QuantStats Vs. SPR (Security Performance Report)

Methodology Validation, Reconciliation & Comparisons

For any questions, drop a note at amar@harolikar.com

📊 Executive Summary

If you’re a financial analyst wondering: “I ran SPR analysis and QuantStats analysis on the same data but got slightly different results - which one should I trust and why the differences?”

Bottom Line: - ✅ Both are reliable - SPR custom calculations achieve 97%+ accuracy vs QuantStats-Lumi - ✅ Perfect matches on core metrics: Total Return (100%), CAGR (100%) - ✅ Minor differences expected: Sharpe/Sortino ratios differ by 2-4% due to data quality filtering approaches - 🎯 Choose based on needs: QuantStats-Lumi for 2 symbols + comprehensive built-in analytics, SPR for multiple symbols + interactive charts + future customizations

Why both exist: - QuantStats-Lumi: Tried and tested library with 15+ charts and 50+ metrics, maintained by Lumiwealth (algorithmic trading education company) - SPR: Custom implementation enabling multiple symbols, interactive Plotly charts, CSV exports and future custom metrics

When to use which: See detailed comparison below

Section	Content
🎯 Why This Document Exists	Analyst’s dilemma and key questions answered
🏗️ System Overview	Architecture comparison and why both exist
📅 Data Preprocessing	Complete data pipeline with multi-symbol handling
💰 Core Metrics	Total Return, CAGR, Sharpe, Sortino methodologies
📊 Supporting Analytics	FFN integration and monthly returns
📁 CSV Export Capabilities	Ready-to-use data files for Excel analysis
🔍 Practical Validation	Real test case with actual results
🎯 When to Use Which Tool	Decision guide and workflows
📚 Technical Summary	Validation status and processing comparison

🎯 Why This Document Exists

The Analyst’s Dilemma

You’re analyzing portfolio performance and have two tools: 1. QuantStats-Lumi: Maintained library with comprehensive analytics 2. SPR: Custom implementation with additional capabilities

When you run both on the same data, you notice: - Total returns match perfectly ✅ - CAGR values are identical ✅ - But Sharpe ratios are slightly different (e.g., 0.785 vs 0.81) ❓

This Document Answers:

What methodology does SPR use? (Custom QuantStats-compatible calculations)
Why are there small differences? (Different data quality filtering approaches)
Which results should I trust? (Both - differences are within normal tolerance)
When should I use each tool? (Depends on your specific needs)

🏗️ SPR vs QuantStats-Lumi: System Overview

QuantStats-Lumi Application

Library: quantstats-lumi - maintained fork of original QuantStats by Ran Aroussi
Maintainer: Lumiwealth, a company specializing in algorithmic trading education and tools
Background: Created by team of data scientists and engineers to address issues in original unmaintained QuantStats
Approach: Direct library wrapper with minimal preprocessing
Limitations: Maximum 2 symbols in standard report (strategy vs benchmark)
Strengths: 15+ built-in charts, 50+ metrics, actively maintained codebase
Output: HTML reports with extensive visualizations

SPR Application

Core Metrics: Custom QuantStats-compatible implementations for validation
Supporting Analytics: FFN library for drawdowns, monthly returns, summary statistics
Capabilities: Multiple symbols support, interactive Plotly charts, future rolling metrics
Strengths: Flexibility for custom calculations, automatic validation via dual methodology
Output: Focused reports with interactive visualizations and CSV exports

Why Both Exist

Automatic Validation: Running both provides immediate cross-verification
Different Use Cases: QuantStats-Lumi for comprehensive single-comparison analysis, SPR for multi-symbol portfolio analysis
Future-Proofing: SPR allows custom metric implementations without modifying QuantStats-Lumi code
Risk Management: If either library changes, the other provides continuity
CSV Downloads: SPR provides CSV download of prices and returns for offline analysis

📅 Data Preprocessing: Complete Pipeline

1. Data Acquisition (Yahoo Finance)

Both applications use identical data sources but different processing:

# Common approach - yfinance library
data = yf.download(tickers=' '.join(symbols), start=start_date, end=end_date)

2. Multi-Symbol Data Quality Pipeline (SPR Specific)

Step 1: Individual Symbol Cleaning

# For each symbol individually:
prices = prices.dropna()  # Remove NaN values
prices = prices[prices > 0]  # Remove zero/negative prices  
prices = prices[~prices.index.duplicated(keep='first')]  # Remove duplicates

Financial Purpose: Ensures each symbol has clean, investable price data before portfolio analysis.

Step 2: Multi-Symbol Date Alignment

# Merge symbols with outer join to preserve all dates
all_data = pd.merge(all_data, symbol_df, left_index=True, right_index=True, how='outer')

Result: Combined dataset may have NaN values where symbols have different trading histories.

Step 3: Missing Data Handling

# Conservative gap-filling approach
all_data = all_data.ffill(limit=5)  # Forward fill up to 5 days maximum
all_data = all_data.dropna()       # Drop remaining NaN values

Financial Impact: - Forward fill (5-day limit): Handles short market closures (holidays, technical issues) - Drop remaining NaN: Ensures analysis period covers only dates where all symbols have data - Conservative approach: Prevents artificial data creation for extended missing periods

3. Return Calculation with Date Matching

Standard Return Calculation

# Both applications: Convert prices to returns (loses first day)
returns = price_data.pct_change().dropna()

QuantStats Date Matching Logic

# QuantStats _match_dates function
def _match_dates(returns):
    loc = returns.ne(0).idxmax()  # Find first non-zero return for each symbol
    return returns.loc[loc:]      # Start analysis from first meaningful return

Explanation: This step identifies when each investment actually started generating returns (non-zero), ensuring performance metrics reflect actual investable periods rather than “dead” periods.

SPR Implementation (Multi-Symbol)

# Apply QuantStats logic across multiple symbols
first_nonzero_indices = []
for col in returns_data.columns:
    first_nonzero_idx = returns_data[col].ne(0).idxmax()
    if returns_data[col].ne(0).any():
        first_nonzero_indices.append(first_nonzero_idx)

# Use the LATEST first non-zero date among all symbols        
start_date = max(first_nonzero_indices)
returns_data = returns_data.loc[start_date:]

Multi-Symbol Financial Logic: When analyzing multiple symbols, SPR waits until ALL symbols have meaningful returns before starting the analysis period. This ensures fair comparison across all portfolio components.

4. Data Quality Validation Results

Typical Processing Example (^GSPC & ^NSEI): - Downloaded: 2,580 daily observations - After pct_change(): 2,579 observations (lost first day) - After date matching: 2,578 observations (lost one zero-return day) - Final analysis period: June 9, 2015 to May 30, 2025

Date Loss Patterns: - 1-day loss: Normal (pct_change() always loses first day) - 2+ day loss: Zero-return filtering + multi-symbol alignment

💰 Core Metrics Methodology

Total Return Calculation

Mathematical Formula:

Total Return = (1 + r₁) × (1 + r₂) × ... × (1 + rₙ) - 1

QuantStats Implementation:

def comp(returns):
    return (1 + returns).prod() - 1  # Compound all daily returns

Explanation: This calculates the cumulative effect of daily returns over the entire period. A 10% gain followed by a 5% loss results in: (1.10 × 0.95) - 1 = 4.5% total return.

SPR Implementation (QuantStats-Compatible):

def compute_total_return(returns):
    returns_processed = _prepare_returns_quantstats_style(returns, rf=0.0)
    return (returns_processed + 1).prod() - 1

Validation Result: ✅ 100% Perfect Match

CAGR (Compound Annual Growth Rate)

Mathematical Formula:

CAGR = (1 + Total_Return)^(1/years) - 1
where years = (end_date - start_date) / 365.25

QuantStats Implementation:

def cagr(returns):
    years = (returns.index[-1] - returns.index[0]).days / 365.25
    total_ret = comp(returns)
    return total_ret ** (1/years) - 1

Explanation: CAGR answers “What constant annual return would produce the same final result?” It uses 365.25 days per year to account for leap years and provides fractional year precision.

SPR Implementation:

def compute_cagr(returns):
    delta_seconds = (returns.index[-1] - returns.index[0]).total_seconds()
    years = delta_seconds / (365.25 * 24 * 60 * 60)  # More precise than days
    total_return_factor = (returns + 1).prod()
    return total_return_factor ** (1 / years) - 1

Validation Result: ✅ 100% Perfect Match

Sharpe Ratio Calculation

Mathematical Formula:

Sharpe Ratio = (Mean_Excess_Return / Std_Dev_Excess_Returns) × √365

Where:
- Excess_Return = Daily_Return - Daily_Risk_Free_Rate
- Daily_RF_Rate = (1 + Annual_RF_Rate)^(1/365) - 1

QuantStats Implementation:

def sharpe(returns, rf=0.0, periods=365):
    returns = _prepare_returns(returns, rf, periods)  # Subtract RF rate
    return returns.mean() / returns.std(ddof=1) * np.sqrt(periods)

Explanation: Sharpe ratio measures risk-adjusted returns. It asks “How much extra return do I get per unit of risk?” The √365 factor annualizes the daily ratio. Higher values indicate better risk-adjusted performance.

Risk-Free Rate Processing (Both Applications):

# Convert annual rate to daily (compound, not simple division)
daily_rf = np.power(1 + annual_rf, 1.0 / 365) - 1.0
excess_returns = daily_returns - daily_rf

Explanation: Risk-free rate conversion uses compound mathematics. A 5% annual rate becomes approximately 0.0134% daily, not simply 5%/365 = 0.0137%.

Validation Results: 96.9-97.4% Accuracy

Why 3% Differences Exist: 1. Time in Market Calculation: SPR 97.2% vs QuantStats 98% (different filtering approaches) 2. Data Quality Thresholds: Different precision levels for “zero return” detection 3. Multi-Symbol vs Pair Analysis: Portfolio-level vs symbol-level processing differences

Sortino Ratio Calculation

Mathematical Formula:

Sortino Ratio = (Mean_Excess_Return / Downside_Deviation) × √365

Downside_Deviation = √(Σ(negative_returns²) / total_observations)

Critical Implementation Detail: Uses total observations in denominator, not just negative returns.

QuantStats Implementation:

def sortino(returns, rf=0, periods=365):
    returns = _prepare_returns(returns, rf, periods)
    downside = np.sqrt((returns[returns < 0] ** 2).sum() / len(returns))  # Note: len(returns)
    return returns.mean() / downside * np.sqrt(periods)

Explanation: Sortino ratio is like Sharpe ratio but only penalizes downside volatility. It recognizes that upside volatility isn’t “bad risk.” The denominator uses all observations to maintain statistical consistency.

Validation Results: 96.4-97.3% Accuracy

📊 Supporting Analytics (FFN Integration)

FFN Library Role

SPR uses FFN’s calc_stats() function for comprehensive supporting analytics:

# FFN integration in SPR
ffn_data = data_processor.get_data_for_ffn()  # Raw price data
perf = ffn_data.calc_stats()                  # Comprehensive statistics
perf.set_riskfree_rate(rf_decimal)           # Risk-free rate integration

FFN-Provided Analytics

1. Monthly Returns Analysis

monthly_returns = ffn.monthly_returns(price_data)

Output: Year/month breakdown tables showing periodic performance patterns, seasonality analysis.

2. Drawdown Analysis

drawdown_series = ffn.to_drawdown_series(price_data)
drawdown_details = ffn.drawdown_details(drawdown_series)

Analysis Provided: - Maximum drawdown periods (peak-to-trough decline) - Drawdown duration and recovery periods - Peak and trough identification

3. Summary Statistics

Additional metrics beyond the core 4, including volatility measures, rolling statistics, and performance attribution.

Why FFN for Supporting Analytics?

Proven Library: Extensively tested in quantitative finance
Consistency: Uses same preprocessed data as custom calculations
Comprehensive: Provides analytics that would take months to implement correctly
Maintenance: Focus custom development on unique requirements only

📁 CSV Export Capabilities (SPR Advantage)

Ready-to-Use Data Files

SPR automatically generates two critical CSV files alongside the HTML report:

1. Processed Price Data CSV

Filename: report_AAPL-MSFT-GOOG_timestamp_processed_price_data.csv

Content: Clean, investment-ready price data with: - ✅ Date alignment across all symbols (conservative multi-symbol approach) - ✅ Forward filling applied (up to 5-day gaps) - ✅ Data quality filters (removed zeros, negatives, duplicates) - ✅ Consistent date range (all symbols have data for same period)

Use Cases: - Excel analysis: Drop 10+ symbols into spreadsheet for custom calculations - External validations: Compare with other data sources - Independent research: Use clean data without repeating preprocessing - Quick data pulls: Get multiple symbol prices without yfinance setup

2. Cumulative Returns CSV

Filename: report_AAPL-MSFT-GOOG_timestamp_cumulative_returns.csv

Content: Daily cumulative return progression showing: - Performance evolution over time - Relative performance comparison across symbols
- Ready for charting and trend analysis

Use Cases: - Custom visualizations: Create your own charts in Excel/Python - Performance attribution: Analyze contribution periods - Validation checks: Verify report calculations independently - Academic research: Clean returns data for statistical analysis

Practical Value

Scenario: You need clean price data for 8 tech stocks for Excel analysis. - Traditional approach: Download from multiple sources, handle missing data, align dates, forward fill gaps - SPR approach: Run one analysis, get ready-to-use CSV with all 8 stocks perfectly aligned

Time savings: Hours of data cleaning reduced to minutes

🔍 Practical Validation Walkthrough

Test Case: ^GSPC & ^NSEI Portfolio (June 2015 - May 2025)

Step 1: Data Processing Verification

SPR Processing Log:
- Downloaded: 2,580 daily observations
- After cleaning: 2,578 observations  
- Date range: 2015-06-09 to 2025-05-30
- Symbols aligned: Both started meaningful returns on same date

Step 2: Core Metrics Comparison

Metric	SPR Result	QuantStats-Lumi	Difference	Status
Total Return	184.20%	184.20%	0.00%	✅ Perfect
CAGR	11.04%	11.04%	0.00%	✅ Perfect
Sharpe Ratio	0.785	0.81	-3.1%	✅ Acceptable
Sortino Ratio	1.099	1.14	-3.6%	✅ Acceptable

Step 3: Understanding Acceptable Differences

Sharpe Ratio Analysis (^GSPC Example):

SPR Debug Information:
- Total sample size: 2,580
- Non-zero returns: 2,508
- Time in Market: 97.2%
- Mean return: 0.00047080
- Std deviation: 0.01146096
- Final Sharpe: 0.784807

Difference Explanation: The 3% variance likely stems from QuantStats using slightly different data quality filters, resulting in 98% vs 97.2% “time in market” calculation.

Step 4: Validation Conclusion

Core return metrics: Perfect mathematical agreement confirms correct implementation
Risk metrics: Minor differences within acceptable tolerance for financial analysis
Overall assessment: Both methodologies reliable for practical use

🎯 When to Use SPR vs QuantStats-Lumi

Choose QuantStats-Lumi When:

✅ Perfect For: - 2-symbol analysis (strategy vs benchmark comparison) - Comprehensive built-in analytics (15+ charts, 50+ metrics) - Minimal customization needs - Tried and tested analysis (maintained by Lumiwealth team of data scientists and engineers) - Quick, standard portfolio reports

📊 Advantages: - Extensive pre-built visualizations - Comprehensive metrics library - Minimal setup required - Well-documented and actively maintained

Choose SPR When:

✅ Perfect For: - Multiple symbols analysis (portfolio with 3+ holdings) - Interactive visualizations (Plotly-based charts) - Clean data extraction (processed price data for 10+ symbols ready for Excel) - Independent analysis & validation (CSV exports for custom calculations) - Custom metric requirements (future rolling metrics, specialized calculations) - Development flexibility (ability to modify calculations without affecting QuantStats) - Dual validation approach (automatic cross-verification with QuantStats)

🚀 Advantages: - No symbol limit restrictions - Interactive cumulative returns charts - Ready-to-use CSV exports: Clean, processed price data and cumulative returns for Excel analysis - Foundation for custom metric development - Transparent calculation methodology

Recommended Workflow:

For Standard Analysis (2 symbols):

Primary: Use QuantStats-Lumi for comprehensive analysis
Validation: Run SPR for automatic cross-verification
Custom needs: Switch to SPR if you need modifications

For Portfolio Analysis (3+ symbols):

Primary: Use SPR (only option for multiple symbols)
Validation: Use individual QuantStats runs for key symbol pairs
Excel Analysis: Use SPR’s clean CSV exports for custom calculations and validations

For Data Extraction (any number of symbols):

Quick Data Pull: Use SPR to get clean, aligned price data for multiple symbols
Export to Excel: Ready-to-use CSV files eliminate data preprocessing
Custom Analysis: Build your own calculations on clean foundation

For Research & Development:

Testing: Use SPR for implementing new metrics
Validation: Compare against QuantStats for proven calculations
Production: Deploy SPR for final analysis once validated

📚 Technical Implementation Summary

Mathematical Validation Status

Component	Implementation	Validation Status	Notes
Total Return	Custom QuantStats-compatible	✅ 100% Match	Perfect mathematical agreement
CAGR	Custom QuantStats-compatible	✅ 100% Match	Identical fractional year calculation
Sharpe Ratio	Custom QuantStats-compatible	✅ 97%+ Match	Minor data filtering differences
Sortino Ratio	Custom QuantStats-compatible	✅ 97%+ Match	Same methodology, slight filtering variance
Monthly Returns	FFN library	✅ Verified	Identical FFN implementation
Drawdown Analysis	FFN library	✅ Verified	Same peak-to-trough methodology

Data Processing Comparison

Stage	QuantStats-Lumi	SPR	Impact
Data Source	Yahoo Finance	Yahoo Finance	Identical
Symbol Limit	2 symbols max	Unlimited	SPR advantage for portfolios
Date Matching	Per-pair basis	Multi-symbol alignment	Conservative SPR approach
Missing Data	Library handling	Forward fill (5-day limit)	Explicit SPR control
Quality Filters	Internal library logic	Transparent implementation	SPR auditability

🏁 Conclusion

Key Takeaways for Financial Analysts

Both methodologies are reliable for professional financial analysis
Perfect agreement on core return metrics validates mathematical correctness
Minor risk metric differences reflect different data quality approaches, not calculation errors
Choose based on specific needs: QuantStats-Lumi for comprehensive 2-symbol analysis, SPR for multi-symbol portfolios and customization

Confidence Level

Core Metrics: 100% confidence in both approaches
Risk Metrics: 97%+ accuracy provides acceptable tolerance for practical analysis
Supporting Analytics: FFN integration ensures consistent methodology

Recommendation

Both tools serve complementary roles in a robust analysis framework. The automatic validation provided by running both tools enhances confidence in results and provides protection against potential library changes or data quality issues.

For most analysts: Start with your primary tool based on symbol count needs, then use the other for validation when making important investment decisions.

Last Updated: 2025-06-17 19:44:27 UTC

Note: This documentation is for informational purposes only and should not be considered as investment advice. The methodologies described here compare QuantStats-Lumi and Security Performance Report (SPR) implementations. SPR uses custom calculations as well as FFN package for some of the metrics. While some of the calculations metrics have been validated for consistency, users are encouraged to refer to GitHub Repo for SPR, the official documentation for FFN and QuantStats documentation for full methodology details. Always validate outputs and interpret results in light of your specific analytical objectives.

QuantStats Vs. SPR (Security Performance Report)

Methodology Validation, Reconciliation & Comparisons

📊 Executive Summary

📑 Quick Navigation

🎯 Why This Document Exists

The Analyst’s Dilemma

This Document Answers:

🏗️ SPR vs QuantStats-Lumi: System Overview

QuantStats-Lumi Application

SPR Application

Why Both Exist

📅 Data Preprocessing: Complete Pipeline

1. Data Acquisition (Yahoo Finance)

2. Multi-Symbol Data Quality Pipeline (SPR Specific)

Step 1: Individual Symbol Cleaning

Step 2: Multi-Symbol Date Alignment

Step 3: Missing Data Handling

3. Return Calculation with Date Matching

Standard Return Calculation

QuantStats Date Matching Logic

SPR Implementation (Multi-Symbol)

4. Data Quality Validation Results

💰 Core Metrics Methodology

Total Return Calculation

CAGR (Compound Annual Growth Rate)

Sharpe Ratio Calculation

Sortino Ratio Calculation

📊 Supporting Analytics (FFN Integration)

FFN Library Role

FFN-Provided Analytics

1. Monthly Returns Analysis

2. Drawdown Analysis

3. Summary Statistics

Why FFN for Supporting Analytics?

📁 CSV Export Capabilities (SPR Advantage)

Ready-to-Use Data Files

1. Processed Price Data CSV

2. Cumulative Returns CSV

Practical Value

🔍 Practical Validation Walkthrough

Test Case: ^GSPC & ^NSEI Portfolio (June 2015 - May 2025)

Step 1: Data Processing Verification

Step 2: Core Metrics Comparison

Step 3: Understanding Acceptable Differences

Step 4: Validation Conclusion

🎯 When to Use SPR vs QuantStats-Lumi

Choose QuantStats-Lumi When:

Choose SPR When:

Recommended Workflow:

For Standard Analysis (2 symbols):

For Portfolio Analysis (3+ symbols):

For Data Extraction (any number of symbols):

For Research & Development:

📚 Technical Implementation Summary

Mathematical Validation Status

Data Processing Comparison

🏁 Conclusion

Key Takeaways for Financial Analysts

Confidence Level

Recommendation