For any questions, drop a note at amar@harolikar.com
If you’re a financial analyst wondering: “I ran SPR analysis and QuantStats analysis on the same data but got slightly different results - which one should I trust and why the differences?”
Bottom Line: - β Both are reliable - SPR custom calculations achieve 97%+ accuracy vs QuantStats-Lumi - β Perfect matches on core metrics: Total Return (100%), CAGR (100%) - β Minor differences expected: Sharpe/Sortino ratios differ by 2-4% due to data quality filtering approaches - π― Choose based on needs: QuantStats-Lumi for 2 symbols + comprehensive built-in analytics, SPR for multiple symbols + interactive charts + future customizations
Why both exist: - QuantStats-Lumi: Tried and tested library with 15+ charts and 50+ metrics, maintained by Lumiwealth (algorithmic trading education company) - SPR: Custom implementation enabling multiple symbols, interactive Plotly charts, CSV exports and future custom metrics
When to use which: See detailed comparison below
Section | Content |
---|---|
π― Why This Document Exists | Analyst’s dilemma and key questions answered |
ποΈ System Overview | Architecture comparison and why both exist |
π Data Preprocessing | Complete data pipeline with multi-symbol handling |
π° Core Metrics | Total Return, CAGR, Sharpe, Sortino methodologies |
π Supporting Analytics | FFN integration and monthly returns |
π CSV Export Capabilities | Ready-to-use data files for Excel analysis |
π Practical Validation | Real test case with actual results |
π― When to Use Which Tool | Decision guide and workflows |
π Technical Summary | Validation status and processing comparison |
You’re analyzing portfolio performance and have two tools: 1. QuantStats-Lumi: Maintained library with comprehensive analytics 2. SPR: Custom implementation with additional capabilities
When you run both on the same data, you notice: - Total returns match perfectly β - CAGR values are identical β - But Sharpe ratios are slightly different (e.g., 0.785 vs 0.81) β
Both applications use identical data sources but different processing:
# Common approach - yfinance library
data = yf.download(tickers=' '.join(symbols), start=start_date, end=end_date)
# For each symbol individually:
prices = prices.dropna() # Remove NaN values
prices = prices[prices > 0] # Remove zero/negative prices
prices = prices[~prices.index.duplicated(keep='first')] # Remove duplicates
Financial Purpose: Ensures each symbol has clean, investable price data before portfolio analysis.
# Merge symbols with outer join to preserve all dates
all_data = pd.merge(all_data, symbol_df, left_index=True, right_index=True, how='outer')
Result: Combined dataset may have NaN values where symbols have different trading histories.
# Conservative gap-filling approach
all_data = all_data.ffill(limit=5) # Forward fill up to 5 days maximum
all_data = all_data.dropna() # Drop remaining NaN values
Financial Impact: - Forward fill (5-day limit): Handles short market closures (holidays, technical issues) - Drop remaining NaN: Ensures analysis period covers only dates where all symbols have data - Conservative approach: Prevents artificial data creation for extended missing periods
# Both applications: Convert prices to returns (loses first day)
returns = price_data.pct_change().dropna()
# QuantStats _match_dates function
def _match_dates(returns):
loc = returns.ne(0).idxmax() # Find first non-zero return for each symbol
return returns.loc[loc:] # Start analysis from first meaningful return
Explanation: This step identifies when each investment actually started generating returns (non-zero), ensuring performance metrics reflect actual investable periods rather than “dead” periods.
# Apply QuantStats logic across multiple symbols
first_nonzero_indices = []
for col in returns_data.columns:
first_nonzero_idx = returns_data[col].ne(0).idxmax()
if returns_data[col].ne(0).any():
first_nonzero_indices.append(first_nonzero_idx)
# Use the LATEST first non-zero date among all symbols
start_date = max(first_nonzero_indices)
returns_data = returns_data.loc[start_date:]
Multi-Symbol Financial Logic: When analyzing multiple symbols, SPR waits until ALL symbols have meaningful returns before starting the analysis period. This ensures fair comparison across all portfolio components.
Typical Processing Example (^GSPC & ^NSEI): - Downloaded: 2,580 daily observations - After pct_change(): 2,579 observations (lost first day) - After date matching: 2,578 observations (lost one zero-return day) - Final analysis period: June 9, 2015 to May 30, 2025
Date Loss Patterns: - 1-day loss: Normal (pct_change() always loses first day) - 2+ day loss: Zero-return filtering + multi-symbol alignment
Mathematical Formula:
Total Return = (1 + rβ) Γ (1 + rβ) Γ ... Γ (1 + rβ) - 1
QuantStats Implementation:
def comp(returns):
return (1 + returns).prod() - 1 # Compound all daily returns
Explanation: This calculates the cumulative effect of daily returns over the entire period. A 10% gain followed by a 5% loss results in: (1.10 Γ 0.95) - 1 = 4.5% total return.
SPR Implementation (QuantStats-Compatible):
def compute_total_return(returns):
returns_processed = _prepare_returns_quantstats_style(returns, rf=0.0)
return (returns_processed + 1).prod() - 1
Validation Result: β 100% Perfect Match
Mathematical Formula:
CAGR = (1 + Total_Return)^(1/years) - 1
where years = (end_date - start_date) / 365.25
QuantStats Implementation:
def cagr(returns):
years = (returns.index[-1] - returns.index[0]).days / 365.25
total_ret = comp(returns)
return total_ret ** (1/years) - 1
Explanation: CAGR answers “What constant annual return would produce the same final result?” It uses 365.25 days per year to account for leap years and provides fractional year precision.
SPR Implementation:
def compute_cagr(returns):
delta_seconds = (returns.index[-1] - returns.index[0]).total_seconds()
years = delta_seconds / (365.25 * 24 * 60 * 60) # More precise than days
total_return_factor = (returns + 1).prod()
return total_return_factor ** (1 / years) - 1
Validation Result: β 100% Perfect Match
Mathematical Formula:
Sharpe Ratio = (Mean_Excess_Return / Std_Dev_Excess_Returns) Γ β365
Where:
- Excess_Return = Daily_Return - Daily_Risk_Free_Rate
- Daily_RF_Rate = (1 + Annual_RF_Rate)^(1/365) - 1
QuantStats Implementation:
def sharpe(returns, rf=0.0, periods=365):
returns = _prepare_returns(returns, rf, periods) # Subtract RF rate
return returns.mean() / returns.std(ddof=1) * np.sqrt(periods)
Explanation: Sharpe ratio measures risk-adjusted returns. It asks “How much extra return do I get per unit of risk?” The β365 factor annualizes the daily ratio. Higher values indicate better risk-adjusted performance.
Risk-Free Rate Processing (Both Applications):
# Convert annual rate to daily (compound, not simple division)
daily_rf = np.power(1 + annual_rf, 1.0 / 365) - 1.0
excess_returns = daily_returns - daily_rf
Explanation: Risk-free rate conversion uses compound mathematics. A 5% annual rate becomes approximately 0.0134% daily, not simply 5%/365 = 0.0137%.
Validation Results: 96.9-97.4% Accuracy
Why 3% Differences Exist: 1. Time in Market Calculation: SPR 97.2% vs QuantStats 98% (different filtering approaches) 2. Data Quality Thresholds: Different precision levels for “zero return” detection 3. Multi-Symbol vs Pair Analysis: Portfolio-level vs symbol-level processing differences
Mathematical Formula:
Sortino Ratio = (Mean_Excess_Return / Downside_Deviation) Γ β365
Downside_Deviation = β(Ξ£(negative_returnsΒ²) / total_observations)
Critical Implementation Detail: Uses total observations in denominator, not just negative returns.
QuantStats Implementation:
def sortino(returns, rf=0, periods=365):
returns = _prepare_returns(returns, rf, periods)
downside = np.sqrt((returns[returns < 0] ** 2).sum() / len(returns)) # Note: len(returns)
return returns.mean() / downside * np.sqrt(periods)
Explanation: Sortino ratio is like Sharpe ratio but only penalizes downside volatility. It recognizes that upside volatility isn’t “bad risk.” The denominator uses all observations to maintain statistical consistency.
Validation Results: 96.4-97.3% Accuracy
SPR uses FFN’s calc_stats()
function for comprehensive supporting analytics:
# FFN integration in SPR
ffn_data = data_processor.get_data_for_ffn() # Raw price data
perf = ffn_data.calc_stats() # Comprehensive statistics
perf.set_riskfree_rate(rf_decimal) # Risk-free rate integration
monthly_returns = ffn.monthly_returns(price_data)
Output: Year/month breakdown tables showing periodic performance patterns, seasonality analysis.
drawdown_series = ffn.to_drawdown_series(price_data)
drawdown_details = ffn.drawdown_details(drawdown_series)
Analysis Provided: - Maximum drawdown periods (peak-to-trough decline) - Drawdown duration and recovery periods - Peak and trough identification
Additional metrics beyond the core 4, including volatility measures, rolling statistics, and performance attribution.
SPR automatically generates two critical CSV files alongside the HTML report:
Filename: report_AAPL-MSFT-GOOG_timestamp_processed_price_data.csv
Content: Clean, investment-ready price data with: - β Date alignment across all symbols (conservative multi-symbol approach) - β Forward filling applied (up to 5-day gaps) - β Data quality filters (removed zeros, negatives, duplicates) - β Consistent date range (all symbols have data for same period)
Use Cases: - Excel analysis: Drop 10+ symbols into spreadsheet for custom calculations - External validations: Compare with other data sources - Independent research: Use clean data without repeating preprocessing - Quick data pulls: Get multiple symbol prices without yfinance setup
Filename: report_AAPL-MSFT-GOOG_timestamp_cumulative_returns.csv
Content: Daily cumulative return progression showing:
- Performance evolution over time
- Relative performance comparison across symbols
- Ready for charting and trend analysis
Use Cases: - Custom visualizations: Create your own charts in Excel/Python - Performance attribution: Analyze contribution periods - Validation checks: Verify report calculations independently - Academic research: Clean returns data for statistical analysis
Scenario: You need clean price data for 8 tech stocks for Excel analysis. - Traditional approach: Download from multiple sources, handle missing data, align dates, forward fill gaps - SPR approach: Run one analysis, get ready-to-use CSV with all 8 stocks perfectly aligned
Time savings: Hours of data cleaning reduced to minutes
SPR Processing Log:
- Downloaded: 2,580 daily observations
- After cleaning: 2,578 observations
- Date range: 2015-06-09 to 2025-05-30
- Symbols aligned: Both started meaningful returns on same date
Metric | SPR Result | QuantStats-Lumi | Difference | Status |
---|---|---|---|---|
Total Return | 184.20% | 184.20% | 0.00% | β Perfect |
CAGR | 11.04% | 11.04% | 0.00% | β Perfect |
Sharpe Ratio | 0.785 | 0.81 | -3.1% | β Acceptable |
Sortino Ratio | 1.099 | 1.14 | -3.6% | β Acceptable |
Sharpe Ratio Analysis (^GSPC Example):
SPR Debug Information:
- Total sample size: 2,580
- Non-zero returns: 2,508
- Time in Market: 97.2%
- Mean return: 0.00047080
- Std deviation: 0.01146096
- Final Sharpe: 0.784807
Difference Explanation: The 3% variance likely stems from QuantStats using slightly different data quality filters, resulting in 98% vs 97.2% “time in market” calculation.
β Perfect For: - 2-symbol analysis (strategy vs benchmark comparison) - Comprehensive built-in analytics (15+ charts, 50+ metrics) - Minimal customization needs - Tried and tested analysis (maintained by Lumiwealth team of data scientists and engineers) - Quick, standard portfolio reports
π Advantages: - Extensive pre-built visualizations - Comprehensive metrics library - Minimal setup required - Well-documented and actively maintained
β Perfect For: - Multiple symbols analysis (portfolio with 3+ holdings) - Interactive visualizations (Plotly-based charts) - Clean data extraction (processed price data for 10+ symbols ready for Excel) - Independent analysis & validation (CSV exports for custom calculations) - Custom metric requirements (future rolling metrics, specialized calculations) - Development flexibility (ability to modify calculations without affecting QuantStats) - Dual validation approach (automatic cross-verification with QuantStats)
π Advantages: - No symbol limit restrictions - Interactive cumulative returns charts - Ready-to-use CSV exports: Clean, processed price data and cumulative returns for Excel analysis - Foundation for custom metric development - Transparent calculation methodology
Component | Implementation | Validation Status | Notes |
---|---|---|---|
Total Return | Custom QuantStats-compatible | β 100% Match | Perfect mathematical agreement |
CAGR | Custom QuantStats-compatible | β 100% Match | Identical fractional year calculation |
Sharpe Ratio | Custom QuantStats-compatible | β 97%+ Match | Minor data filtering differences |
Sortino Ratio | Custom QuantStats-compatible | β 97%+ Match | Same methodology, slight filtering variance |
Monthly Returns | FFN library | β Verified | Identical FFN implementation |
Drawdown Analysis | FFN library | β Verified | Same peak-to-trough methodology |
Stage | QuantStats-Lumi | SPR | Impact |
---|---|---|---|
Data Source | Yahoo Finance | Yahoo Finance | Identical |
Symbol Limit | 2 symbols max | Unlimited | SPR advantage for portfolios |
Date Matching | Per-pair basis | Multi-symbol alignment | Conservative SPR approach |
Missing Data | Library handling | Forward fill (5-day limit) | Explicit SPR control |
Quality Filters | Internal library logic | Transparent implementation | SPR auditability |
Both tools serve complementary roles in a robust analysis framework. The automatic validation provided by running both tools enhances confidence in results and provides protection against potential library changes or data quality issues.
For most analysts: Start with your primary tool based on symbol count needs, then use the other for validation when making important investment decisions.
Last Updated: 2025-06-17 19:44:27 UTC