Backtesting Guide
Backtesting is the process of testing a strategy on historical data. A well-executed backtest gives you confidence in your strategy's potential.
The Backtest Equation
Backtest Quality = (Data Quality × Cost Modeling × Execution Realism) - Bias
Each component matters. Let's break them down.
Data Quality
Requirements for Quality Data
| Requirement | Why It Matters |
|---|---|
| Adjusted prices | Splits, dividends affect continuity |
| Survivorship-free | Include delisted securities |
| Point-in-time | Use data available at that moment |
| Complete coverage | No gaps or missing bars |
Data Adjustments
# VecAlpha handles adjustments automatically
data = vecalpha.get_data(
symbol='AAPL',
start='2020-01-01',
adjusted=True, # Split/dividend adjusted
survivorship_free=True # Includes delisted
)
Point-in-Time Data
Critical for avoiding look-ahead bias:
# WRONG: Uses data not available at trade time
if earnings_announced and earnings > expected:
buy()
# CORRECT: Only use data available before trade
if yesterday.close > yesterday.open:
buy()
Cost Modeling
Transaction Costs
Every trade costs money. Include:
| Cost Type | Typical Range | Impact |
|---|---|---|
| Commission | 0.01% - 0.1% | Reduces returns linearly |
| Slippage | 0.01% - 0.1% | Higher for larger orders |
| Spread | 0.01% - 0.05% | Varies by liquidity |
# VecAlpha backtest configuration
backtest_config = {
'commission': 0.001, # 0.1% commission
'slippage_model': 'volume_share', # Proportional to order size
'slippage_impact': 0.1, # Market impact coefficient
}
Slippage Models
Different models for different markets:
# Fixed slippage
slippage = 0.0005 # 5 basis points
# Volume-based slippage (more realistic)
slippage = order_size / daily_volume * price * 0.1
# Volatility-adjusted slippage
slippage = atr * 0.1 # 10% of ATR
Impact on High-Frequency Strategies
More trades = more costs:
Strategy A: 100 trades/year, 10% gross return
Cost: 100 × 0.2% = 20%
Net return: -10% (LOSS)
Strategy B: 10 trades/year, 10% gross return
Cost: 10 × 0.2% = 2%
Net return: 8% (PROFIT)
Execution Realism
Order Types
Model the orders your strategy will use:
| Order Type | When to Use | Modeling |
|---|---|---|
| Market | Immediate execution | Slippage costs |
| Limit | Price target | Fill probability |
| Stop | Risk management | Trigger timing |
# Market order (with slippage)
self.buy(size=100, type='market')
# Limit order (may not fill)
self.buy(size=100, type='limit', price=current_price * 0.99)
# Stop order (triggers on price)
self.sell(size=position, type='stop', price=entry_price * 0.95)
Fill Assumptions
Be realistic about fills:
# Too optimistic: Assume limit always fills
if price <= limit_price:
filled = True
# More realistic: Partial fills, rejections
filled = simulate_fill_probability(
order_size=size,
available_volume=bar_volume,
price_distance=limit_price - current_price
)
Avoiding Bias
Look-Ahead Bias
Using future information:
# WRONG: Uses today's close to trade today
if close > open:
buy() # Can't know close until end of day
# CORRECT: Use yesterday's data
if prev_close > prev_open:
buy()
Survivorship Bias
Testing only on successful companies:
# WRONG: Only current S&P 500 stocks
symbols = get_current_sp500()
# CORRECT: Historical S&P 500 constituents
symbols = get_sp500_members(date='2020-01-01')
Selection Bias
Picking favorable test periods:
# WRONG: Cherry-pick bull market
start = '2020-04-01' # Post-COVID bottom
end = '2021-12-01' # Peak
# CORRECT: Test multiple market regimes
periods = [
('2018-01-01', '2019-12-31'), # Normal
('2020-01-01', '2020-12-31'), # Volatile
('2021-01-01', '2022-12-31'), # Mixed
]
Performance Metrics
Return Metrics
| Metric | Formula | Interpretation |
|---|---|---|
| Total Return | (End - Start) / Start | Overall profit |
| CAGR | (End/Start)^(1/years) - 1 | Annualized growth |
| Monthly Return | Mean of monthly returns | Consistency check |
Risk Metrics
| Metric | Formula | Good Range |
|---|---|---|
| Sharpe Ratio | (Return - Rf) / StdDev | > 1.0 |
| Sortino Ratio | (Return - Rf) / DownsideStd | > 1.5 |
| Max Drawdown | Peak to trough decline | < 20% |
| Calmar Ratio | CAGR / MaxDrawdown | > 1.0 |
Trade Metrics
| Metric | Formula | Target |
|---|---|---|
| Win Rate | Wins / Total Trades | > 45% |
| Profit Factor | Gross Profit / Gross Loss | > 1.5 |
| Avg Win / Avg Loss | Average win size / Average loss | > 1.0 |
| Expectancy | (Win% × AvgWin) - (Loss% × AvgLoss) | > 0 |
# VecAlpha backtest results
results = backtest.run()
print(f"Total Return: {results.total_return:.2%}")
print(f"Sharpe Ratio: {results.sharpe_ratio:.2f}")
print(f"Max Drawdown: {results.max_drawdown:.2%}")
print(f"Win Rate: {results.win_rate:.2%}")
print(f"Profit Factor: {results.profit_factor:.2f}")
Walk-Forward Analysis
The gold standard for robustness testing:
from vecalpha import WalkForwardAnalysis
wfa = WalkForwardAnalysis(
train_period='2Y', # 2 years for optimization
test_period='6M', # 6 months for out-of-sample
anchor=False # Rolling (vs anchored)
)
results = wfa.run(strategy, data)
# Compare in-sample vs out-of-sample
print(f"In-Sample Sharpe: {results.in_sample_sharpe:.2f}")
print(f"Out-of-Sample Sharpe: {results.out_of_sample_sharpe:.2f}")
# If OOS < 50% of IS, likely overfitted
if results.out_of_sample_sharpe < results.in_sample_sharpe * 0.5:
print("WARNING: Strategy may be overfitted")
Monte Carlo Simulation
Test statistical significance:
from vecalpha import MonteCarloSimulation
mc = MonteCarloSimulation(n_simulations=1000)
results = mc.run(strategy, data)
print(f"Expected Return: {results.mean_return:.2%}")
print(f"5th Percentile: {results.percentile_5:.2%}")
print(f"Probability of Loss: {results.prob_loss:.2%}")
Backtest Checklist
Before trusting a backtest:
- Data is adjusted for splits/dividends
- Survivorship-free data used
- No look-ahead bias in signals
- Realistic transaction costs included
- Slippage model appropriate for strategy
- Tested across multiple market regimes
- Out-of-sample testing performed
- Performance compared to buy-and-hold benchmark
Next Steps
- Optimization - Fine-tune your strategy
- Live Trading - Deploy to production