Optimization
Optimization is the process of finding the best parameters for your strategy. But beware: aggressive optimization leads to overfitting.
The Optimization Paradox
More optimization ≠ Better results
- Too little: Suboptimal performance
- Too much: Overfitting (great backtest, terrible live trading)
- Just right: Robust, generalizable strategy
What to Optimize
Good Candidates
Parameters with economic rationale:
| Parameter | Rationale | Range |
|---|---|---|
| MA periods | Trend timeframe | 10-200 |
| RSI threshold | Overbought/oversold definition | 25-35, 65-75 |
| Stop loss % | Risk tolerance | 2-10% |
Poor Candidates
Parameters that just "fit the data":
- Arbitrary numbers without meaning
- Too many parameters simultaneously
- Parameters that perfectly capture historical noise
Optimization Methods
Grid Search
Test all combinations:
from vecalpha import GridSearch
params = {
'short_ma': [10, 20, 30],
'long_ma': [40, 50, 60],
'stop_loss': [0.03, 0.05, 0.07],
}
grid = GridSearch(strategy, data, params, metric='sharpe')
results = grid.run()
print(f"Best params: {results.best_params}")
print(f"Best Sharpe: {results.best_score:.2f}")
Pros: Exhaustive, finds global optimum Cons: Computationally expensive, easy to overfit
Random Search
Sample random combinations:
from vecalpha import RandomSearch
random = RandomSearch(
strategy, data,
n_iterations=100, # Test 100 random combinations
param_distributions={
'short_ma': (10, 50),
'long_ma': (50, 200),
}
)
Pros: More efficient than grid for high dimensions Cons: May miss global optimum
Bayesian Optimization
Intelligent search using probabilistic models:
from vecalpha import BayesianOptimization
bo = BayesianOptimization(
strategy, data,
n_iterations=50,
param_space={
'short_ma': (10, 50),
'long_ma': (50, 200),
}
)
Pros: Efficient, good for expensive-to-evaluate functions Cons: Complex, assumes smooth objective
Avoiding Overfitting
1. Train/Test Split
Never optimize and test on the same data:
# Split data
train_data = data['2020-01-01':'2022-12-31'] # 3 years for training
test_data = data['2023-01-01':'2023-12-31'] # 1 year for testing
# Optimize on train
best_params = optimize(strategy, train_data)
# Test on test (only once!)
final_sharpe = backtest(strategy, test_data, params=best_params).sharpe
2. Walk-Forward Optimization
The robust approach:
from vecalpha import WalkForwardOptimization
wfo = WalkForwardOptimization(
train_period='2Y',
test_period='6M',
n_splits=5
)
results = wfo.run(strategy, data)
# Each split has its own optimized params
for i, split in enumerate(results.splits):
print(f"Split {i}: OOS Sharpe = {split.test_sharpe:.2f}")
# Average out-of-sample performance
print(f"Average OOS Sharpe: {results.avg_oos_sharpe:.2f}")
3. Parameter Stability
Good parameters are stable:
# Check parameter sensitivity
base_params = {'ma_period': 50}
sensitivity_results = []
for delta in [-10, -5, 0, 5, 10]:
params = {'ma_period': base_params['ma_period'] + delta}
result = backtest(strategy, data, params)
sensitivity_results.append(result.sharpe)
# If Sharpe varies wildly with small param changes -> unstable
std_dev = np.std(sensitivity_results)
if std_dev > 0.3:
print("WARNING: Parameters are unstable")
4. Complexity Penalty
More parameters = more risk:
# Information Criterion (AIC/BIC) penalizes complexity
n_params = count_parameters(strategy)
n_obs = len(data)
aic = -2 * log_likelihood + 2 * n_params
bic = -2 * log_likelihood + n_params * np.log(n_obs)
# Prefer strategies with lower AIC/BIC
Multi-Objective Optimization
Optimize for multiple goals:
from vecalpha import MultiObjectiveOptimization
moo = MultiObjectiveOptimization(
strategy, data,
objectives=['return', 'sharpe', 'max_drawdown'],
weights=[0.4, 0.4, 0.2] # Priorities
)
# Returns Pareto-optimal solutions
pareto_front = moo.run()
Objective Trade-offs
| Return ↑ | Sharpe ↑ | Drawdown ↓ |
|---|---|---|
| Higher risk | Smoother returns | Lower risk |
| More volatile | Lower absolute return | May miss opportunities |
Optimization Workflow
Step 1: Define Parameter Space
param_space = {
'short_window': (5, 30), # Short MA
'long_window': (30, 100), # Long MA
'stop_loss': (0.02, 0.10), # Stop loss %
'take_profit': (0.03, 0.15), # Take profit %
}
Step 2: Choose Metric
# For trend strategies
metric = 'sharpe_ratio'
# For conservative strategies
metric = 'sortino_ratio'
# For risk-averse
metric = 'calmar_ratio'
Step 3: Run Walk-Forward
wfo = WalkForwardOptimization(
train_period='2Y',
test_period='6M',
param_space=param_space,
metric=metric
)
results = wfo.run(strategy, data)
Step 4: Analyze Results
# Check consistency across splits
for split in results.splits:
print(f"Train Sharpe: {split.train_sharpe:.2f}")
print(f"Test Sharpe: {split.test_sharpe:.2f}")
print(f"Degradation: {(1 - split.test_sharpe/split.train_sharpe)*100:.1f}%")
Step 5: Validate on Holdout
# Final validation on unseen data
holdout_data = data['2024-01-01':]
final_result = backtest(strategy, holdout_data, params=results.best_params)
print(f"Holdout Sharpe: {final_result.sharpe:.2f}")
Red Flags
Signs of overfitting:
| Symptom | Meaning |
|---|---|
| Train Sharpe >> Test Sharpe | Fitted to noise |
| Performance cliff with small param changes | Unstable |
| Too many optimized parameters | Data mining |
| Perfect equity curve | Unrealistic |
| No losing months | Too good to be true |
Best Practices Summary
- Start simple - Fewer parameters, clearer logic
- Use out-of-sample testing - Always hold back data
- Check parameter stability - Robust parameters survive small changes
- Expect degradation - Live performance < backtest performance
- Economic rationale - Parameters should make sense
Next Steps
- Live Trading - Deploy your optimized strategy
- Risk Management - Protect your capital