npx skills add https://github.com/wshobson/agents --skill backtesting-frameworksHow Backtesting Frameworks fits into a Paperclip company.
Backtesting Frameworks drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.
Pre-configured AI company — 18 agents, 18 skills, one-time purchase.
SKILL.md657 linesExpandCollapse
---name: backtesting-frameworksdescription: Build robust backtesting systems for trading strategies with proper handling of look-ahead bias, survivorship bias, and transaction costs. Use when developing trading algorithms, validating strategies, or building backtesting infrastructure.--- # Backtesting Frameworks Build robust, production-grade backtesting systems that avoid common pitfalls and produce reliable strategy performance estimates. ## When to Use This Skill - Developing trading strategy backtests- Building backtesting infrastructure- Validating strategy performance- Avoiding common backtesting biases- Implementing walk-forward analysis- Comparing strategy alternatives ## Core Concepts ### 1. Backtesting Biases | Bias | Description | Mitigation || ---------------- | ------------------------- | ----------------------- || **Look-ahead** | Using future information | Point-in-time data || **Survivorship** | Only testing on survivors | Use delisted securities || **Overfitting** | Curve-fitting to history | Out-of-sample testing || **Selection** | Cherry-picking strategies | Pre-registration || **Transaction** | Ignoring trading costs | Realistic cost models | ### 2. Proper Backtest Structure ```Historical Data │ ▼┌─────────────────────────────────────────┐│ Training Set ││ (Strategy Development & Optimization) │└─────────────────────────────────────────┘ │ ▼┌─────────────────────────────────────────┐│ Validation Set ││ (Parameter Selection, No Peeking) │└─────────────────────────────────────────┘ │ ▼┌─────────────────────────────────────────┐│ Test Set ││ (Final Performance Evaluation) │└─────────────────────────────────────────┘``` ### 3. Walk-Forward Analysis ```Window 1: [Train──────][Test]Window 2: [Train──────][Test]Window 3: [Train──────][Test]Window 4: [Train──────][Test] ─────▶ Time``` ## Implementation Patterns ### Pattern 1: Event-Driven Backtester ```pythonfrom abc import ABC, abstractmethodfrom dataclasses import dataclass, fieldfrom datetime import datetimefrom decimal import Decimalfrom enum import Enumfrom typing import Dict, List, Optionalimport pandas as pdimport numpy as np class OrderSide(Enum): BUY = "buy" SELL = "sell" class OrderType(Enum): MARKET = "market" LIMIT = "limit" STOP = "stop" @dataclassclass Order: symbol: str side: OrderSide quantity: Decimal order_type: OrderType limit_price: Optional[Decimal] = None stop_price: Optional[Decimal] = None timestamp: Optional[datetime] = None @dataclassclass Fill: order: Order fill_price: Decimal fill_quantity: Decimal commission: Decimal slippage: Decimal timestamp: datetime @dataclassclass Position: symbol: str quantity: Decimal = Decimal("0") avg_cost: Decimal = Decimal("0") realized_pnl: Decimal = Decimal("0") def update(self, fill: Fill) -> None: if fill.order.side == OrderSide.BUY: new_quantity = self.quantity + fill.fill_quantity if new_quantity != 0: self.avg_cost = ( (self.quantity * self.avg_cost + fill.fill_quantity * fill.fill_price) / new_quantity ) self.quantity = new_quantity else: self.realized_pnl += fill.fill_quantity * (fill.fill_price - self.avg_cost) self.quantity -= fill.fill_quantity @dataclassclass Portfolio: cash: Decimal positions: Dict[str, Position] = field(default_factory=dict) def get_position(self, symbol: str) -> Position: if symbol not in self.positions: self.positions[symbol] = Position(symbol=symbol) return self.positions[symbol] def process_fill(self, fill: Fill) -> None: position = self.get_position(fill.order.symbol) position.update(fill) if fill.order.side == OrderSide.BUY: self.cash -= fill.fill_price * fill.fill_quantity + fill.commission else: self.cash += fill.fill_price * fill.fill_quantity - fill.commission def get_equity(self, prices: Dict[str, Decimal]) -> Decimal: equity = self.cash for symbol, position in self.positions.items(): if position.quantity != 0 and symbol in prices: equity += position.quantity * prices[symbol] return equity class Strategy(ABC): @abstractmethod def on_bar(self, timestamp: datetime, data: pd.DataFrame) -> List[Order]: pass @abstractmethod def on_fill(self, fill: Fill) -> None: pass class ExecutionModel(ABC): @abstractmethod def execute(self, order: Order, bar: pd.Series) -> Optional[Fill]: pass class SimpleExecutionModel(ExecutionModel): def __init__(self, slippage_bps: float = 10, commission_per_share: float = 0.01): self.slippage_bps = slippage_bps self.commission_per_share = commission_per_share def execute(self, order: Order, bar: pd.Series) -> Optional[Fill]: if order.order_type == OrderType.MARKET: base_price = Decimal(str(bar["open"])) # Apply slippage slippage_mult = 1 + (self.slippage_bps / 10000) if order.side == OrderSide.BUY: fill_price = base_price * Decimal(str(slippage_mult)) else: fill_price = base_price / Decimal(str(slippage_mult)) commission = order.quantity * Decimal(str(self.commission_per_share)) slippage = abs(fill_price - base_price) * order.quantity return Fill( order=order, fill_price=fill_price, fill_quantity=order.quantity, commission=commission, slippage=slippage, timestamp=bar.name ) return None class Backtester: def __init__( self, strategy: Strategy, execution_model: ExecutionModel, initial_capital: Decimal = Decimal("100000") ): self.strategy = strategy self.execution_model = execution_model self.portfolio = Portfolio(cash=initial_capital) self.equity_curve: List[tuple] = [] self.trades: List[Fill] = [] def run(self, data: pd.DataFrame) -> pd.DataFrame: """Run backtest on OHLCV data with DatetimeIndex.""" pending_orders: List[Order] = [] for timestamp, bar in data.iterrows(): # Execute pending orders at today's prices for order in pending_orders: fill = self.execution_model.execute(order, bar) if fill: self.portfolio.process_fill(fill) self.strategy.on_fill(fill) self.trades.append(fill) pending_orders.clear() # Get current prices for equity calculation prices = {data.index.name or "default": Decimal(str(bar["close"]))} equity = self.portfolio.get_equity(prices) self.equity_curve.append((timestamp, float(equity))) # Generate new orders for next bar new_orders = self.strategy.on_bar(timestamp, data.loc[:timestamp]) pending_orders.extend(new_orders) return self._create_results() def _create_results(self) -> pd.DataFrame: equity_df = pd.DataFrame(self.equity_curve, columns=["timestamp", "equity"]) equity_df.set_index("timestamp", inplace=True) equity_df["returns"] = equity_df["equity"].pct_change() return equity_df``` ### Pattern 2: Vectorized Backtester (Fast) ```pythonimport pandas as pdimport numpy as npfrom typing import Callable, Dict, Any class VectorizedBacktester: """Fast vectorized backtester for simple strategies.""" def __init__( self, initial_capital: float = 100000, commission: float = 0.001, # 0.1% slippage: float = 0.0005 # 0.05% ): self.initial_capital = initial_capital self.commission = commission self.slippage = slippage def run( self, prices: pd.DataFrame, signal_func: Callable[[pd.DataFrame], pd.Series] ) -> Dict[str, Any]: """ Run backtest with signal function. Args: prices: DataFrame with 'close' column signal_func: Function that returns position signals (-1, 0, 1) Returns: Dictionary with results """ # Generate signals (shifted to avoid look-ahead) signals = signal_func(prices).shift(1).fillna(0) # Calculate returns returns = prices["close"].pct_change() # Calculate strategy returns with costs position_changes = signals.diff().abs() trading_costs = position_changes * (self.commission + self.slippage) strategy_returns = signals * returns - trading_costs # Build equity curve equity = (1 + strategy_returns).cumprod() * self.initial_capital # Calculate metrics results = { "equity": equity, "returns": strategy_returns, "signals": signals, "metrics": self._calculate_metrics(strategy_returns, equity) } return results def _calculate_metrics( self, returns: pd.Series, equity: pd.Series ) -> Dict[str, float]: """Calculate performance metrics.""" total_return = (equity.iloc[-1] / self.initial_capital) - 1 annual_return = (1 + total_return) ** (252 / len(returns)) - 1 annual_vol = returns.std() * np.sqrt(252) sharpe = annual_return / annual_vol if annual_vol > 0 else 0 # Drawdown rolling_max = equity.cummax() drawdown = (equity - rolling_max) / rolling_max max_drawdown = drawdown.min() # Win rate winning_days = (returns > 0).sum() total_days = (returns != 0).sum() win_rate = winning_days / total_days if total_days > 0 else 0 return { "total_return": total_return, "annual_return": annual_return, "annual_volatility": annual_vol, "sharpe_ratio": sharpe, "max_drawdown": max_drawdown, "win_rate": win_rate, "num_trades": int((returns != 0).sum()) } # Example usagedef momentum_signal(prices: pd.DataFrame, lookback: int = 20) -> pd.Series: """Simple momentum strategy: long when price > SMA, else flat.""" sma = prices["close"].rolling(lookback).mean() return (prices["close"] > sma).astype(int) # Run backtest# backtester = VectorizedBacktester()# results = backtester.run(price_data, lambda p: momentum_signal(p, 50))``` ### Pattern 3: Walk-Forward Optimization ```pythonfrom typing import Callable, Dict, List, Tuple, Anyimport pandas as pdimport numpy as npfrom itertools import product class WalkForwardOptimizer: """Walk-forward analysis with anchored or rolling windows.""" def __init__( self, train_period: int, test_period: int, anchored: bool = False, n_splits: int = None ): """ Args: train_period: Number of bars in training window test_period: Number of bars in test window anchored: If True, training always starts from beginning n_splits: Number of train/test splits (auto-calculated if None) """ self.train_period = train_period self.test_period = test_period self.anchored = anchored self.n_splits = n_splits def generate_splits( self, data: pd.DataFrame ) -> List[Tuple[pd.DataFrame, pd.DataFrame]]: """Generate train/test splits.""" splits = [] n = len(data) if self.n_splits: step = (n - self.train_period) // self.n_splits else: step = self.test_period start = 0 while start + self.train_period + self.test_period <= n: if self.anchored: train_start = 0 else: train_start = start train_end = start + self.train_period test_end = min(train_end + self.test_period, n) train_data = data.iloc[train_start:train_end] test_data = data.iloc[train_end:test_end] splits.append((train_data, test_data)) start += step return splits def optimize( self, data: pd.DataFrame, strategy_func: Callable, param_grid: Dict[str, List], metric: str = "sharpe_ratio" ) -> Dict[str, Any]: """ Run walk-forward optimization. Args: data: Full dataset strategy_func: Function(data, **params) -> results dict param_grid: Parameter combinations to test metric: Metric to optimize Returns: Combined results from all test periods """ splits = self.generate_splits(data) all_results = [] optimal_params_history = [] for i, (train_data, test_data) in enumerate(splits): # Optimize on training data best_params, best_metric = self._grid_search( train_data, strategy_func, param_grid, metric ) optimal_params_history.append(best_params) # Test with optimal params test_results = strategy_func(test_data, **best_params) test_results["split"] = i test_results["params"] = best_params all_results.append(test_results) print(f"Split {i+1}/{len(splits)}: " f"Best {metric}={best_metric:.4f}, params={best_params}") return { "split_results": all_results, "param_history": optimal_params_history, "combined_equity": self._combine_equity_curves(all_results) } def _grid_search( self, data: pd.DataFrame, strategy_func: Callable, param_grid: Dict[str, List], metric: str ) -> Tuple[Dict, float]: """Grid search for best parameters.""" best_params = None best_metric = -np.inf # Generate all parameter combinations param_names = list(param_grid.keys()) param_values = list(param_grid.values()) for values in product(*param_values): params = dict(zip(param_names, values)) results = strategy_func(data, **params) if results["metrics"][metric] > best_metric: best_metric = results["metrics"][metric] best_params = params return best_params, best_metric def _combine_equity_curves( self, results: List[Dict] ) -> pd.Series: """Combine equity curves from all test periods.""" combined = pd.concat([r["equity"] for r in results]) return combined``` ### Pattern 4: Monte Carlo Analysis ```pythonimport numpy as npimport pandas as pdfrom typing import Dict, List class MonteCarloAnalyzer: """Monte Carlo simulation for strategy robustness.""" def __init__(self, n_simulations: int = 1000, confidence: float = 0.95): self.n_simulations = n_simulations self.confidence = confidence def bootstrap_returns( self, returns: pd.Series, n_periods: int = None ) -> np.ndarray: """ Bootstrap simulation by resampling returns. Args: returns: Historical returns series n_periods: Length of each simulation (default: same as input) Returns: Array of shape (n_simulations, n_periods) """ if n_periods is None: n_periods = len(returns) simulations = np.zeros((self.n_simulations, n_periods)) for i in range(self.n_simulations): # Resample with replacement simulated_returns = np.random.choice( returns.values, size=n_periods, replace=True ) simulations[i] = simulated_returns return simulations def analyze_drawdowns( self, returns: pd.Series ) -> Dict[str, float]: """Analyze drawdown distribution via simulation.""" simulations = self.bootstrap_returns(returns) max_drawdowns = [] for sim_returns in simulations: equity = (1 + sim_returns).cumprod() rolling_max = np.maximum.accumulate(equity) drawdowns = (equity - rolling_max) / rolling_max max_drawdowns.append(drawdowns.min()) max_drawdowns = np.array(max_drawdowns) return { "expected_max_dd": np.mean(max_drawdowns), "median_max_dd": np.median(max_drawdowns), f"worst_{int(self.confidence*100)}pct": np.percentile( max_drawdowns, (1 - self.confidence) * 100 ), "worst_case": max_drawdowns.min() } def probability_of_loss( self, returns: pd.Series, holding_periods: List[int] = [21, 63, 126, 252] ) -> Dict[int, float]: """Calculate probability of loss over various holding periods.""" results = {} for period in holding_periods: if period > len(returns): continue simulations = self.bootstrap_returns(returns, period) total_returns = (1 + simulations).prod(axis=1) - 1 prob_loss = (total_returns < 0).mean() results[period] = prob_loss return results def confidence_interval( self, returns: pd.Series, periods: int = 252 ) -> Dict[str, float]: """Calculate confidence interval for future returns.""" simulations = self.bootstrap_returns(returns, periods) total_returns = (1 + simulations).prod(axis=1) - 1 lower = (1 - self.confidence) / 2 upper = 1 - lower return { "expected": total_returns.mean(), "lower_bound": np.percentile(total_returns, lower * 100), "upper_bound": np.percentile(total_returns, upper * 100), "std": total_returns.std() }``` ## Performance Metrics ```pythondef calculate_metrics(returns: pd.Series, rf_rate: float = 0.02) -> Dict[str, float]: """Calculate comprehensive performance metrics.""" # Annualization factor (assuming daily returns) ann_factor = 252 # Basic metrics total_return = (1 + returns).prod() - 1 annual_return = (1 + total_return) ** (ann_factor / len(returns)) - 1 annual_vol = returns.std() * np.sqrt(ann_factor) # Risk-adjusted returns sharpe = (annual_return - rf_rate) / annual_vol if annual_vol > 0 else 0 # Sortino (downside deviation) downside_returns = returns[returns < 0] downside_vol = downside_returns.std() * np.sqrt(ann_factor) sortino = (annual_return - rf_rate) / downside_vol if downside_vol > 0 else 0 # Calmar ratio equity = (1 + returns).cumprod() rolling_max = equity.cummax() drawdowns = (equity - rolling_max) / rolling_max max_drawdown = drawdowns.min() calmar = annual_return / abs(max_drawdown) if max_drawdown != 0 else 0 # Win rate and profit factor wins = returns[returns > 0] losses = returns[returns < 0] win_rate = len(wins) / len(returns[returns != 0]) if len(returns[returns != 0]) > 0 else 0 profit_factor = wins.sum() / abs(losses.sum()) if losses.sum() != 0 else np.inf return { "total_return": total_return, "annual_return": annual_return, "annual_volatility": annual_vol, "sharpe_ratio": sharpe, "sortino_ratio": sortino, "calmar_ratio": calmar, "max_drawdown": max_drawdown, "win_rate": win_rate, "profit_factor": profit_factor, "num_trades": int((returns != 0).sum()) }``` ## Best Practices ### Do's - **Use point-in-time data** - Avoid look-ahead bias- **Include transaction costs** - Realistic estimates- **Test out-of-sample** - Always reserve data- **Use walk-forward** - Not just train/test- **Monte Carlo analysis** - Understand uncertainty ### Don'ts - **Don't overfit** - Limit parameters- **Don't ignore survivorship** - Include delisted- **Don't use adjusted data carelessly** - Understand adjustments- **Don't optimize on full history** - Reserve test set- **Don't ignore capacity** - Market impact mattersAccessibility Compliance
This walks you through implementing proper WCAG 2.2 compliance with real code patterns for screen readers, keyboard navigation, and mobile accessibility. It cov
Airflow Dag Patterns
If you're building data pipelines with Airflow, this skill gives you production-ready DAG patterns that actually work in the real world. It covers TaskFlow API
Angular Migration
Migrating from AngularJS to Angular is notoriously painful, and this skill tackles the practical stuff that makes or breaks these projects. It covers hybrid app