Name: Backtesting Frameworks
Author: Wshobson
Install
Terminal · npx
$npx skills add https://github.com/wshobson/agents --skill backtesting-frameworks
Works with Paperclip
How Backtesting Frameworks fits into a Paperclip company.

Backtesting Frameworks drops into any Paperclip agent that handles this kind of work. Assign it to a specialist inside a pre-configured PaperclipOrg company and the skill becomes available on every heartbeat — no prompt engineering, no tool wiring.
SaaS FactoryPaired
Pre-configured AI company — 18 agents, 18 skills, one-time purchase.
$27$59
Explore pack
Source file
SKILL.md657 linesmarkdown
Expand
1---2name: backtesting-frameworks3description: Build robust backtesting systems for trading strategies with proper handling of look-ahead bias, survivorship bias, and transaction costs. Use when developing trading algorithms, validating strategies, or building backtesting infrastructure.4---5 6# Backtesting Frameworks7 8Build robust, production-grade backtesting systems that avoid common pitfalls and produce reliable strategy performance estimates.9 10## When to Use This Skill11 12- Developing trading strategy backtests13- Building backtesting infrastructure14- Validating strategy performance15- Avoiding common backtesting biases16- Implementing walk-forward analysis17- Comparing strategy alternatives18 19## Core Concepts20 21### 1. Backtesting Biases22 23| Bias             | Description               | Mitigation              |24| ---------------- | ------------------------- | ----------------------- |25| **Look-ahead**   | Using future information  | Point-in-time data      |26| **Survivorship** | Only testing on survivors | Use delisted securities |27| **Overfitting**  | Curve-fitting to history  | Out-of-sample testing   |28| **Selection**    | Cherry-picking strategies | Pre-registration        |29| **Transaction**  | Ignoring trading costs    | Realistic cost models   |30 31### 2. Proper Backtest Structure32 33```34Historical Data35      │36      ▼37┌─────────────────────────────────────────┐38│              Training Set               │39│  (Strategy Development & Optimization)  │40└─────────────────────────────────────────┘41      │42      ▼43┌─────────────────────────────────────────┐44│             Validation Set              │45│  (Parameter Selection, No Peeking)      │46└─────────────────────────────────────────┘47      │48      ▼49┌─────────────────────────────────────────┐50│               Test Set                  │51│  (Final Performance Evaluation)         │52└─────────────────────────────────────────┘53```54 55### 3. Walk-Forward Analysis56 57```58Window 1: [Train──────][Test]59Window 2:     [Train──────][Test]60Window 3:         [Train──────][Test]61Window 4:             [Train──────][Test]62                                     ─────▶ Time63```64 65## Implementation Patterns66 67### Pattern 1: Event-Driven Backtester68 69```python70from abc import ABC, abstractmethod71from dataclasses import dataclass, field72from datetime import datetime73from decimal import Decimal74from enum import Enum75from typing import Dict, List, Optional76import pandas as pd77import numpy as np78 79class OrderSide(Enum):80    BUY = "buy"81    SELL = "sell"82 83class OrderType(Enum):84    MARKET = "market"85    LIMIT = "limit"86    STOP = "stop"87 88@dataclass89class Order:90    symbol: str91    side: OrderSide92    quantity: Decimal93    order_type: OrderType94    limit_price: Optional[Decimal] = None95    stop_price: Optional[Decimal] = None96    timestamp: Optional[datetime] = None97 98@dataclass99class Fill:100    order: Order101    fill_price: Decimal102    fill_quantity: Decimal103    commission: Decimal104    slippage: Decimal105    timestamp: datetime106 107@dataclass108class Position:109    symbol: str110    quantity: Decimal = Decimal("0")111    avg_cost: Decimal = Decimal("0")112    realized_pnl: Decimal = Decimal("0")113 114    def update(self, fill: Fill) -> None:115        if fill.order.side == OrderSide.BUY:116            new_quantity = self.quantity + fill.fill_quantity117            if new_quantity != 0:118                self.avg_cost = (119                    (self.quantity * self.avg_cost + fill.fill_quantity * fill.fill_price)120                    / new_quantity121                )122            self.quantity = new_quantity123        else:124            self.realized_pnl += fill.fill_quantity * (fill.fill_price - self.avg_cost)125            self.quantity -= fill.fill_quantity126 127@dataclass128class Portfolio:129    cash: Decimal130    positions: Dict[str, Position] = field(default_factory=dict)131 132    def get_position(self, symbol: str) -> Position:133        if symbol not in self.positions:134            self.positions[symbol] = Position(symbol=symbol)135        return self.positions[symbol]136 137    def process_fill(self, fill: Fill) -> None:138        position = self.get_position(fill.order.symbol)139        position.update(fill)140 141        if fill.order.side == OrderSide.BUY:142            self.cash -= fill.fill_price * fill.fill_quantity + fill.commission143        else:144            self.cash += fill.fill_price * fill.fill_quantity - fill.commission145 146    def get_equity(self, prices: Dict[str, Decimal]) -> Decimal:147        equity = self.cash148        for symbol, position in self.positions.items():149            if position.quantity != 0 and symbol in prices:150                equity += position.quantity * prices[symbol]151        return equity152 153class Strategy(ABC):154    @abstractmethod155    def on_bar(self, timestamp: datetime, data: pd.DataFrame) -> List[Order]:156        pass157 158    @abstractmethod159    def on_fill(self, fill: Fill) -> None:160        pass161 162class ExecutionModel(ABC):163    @abstractmethod164    def execute(self, order: Order, bar: pd.Series) -> Optional[Fill]:165        pass166 167class SimpleExecutionModel(ExecutionModel):168    def __init__(self, slippage_bps: float = 10, commission_per_share: float = 0.01):169        self.slippage_bps = slippage_bps170        self.commission_per_share = commission_per_share171 172    def execute(self, order: Order, bar: pd.Series) -> Optional[Fill]:173        if order.order_type == OrderType.MARKET:174            base_price = Decimal(str(bar["open"]))175 176            # Apply slippage177            slippage_mult = 1 + (self.slippage_bps / 10000)178            if order.side == OrderSide.BUY:179                fill_price = base_price * Decimal(str(slippage_mult))180            else:181                fill_price = base_price / Decimal(str(slippage_mult))182 183            commission = order.quantity * Decimal(str(self.commission_per_share))184            slippage = abs(fill_price - base_price) * order.quantity185 186            return Fill(187                order=order,188                fill_price=fill_price,189                fill_quantity=order.quantity,190                commission=commission,191                slippage=slippage,192                timestamp=bar.name193            )194        return None195 196class Backtester:197    def __init__(198        self,199        strategy: Strategy,200        execution_model: ExecutionModel,201        initial_capital: Decimal = Decimal("100000")202    ):203        self.strategy = strategy204        self.execution_model = execution_model205        self.portfolio = Portfolio(cash=initial_capital)206        self.equity_curve: List[tuple] = []207        self.trades: List[Fill] = []208 209    def run(self, data: pd.DataFrame) -> pd.DataFrame:210        """Run backtest on OHLCV data with DatetimeIndex."""211        pending_orders: List[Order] = []212 213        for timestamp, bar in data.iterrows():214            # Execute pending orders at today's prices215            for order in pending_orders:216                fill = self.execution_model.execute(order, bar)217                if fill:218                    self.portfolio.process_fill(fill)219                    self.strategy.on_fill(fill)220                    self.trades.append(fill)221 222            pending_orders.clear()223 224            # Get current prices for equity calculation225            prices = {data.index.name or "default": Decimal(str(bar["close"]))}226            equity = self.portfolio.get_equity(prices)227            self.equity_curve.append((timestamp, float(equity)))228 229            # Generate new orders for next bar230            new_orders = self.strategy.on_bar(timestamp, data.loc[:timestamp])231            pending_orders.extend(new_orders)232 233        return self._create_results()234 235    def _create_results(self) -> pd.DataFrame:236        equity_df = pd.DataFrame(self.equity_curve, columns=["timestamp", "equity"])237        equity_df.set_index("timestamp", inplace=True)238        equity_df["returns"] = equity_df["equity"].pct_change()239        return equity_df240```241 242### Pattern 2: Vectorized Backtester (Fast)243 244```python245import pandas as pd246import numpy as np247from typing import Callable, Dict, Any248 249class VectorizedBacktester:250    """Fast vectorized backtester for simple strategies."""251 252    def __init__(253        self,254        initial_capital: float = 100000,255        commission: float = 0.001,  # 0.1%256        slippage: float = 0.0005   # 0.05%257    ):258        self.initial_capital = initial_capital259        self.commission = commission260        self.slippage = slippage261 262    def run(263        self,264        prices: pd.DataFrame,265        signal_func: Callable[[pd.DataFrame], pd.Series]266    ) -> Dict[str, Any]:267        """268        Run backtest with signal function.269 270        Args:271            prices: DataFrame with 'close' column272            signal_func: Function that returns position signals (-1, 0, 1)273 274        Returns:275            Dictionary with results276        """277        # Generate signals (shifted to avoid look-ahead)278        signals = signal_func(prices).shift(1).fillna(0)279 280        # Calculate returns281        returns = prices["close"].pct_change()282 283        # Calculate strategy returns with costs284        position_changes = signals.diff().abs()285        trading_costs = position_changes * (self.commission + self.slippage)286 287        strategy_returns = signals * returns - trading_costs288 289        # Build equity curve290        equity = (1 + strategy_returns).cumprod() * self.initial_capital291 292        # Calculate metrics293        results = {294            "equity": equity,295            "returns": strategy_returns,296            "signals": signals,297            "metrics": self._calculate_metrics(strategy_returns, equity)298        }299 300        return results301 302    def _calculate_metrics(303        self,304        returns: pd.Series,305        equity: pd.Series306    ) -> Dict[str, float]:307        """Calculate performance metrics."""308        total_return = (equity.iloc[-1] / self.initial_capital) - 1309        annual_return = (1 + total_return) ** (252 / len(returns)) - 1310        annual_vol = returns.std() * np.sqrt(252)311        sharpe = annual_return / annual_vol if annual_vol > 0 else 0312 313        # Drawdown314        rolling_max = equity.cummax()315        drawdown = (equity - rolling_max) / rolling_max316        max_drawdown = drawdown.min()317 318        # Win rate319        winning_days = (returns > 0).sum()320        total_days = (returns != 0).sum()321        win_rate = winning_days / total_days if total_days > 0 else 0322 323        return {324            "total_return": total_return,325            "annual_return": annual_return,326            "annual_volatility": annual_vol,327            "sharpe_ratio": sharpe,328            "max_drawdown": max_drawdown,329            "win_rate": win_rate,330            "num_trades": int((returns != 0).sum())331        }332 333# Example usage334def momentum_signal(prices: pd.DataFrame, lookback: int = 20) -> pd.Series:335    """Simple momentum strategy: long when price > SMA, else flat."""336    sma = prices["close"].rolling(lookback).mean()337    return (prices["close"] > sma).astype(int)338 339# Run backtest340# backtester = VectorizedBacktester()341# results = backtester.run(price_data, lambda p: momentum_signal(p, 50))342```343 344### Pattern 3: Walk-Forward Optimization345 346```python347from typing import Callable, Dict, List, Tuple, Any348import pandas as pd349import numpy as np350from itertools import product351 352class WalkForwardOptimizer:353    """Walk-forward analysis with anchored or rolling windows."""354 355    def __init__(356        self,357        train_period: int,358        test_period: int,359        anchored: bool = False,360        n_splits: int = None361    ):362        """363        Args:364            train_period: Number of bars in training window365            test_period: Number of bars in test window366            anchored: If True, training always starts from beginning367            n_splits: Number of train/test splits (auto-calculated if None)368        """369        self.train_period = train_period370        self.test_period = test_period371        self.anchored = anchored372        self.n_splits = n_splits373 374    def generate_splits(375        self,376        data: pd.DataFrame377    ) -> List[Tuple[pd.DataFrame, pd.DataFrame]]:378        """Generate train/test splits."""379        splits = []380        n = len(data)381 382        if self.n_splits:383            step = (n - self.train_period) // self.n_splits384        else:385            step = self.test_period386 387        start = 0388        while start + self.train_period + self.test_period <= n:389            if self.anchored:390                train_start = 0391            else:392                train_start = start393 394            train_end = start + self.train_period395            test_end = min(train_end + self.test_period, n)396 397            train_data = data.iloc[train_start:train_end]398            test_data = data.iloc[train_end:test_end]399 400            splits.append((train_data, test_data))401            start += step402 403        return splits404 405    def optimize(406        self,407        data: pd.DataFrame,408        strategy_func: Callable,409        param_grid: Dict[str, List],410        metric: str = "sharpe_ratio"411    ) -> Dict[str, Any]:412        """413        Run walk-forward optimization.414 415        Args:416            data: Full dataset417            strategy_func: Function(data, **params) -> results dict418            param_grid: Parameter combinations to test419            metric: Metric to optimize420 421        Returns:422            Combined results from all test periods423        """424        splits = self.generate_splits(data)425        all_results = []426        optimal_params_history = []427 428        for i, (train_data, test_data) in enumerate(splits):429            # Optimize on training data430            best_params, best_metric = self._grid_search(431                train_data, strategy_func, param_grid, metric432            )433            optimal_params_history.append(best_params)434 435            # Test with optimal params436            test_results = strategy_func(test_data, **best_params)437            test_results["split"] = i438            test_results["params"] = best_params439            all_results.append(test_results)440 441            print(f"Split {i+1}/{len(splits)}: "442                  f"Best {metric}={best_metric:.4f}, params={best_params}")443 444        return {445            "split_results": all_results,446            "param_history": optimal_params_history,447            "combined_equity": self._combine_equity_curves(all_results)448        }449 450    def _grid_search(451        self,452        data: pd.DataFrame,453        strategy_func: Callable,454        param_grid: Dict[str, List],455        metric: str456    ) -> Tuple[Dict, float]:457        """Grid search for best parameters."""458        best_params = None459        best_metric = -np.inf460 461        # Generate all parameter combinations462        param_names = list(param_grid.keys())463        param_values = list(param_grid.values())464 465        for values in product(*param_values):466            params = dict(zip(param_names, values))467            results = strategy_func(data, **params)468 469            if results["metrics"][metric] > best_metric:470                best_metric = results["metrics"][metric]471                best_params = params472 473        return best_params, best_metric474 475    def _combine_equity_curves(476        self,477        results: List[Dict]478    ) -> pd.Series:479        """Combine equity curves from all test periods."""480        combined = pd.concat([r["equity"] for r in results])481        return combined482```483 484### Pattern 4: Monte Carlo Analysis485 486```python487import numpy as np488import pandas as pd489from typing import Dict, List490 491class MonteCarloAnalyzer:492    """Monte Carlo simulation for strategy robustness."""493 494    def __init__(self, n_simulations: int = 1000, confidence: float = 0.95):495        self.n_simulations = n_simulations496        self.confidence = confidence497 498    def bootstrap_returns(499        self,500        returns: pd.Series,501        n_periods: int = None502    ) -> np.ndarray:503        """504        Bootstrap simulation by resampling returns.505 506        Args:507            returns: Historical returns series508            n_periods: Length of each simulation (default: same as input)509 510        Returns:511            Array of shape (n_simulations, n_periods)512        """513        if n_periods is None:514            n_periods = len(returns)515 516        simulations = np.zeros((self.n_simulations, n_periods))517 518        for i in range(self.n_simulations):519            # Resample with replacement520            simulated_returns = np.random.choice(521                returns.values,522                size=n_periods,523                replace=True524            )525            simulations[i] = simulated_returns526 527        return simulations528 529    def analyze_drawdowns(530        self,531        returns: pd.Series532    ) -> Dict[str, float]:533        """Analyze drawdown distribution via simulation."""534        simulations = self.bootstrap_returns(returns)535 536        max_drawdowns = []537        for sim_returns in simulations:538            equity = (1 + sim_returns).cumprod()539            rolling_max = np.maximum.accumulate(equity)540            drawdowns = (equity - rolling_max) / rolling_max541            max_drawdowns.append(drawdowns.min())542 543        max_drawdowns = np.array(max_drawdowns)544 545        return {546            "expected_max_dd": np.mean(max_drawdowns),547            "median_max_dd": np.median(max_drawdowns),548            f"worst_{int(self.confidence*100)}pct": np.percentile(549                max_drawdowns, (1 - self.confidence) * 100550            ),551            "worst_case": max_drawdowns.min()552        }553 554    def probability_of_loss(555        self,556        returns: pd.Series,557        holding_periods: List[int] = [21, 63, 126, 252]558    ) -> Dict[int, float]:559        """Calculate probability of loss over various holding periods."""560        results = {}561 562        for period in holding_periods:563            if period > len(returns):564                continue565 566            simulations = self.bootstrap_returns(returns, period)567            total_returns = (1 + simulations).prod(axis=1) - 1568            prob_loss = (total_returns < 0).mean()569            results[period] = prob_loss570 571        return results572 573    def confidence_interval(574        self,575        returns: pd.Series,576        periods: int = 252577    ) -> Dict[str, float]:578        """Calculate confidence interval for future returns."""579        simulations = self.bootstrap_returns(returns, periods)580        total_returns = (1 + simulations).prod(axis=1) - 1581 582        lower = (1 - self.confidence) / 2583        upper = 1 - lower584 585        return {586            "expected": total_returns.mean(),587            "lower_bound": np.percentile(total_returns, lower * 100),588            "upper_bound": np.percentile(total_returns, upper * 100),589            "std": total_returns.std()590        }591```592 593## Performance Metrics594 595```python596def calculate_metrics(returns: pd.Series, rf_rate: float = 0.02) -> Dict[str, float]:597    """Calculate comprehensive performance metrics."""598    # Annualization factor (assuming daily returns)599    ann_factor = 252600 601    # Basic metrics602    total_return = (1 + returns).prod() - 1603    annual_return = (1 + total_return) ** (ann_factor / len(returns)) - 1604    annual_vol = returns.std() * np.sqrt(ann_factor)605 606    # Risk-adjusted returns607    sharpe = (annual_return - rf_rate) / annual_vol if annual_vol > 0 else 0608 609    # Sortino (downside deviation)610    downside_returns = returns[returns < 0]611    downside_vol = downside_returns.std() * np.sqrt(ann_factor)612    sortino = (annual_return - rf_rate) / downside_vol if downside_vol > 0 else 0613 614    # Calmar ratio615    equity = (1 + returns).cumprod()616    rolling_max = equity.cummax()617    drawdowns = (equity - rolling_max) / rolling_max618    max_drawdown = drawdowns.min()619    calmar = annual_return / abs(max_drawdown) if max_drawdown != 0 else 0620 621    # Win rate and profit factor622    wins = returns[returns > 0]623    losses = returns[returns < 0]624    win_rate = len(wins) / len(returns[returns != 0]) if len(returns[returns != 0]) > 0 else 0625    profit_factor = wins.sum() / abs(losses.sum()) if losses.sum() != 0 else np.inf626 627    return {628        "total_return": total_return,629        "annual_return": annual_return,630        "annual_volatility": annual_vol,631        "sharpe_ratio": sharpe,632        "sortino_ratio": sortino,633        "calmar_ratio": calmar,634        "max_drawdown": max_drawdown,635        "win_rate": win_rate,636        "profit_factor": profit_factor,637        "num_trades": int((returns != 0).sum())638    }639```640 641## Best Practices642 643### Do's644 645- **Use point-in-time data** - Avoid look-ahead bias646- **Include transaction costs** - Realistic estimates647- **Test out-of-sample** - Always reserve data648- **Use walk-forward** - Not just train/test649- **Monte Carlo analysis** - Understand uncertainty650 651### Don'ts652 653- **Don't overfit** - Limit parameters654- **Don't ignore survivorship** - Include delisted655- **Don't use adjusted data carelessly** - Understand adjustments656- **Don't optimize on full history** - Reserve test set657- **Don't ignore capacity** - Market impact matters
Related skills
Accessibility Compliance

This walks you through implementing proper WCAG 2.2 compliance with real code patterns for screen readers, keyboard navigation, and mobile accessibility. It cov
Airflow Dag Patterns

If you're building data pipelines with Airflow, this skill gives you production-ready DAG patterns that actually work in the real world. It covers TaskFlow API
Angular Migration

Migrating from AngularJS to Angular is notoriously painful, and this skill tackles the practical stuff that makes or breaks these projects. It covers hybrid app