Skip to content

bobscheller/backtest-service

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The logic in this service has not yet been fully validated. This is working, but results should be reviewed and validated separately. I will continue to build test scenarios and observability to build trust in the output, but the primary focus at this time is the agentic workflow this service supports. This is not directing real investments at this time.

Agent Backtesting Service

A production-grade backtesting engine for the agent-trading-firm ecosystem. Runs event-driven strategy simulations against historical OHLCV data fetched from the market-data-service, computes a comprehensive set of performance metrics, and optionally validates strategies using walk-forward analysis or Black-Scholes options simulation.

I/O contract: JSON → stdout · Rich progress → stderr · Exit 0 = success, 1 = soft failure (threshold breach), 2+ = error.


Table of Contents


Architecture

CLI (backtest run / walk-forward / options-sim / report / list)
        │
        ▼
  BacktestEngine
        │
        ├── market-data CLI ──► OHLCV DataFrame (subprocess, JSON stdout)
        │        └── iv-rank  ──► IV rank time series (optional join)
        │        └── yfinance ──► VIX time series   (optional join)
        │
        ├── compute_indicators() ─► 60+ indicator columns on each bar
        │
        ├── run_signals()         ─► bar-by-bar eval() of condition strings
        │        └── BacktestTrade list
        │
        ├── compute_metrics()     ─► BacktestMetrics (Sharpe, Calmar, DD …)
        ├── check_thresholds()    ─► pass/fail list
        │
        └── ResultsStore (SQLite) + MinIO (Parquet equity curves)

Options path (options-sim): replaces run_signals() with run_options_signals(), which models P&L from Black-Scholes option premium changes rather than underlying price changes.

Walk-forward path (walk-forward): runs N pairs of IS/OOS backtests and reports efficiency = oos_sharpe / is_sharpe.


Installation

cd backtest-service
python -m venv .venv
source .venv/bin/activate
pip install -e .

Requires Python 3.12+. Optional VIX support:

pip install yfinance

The market-data CLI must be on $PATH (or set BT_MARKET_DATA_CMD to its absolute path).


Environment Variables

All variables are prefixed BT_. Can be set in a .env file at the project root.

Variable Default Description
BT_DB_PATH data/backtest/results.db SQLite database path for runs, metrics, and trades
BT_MINIO_ENDPOINT localhost:9000 MinIO endpoint for Parquet equity curve storage
BT_MINIO_ACCESS_KEY mds MinIO access key
BT_MINIO_SECRET_KEY mds_secret MinIO secret key
BT_MINIO_BUCKET backtest-results Bucket for equity curve Parquet files
BT_MINIO_SECURE false Use HTTPS for MinIO
BT_INITIAL_CAPITAL 100000.0 Default starting capital ($)
BT_DEFAULT_COMMISSION 0.65 Default commission per contract ($)
BT_DEFAULT_SLIPPAGE 0.005 Default slippage fraction (0.5%)
BT_MARKET_DATA_CMD market-data Path/name of the market-data CLI
BT_MIN_TRADES_REQUIRED 10 Minimum trades before warning
BT_MAX_HOLDING_BARS 20 Safety cap if spec omits max_holding_bars

CLI Reference

run

Run a full backtest for a strategy spec over a date range.

backtest run \
  --strategy <path/to/spec.json> \
  --symbol AAPL \
  --start 2019-01-01 \
  --end 2024-12-31 \
  [--capital 100000] \
  [--commission 0.65] \
  [--slippage 0.005]

Options

Flag Required Description
--strategy Yes Path to strategy spec JSON or YAML
--symbol / -s Yes Ticker symbol (e.g. SPY, AAPL)
--start Yes Start date YYYY-MM-DD
--end Yes End date YYYY-MM-DD
--capital No Override initial capital ($)
--commission No Override commission per contract ($)
--slippage No Override slippage fraction

Exit codes: 0 = pass · 1 = fail (threshold breach or no data) · 2 = error

Stdout: BacktestReport JSON

Example

backtest run --strategy strategies/spy_short_put_spread_v1.json \
             --symbol SPY --start 2020-01-01 --end 2024-12-31

walk-forward

Run IS/OOS walk-forward validation across N rolling windows.

backtest walk-forward \
  --strategy <path/to/spec.json> \
  --symbol SPY \
  --start 2016-01-01 \
  --end 2024-12-31 \
  [--windows 3] \
  [--is-ratio 0.70]

Options

Flag Default Description
--strategy Required Path to strategy spec
--symbol / -s Required Ticker symbol
--start Required Start date YYYY-MM-DD
--end Required End date YYYY-MM-DD
--windows 3 Number of walk-forward windows
--is-ratio 0.70 In-sample fraction per window

Pass criterion: avg_efficiency >= 0.70 (OOS Sharpe / IS Sharpe averaged across all windows)

Stdout: WalkForwardReport JSON


options-sim

Simulate an options spread strategy using Black-Scholes with stored IV rank data. P&L is computed from option premium changes, not underlying price movements.

Pre-requisite: IV rank history must be populated first:

market-data iv-rank-backfill --symbol SPY --start 2019-01-01 --yes
backtest options-sim \
  --strategy <path/to/spread_spec.json> \
  --symbol SPY \
  --start 2023-01-01 \
  --end 2024-12-31 \
  [--capital 100000] \
  [--commission 0.65] \
  [--slippage 0.005]

Supported spread types: short_put_spread · short_call_spread · iron_condor

Stdout: BacktestReport JSON (same schema as run)


report

Retrieve the full persisted report for a completed run.

backtest report --run-id <uuid>

Stdout:

{
  "run": { ... },
  "metrics": { ... },
  "trades": [ ... ]
}

list

List recent backtest runs.

backtest list [--symbol AAPL] [--limit 20]

Stdout:

{
  "count": 5,
  "runs": [ { "run_id": "...", "strategy_name": "...", "symbol": "...", "status": "complete", ... } ]
}

Strategy Spec Format

Strategy specs are JSON (or YAML) files that define entry/exit logic as Python boolean expressions evaluated per bar.

StrategySpec (equities)

Used by backtest run and backtest walk-forward.

{
  "name": "spy_momentum_v1",
  "version": "1.0",
  "asset_class": "equities",
  "timeframe": "swing",

  "entry_conditions": [
    "close > sma_50",
    "rsi_14 > 50",
    "macd_hist > 0",
    "not in_position"
  ],
  "exit_conditions": [
    "rsi_14 > 70",
    "close < ema_21"
  ],
  "stop_conditions": [
    "close < sma_200"
  ],
  "filter_conditions": [
    "vix < 35",
    "not fomc_meeting_day"
  ],

  "position_sizing_formula": "risk_pct * capital / atr_14",
  "max_holding_bars": 10,
  "max_concurrent_positions": 1,
  "per_trade_risk_pct": 0.01,

  "requires_options_data": false,
  "description": "SPY trend-following with RSI/MACD confirmation",
  "instruments": ["SPY"]
}

Field reference

Field Type Default Description
name string required Strategy identifier
version string "1.0" Version label
asset_class enum "equities" equities · options · equity_options · futures · multi-leg
timeframe enum "swing" intraday · daily · swing · position
entry_conditions list[str] required All must be True to open
exit_conditions list[str] required All must be True to close at profit target
stop_conditions list[str] required All must be True to close at stop
filter_conditions list[str] [] All must be True each bar for signals to be evaluated
position_sizing_formula str "risk_pct * capital / atr_14" Python expression for share count
max_holding_bars int 10 Force-exit after N bars
max_concurrent_positions int 1 Max open positions at once
per_trade_risk_pct float 0.01 Capital fraction risked per trade (hard cap: 0.02)
requires_options_data bool false When True, engine fetches IV rank history and joins it
instruments list[str] [] Instruments the strategy trades (informational)

OptionsSpreadSpec (options-sim)

Used by backtest options-sim. P&L uses Black-Scholes premium changes, not OHLCV price moves.

{
  "name": "vrp_short_put_spread",
  "version": "1.0",
  "spread_type": "short_put_spread",

  "entry_conditions": [
    "iv_rank is not None and iv_rank >= 30",
    "close > sma_50",
    "not in_position"
  ],
  "filter_conditions": [
    "vix < 40",
    "not fomc_meeting_day"
  ],

  "target_dte": 21,
  "short_delta": 0.30,
  "long_delta": 0.15,
  "profit_target_pct": 0.50,
  "stop_loss_pct": 2.00,
  "max_holding_bars": 15,

  "max_concurrent_positions": 1,
  "per_trade_risk_pct": 0.02,
  "risk_free_rate": 0.05
}

Field reference

Field Type Default Description
spread_type enum "short_put_spread" short_put_spread · short_call_spread · iron_condor
target_dte int 21 Days to expiration at entry
short_delta float 0.30 Short leg absolute delta (e.g. 0.30 = 30Δ)
long_delta float 0.15 Long (protective) leg absolute delta
profit_target_pct float 0.50 Close when 50% of credit is captured
stop_loss_pct float 2.00 Close when spread value = 2× credit received
risk_free_rate float 0.05 Risk-free rate for Black-Scholes (annual)

Indicator Namespace

Every bar's condition expressions are evaluated against a namespace containing these variables. All are float unless noted. Missing data resolves to None (so conditions like iv_rank is not None and iv_rank >= 30 are safe).

Trend

Variable Description
open, high, low, close, volume Raw OHLCV values
sma_20, sma_50, sma_200 Simple moving averages
ema_9, ema_21, ema_50 Exponential moving averages
spy_price, spy_close Alias for close (useful for SPY strategies)
spy_sma_20, spy_sma_50, spy_sma_200 Alias for SMA columns
prev_close, prev_high, prev_low Previous bar values

Momentum

Variable Description
rsi_14 RSI 14-period, bounded [0, 100]
macd MACD line (EMA12 − EMA26)
macd_signal MACD signal line (EMA9 of MACD)
macd_hist MACD histogram (MACD − signal)
roc_10 Rate of change, 10-bar (%)

Volatility

Variable Description
atr_14 Average True Range, 14-period
bb_upper, bb_lower Bollinger Bands (20-period, ±2σ)
bb_pct Price position within Bollinger Bands [0, 1]
adx_14 Average Directional Index
plus_di, minus_di Directional movement indicators
hv_30 30-day historical volatility (annualised)
spy_realized_vol_5d, spy_realized_vol_10d Short-window realised vol

Volume

Variable Description
obv On-balance volume
spy_adv_20d 20-day average daily volume
volume_spy_today Current bar volume

IV / Options

Variable Description
iv_rank IV rank 0–100 (requires iv-rank-backfill or requires_options_data: true)
iv_percentile IV percentile 0–100
current_iv Current ATM implied volatility (decimal)
iv_hv_ratio current_iv / hv_30 (VRP indicator)
iv_atm ATM IV; falls back to hv_30 * 1.20 when unavailable
dte Days to next standard monthly OpEx (0–60)
short_leg_delta Approximated short leg delta (default 0.30)
short_strike_delta Alias for short_leg_delta
bid_ask_spread_pct Modelled bid/ask spread fraction
pnl_pct_of_max_credit Options P&L as fraction of max credit received
premium_collected_pct Alias for pnl_pct_of_max_credit
open_interest_short_strike Open interest at short strike (default 2000)

VIX

Variable Description
vix VIX close (fetched via yfinance; fallback 20.0)
vix_spot Alias for vix

Calendar

Variable Type Description
fomc_meeting_day bool True on FOMC announcement days (2019–2026)
fomc_within_3d bool True within 3 calendar days of next FOMC
days_to_fomc int Calendar days until next FOMC
cpi_release_day bool True on BLS CPI release days (2019–2026)
opex_week bool True during standard monthly OpEx week
within_3d_of_opex bool True within 3 days of nearest monthly OpEx
earnings_within_5d bool Always False (requires external earnings service)
earnings_within_7d bool Always False (requires external earnings service)

Position State

Variable Type Description
in_position bool True when a position is open
capital float Current available capital ($)
bars_held int Bars elapsed since entry (0 if flat)
days_held int Alias for bars_held
position_pnl_pct float Unrealised P&L as fraction of entry price

Safe Builtins

The eval scope includes: abs, round, min, max, int, float, bool, len, True, False, None.


Pass / Fail Thresholds

backtest run and backtest options-sim check these after every run. Any violation is reported in failures[] and sets passed: false (exit code 1).

Metric Threshold
win_rate >= 0.50
profit_factor >= 1.50
max_drawdown_pct <= 0.25 (−25%)
sharpe_ratio >= 1.00
consecutive_losses_max <= 8
walk_forward_efficiency >= 0.70 (walk-forward only)

JSON Output Schemas

All commands emit JSON to stdout. Rich progress and warnings go to stderr.

BacktestReport

Returned by backtest run and backtest options-sim.

{
  "run": {
    "run_id": "uuid-string",
    "strategy_name": "spy_momentum_v1",
    "strategy_version": "1.0",
    "symbol": "SPY",
    "start_date": "2020-01-01",
    "end_date": "2024-12-31",
    "initial_capital": 100000.0,
    "commission_per_contract": 0.65,
    "slippage_pct": 0.005,
    "created_at": "2026-04-19T10:00:00",
    "completed_at": "2026-04-19T10:00:05",
    "status": "complete",
    "error_message": null
  },
  "metrics": {
    "run_id": "uuid-string",
    "total_trades": 87,
    "winning_trades": 52,
    "losing_trades": 35,
    "win_rate": 0.5977,
    "profit_factor": 1.82,
    "expectancy_per_trade": 312.40,
    "avg_win_pct": 0.0421,
    "avg_loss_pct": -0.0198,
    "largest_win": 4210.00,
    "largest_loss": -1850.00,
    "consecutive_losses_max": 5,
    "cagr": 0.1423,
    "max_drawdown_pct": -0.1187,
    "max_drawdown_duration_days": 38,
    "sharpe_ratio": 1.34,
    "sortino_ratio": 2.01,
    "calmar_ratio": 1.20,
    "recovery_factor": 3.74,
    "is_sharpe": null,
    "oos_sharpe": null,
    "walk_forward_efficiency": null
  },
  "equity_curve_url": null,
  "trade_count": 87,
  "passed": true,
  "failures": []
}

WalkForwardReport

Returned by backtest walk-forward.

{
  "strategy_name": "spy_momentum_v1",
  "symbol": "SPY",
  "windows": [
    {
      "window_index": 0,
      "is_start": "2016-01-01",
      "is_end": "2018-10-28",
      "oos_start": "2018-10-29",
      "oos_end": "2019-07-27",
      "is_run_id": "uuid-string",
      "oos_run_id": "uuid-string",
      "is_sharpe": 1.42,
      "oos_sharpe": 1.10,
      "efficiency": 0.775
    }
  ],
  "avg_efficiency": 0.79,
  "avg_oos_sharpe": 1.05,
  "passed": true
}

Storage

SQLite (runs, metrics, trades)

Default path: data/backtest/results.db (overridden by BT_DB_PATH).

Three tables: backtest_runs, backtest_metrics, backtest_trades. Schema defined in backtest/migrations/001_backtest_schema.sql.

# Retrieve a run via sqlite3
sqlite3 $BT_DB_PATH \
  "SELECT run_id, strategy_name, symbol, status, sharpe_ratio \
   FROM backtest_runs r \
   JOIN backtest_metrics m USING (run_id) \
   ORDER BY created_at DESC LIMIT 10;"

MinIO (equity curve Parquet)

Equity curves are stored as Parquet in the backtest-results bucket when MinIO is available. The path is returned in BacktestReport.equity_curve_url.


Running Tests

# Unit tests — no infrastructure required
cd backtest-service
.venv/bin/pytest tests/unit/ -m 'not integration' --tb=short -q

# All tests with coverage
.venv/bin/pytest tests/unit/ --cov=backtest --cov-report=term-missing -q

Test suite: tests/unit/test_engine.py, test_metrics.py, test_signal_runner.py, test_cli.py.


Agent Integration

This service is designed to be called by the backtesting_agent in the agent-trading-firm R&D pipeline via the shared call_cli() subprocess wrapper.

Calling from an agent

from shared.tools.cli_tool import call_cli

# Run a backtest
result = call_cli([
    "backtest", "run",
    "--strategy", "/path/to/spec.json",
    "--symbol", "SPY",
    "--start", "2020-01-01",
    "--end", "2024-12-31",
])
# result is a dict parsed from JSON stdout
passed = result["passed"]
run_id = result["run"]["run_id"]
sharpe = result["metrics"]["sharpe_ratio"]

# Walk-forward validation
wf = call_cli([
    "backtest", "walk-forward",
    "--strategy", "/path/to/spec.json",
    "--symbol", "SPY",
    "--start", "2016-01-01",
    "--end", "2024-12-31",
    "--windows", "3",
])
efficiency = wf["avg_efficiency"]   # >= 0.70 = pass

# Retrieve stored report
report = call_cli(["backtest", "report", "--run-id", run_id])

# List recent runs
runs = call_cli(["backtest", "list", "--symbol", "SPY", "--limit", "5"])

Pipeline position

backtest-service sits between strategy_development_agent and monte_carlo_agent in the R&D pipeline:

instrument_research_agent
  → strategy_development_agent  (produces strategy spec JSON)
      → backtesting_agent        (calls this service)
          → monte_carlo_agent    (consumes run_id from BacktestReport)
              → forward_testing_agent
                  → deployment_agent

Key invariants for agents

  • passed: true is required before passing run_id to the Monte Carlo service.
  • walk_forward_efficiency >= 0.70 is the bar for walk-forward validation.
  • All condition strings in specs use the indicator namespace above — eval() is sandboxed; only the listed safe builtins are available.
  • RSI is bounded [0, 100] — use rsi_14 < 0 as an always-false sentinel in tests, not rsi_14 < 5.
  • iv_rank is None (not NaN) in the eval namespace when IV data is absent; always guard with iv_rank is not None.
  • per_trade_risk_pct has a hard server-side cap of 0.02 (2%); specs exceeding this will be rejected at load time.

About

Agent-focused backtesting service

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages