Docs

Backtesting & Metrics

Learn how to run a useful backtest in Setup.Cash, interpret core metrics, and avoid common testing mistakes like overfitting and data leakage.

By Setup.Cash TeamLast updated 2026-02-225 min read

Backtesting is the process of running a strategy against historical data to evaluate how the rules behaved. In Setup.Cash, the purpose of backtesting is not to prove future profits. It is to validate logic, inspect edge cases, and improve the strategy workflow before paper trading or live use.

This page explains how to run backtests that are actually useful.

What Backtesting Can and Cannot Do

What backtesting is good for

Checking whether your blueprint rules execute as expected
Comparing strategy variants using a consistent process
Measuring drawdown, trade frequency, and risk assumptions
Finding weak conditions, missing filters, and invalidation problems

What backtesting cannot guarantee

Future profitability
Exact live execution behavior
Performance under unseen market regimes
Protection from overfitting if the process is weak

This is why backtesting should be followed by paper trading and ongoing review.

Before You Start a Backtest

Backtests are only as good as the blueprint and assumptions behind them. Before running a test, confirm:

The blueprint is clearly defined (docs/blueprints)
Risk rules are part of the strategy (not added later)
You know what question you are testing
You are not changing multiple variables at once

Example of a good test question:

Does adding an ATR-based stop improve drawdown consistency without reducing expectancy too much?

Example of a weak test question:

Can I make this strategy profitable if I tweak enough settings?

Core Backtesting Workflow in Setup.Cash

1) Run a baseline test

Start with the simplest valid blueprint version. Save the results. This baseline becomes the reference point for all later changes.

2) Inspect trade behavior, not just summary metrics

Do not stop at headline numbers. Review:

Where entries occurred
Whether exits match the defined rules
Losing streak behavior
Trade clustering during volatile periods
Any conditions that produced low-quality setups

3) Track one change at a time

If you change the trend filter, stop method, and session filter in one test, you lose diagnostic power. Make one change, then compare against baseline.

4) Record assumptions and outcomes

Document what changed and why. Good records are part of the strategy itself.

Metrics That Matter (and Why)

Expectancy

Expectancy helps you understand the average outcome per trade based on win rate and payoff distribution. It is more informative than win rate alone.

Drawdown

Drawdown shows how deep the strategy can fall from a peak. A strategy can look good on returns and still be unusable if drawdowns exceed your tolerance.

Trade frequency

Frequency affects operational load and execution quality. A strategy that trades too often may be harder to monitor and more sensitive to costs.

Average hold time

Hold time influences exposure, session risk, and whether your execution process matches the strategy design.

Profit factor (context, not obsession)

Profit factor can be useful, but only when read alongside expectancy, drawdown, and sample size.

Sample size and regime coverage

A good result on a tiny sample is not strong evidence. Include multiple market conditions where possible.

Common Backtesting Mistakes

1) Overfitting

Overfitting happens when rules or thresholds are tuned too closely to historical data. The strategy may look strong in test results and fail in new conditions.

For a practical review workflow, compare baseline tests and one-change retests in Backtesting Explained: How to Test a Trading Strategy Safely.

2) Data leakage / hindsight bias

If your rule accidentally uses information that would not have been known at the time of entry, the test is invalid.

3) Ignoring execution assumptions

Entries, stops, and targets need realistic logic. Unrealistic fills can make a strategy look stronger than it is.

4) Optimizing too early

First validate that the strategy logic makes sense. Optimization before validation usually creates a fragile system.

5) Using one metric as the decision-maker

No single metric can summarize strategy quality. Use a balanced review.

Practical Review Framework (Use After Every Test)

Rule quality

Did the strategy take trades it should have taken, and avoid trades it should have avoided?

Risk quality

Did drawdown and loss size stay within your stated constraints?

Operational quality

Would you realistically be able to execute this strategy consistently in paper trading or live conditions?

Improvement priority

What is the one highest-impact change for the next test cycle?

When to Move to Paper Trading

Move from backtesting to paper trading when:

The blueprint is stable enough to explain clearly
Risk controls are defined and tested
The strategy has acceptable behavior across the sample you used
You have a checklist for execution review

Then use the paper trading landing page and the blog post Paper Trading vs Live Trading: When to Switch.

FAQ

How many backtests should I run before paper trading?

There is no fixed number. Move when the strategy logic is stable, risk is defined, and changes are becoming incremental rather than random.

Should I optimize parameters aggressively?

Usually no. Optimize carefully and document why each change exists. The goal is robustness, not a perfect historical curve.

Is a high win rate always better?

No. Win rate without payoff structure and drawdown context can be misleading.

Build your trading bot workflow with structure

Use Setup.Cash to create, backtest, and paper trade rule-based strategies without relying on guesswork. Not financial advice. Trading involves risk.

Open Builder Compare Plans