Docs
Backtesting & Metrics
Learn how to run a useful backtest in Setup.Cash, interpret core metrics, and avoid common testing mistakes like overfitting and data leakage.
Backtesting is the process of running a strategy against historical data to evaluate how the rules behaved. In Setup.Cash, the purpose of backtesting is not to prove future profits. It is to validate logic, inspect edge cases, and improve the strategy workflow before paper trading or live use.
This page explains how to run backtests that are actually useful.
What Backtesting Can and Cannot Do
What backtesting is good for
- Checking whether your blueprint rules execute as expected
- Comparing strategy variants using a consistent process
- Measuring drawdown, trade frequency, and risk assumptions
- Finding weak conditions, missing filters, and invalidation problems
What backtesting cannot guarantee
- Future profitability
- Exact live execution behavior
- Performance under unseen market regimes
- Protection from overfitting if the process is weak
This is why backtesting should be followed by paper trading and ongoing review.
Before You Start a Backtest
Backtests are only as good as the blueprint and assumptions behind them. Before running a test, confirm:
- The blueprint is clearly defined (docs/blueprints)
- Risk rules are part of the strategy (not added later)
- You know what question you are testing
- You are not changing multiple variables at once
Example of a good test question:
Does adding an ATR-based stop improve drawdown consistency without reducing expectancy too much?
Example of a weak test question:
Can I make this strategy profitable if I tweak enough settings?
Core Backtesting Workflow in Setup.Cash
1) Run a baseline test
Start with the simplest valid blueprint version. Save the results. This baseline becomes the reference point for all later changes.
2) Inspect trade behavior, not just summary metrics
Do not stop at headline numbers. Review:
- Where entries occurred
- Whether exits match the defined rules
- Losing streak behavior
- Trade clustering during volatile periods
- Any conditions that produced low-quality setups
3) Track one change at a time
If you change the trend filter, stop method, and session filter in one test, you lose diagnostic power. Make one change, then compare against baseline.
4) Record assumptions and outcomes
Document what changed and why. Good records are part of the strategy itself.
Metrics That Matter (and Why)
Expectancy
Expectancy helps you understand the average outcome per trade based on win rate and payoff distribution. It is more informative than win rate alone.
Drawdown
Drawdown shows how deep the strategy can fall from a peak. A strategy can look good on returns and still be unusable if drawdowns exceed your tolerance.
Trade frequency
Frequency affects operational load and execution quality. A strategy that trades too often may be harder to monitor and more sensitive to costs.
Average hold time
Hold time influences exposure, session risk, and whether your execution process matches the strategy design.
Profit factor (context, not obsession)
Profit factor can be useful, but only when read alongside expectancy, drawdown, and sample size.
Sample size and regime coverage
A good result on a tiny sample is not strong evidence. Include multiple market conditions where possible.
Common Backtesting Mistakes
1) Overfitting
Overfitting happens when rules or thresholds are tuned too closely to historical data. The strategy may look strong in test results and fail in new conditions.
For a practical review workflow, compare baseline tests and one-change retests in Backtesting Explained: How to Test a Trading Strategy Safely.
2) Data leakage / hindsight bias
If your rule accidentally uses information that would not have been known at the time of entry, the test is invalid.
3) Ignoring execution assumptions
Entries, stops, and targets need realistic logic. Unrealistic fills can make a strategy look stronger than it is.
4) Optimizing too early
First validate that the strategy logic makes sense. Optimization before validation usually creates a fragile system.
5) Using one metric as the decision-maker
No single metric can summarize strategy quality. Use a balanced review.
Practical Review Framework (Use After Every Test)
Rule quality
Did the strategy take trades it should have taken, and avoid trades it should have avoided?
Risk quality
Did drawdown and loss size stay within your stated constraints?
Operational quality
Would you realistically be able to execute this strategy consistently in paper trading or live conditions?
Improvement priority
What is the one highest-impact change for the next test cycle?
When to Move to Paper Trading
Move from backtesting to paper trading when:
- The blueprint is stable enough to explain clearly
- Risk controls are defined and tested
- The strategy has acceptable behavior across the sample you used
- You have a checklist for execution review
Then use the paper trading landing page and the blog post Paper Trading vs Live Trading: When to Switch.
FAQ
How many backtests should I run before paper trading?
There is no fixed number. Move when the strategy logic is stable, risk is defined, and changes are becoming incremental rather than random.
Should I optimize parameters aggressively?
Usually no. Optimize carefully and document why each change exists. The goal is robustness, not a perfect historical curve.
Is a high win rate always better?
No. Win rate without payoff structure and drawdown context can be misleading.
Further Reading
- Backtesting Explained: How to Test a Trading Strategy Safely
- Build a Trading Bot
- Strategy Builder
- Getting Started
External references:
Not financial advice. Trading involves risk.
Start here
Build your trading bot workflow with structure
Use Setup.Cash to create, backtest, and paper trade rule-based strategies without relying on guesswork. Not financial advice. Trading involves risk.