Concepts

Backtesting: Before You Trust Any Strategy, Read This

A backtest that looks good is not the same as a backtest that means something. Here's how to tell the difference — before you put any money behind it.

June 16, 2026

Someone shows you a backtest. The equity curve goes up and to the right. The annual returns look strong. The drawdowns are manageable. It looks like evidence.

It isn’t. Not yet.

A backtest is a simulation — a set of rules applied to historical price data to show what would have happened if you’d followed those rules in the past. The output is a hypothetical track record, not a real one. That distinction matters enormously, because there are at least five systematic ways a backtest can produce results that look credible and aren’t. Every serious investor should be able to recognize them.

Core Principle

A backtest doesn’t tell you a strategy works.
It tells you the strategy would have worked in the past, on that data, with those parameters.

That’s a much weaker claim — and the distance between those two things is where most investing decisions go wrong. The job of a well-designed backtest is not to prove the strategy works. It’s to give the strategy every honest opportunity to fail before real money is at stake.

The Five Ways a Backtest Can Lie

These aren’t exotic edge cases. They’re the most common reasons that a strategy performs brilliantly in simulation and fails in practice. Each one has its own post in the Concepts library. What follows is the version you need before reading any of them.

Failure Mode	What It Means	The Question to Ask
Survivorship Bias	The backtest only included companies that still exist today. The ones that went bankrupt, got delisted, or quietly failed aren’t in the data. The universe is pre-filtered to winners before the test even begins.	Does the historical dataset include companies that no longer exist?
Look-Ahead Bias	The strategy used information that wouldn’t have been available at the time the trade was made — earnings figures released after market close, revised economic data, end-of-day prices used to generate intraday signals. The strategy knew things it couldn’t have known.	Could every data point used have been known at the exact moment the trade was executed?
Overfitting	The strategy was tuned so precisely to historical data that it’s essentially memorized the past rather than found a pattern in it. Change the parameters slightly and the results collapse. A strategy that only works in one exact configuration hasn’t found edge — it’s found a coincidence.	What happens to the results if the parameters change by 10% in either direction?
Ignoring Transaction Costs	Commissions, bid-ask spreads, and slippage are real costs that compound quickly for strategies that trade frequently. Most backtests assume a perfect fill at the closing price. Real markets don’t work that way — especially when size or speed is involved.	Were realistic execution costs applied to every trade in the simulation?
Data Snooping	The researcher tested hundreds of parameter combinations, identified the one that looked best, and presented it as if that configuration was chosen in advance. The more combinations you test, the more likely one will look good by pure chance. This is the most common failure mode — and the least acknowledged.	How many parameter combinations were tested before arriving at these results?

The most dangerous backtest is the one that looks perfect. Real edge is messy. It shows losing periods, sensitivity to parameters, and results that degrade slightly when tested on data the strategy has never seen. A curve that never breaks down hasn’t been properly tested — it’s been tuned until all the inconvenient evidence disappeared.

What a Trustworthy Backtest Includes

No backtest eliminates all uncertainty. But a well-constructed one makes the uncertainty visible rather than hiding it. When evaluating any strategy — including every Lab we publish here — look for three things.

Out-of-sample results. The backtest should be split: one portion used to develop the strategy, a separate portion held back and tested afterward. If the results only hold on the data the strategy was built on, that’s not validation — it’s circular reasoning.

Realistic transaction costs. Every trade should carry a friction estimate that reflects what execution actually costs — not a theoretical best-case fill. Strategies that look profitable before costs and unprofitable after them have no edge. They have overhead.

Parameter stability. The results should hold up if the inputs shift slightly. If a two-day difference in a moving average window changes the outcome from profitable to catastrophic, the strategy hasn’t found a real pattern. It’s found a local artifact in one specific dataset.

Consider the difference between a strategy that returns 18% annually when tested on the same data it was built from, and 9% when tested on data it’s never seen. That’s not a failure — that’s an honest result. The degradation is expected and the strategy may still be worth pursuing. Now consider a strategy that returns 22% on familiar data and loses money on new data. That’s not a strategy. That’s a historical artifact dressed up as one.

The gap between those two scenarios is what out-of-sample testing reveals. Without it, you have no way of knowing which one you’re looking at.

These are the minimum conditions for taking a backtest seriously. They don’t guarantee the strategy will work going forward — nothing does. But their absence is a reliable signal that the results aren’t worth trusting.

This post is the entry point for the Concepts library. Each of the five failure modes above has its own deep-dive post. Every Lab verdict published on this site links back to this framework — because a verdict means nothing without knowing how to evaluate the methodology behind it.

Backtesting: Before You Trust Any Strategy, Read This

A backtest doesn’t tell you a strategy works.
It tells you the strategy would have worked in the past, on that data, with those parameters.

The Five Ways a Backtest Can Lie

What a Trustworthy Backtest Includes

Stay Current

Notes From the Lab

Research & Analysis

Standard Deviation: What Risk Actually Means When It’s Your Money