The Problem With Most Trading Research
There is no shortage of trading strategies on the internet. Momentum plays, mean reversion setups, options flow signals, machine learning overlays. Forums, newsletters, YouTube channels, and paid communities all publishing their edges — usually with impressive backtest charts attached.
Look closer and a pattern emerges. The results that get published are the ones that worked. The strategies that failed quietly disappear. The methodology is vague or absent entirely. The backtest window was chosen after the fact. The parameters were tuned to the data. The results couldn’t survive out-of-sample testing because out-of-sample testing was never actually run.
This is not bad faith. It’s just what happens when there’s no structural incentive to publish failure. When your reputation is built on looking right, you publish when you look right. The research record that accumulates isn’t a record of what works — it’s a record of what looked good at the time of publishing.
“Most backtests are optimized to look good, not to tell the truth. The difference between those two objectives produces radically different research.”
This is the environment Code Assassins was built to operate differently in. Not because transparency is a marketing angle, but because a research lab that only publishes wins isn’t actually doing research — it’s doing PR.
What Code Assassins Actually Is
Code Assassins is an AI-powered trading research lab. We build systematic strategies, backtest them rigorously, run them in paper trading, and publish the full verdict — hypothesis, methodology, results, and honest assessment — regardless of outcome.
We are not a signal service. We do not tell you what to buy or sell. We are not a newsletter delivering “actionable setups.” We are a research operation, and the product is documented, reproducible research that any serious practitioner can interrogate and build on.
Every Lab we run follows the same six-stage pipeline:
-
01
Hypothesis & Data Pull Define the exact question. Pull clean, point-in-time data. Document assumptions before writing a line of code.
-
02
Backtest Engine VectorBT for research-speed iteration. Backtrader for execution modeling.
-
03
Results Storage Every trade, every metric, every equity curve — stored in full.
-
04
Lab Post Builder The written research document. Hypothesis, methodology, results, and a preliminary read on what the data is actually saying.
-
05
Live Paper Trading The strategy runs in simulated live conditions before any verdict is issued. Backtest performance and live performance are compared directly.
-
06
Publish the Verdict Pass, fail, or inconclusive — with full reasoning. The verdict publishes when the evidence is sufficient, not when the results look impressive.
The pipeline is not ceremonial. It’s the mechanism that makes results trustworthy. Skip a stage and the research degrades. Run all six and you have something worth publishing.
Why We’re Doing This in Public
The research was happening regardless. The question was what to do with it.
Publishing in public forces a discipline that private research doesn’t require. When you know the methodology will be read by people who will try to break it, you write the methodology carefully. When you know the verdict will be public, you don’t quietly shelve the experiments that failed. When reproducibility is the standard, you build the infrastructure to support it.
Accountability creates rigor. That’s the entire argument for doing this in public rather than in a private notebook.
There’s a secondary reason worth naming directly: the systematic trading community deserves better reference material. The existing public literature on backtesting and strategy development is dominated by either academic papers written without practitioners in mind, or retail content optimized for engagement rather than accuracy. The middle ground — rigorous, readable, and honest about failure — is thin.
The Research Infrastructure
The infrastructure behind Code Assassins is built for one thing — producing research results you can trust. We use Python throughout, with industry-standard backtesting tools that separate signal generation from order execution. That separation matters: it’s one of the most common sources of contamination in backtests that look great on paper but fail in live conditions.
The stack is simple by design. Complexity in research infrastructure is a tax on research quality. Every hour spent managing tooling is an hour not spent on the work that actually matters.
What’s Coming
Before the first Lab drops, we’re building the conceptual foundation the research rests on. Not prerequisites — reference material. Posts that exist so Lab writeups can stay focused on results rather than stopping to define standard deviation mid-verdict.
The opening posts cover the concepts every serious systematic trader needs locked down: why most backtests fail before they start, how to actually measure risk, and how to evaluate whether a trade is worth taking in the first place.
After that, the bias series — a complete taxonomy of the ways rigorous-looking research produces wrong conclusions. Each one written as a reference, not a content hit. Dense enough to be useful. Readable enough to finish.
The first Lab verdict follows. Whatever the data says, that’s what we publish.
The Standard We’re Holding Ourselves To
Every result we publish will be reproducible. Every verdict will include the full methodology, not a summary of it. Every failed experiment will publish alongside every successful one. If a strategy can’t survive scrutiny, the right answer is to say so.
This is not a difficult standard to articulate. It’s a difficult standard to maintain when the results are bad, the methodology has a flaw you didn’t catch until after publishing, or the experiment that took three weeks to build comes back inconclusive.
We’re publishing anyway. That’s the whole point.