I built a quant engine based on 20 years of OOS data. Tear my methodology apart.

I’ve spent the last year trying to automate Wyckoff institutional accumulation logic and a mean reversion engine. I just finished a 20 year validation run from 2006 to 2026 and I’m looking for some honest peer review from people who actually know how to code and backtest.

The basic stats for 2006 to 2026:
Total Signals: 18,808 (all out of sample)
Combined CAGR: 12.55 percent (this is gross of the sub fee, but net of 10bps for slippage and costs)
Max Drawdown: 32.04 percent (it survived 2008 and 2020 without blowing up)
Sharpe: 0.729
Alpha: 0.509 percent per signal (based on a Carhart 4 factor regression)

How I tried to keep it honest:

  1. Survivorship Bias: The universe includes 412 delisted stocks. If a company went bankrupt in 2008, it is in the data.
  2. Out of Sample: I used a walk forward framework, training on 2006 to 2015 and testing on 2016 to 2025.
  3. No Black Box: It is based on Wyckoff principles like Accumulation and Springs. It is just tracking volume and price action where big money leaves footprints.
  4. Math: I applied Bonferroni correction and Block Bootstrap to make sure the win rate isn’t just a lucky streak.

The Catch:
The 12.55 percent is gross of subscription costs. If you have a small 10k account, the fees are going to eat a huge chunk of your gains. This system really only starts to beat the benchmarks once your capital is high enough that the overhead doesn’t matter.

What am I missing?
I’m looking for holes in the logic. I uploaded the full validation suite, signal data, and factor data to GitHub so anyone can actually reproduce these numbers. I am not sharing the proprietary source code for the engine itself, but all the outputs are there to be checked.

GitHub for verification: https://github.com/signal-validation/krentium

submitted by /u/PracticalOil9183 to r/algotrading
[link] [comments]

SOURCE