Built a full pairs trading system on Russell 3000 using a 4-layer Transformer Encoder for supervised trade signal classification. Discovered and eliminated three compounding biases that inflated returns from 617% → 3.97%. Current unbiased result: +3.97% total return, 0.91 Sharpe, -19.3% max drawdown over 2023–2025. Strategy has no edge in current form — documented why, and what fixes it.
| Component | Implementation |
|---|---|
| Universe | Russell 3000 (~3,556 symbols) |
| Pair Selection | Engle-Granger cointegration + Hurst exponent + half-life filter (4–120 days) |
| Signal Model | 4-layer Transformer Encoder (8-head attention, d_model=128) — supervised binary classification |
| Entry Signal | Z-score |
| Exit Signal | Z-score |
| Position Sizing | Kelly-inspired: base 4%, max 10%, scaled by volatility + profit protection |
| Risk Manager | Drawdown kill switch (15%), single-day circuit breaker (5%), intraday pause (-3%) |
| Costs | Bid-ask spread + commission + market impact (total cost impact: 60% of capital on 5,600 trades) |
All three biases eliminated before this run.
| Metric | Value |
|---|---|
| Total Return (2023–2025) | +3.97% |
| Annualized Return | ~1.6% |
| Sharpe Ratio | 0.91 |
| Max Drawdown | -19.29% |
| Total Trades | 416 |
| Win Rate | 57.69% |
| Profit Factor | 1.01 |
Interpretation: Profit factor of 1.01 means the strategy is essentially break-even after transaction costs. No real edge in current form on out-of-sample data.