fix(s6): suspend S6 sizing coefficients — point-in-time overfitting

The S6 multipliers (B3→2.0×, B6→1.5×, etc.) were derived from the ~600-trade
window ending 2026-04-19. ~100+ trades since that window show regime reversal —
bucket PnL rankings did not hold out-of-sample. Locking historical per-bucket
performance into operational sizing is overfitting at any fixed point-in-time.

Changes:
- green.yml: s6_size_table → null (uniform 1.0× sizing until coefficients
  demonstrate multi-window out-of-sample stability)
- s6_table_path commented out (same reason)
- B4 ban RETAINED: structural exclusion (only gross-negative bucket,
  -$100 gross before fees, R:R 0.80, WR 34.8%) not a time-window call
- AEM MAE_MULT_BY_BUCKET RETAINED: grounded in asset vol characteristics,
  not point-in-time PnL

Infrastructure (routing layer, recompute script, Prefect flow) fully intact.
Re-enable: set s6_table_path or populate s6_size_table once recompute_s6
demonstrates stable multi-window out-of-sample variance (< 20% guard).

Post-mortem note added to CRITICAL_ASSET_PICKING_BRACKETS_VS._ROI_WR_AT_TRADES.md
including system naming clarification (old GREEN vs new-GREEN vs BLUE).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
hjnormey
2026-04-25 14:16:24 +02:00
parent aac4484c0f
commit ea65e2d699
2 changed files with 44 additions and 17 deletions

View File

@@ -433,3 +433,25 @@ The instinct toward S6 for diversification is correct, but the reason is more pr
6. **MEDIUM — B6 validation:** 38 trades, 55% WR, gross +$1,394. Small sample. If B6 validates over next 100 trades, increase multiplier from 1.5× toward 2.0×.
7. **FUTURE — B5 rehabilitation:** B5 has the highest gross alpha of the "fee-loser" buckets (+$2,087 gross, 133 trades). Once fee reduction is addressed (item 3), B5 sizing should be revisited upward from 0.5× toward 1.0×.
---
## ⚠ Post-Mortem: S6 Coefficients Suspended (2026-04-22)
**Diagnosis:** The S6 sizing table above (B3→2.0×, B6→1.5×, etc.) was derived from the ~600-trade window ending 2026-04-19. The ~100+ trades since that window have shown a regime reversal — bucket performance rankings have not held out-of-sample. **Applying fixed point-in-time bucket PnL rankings as operational sizing multipliers is overfitting.** The research window captured a real signal, but any single window is insufficient to justify permanent coefficient locks.
**What this changes:**
- `s6_size_table` is set to `null` in `prod/configs/green.yml` — all non-B4 buckets size uniformly at 1.0× until the signal demonstrates multi-window stability.
- **B4 ban is retained.** B4 is the only gross-negative bucket (-$100 before fees). That is a structural characteristic of large-cap low-vol assets with moderate BTC correlation, not a point-in-time performance artifact.
- **AEM per-bucket MAE_MULT table is retained.** Those values are grounded in asset volatility characteristics (B3 naturally rides past 3.5×ATR before TP; B6 extreme vol), not historical PnL.
**What's not changed:**
- The bucket *assignments* (KMeans k=7) remain valid. Overfitting was in the *sizing coefficients*, not the asset groupings.
- All infrastructure (routing layer, recompute script, Prefect flow) is intact and ready to re-enable once out-of-sample validation is possible.
**Re-enable path:** When `recompute_s6_coefficients.py` has run across multiple non-overlapping windows and the variance guard consistently passes (< 20% bucket PnL move between windows), the coefficients can be re-applied by setting `s6_table_path` in `green.yml`.
**Note on system naming (2026-04-22 clarification):**
- "GREEN" (old) = Python-based standalone, now running as BLUE semi-shadow. Profit data cited in this doc is from this system.
- "new-GREEN" = Nautilus Trader-based system (sprint target). Pre-S6 and therefore unaffected by the coefficient issue.
- BLUE and old-GREEN remain in profit pre-S6. The overfit risk is specific to new-GREEN if S6 coefficients were applied.