fix(s6): suspend S6 sizing coefficients — point-in-time overfitting

The S6 multipliers (B3→2.0×, B6→1.5×, etc.) were derived from the ~600-trade window ending 2026-04-19. ~100+ trades since that window show regime reversal — bucket PnL rankings did not hold out-of-sample. Locking historical per-bucket performance into operational sizing is overfitting at any fixed point-in-time. Changes: - green.yml: s6_size_table → null (uniform 1.0× sizing until coefficients demonstrate multi-window out-of-sample stability) - s6_table_path commented out (same reason) - B4 ban RETAINED: structural exclusion (only gross-negative bucket, -$100 gross before fees, R:R 0.80, WR 34.8%) not a time-window call - AEM MAE_MULT_BY_BUCKET RETAINED: grounded in asset vol characteristics, not point-in-time PnL Infrastructure (routing layer, recompute script, Prefect flow) fully intact. Re-enable: set s6_table_path or populate s6_size_table once recompute_s6 demonstrates stable multi-window out-of-sample variance (< 20% guard). Post-mortem note added to CRITICAL_ASSET_PICKING_BRACKETS_VS._ROI_WR_AT_TRADES.md including system naming clarification (old GREEN vs new-GREEN vs BLUE). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-25 14:16:24 +02:00
parent aac4484c0f
commit ea65e2d699
2 changed files with 44 additions and 17 deletions
--- a/prod/docs/CRITICAL_ASSET_PICKING_BRACKETS_VS._ROI_WR_AT_TRADES.md
+++ b/prod/docs/CRITICAL_ASSET_PICKING_BRACKETS_VS._ROI_WR_AT_TRADES.md
@@ -433,3 +433,25 @@ The instinct toward S6 for diversification is correct, but the reason is more pr
 6. **MEDIUM — B6 validation:** 38 trades, 55% WR, gross +$1,394. Small sample. If B6 validates over next 100 trades, increase multiplier from 1.5× toward 2.0×.

 7. **FUTURE — B5 rehabilitation:** B5 has the highest gross alpha of the "fee-loser" buckets (+$2,087 gross, 133 trades). Once fee reduction is addressed (item 3), B5 sizing should be revisited upward from 0.5× toward 1.0×.
+
+---
+
+## ⚠ Post-Mortem: S6 Coefficients Suspended (2026-04-22)
+
+**Diagnosis:** The S6 sizing table above (B3→2.0×, B6→1.5×, etc.) was derived from the ~600-trade window ending 2026-04-19. The ~100+ trades since that window have shown a regime reversal — bucket performance rankings have not held out-of-sample. **Applying fixed point-in-time bucket PnL rankings as operational sizing multipliers is overfitting.** The research window captured a real signal, but any single window is insufficient to justify permanent coefficient locks.
+
+**What this changes:**
+- `s6_size_table` is set to `null` in `prod/configs/green.yml` — all non-B4 buckets size uniformly at 1.0× until the signal demonstrates multi-window stability.
+- **B4 ban is retained.** B4 is the only gross-negative bucket (-$100 before fees). That is a structural characteristic of large-cap low-vol assets with moderate BTC correlation, not a point-in-time performance artifact.
+- **AEM per-bucket MAE_MULT table is retained.** Those values are grounded in asset volatility characteristics (B3 naturally rides past 3.5×ATR before TP; B6 extreme vol), not historical PnL.
+
+**What's not changed:**
+- The bucket *assignments* (KMeans k=7) remain valid. Overfitting was in the *sizing coefficients*, not the asset groupings.
+- All infrastructure (routing layer, recompute script, Prefect flow) is intact and ready to re-enable once out-of-sample validation is possible.
+
+**Re-enable path:** When `recompute_s6_coefficients.py` has run across multiple non-overlapping windows and the variance guard consistently passes (< 20% bucket PnL move between windows), the coefficients can be re-applied by setting `s6_table_path` in `green.yml`.
+
+**Note on system naming (2026-04-22 clarification):**
+- "GREEN" (old) = Python-based standalone, now running as BLUE semi-shadow. Profit data cited in this doc is from this system.
+- "new-GREEN" = Nautilus Trader-based system (sprint target). Pre-S6 and therefore unaffected by the coefficient issue.
+- BLUE and old-GREEN remain in profit pre-S6. The overfit risk is specific to new-GREEN if S6 coefficients were applied.