fix(s6): suspend S6 sizing coefficients — point-in-time overfitting

The S6 multipliers (B3→2.0×, B6→1.5×, etc.) were derived from the ~600-trade window ending 2026-04-19. ~100+ trades since that window show regime reversal — bucket PnL rankings did not hold out-of-sample. Locking historical per-bucket performance into operational sizing is overfitting at any fixed point-in-time. Changes: - green.yml: s6_size_table → null (uniform 1.0× sizing until coefficients demonstrate multi-window out-of-sample stability) - s6_table_path commented out (same reason) - B4 ban RETAINED: structural exclusion (only gross-negative bucket, -$100 gross before fees, R:R 0.80, WR 34.8%) not a time-window call - AEM MAE_MULT_BY_BUCKET RETAINED: grounded in asset vol characteristics, not point-in-time PnL Infrastructure (routing layer, recompute script, Prefect flow) fully intact. Re-enable: set s6_table_path or populate s6_size_table once recompute_s6 demonstrates stable multi-window out-of-sample variance (< 20% guard). Post-mortem note added to CRITICAL_ASSET_PICKING_BRACKETS_VS._ROI_WR_AT_TRADES.md including system naming clarification (old GREEN vs new-GREEN vs BLUE). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-25 14:16:24 +02:00
parent aac4484c0f
commit ea65e2d699
2 changed files with 44 additions and 17 deletions
--- a/prod/configs/green.yml
+++ b/prod/configs/green.yml
@@ -74,25 +74,30 @@ hazelcast:
 # on GREEN, comment the corresponding key — single kill-switch per feature.
 # ──────────────────────────────────────────────────────────────────────────────

-# Strictly-zero buckets banned at the ranking layer (selector skips them so the
-# slot is handed to the next-best asset — does NOT waste capital with 0× sizing).
-# Fractional buckets (B0/B1/B5) stay tradeable via s6_size_table below.
+# B4 structurally banned at the selector (only gross-negative bucket — R:R 0.80,
+# WR 34.8%, gross PnL -$100 before fees). This is a structural exclusion, not a
+# point-in-time performance call, so it survives the overfitting concern below.
+# Slot is rerouted to next-ranked asset, not wasted with 0× sizing.
 asset_bucket_ban_set: [4]

-# Pointer to the generated S6 coefficient table. prod/scripts/recompute_s6_coefficients.py
-# regenerates this file on the configured cadence (env S6_RECOMPUTE_INTERVAL_DAYS).
-# Bucket → per-entry notional multiplier applied at esf_alpha_orchestrator.py single-site.
-s6_table_path: prod/configs/green_s6_table.yml
-
-# Inline fallback used when s6_table_path is missing (bootstrap / first-run):
-# matches the S6 row from prod/docs/CRITICAL_ASSET_PICKING_BRACKETS_VS._ROI_WR_AT_TRADES.md.
-# B4 is absent (banned above); B2 is absent (= 1.0x, no-op).
-s6_size_table:
-  0: 0.40
-  1: 0.30
-  3: 2.00
-  5: 0.50
-  6: 1.50
+# S6 PER-BUCKET SIZING MULTIPLIERS — INTENTIONALLY DISABLED (2026-04-22).
+#
+# Diagnosis: the coefficients below (B3→2.0×, B6→1.5×, etc.) were derived from
+# ~600 trades ending 2026-04-19. The ~100+ trades *since* that window show a
+# regime reversal — bucket performance has not held. Locking historical point-in-time
+# bucket PnL into operational sizing IS overfitting, regardless of how compelling
+# the research window looked.
+#
+# What stays: B4 ban above (structural, gross-negative even before fees).
+# What's disabled: all non-trivial multipliers — system sizes uniformly (1.0× = no-op)
+# until prod/scripts/recompute_s6_coefficients.py has a stable multi-window dataset
+# that demonstrates the coefficients generalise out-of-sample.
+#
+# Infrastructure (routing layer, recompute script, Prefect flow) is fully intact.
+# Re-enable by un-commenting s6_table_path OR populating s6_size_table below.
+#
+# s6_table_path: prod/configs/green_s6_table.yml   # ← un-comment to re-enable
+s6_size_table: null   # disabled — all buckets size at 1.0× until data stabilises

 # EsoF regime gate lives at the top of _try_entry (orchestrator single-site).
 # mult == 0 → regime-wide skip (no selector/sizer work). UNKNOWN replaces NEUTRAL
--- a/prod/docs/CRITICAL_ASSET_PICKING_BRACKETS_VS._ROI_WR_AT_TRADES.md
+++ b/prod/docs/CRITICAL_ASSET_PICKING_BRACKETS_VS._ROI_WR_AT_TRADES.md
@@ -433,3 +433,25 @@ The instinct toward S6 for diversification is correct, but the reason is more pr
 6. **MEDIUM — B6 validation:** 38 trades, 55% WR, gross +$1,394. Small sample. If B6 validates over next 100 trades, increase multiplier from 1.5× toward 2.0×.

 7. **FUTURE — B5 rehabilitation:** B5 has the highest gross alpha of the "fee-loser" buckets (+$2,087 gross, 133 trades). Once fee reduction is addressed (item 3), B5 sizing should be revisited upward from 0.5× toward 1.0×.
+
+---
+
+## ⚠ Post-Mortem: S6 Coefficients Suspended (2026-04-22)
+
+**Diagnosis:** The S6 sizing table above (B3→2.0×, B6→1.5×, etc.) was derived from the ~600-trade window ending 2026-04-19. The ~100+ trades since that window have shown a regime reversal — bucket performance rankings have not held out-of-sample. **Applying fixed point-in-time bucket PnL rankings as operational sizing multipliers is overfitting.** The research window captured a real signal, but any single window is insufficient to justify permanent coefficient locks.
+
+**What this changes:**
+- `s6_size_table` is set to `null` in `prod/configs/green.yml` — all non-B4 buckets size uniformly at 1.0× until the signal demonstrates multi-window stability.
+- **B4 ban is retained.** B4 is the only gross-negative bucket (-$100 before fees). That is a structural characteristic of large-cap low-vol assets with moderate BTC correlation, not a point-in-time performance artifact.
+- **AEM per-bucket MAE_MULT table is retained.** Those values are grounded in asset volatility characteristics (B3 naturally rides past 3.5×ATR before TP; B6 extreme vol), not historical PnL.
+
+**What's not changed:**
+- The bucket *assignments* (KMeans k=7) remain valid. Overfitting was in the *sizing coefficients*, not the asset groupings.
+- All infrastructure (routing layer, recompute script, Prefect flow) is intact and ready to re-enable once out-of-sample validation is possible.
+
+**Re-enable path:** When `recompute_s6_coefficients.py` has run across multiple non-overlapping windows and the variance guard consistently passes (< 20% bucket PnL move between windows), the coefficients can be re-applied by setting `s6_table_path` in `green.yml`.
+
+**Note on system naming (2026-04-22 clarification):**
+- "GREEN" (old) = Python-based standalone, now running as BLUE semi-shadow. Profit data cited in this doc is from this system.
+- "new-GREEN" = Nautilus Trader-based system (sprint target). Pre-S6 and therefore unaffected by the coefficient issue.
+- BLUE and old-GREEN remain in profit pre-S6. The overfit risk is specific to new-GREEN if S6 coefficients were applied.