fix(s6): suspend S6 sizing coefficients — point-in-time overfitting

The S6 multipliers (B3→2.0×, B6→1.5×, etc.) were derived from the ~600-trade
window ending 2026-04-19. ~100+ trades since that window show regime reversal —
bucket PnL rankings did not hold out-of-sample. Locking historical per-bucket
performance into operational sizing is overfitting at any fixed point-in-time.

Changes:
- green.yml: s6_size_table → null (uniform 1.0× sizing until coefficients
  demonstrate multi-window out-of-sample stability)
- s6_table_path commented out (same reason)
- B4 ban RETAINED: structural exclusion (only gross-negative bucket,
  -$100 gross before fees, R:R 0.80, WR 34.8%) not a time-window call
- AEM MAE_MULT_BY_BUCKET RETAINED: grounded in asset vol characteristics,
  not point-in-time PnL

Infrastructure (routing layer, recompute script, Prefect flow) fully intact.
Re-enable: set s6_table_path or populate s6_size_table once recompute_s6
demonstrates stable multi-window out-of-sample variance (< 20% guard).

Post-mortem note added to CRITICAL_ASSET_PICKING_BRACKETS_VS._ROI_WR_AT_TRADES.md
including system naming clarification (old GREEN vs new-GREEN vs BLUE).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
hjnormey
2026-04-25 14:16:24 +02:00
parent aac4484c0f
commit ea65e2d699
2 changed files with 44 additions and 17 deletions

View File

@@ -74,25 +74,30 @@ hazelcast:
# on GREEN, comment the corresponding key — single kill-switch per feature.
# ──────────────────────────────────────────────────────────────────────────────
# Strictly-zero buckets banned at the ranking layer (selector skips them so the
# slot is handed to the next-best asset — does NOT waste capital with 0× sizing).
# Fractional buckets (B0/B1/B5) stay tradeable via s6_size_table below.
# B4 structurally banned at the selector (only gross-negative bucket — R:R 0.80,
# WR 34.8%, gross PnL -$100 before fees). This is a structural exclusion, not a
# point-in-time performance call, so it survives the overfitting concern below.
# Slot is rerouted to next-ranked asset, not wasted with 0× sizing.
asset_bucket_ban_set: [4]
# Pointer to the generated S6 coefficient table. prod/scripts/recompute_s6_coefficients.py
# regenerates this file on the configured cadence (env S6_RECOMPUTE_INTERVAL_DAYS).
# Bucket → per-entry notional multiplier applied at esf_alpha_orchestrator.py single-site.
s6_table_path: prod/configs/green_s6_table.yml
# Inline fallback used when s6_table_path is missing (bootstrap / first-run):
# matches the S6 row from prod/docs/CRITICAL_ASSET_PICKING_BRACKETS_VS._ROI_WR_AT_TRADES.md.
# B4 is absent (banned above); B2 is absent (= 1.0x, no-op).
s6_size_table:
0: 0.40
1: 0.30
3: 2.00
5: 0.50
6: 1.50
# S6 PER-BUCKET SIZING MULTIPLIERS — INTENTIONALLY DISABLED (2026-04-22).
#
# Diagnosis: the coefficients below (B3→2.0×, B6→1.5×, etc.) were derived from
# ~600 trades ending 2026-04-19. The ~100+ trades *since* that window show a
# regime reversal — bucket performance has not held. Locking historical point-in-time
# bucket PnL into operational sizing IS overfitting, regardless of how compelling
# the research window looked.
#
# What stays: B4 ban above (structural, gross-negative even before fees).
# What's disabled: all non-trivial multipliers — system sizes uniformly (1.0× = no-op)
# until prod/scripts/recompute_s6_coefficients.py has a stable multi-window dataset
# that demonstrates the coefficients generalise out-of-sample.
#
# Infrastructure (routing layer, recompute script, Prefect flow) is fully intact.
# Re-enable by un-commenting s6_table_path OR populating s6_size_table below.
#
# s6_table_path: prod/configs/green_s6_table.yml # ← un-comment to re-enable
s6_size_table: null # disabled — all buckets size at 1.0× until data stabilises
# EsoF regime gate lives at the top of _try_entry (orchestrator single-site).
# mult == 0 → regime-wide skip (no selector/sizer work). UNKNOWN replaces NEUTRAL

View File

@@ -433,3 +433,25 @@ The instinct toward S6 for diversification is correct, but the reason is more pr
6. **MEDIUM — B6 validation:** 38 trades, 55% WR, gross +$1,394. Small sample. If B6 validates over next 100 trades, increase multiplier from 1.5× toward 2.0×.
7. **FUTURE — B5 rehabilitation:** B5 has the highest gross alpha of the "fee-loser" buckets (+$2,087 gross, 133 trades). Once fee reduction is addressed (item 3), B5 sizing should be revisited upward from 0.5× toward 1.0×.
---
## ⚠ Post-Mortem: S6 Coefficients Suspended (2026-04-22)
**Diagnosis:** The S6 sizing table above (B3→2.0×, B6→1.5×, etc.) was derived from the ~600-trade window ending 2026-04-19. The ~100+ trades since that window have shown a regime reversal — bucket performance rankings have not held out-of-sample. **Applying fixed point-in-time bucket PnL rankings as operational sizing multipliers is overfitting.** The research window captured a real signal, but any single window is insufficient to justify permanent coefficient locks.
**What this changes:**
- `s6_size_table` is set to `null` in `prod/configs/green.yml` — all non-B4 buckets size uniformly at 1.0× until the signal demonstrates multi-window stability.
- **B4 ban is retained.** B4 is the only gross-negative bucket (-$100 before fees). That is a structural characteristic of large-cap low-vol assets with moderate BTC correlation, not a point-in-time performance artifact.
- **AEM per-bucket MAE_MULT table is retained.** Those values are grounded in asset volatility characteristics (B3 naturally rides past 3.5×ATR before TP; B6 extreme vol), not historical PnL.
**What's not changed:**
- The bucket *assignments* (KMeans k=7) remain valid. Overfitting was in the *sizing coefficients*, not the asset groupings.
- All infrastructure (routing layer, recompute script, Prefect flow) is intact and ready to re-enable once out-of-sample validation is possible.
**Re-enable path:** When `recompute_s6_coefficients.py` has run across multiple non-overlapping windows and the variance guard consistently passes (< 20% bucket PnL move between windows), the coefficients can be re-applied by setting `s6_table_path` in `green.yml`.
**Note on system naming (2026-04-22 clarification):**
- "GREEN" (old) = Python-based standalone, now running as BLUE semi-shadow. Profit data cited in this doc is from this system.
- "new-GREEN" = Nautilus Trader-based system (sprint target). Pre-S6 and therefore unaffected by the coefficient issue.
- BLUE and old-GREEN remain in profit pre-S6. The overfit risk is specific to new-GREEN if S6 coefficients were applied.