Includes core prod + GREEN/BLUE subsystems: - prod/ (BLUE harness, configs, scripts, docs) - nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved) - adaptive_exit/ (AEM engine + models/bucket_assignments.pkl) - Observability/ (EsoF advisor, TUI, dashboards) - external_factors/ (EsoF producer) - mc_forewarning_qlabs_fork/ (MC regime/envelope) Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
5.4 KiB
Executable File
DOLPHIN NG — Performance Registry
Canonical benchmark tiers. Update this file whenever a new result becomes the production target.
GOLD= what the system must beat or match.SILVER= previous gold.BRONZE= regression floor.
🥇 D_LIQ_GOLD — Active Production Candidate (2026-03-15)
| Metric | Value | vs prev GOLD | vs BRONZE |
|---|---|---|---|
| ROI | 181.81% | +85.26 pp | +93.26 pp |
| DD | 17.65% | +3.33 pp | +2.60 pp |
| Calmar | 10.30 | vs 6.74 | vs 5.88 |
| Trades | 2155 | identical | identical |
| avg_leverage | 4.09x | — | — |
| liquidation_stops | 1 (0.05%) | — | — |
Engine: LiquidationGuardEngine(soft=8x, hard=9x, mc_ref=5x, margin_buffer=0.95, adaptive_beta=True)
Factory: create_d_liq_engine(**engine_kwargs) — also create_boost_engine() default
Module: nautilus_dolphin/nautilus/proxy_boost_engine.py
Config key: engine.boost_mode = "d_liq" (now DEFAULT_BOOST_MODE)
Mechanism:
- Inherits
adaptive_betascale_boost from AdaptiveBoostEngine (GOLD) - Leverage ceiling raised to 8x soft / 9x hard (from 5x/6x)
- MC-Forewarner assessed at 5x reference (decoupled) → 0 RED/ORANGE/halted days
- Liquidation floor stop at 10.6% adverse move (= 1/9 × 0.95) — prevents exchange force-close
- DD plateau: each +1x above 7x costs only +0.12pp DD (vs +2.6pp for 5→6x)
Validation (exp9b, 2026-03-15):
- All 4 leverage configs compared vs unguarded (exp9): B/C/D all improved ROI + reduced DD
- E (9/10x): 5 liquidation stops → cascade → dead; D (8/9x) is the sweet spot
pytest -m slow tests/test_proxy_boost_production.py→ 9/9 PASSED (2026-03-15)- MC completely silent: 0 RED, 0 ORANGE, 0 halted across 56 days at 8/9x
- Trade count identical to Silver (2155) — no entry/exit timing change
Compounding ($25k, 56-day periods):
| Periods | ~Time | Value |
|---|---|---|
| 3 | ~5 mo | $559,493 |
| 6 | ~1 yr | $12,521,315 |
| 12 | ~2 yr | $6,271,333,381 |
🥈 GOLD (prev) — Former Production (demoted 2026-03-15)
| Metric | Value |
|---|---|
| ROI | 96.55% |
| DD | 14.32% |
| Calmar | 6.74 |
| Trades | 2155 |
| scale_mean | 1.088 |
| alpha_eff_mean | 1.429 |
Engine: AdaptiveBoostEngine(threshold=0.35, alpha=1.0, adaptive_beta=True)
Factory: create_boost_engine(mode='adaptive_beta') — non-default, opt-in for conservative/quiet-regime use
Validation: pytest -m slow tests/test_proxy_boost_production.py → 7/7 PASSED 2026-03-15
🥉 BRONZE — Regression Floor (former silver, 2026-03-15)
| Metric | Value |
|---|---|
| ROI | 88.55% |
| PF | 1.215 |
| DD | 15.05% |
| Sharpe | 4.38 |
| Trades | 2155 |
Engine: NDAlphaEngine (no proxy_B boost)
Equivalent factory call: create_boost_engine(mode='none', ...)
Validation script: test_pf_dynamic_beta_validate.py
Bronze is the absolute regression floor. Falling below Bronze on both ROI and DD is a failure.
All Boost Modes (exp8 results, 2026-03-14)
| mode | ROI% | DD% | ΔDD | ΔROI | Notes |
|---|---|---|---|---|---|
none (Bronze) |
88.55 | 15.05 | — | — | Baseline |
fixed |
93.61 | 14.51 | −0.54 | +5.06 | thr=0.35, a=1.0 |
adaptive_alpha |
93.40 | 14.51 | −0.54 | +4.86 | alpha×boost |
adaptive_thr |
94.13 | 14.51 | −0.54 | +5.58 | thr÷boost |
adaptive_both |
94.11 | 14.51 | −0.54 | +5.57 | both combined |
adaptive_beta ⭐ |
96.55 | 14.32 | −0.72 | +8.00 | alpha×(1+day_beta) — prev GOLD |
Extended Leverage Configs (exp9b results, 2026-03-15)
| Config | ROI% | DD% | Calmar | liq_stops | Notes |
|---|---|---|---|---|---|
| GOLD (5/6x) | 96.55 | 14.32 | 6.74 | 0 | adaptive_beta baseline |
| B_liq (6/7x) | 124.01 | 15.97 | 7.77 | 1 | improved vs unguarded |
| C_liq (7/8x) | 155.60 | 17.18 | 9.05 | 1 | improved vs unguarded |
| D_liq (8/9x) | 181.81 | 17.65 | 10.30 | 1 | D_LIQ_GOLD |
| E_liq (9/10x) | 155.88 | 31.79 | 4.90 | 5 | cascade — dead |
Test Suite
# Fast unit tests only (no data needed, ~5 seconds)
pytest tests/test_proxy_boost_production.py -m "not slow" -v
# Full e2e regression (55-day backtests, ~60 minutes)
pytest tests/test_proxy_boost_production.py -m slow -v
Unit tests: ~40 (factory, engine, extended leverage, liquidation guard, actor import) E2E tests: 9 (baseline + 5 boost modes + winner-beats-baseline + D_liq repro + MC silent)
Last full run: 2026-03-15 — 9/9 PASSED, exit code 0 (50:20)
Promotion Checklist
To promote a new result to D_LIQ_GOLD (production):
- Beats prev GOLD on ROI (+85pp); DD increased +3.33pp but Calmar +53% — acceptable
- Trade count identical (2155) — no re-entry cascade
- MC completely silent at mc_ref=5.0 — 0 RED/ORANGE/halted
- liquidation_stops=1 (0.05%) — negligible, no cascade
pytest -m slowpasses — 9/9 PASSED (2026-03-15, 50:20)- Updated Registry.md, memory/benchmarks.md, memory/MEMORY.md
create_d_liq_engine()and classes added to proxy_boost_engine.py- Wire
create_d_liq_engineinto DolphinActor as configurable option