Files

hjnormey 01c19662cb initial: import DOLPHIN baseline 2026-04-21 from dolphinng5_predict working tree

Includes core prod + GREEN/BLUE subsystems:
- prod/ (BLUE harness, configs, scripts, docs)
- nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved)
- adaptive_exit/ (AEM engine + models/bucket_assignments.pkl)
- Observability/ (EsoF advisor, TUI, dashboards)
- external_factors/ (EsoF producer)
- mc_forewarning_qlabs_fork/ (MC regime/envelope)

Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.

2026-04-21 16:58:38 +02:00

4.8 KiB

Executable File

Raw Blame History

Agent Change Analysis Report

Date: 2026-03-21 Author: Claude Code audit of Antigravity AI agent document

Executive Summary

FORK TEST RESULT: 0/2 PASS — Both fork tests produce ~12% ROI vs gold 181.81%.

The agent's claims are PARTIALLY correct in diagnosis but the remediation INTRODUCES new regressions.

Test Results

Test	ROI	Trades	DD	Verdict
D_LIQ_GOLD perfect-maker (fork)	+12.83%	1739	26.24%	FAIL ✗
D_LIQ_GOLD stochastic 0.62 (fork)	+5.92%	1739	27.95%	FAIL ✗
replicate_181 style (no hazard call, float64, static vol_p60)	+111.03%	1959	16.89%	FAIL ✗
Gold reference	+181.81%	2155	17.65%	—

Root Cause Analysis

Cause 1: `set_esoteric_hazard_multiplier(0.0)` in exp_shared.run_backtest

The agent added eng.set_esoteric_hazard_multiplier(0.0) to exp_shared.run_backtest. With the new ceiling=10.0:

Sets base_max_leverage = 10.0 on a D_LIQ engine designed for 8.0 soft / 9.0 hard
On unboosted days: effective leverage = 9.0x (vs certified 8.0x)
5-day comparison confirms: TEST A at 9.0x amplifies bad-day losses more than good-day gains

Effect: Variance increase that over 56 days results in 12.83% vs 111% (replicate style)

Cause 2: Rolling vol_p60 (lower threshold on some days)

The rolling vol_p60 can be LOWER than static vol_p60 (especially after quiet days like Jan 1 holiday). This allows more bars to trade in low-quality signal environments.

Day 2 (Jan 1): TEST A vol_ok=1588 bars vs TEST B=791 (2× more eligible, vp60=0.000099 vs 0.000121). More trades on bad signal days → net negative over 56 days.

Cause 3: Pre-existing regression (111% vs 181.81%)

Even WITHOUT the agent's specific exp_shared changes, the current code produces 111%/1959 vs gold 181.81%/2155. This regression predates the agent's changes and stems from:

ACB change: fund_dbt_btc (Deribit funding) now preferred over funding_btc. If Deribit funding is less bearish in Dec-Feb 2026 period, ACB gives lower boost → lower leverage → lower ROI.
Orchestrator refactoring: 277+ lines added (begin_day/step_bar/end_day), 68 removed. Subtle behavioral changes may have affected trade quality.

Verdict on Agent's Claims

Claim	Assessment
A. Ceiling_lev 6→10	CORRECT in concept: old 6.0 DID suppress D_LIQ below certified 8.0x. But fix leaves `set_esoteric_hazard_multiplier(0.0)` in run_backtest, which now drives to 9.0x (not 8.0x) — over-correction.
B. MC proportional 0.8x	NEUTRAL for no-forewarner runs (forewarner=None → never called).
C. Rolling vol_p60	NEGATIVE: rolling vol_p60 can be lower than static, enabling trading in worse signal environments.
D. Float32 / lazy OB	NEUTRAL for trade count (float32 at $50k has sufficient precision; OB mock data is date-agnostic).

Confirmed Mechanism (leverage verification)

Direct Python verification of the hazard call effect:

BEFORE set_esoteric_hazard_multiplier(0.0) [ceiling=10.0]:
  base_max_leverage = 8.0  (certified D_LIQ soft cap)
  bet_sizer.max_leverage = 8.0
  abs_max_leverage = 9.0   (certified D_LIQ hard cap)

AFTER set_esoteric_hazard_multiplier(0.0) [ceiling=10.0]:
  base_max_leverage = 10.0  ← overridden!
  bet_sizer.max_leverage = 10.0  ← overridden!
  abs_max_leverage = 9.0   (unchanged — abs is not touched by hazard call)

Result: effective leverage = min(base=10, abs=9) = 9.0x on ALL days. D_LIQ is certified at 8.0x soft / 9.0x hard. The hard cap should only trigger on proxy_B boost events. The hazard call unconditionally removes the 8.0x soft limit — every day runs at 9.0x.

The Real Problem

The gold standard (181.81%) was certified using code where set_esoteric_hazard_multiplier was NOT called in the backtest loop. The replicate_181_gold.py script (which doesn't call it) was the certification vehicle.

The agent's fix (ceiling 6→10) was meant to address the case WHERE set_esoteric_hazard_multiplier(0.0) IS called. With ceiling=6.0: sets base=6.0 < D_LIQ's 8.0 → suppresses leverage. With ceiling=10.0: sets base=10.0 > D_LIQ's abs=9.0 → raises leverage beyond certified. Both are wrong.

Correct fix: Remove eng.set_esoteric_hazard_multiplier(0.0) from exp_shared.run_backtest, OR don't call it when using D_LIQ (which manages its own leverage via extended_soft_cap/extended_abs_cap).

Gold Standard Status

The gold standard (181.81%/2155/DD=17.65%) CANNOT be replicated from current code via ANY tested path:

exp_shared.run_backtest: 12.83%/1739 (agent's hazard call + rolling vol_p60 + 9x leverage)
replicate_181_gold.py style: 111.03%/1959 (pre-existing regression from orchestrator/ACB changes)

The agent correctly identified that the codebase had regressed but their fix is incomplete.

4.8 KiB Executable File Raw Blame History Unescape Escape