Files

hjnormey 01c19662cb initial: import DOLPHIN baseline 2026-04-21 from dolphinng5_predict working tree

Includes core prod + GREEN/BLUE subsystems:
- prod/ (BLUE harness, configs, scripts, docs)
- nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved)
- adaptive_exit/ (AEM engine + models/bucket_assignments.pkl)
- Observability/ (EsoF advisor, TUI, dashboards)
- external_factors/ (EsoF producer)
- mc_forewarning_qlabs_fork/ (MC regime/envelope)

Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.

2026-04-21 16:58:38 +02:00

11 KiB

Executable File

Raw Blame History

proxy_B — Research Filing

Date: 2026-03-14 Status: Closed for direct exploitation; open as modulator candidate Gold baseline: ROI=+88.55%, PF=1.215, DD=15.05%, Sharpe=4.38, Trades=2155

1. Signal Definition

proxy_B = instability_50 - v750_lambda_max_velocity

instability_50: short-window (50-bar) eigenvalue instability in correlation matrix
v750_lambda_max_velocity: long-window (750-bar) max-eigenvalue velocity
Intuition: short-term stress MINUS long-term momentum. When this is high, the eigenspace is rapidly destabilising relative to its recent trend.
Available in 5s scan parquets. Computed in ShadowLoggingEngine.process_day().

2. Discovery & Measurement

Experiment: Precursor sweep (e2e_precursor_auc.py, flint_precursor_sweep.py) Metric: AUC for predicting eigenspace stress events at K=5 bars forward

Window / Signal	AUC (K=5)
proxy_B (inst50 − v750_vel)	0.715
instability_50 alone	~0.65
v750_lambda_max_velocity alone	~0.61
FlintHDVAE latent z (β=0.1)	0.6918
vel_div (entry signal)	baseline

proxy_B leads stress events by ~25 seconds (5-bar horizon on 5s data). It is NOT the entry signal — it measures a different aspect of the eigenspace.

3. Orthogonality to System Signals

Test: Exp 4 — shadow run, 48/2155 trades had valid aligned pb_entry+vd_entry (entry_bar alignment bug: only ~2% of trades yield correctly-matched bar-level values; see Section 6 Technical Note).

Pair	Pearson r	p-value	Spearman rho	Verdict
proxy_B_entry ↔ vel_div_entry	−0.031	0.837	−0.463	Orthogonal (ns)
proxy_B_entry ↔ pnl_frac	+0.166	0.260	+0.158	Not predictive of outcome (ns)
proxy_B_entry ↔ MAE	+0.420	0.003 **	+0.149	Predicts intraday adversity
proxy_B_entry ↔ hold_bars	−0.054	0.717	−0.171	Orthogonal (ns)
proxy_B_max ↔ pnl_frac	+0.066	0.655	−0.379	ns
proxy_B_max ↔ MAE	+0.047	0.750	−0.280	ns

Mann-Whitney (worst-10% pnl vs rest): pb_entry worst=-5.40, rest=+0.27, p=0.183 ns Mann-Whitney (worst-10% MAE vs rest): pb_entry worst=-5.41, rest=+0.27, p=0.115 ns

Critical finding:

proxy_B IS orthogonal to vel_div (the entry signal) — r≈0, ns ✓
proxy_B does NOT predict final trade PnL — r=+0.17, ns ✓ (confirms prior findings)
proxy_B DOES predict intraday adversity (MAE): r=+0.42, p=0.003 ← KEY

Mechanistic interpretation: When proxy_B is high at entry, the trade experiences a worse intraday adverse excursion (deeper MAE). But final PnL is unaffected because the engine's exit logic (TP/max_hold/direction-confirm) successfully navigates through the stress period. This is the complete explanation for why:

Gating on proxy_B removes trades that are temporarily stressed but then RECOVER → hurts
A proxy-coupled stop would cut those recoveries short → reduces DD but also reduces ROI
The signal has genuine information content (AUC=0.715, MAE correlation p=0.003) but the system is ALREADY correctly managing the trades it tags as stressed

4. Experiments Performed

Exp 1 — proxy_B Position Sizing (`exp1_proxy_sizing.py`)

Tests bet_sizer.base_fraction * scale(proxy_B_at_entry).

Config	ROI%	PF	DD%	Sharpe	scale_mean
GOLD	88.55	1.215	15.05	4.38	—
Baseline (no sizing)	88.55	1.2147	15.05	4.378	—
S1 [0.5x–1.5x] w500	91.48	1.1782	16.93	3.528	1.004
S2 [0.25x–2.0x] w500	105.51	1.1537	20.30	2.956	1.133
S3 [0.5x–1.5x] w1000	89.49	1.1763	16.69	3.514	1.000
S4 [0.5x–1.5x] clip	87.13	1.1628	18.03	3.184	1.019

Finding: scale_mean > 1.0 in all configs → proxy_B is more often LOW during trading activity, meaning the engine sizes UP on average. Higher ROI (S2: +17pp) is a leverage effect, not signal quality — PF drops and Sharpe collapses. The signal is anti-correlated with trade quality per unit capital.

Exp 2 — proxy_B Shadow Exit (`exp2_proxy_exit.py`)

Post-hoc test: would exiting when proxy_B < threshold have helped?

Threshold	Trigger rate	AvgDelta%	Early better	Est. ROI
p10	~60% of trades (at ≥1 bar)	−0.15%	37%	−0.96pp
p25	~69%	+0.04%	43%	+0.26pp
p50	~85%	+0.02%	43%	+0.15pp

Note: High trigger rates are mathematically expected — a 120-bar hold has ~100% chance of any bar crossing the p50 level. The signal fires constantly during holds; using it as an exit trigger is noise, not information. Verdict: Holding to natural exit is better. Early exit is weakly beneficial in only 37-43% of cases.

Exp 3 — Longer Window Proxies (`exp3_longer_proxies.py`)

All 5 proxy variants × 3 modes (gate/size/exit) × 3 thresholds. AE validation of top 10 fast-sweep configs.

Config	AE ROI%	PF	DD%	Note
GOLD	88.55	1.215	15.05	—
V50/gate/p50	−21.58	0.822	31.94	Catastrophic
V150/gate/p50	−24.34	0.909	31.97	Catastrophic
B150/gate/p10	−17.37	0.941	29.00	Catastrophic
B150/gate/p25	−1.26	0.996	28.25	Marginal hurt
Exit modes	88.55 (=base)	—	—	0 early exits

Why velocity gates are catastrophic: V50 = instability_50 − v750_velocity and V150 = instability_150 − v750_velocity. The velocity divergence short-minus-long is highly noisy at short windows. Gating on it suppresses large fractions of trades (compound-leverage paradox: each suppressed trade costs more than it saves due to capital compounding).

Exit mode 0-triggers in AE: _try_entry is the wrong hook for exits. The AE exit path goes through exit_manager.evaluate(). Fast-sweep exit approximations are valid; AE validation of exit modes requires exit_manager override.

Exp 5 — Two-pass β DVAE (`exp5_dvae_twopass.py`)

Does high-β first pass → low-β second pass improve latent space?

Variant	AUC	vs baseline	Active dims
A: single-pass β=0.1	0.6918	—	8/8
B: β=4→β=0.1	<0.6918	−0.006	collapsed
C: β=2→β=0.1	<0.6918	negative	partial
D: dual concat β=4‖β=0.1	<0.6918	negative	mixed

Root cause: High-β collapses z_var to 0.008–0.019 (from 0.1+ single-pass) by epoch 10 via KL domination. The collapsed posterior is a worse initialiser than random. β=12 was not tested (β=6 already gave 0/20 active dims).

Exp 4 — Coupling Sweep (`exp4_proxy_coupling.py`) — COMPLETE

155 configs tested in 0.11s (retroactive). Shadow run confirmed: ROI=88.55%, Trades=2155.

DD < 15.05% AND ROI ≥ 84.1% candidates (19 found):

Config	ROI%	DD%	ΔROI	ΔDD	Note
B/pb_entry/thr0.35/a1.0	86.93	14.89	−1.62	−0.15	scale_boost, smean=1.061
E/stop_0.003	89.90	14.91	+1.36	−0.14	pure_stop, 18 triggers
B/pb_entry/thr0.35/a0.5	87.74	14.97	−0.81	−0.08	scale_boost
E/stop_0.005	89.29	14.97	+0.74	−0.07	pure_stop, 11 triggers
E/stop_0.015	89.27	14.97	+0.72	−0.07	pure_stop, 2 triggers
F/stop_0.005/gate_p0.5	88.68	15.03	+0.14	−0.01	gated_stop, 4 triggers

Best per mode: scale_suppress → DD worsens; hold_limit → DD worsens; rising_exit → DD worsens; pure_stop → best legitimate DD reducer; gated_stop → marginal (few triggers).

IMPORTANT CAVEAT: entry_bar alignment bug caused 2107/2155 pb_entry to be NaN (entry_bar appears to store global_bar_idx not per-day ri). The proxy-coupling modes (A, B, F) used median fill-in for 98% of trades → effectively a null test. Only Mode E (pure_stop) is fully valid because it uses MAE computed from shadow hold prices.

Valid conclusion from Exp 4:

A 0.3% retroactive stop (E/stop_0.003) improves BOTH ROI (+1.36pp) and DD (−0.14pp)
Only 18 trades triggered → the improvement is modest but directionally sound
The proxy-coupled stop (Mode F) needs proper entry_bar alignment to test meaningfully
Next step: implement stop_pct parameter in exit_manager for real engine test

5. Core Findings

5.1 The Compound-Leverage Paradox

With dynamic leverage, gating ANY subset of trades (even below-average quality ones) costs ROI because capital that would have compounded is left idle. The break-even requires gated trades to have strongly negative expected value — but the 50.5% win rate means most trades are net-positive.

5.2 Why proxy_B Gating Specifically Hurts

scale_mean > 1.0 in position sizing tests = proxy_B is LOWER during most trading time windows than the neutral baseline. The system naturally avoids high-proxy periods (or avoids entering during them) already. Gating explicitly on high-proxy removes the REMAINING high-proxy trades, which happen to be positive on average.

5.3 The Unresolved Question: MAE vs Final PnL

proxy_B has AUC=0.715 for eigenspace stress prediction. The signal IS predictive of something real. The hypothesis (untested until Exp 4): proxy_B predicts intraday adversity (MAE) but NOT final trade outcome, because the engine's exit logic successfully recovers from intraday stress. If confirmed:

proxy_B fires during the rough patch mid-trade
The trade then recovers to its natural TP/exit
Gating removes trades that look scary but ultimately recover
A tighter retroactive stop ONLY during high-proxy periods might reduce DD without proportionally reducing ROI — if the recovery is systematic

6. Open Research Directions

Priority	Direction	Rationale
HIGH	Exp 4 coupling results	Does gated stop reduce DD without ROI cost?
MED	Exit hook override	Implement `exit_manager` proxy gate for proper AE test
MED	5s crossover test	Does vel_div crossover on 5s data escape fee pressure?
LOW	Longer proxy windows	B300, B500 (instability_300 not in data)
LOW	Combined proxy	B50 × B150 product for sharper stress signal

7. Files

File	Description
`exp1_proxy_sizing.py`	Position scaling by proxy_B
`exp2_proxy_exit.py`	Shadow exit analysis (corrected)
`exp3_longer_proxies.py`	All 5 proxies × all 3 modes × 3 thresholds
`exp4_proxy_coupling.py`	Coupling sweep + orthogonality test
`exp5_dvae_twopass.py`	Two-pass β DVAE test
`exp1_proxy_sizing_results.json`	Logged results
`exp2_proxy_exit_results.json`	Logged results
`exp3_fast_sweep_results.json`	Fast numpy sweep
`exp3_alpha_engine_results.json`	AE validation
`exp4_proxy_coupling_results.json`	Coupling sweep output
`exp5_dvae_twopass_results.json`	Two-pass DVAE output
`flint_hd_vae.py`	FlintHDVAE implementation
`e2e_precursor_auc.py`	AUC measurement infrastructure

11 KiB Executable File Raw Blame History Unescape Escape