Files
DOLPHIN/nautilus_dolphin/dvae/PROXY_B_RESEARCH_FILING.md
hjnormey 01c19662cb initial: import DOLPHIN baseline 2026-04-21 from dolphinng5_predict working tree
Includes core prod + GREEN/BLUE subsystems:
- prod/ (BLUE harness, configs, scripts, docs)
- nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved)
- adaptive_exit/ (AEM engine + models/bucket_assignments.pkl)
- Observability/ (EsoF advisor, TUI, dashboards)
- external_factors/ (EsoF producer)
- mc_forewarning_qlabs_fork/ (MC regime/envelope)

Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
2026-04-21 16:58:38 +02:00

11 KiB
Executable File
Raw Blame History

proxy_B — Research Filing

Date: 2026-03-14 Status: Closed for direct exploitation; open as modulator candidate Gold baseline: ROI=+88.55%, PF=1.215, DD=15.05%, Sharpe=4.38, Trades=2155


1. Signal Definition

proxy_B = instability_50 - v750_lambda_max_velocity
  • instability_50: short-window (50-bar) eigenvalue instability in correlation matrix
  • v750_lambda_max_velocity: long-window (750-bar) max-eigenvalue velocity
  • Intuition: short-term stress MINUS long-term momentum. When this is high, the eigenspace is rapidly destabilising relative to its recent trend.
  • Available in 5s scan parquets. Computed in ShadowLoggingEngine.process_day().

2. Discovery & Measurement

Experiment: Precursor sweep (e2e_precursor_auc.py, flint_precursor_sweep.py) Metric: AUC for predicting eigenspace stress events at K=5 bars forward

Window / Signal AUC (K=5)
proxy_B (inst50 v750_vel) 0.715
instability_50 alone ~0.65
v750_lambda_max_velocity alone ~0.61
FlintHDVAE latent z (β=0.1) 0.6918
vel_div (entry signal) baseline

proxy_B leads stress events by ~25 seconds (5-bar horizon on 5s data). It is NOT the entry signal — it measures a different aspect of the eigenspace.


3. Orthogonality to System Signals

Test: Exp 4 — shadow run, 48/2155 trades had valid aligned pb_entry+vd_entry (entry_bar alignment bug: only ~2% of trades yield correctly-matched bar-level values; see Section 6 Technical Note).

Pair Pearson r p-value Spearman rho Verdict
proxy_B_entry ↔ vel_div_entry 0.031 0.837 0.463 Orthogonal (ns)
proxy_B_entry ↔ pnl_frac +0.166 0.260 +0.158 Not predictive of outcome (ns)
proxy_B_entry ↔ MAE +0.420 0.003 ** +0.149 Predicts intraday adversity
proxy_B_entry ↔ hold_bars 0.054 0.717 0.171 Orthogonal (ns)
proxy_B_max ↔ pnl_frac +0.066 0.655 0.379 ns
proxy_B_max ↔ MAE +0.047 0.750 0.280 ns

Mann-Whitney (worst-10% pnl vs rest): pb_entry worst=-5.40, rest=+0.27, p=0.183 ns Mann-Whitney (worst-10% MAE vs rest): pb_entry worst=-5.41, rest=+0.27, p=0.115 ns

Critical finding:

  • proxy_B IS orthogonal to vel_div (the entry signal) — r≈0, ns ✓
  • proxy_B does NOT predict final trade PnL — r=+0.17, ns ✓ (confirms prior findings)
  • proxy_B DOES predict intraday adversity (MAE): r=+0.42, p=0.003 ← KEY

Mechanistic interpretation: When proxy_B is high at entry, the trade experiences a worse intraday adverse excursion (deeper MAE). But final PnL is unaffected because the engine's exit logic (TP/max_hold/direction-confirm) successfully navigates through the stress period. This is the complete explanation for why:

  1. Gating on proxy_B removes trades that are temporarily stressed but then RECOVER → hurts
  2. A proxy-coupled stop would cut those recoveries short → reduces DD but also reduces ROI
  3. The signal has genuine information content (AUC=0.715, MAE correlation p=0.003) but the system is ALREADY correctly managing the trades it tags as stressed

4. Experiments Performed

Exp 1 — proxy_B Position Sizing (exp1_proxy_sizing.py)

Tests bet_sizer.base_fraction * scale(proxy_B_at_entry).

Config ROI% PF DD% Sharpe scale_mean
GOLD 88.55 1.215 15.05 4.38
Baseline (no sizing) 88.55 1.2147 15.05 4.378
S1 [0.5x1.5x] w500 91.48 1.1782 16.93 3.528 1.004
S2 [0.25x2.0x] w500 105.51 1.1537 20.30 2.956 1.133
S3 [0.5x1.5x] w1000 89.49 1.1763 16.69 3.514 1.000
S4 [0.5x1.5x] clip 87.13 1.1628 18.03 3.184 1.019

Finding: scale_mean > 1.0 in all configs → proxy_B is more often LOW during trading activity, meaning the engine sizes UP on average. Higher ROI (S2: +17pp) is a leverage effect, not signal quality — PF drops and Sharpe collapses. The signal is anti-correlated with trade quality per unit capital.

Exp 2 — proxy_B Shadow Exit (exp2_proxy_exit.py)

Post-hoc test: would exiting when proxy_B < threshold have helped?

Threshold Trigger rate AvgDelta% Early better Est. ROI
p10 ~60% of trades (at ≥1 bar) 0.15% 37% 0.96pp
p25 ~69% +0.04% 43% +0.26pp
p50 ~85% +0.02% 43% +0.15pp

Note: High trigger rates are mathematically expected — a 120-bar hold has ~100% chance of any bar crossing the p50 level. The signal fires constantly during holds; using it as an exit trigger is noise, not information. Verdict: Holding to natural exit is better. Early exit is weakly beneficial in only 37-43% of cases.

Exp 3 — Longer Window Proxies (exp3_longer_proxies.py)

All 5 proxy variants × 3 modes (gate/size/exit) × 3 thresholds. AE validation of top 10 fast-sweep configs.

Config AE ROI% PF DD% Note
GOLD 88.55 1.215 15.05
V50/gate/p50 21.58 0.822 31.94 Catastrophic
V150/gate/p50 24.34 0.909 31.97 Catastrophic
B150/gate/p10 17.37 0.941 29.00 Catastrophic
B150/gate/p25 1.26 0.996 28.25 Marginal hurt
Exit modes 88.55 (=base) 0 early exits

Why velocity gates are catastrophic: V50 = instability_50 v750_velocity and V150 = instability_150 v750_velocity. The velocity divergence short-minus-long is highly noisy at short windows. Gating on it suppresses large fractions of trades (compound-leverage paradox: each suppressed trade costs more than it saves due to capital compounding).

Exit mode 0-triggers in AE: _try_entry is the wrong hook for exits. The AE exit path goes through exit_manager.evaluate(). Fast-sweep exit approximations are valid; AE validation of exit modes requires exit_manager override.

Exp 5 — Two-pass β DVAE (exp5_dvae_twopass.py)

Does high-β first pass → low-β second pass improve latent space?

Variant AUC vs baseline Active dims
A: single-pass β=0.1 0.6918 8/8
B: β=4→β=0.1 <0.6918 0.006 collapsed
C: β=2→β=0.1 <0.6918 negative partial
D: dual concat β=4‖β=0.1 <0.6918 negative mixed

Root cause: High-β collapses z_var to 0.0080.019 (from 0.1+ single-pass) by epoch 10 via KL domination. The collapsed posterior is a worse initialiser than random. β=12 was not tested (β=6 already gave 0/20 active dims).

Exp 4 — Coupling Sweep (exp4_proxy_coupling.py) — COMPLETE

155 configs tested in 0.11s (retroactive). Shadow run confirmed: ROI=88.55%, Trades=2155.

DD < 15.05% AND ROI ≥ 84.1% candidates (19 found):

Config ROI% DD% ΔROI ΔDD Note
B/pb_entry/thr0.35/a1.0 86.93 14.89 1.62 0.15 scale_boost, smean=1.061
E/stop_0.003 89.90 14.91 +1.36 0.14 pure_stop, 18 triggers
B/pb_entry/thr0.35/a0.5 87.74 14.97 0.81 0.08 scale_boost
E/stop_0.005 89.29 14.97 +0.74 0.07 pure_stop, 11 triggers
E/stop_0.015 89.27 14.97 +0.72 0.07 pure_stop, 2 triggers
F/stop_0.005/gate_p0.5 88.68 15.03 +0.14 0.01 gated_stop, 4 triggers

Best per mode: scale_suppress → DD worsens; hold_limit → DD worsens; rising_exit → DD worsens; pure_stop → best legitimate DD reducer; gated_stop → marginal (few triggers).

IMPORTANT CAVEAT: entry_bar alignment bug caused 2107/2155 pb_entry to be NaN (entry_bar appears to store global_bar_idx not per-day ri). The proxy-coupling modes (A, B, F) used median fill-in for 98% of trades → effectively a null test. Only Mode E (pure_stop) is fully valid because it uses MAE computed from shadow hold prices.

Valid conclusion from Exp 4:

  • A 0.3% retroactive stop (E/stop_0.003) improves BOTH ROI (+1.36pp) and DD (0.14pp)
  • Only 18 trades triggered → the improvement is modest but directionally sound
  • The proxy-coupled stop (Mode F) needs proper entry_bar alignment to test meaningfully
  • Next step: implement stop_pct parameter in exit_manager for real engine test

5. Core Findings

5.1 The Compound-Leverage Paradox

With dynamic leverage, gating ANY subset of trades (even below-average quality ones) costs ROI because capital that would have compounded is left idle. The break-even requires gated trades to have strongly negative expected value — but the 50.5% win rate means most trades are net-positive.

5.2 Why proxy_B Gating Specifically Hurts

scale_mean > 1.0 in position sizing tests = proxy_B is LOWER during most trading time windows than the neutral baseline. The system naturally avoids high-proxy periods (or avoids entering during them) already. Gating explicitly on high-proxy removes the REMAINING high-proxy trades, which happen to be positive on average.

5.3 The Unresolved Question: MAE vs Final PnL

proxy_B has AUC=0.715 for eigenspace stress prediction. The signal IS predictive of something real. The hypothesis (untested until Exp 4): proxy_B predicts intraday adversity (MAE) but NOT final trade outcome, because the engine's exit logic successfully recovers from intraday stress. If confirmed:

  • proxy_B fires during the rough patch mid-trade
  • The trade then recovers to its natural TP/exit
  • Gating removes trades that look scary but ultimately recover
  • A tighter retroactive stop ONLY during high-proxy periods might reduce DD without proportionally reducing ROI — if the recovery is systematic

6. Open Research Directions

Priority Direction Rationale
HIGH Exp 4 coupling results Does gated stop reduce DD without ROI cost?
MED Exit hook override Implement exit_manager proxy gate for proper AE test
MED 5s crossover test Does vel_div crossover on 5s data escape fee pressure?
LOW Longer proxy windows B300, B500 (instability_300 not in data)
LOW Combined proxy B50 × B150 product for sharper stress signal

7. Files

File Description
exp1_proxy_sizing.py Position scaling by proxy_B
exp2_proxy_exit.py Shadow exit analysis (corrected)
exp3_longer_proxies.py All 5 proxies × all 3 modes × 3 thresholds
exp4_proxy_coupling.py Coupling sweep + orthogonality test
exp5_dvae_twopass.py Two-pass β DVAE test
exp1_proxy_sizing_results.json Logged results
exp2_proxy_exit_results.json Logged results
exp3_fast_sweep_results.json Fast numpy sweep
exp3_alpha_engine_results.json AE validation
exp4_proxy_coupling_results.json Coupling sweep output
exp5_dvae_twopass_results.json Two-pass DVAE output
flint_hd_vae.py FlintHDVAE implementation
e2e_precursor_auc.py AUC measurement infrastructure