chore: safety snapshot 2026-03-05 — HCM infrastructure before 2y klines experiment

Captures critical infrastructure surrounding the nautilus_dolphin core package: - dolphin_vbt_real.py: VBT vectorized backtest engine (6008 lines) - dolphin_paper_trade_adaptive_cb_v2.py: champion runner (champion_5x_f20) - _update_vbt_cache.py / update_VBT_parquet_cache.bat: cache builder - external_factors/: ExF system (all 85 indicator fetchers + NPZ cache) - mc_forewarning_qlabs_fork/: QLabs-enhanced MC-Forewarner research fork - DATA_LOCATIONS.md: source-of-truth path registry - .gitignore: excludes vbt_cache*, backfilled_data, .venv, models, etc. Note: nautilus_dolphin/ has own git repo (inner) — safety snapshot committed there separately. Champion state: WR=49.3%, ROI=+44.89%, PF=1.123, DD=14.95%, Sharpe=2.50 (55d, full-stack, abs_max_lev=6.0). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 23:51:30 +01:00
commit 351ce2044d
38 changed files with 21890 additions and 0 deletions
--- a/mc_forewarning_qlabs_fork/benchmark_results/comparison_report.json
+++ b/mc_forewarning_qlabs_fork/benchmark_results/comparison_report.json
@@ -0,0 +1,52 @@
+{
+  "regression": {
+    "roi": {
+      "baseline_r2": 0.6477214907414871,
+      "qlabs_r2": 0.6619111823995362,
+      "r2_improvement": 0.014189691658049064,
+      "r2_improvement_pct": 2.1907087939610035,
+      "baseline_rmse": 14.992700064057505,
+      "qlabs_rmse": 14.687645475874271,
+      "rmse_improvement": 0.30505458818323383
+    },
+    "dd": {
+      "baseline_r2": 0.7054319934411389,
+      "qlabs_r2": 0.7078504319113373,
+      "r2_improvement": 0.002418438470198403,
+      "r2_improvement_pct": 0.34283084587659785,
+      "baseline_rmse": 5.083696667104963,
+      "qlabs_rmse": 5.062784778354399,
+      "rmse_improvement": 0.020911888750563712
+    }
+  },
+  "classification": {
+    "champion": {
+      "baseline_f1": 0.7580299785867237,
+      "qlabs_f1": 0.7417218543046358,
+      "f1_improvement": -0.016308124282087944,
+      "baseline_accuracy": 0.7175,
+      "qlabs_accuracy": 0.7075,
+      "accuracy_improvement": -0.010000000000000009,
+      "baseline_auc": 0.7762787659531705,
+      "qlabs_auc": 0.789493518239373,
+      "auc_improvement": 0.013214752286202502
+    },
+    "catastrophic": {
+      "baseline_f1": 0.0,
+      "qlabs_f1": 0.3333333333333333,
+      "f1_improvement": 0.3333333333333333,
+      "baseline_accuracy": 0.9875,
+      "qlabs_accuracy": 0.99,
+      "accuracy_improvement": 0.0024999999999999467,
+      "baseline_auc": 0.8830379746835444,
+      "qlabs_auc": 0.9883544303797468,
+      "auc_improvement": 0.1053164556962024
+    }
+  },
+  "summary": {
+    "avg_r2_improvement": 0.008304065064123733,
+    "avg_f1_improvement": 0.15851260452562269,
+    "regression_models": 2,
+    "classification_models": 2
+  }
+}
--- a/mc_forewarning_qlabs_fork/benchmark_results/comparison_report.md
+++ b/mc_forewarning_qlabs_fork/benchmark_results/comparison_report.md
@@ -0,0 +1,33 @@
+# QLabs Enhancement Benchmark Report
+
+**Date:** 2026-03-05 04:56
+
+## Summary
+
+- Average R<> Improvement: +0.0083
+- Average F1 Improvement: +0.1585
+- Regression Models Tested: 2
+- Classification Models Tested: 2
+
+## Regression Results
+
+| Target | Baseline R<> | QLabs R<> | Improvement |
+|--------|-------------|----------|-------------|
+| ROI | 0.6477 | 0.6619 | +0.0142 |
+| DD | 0.7054 | 0.7079 | +0.0024 |
+
+## Classification Results
+
+| Target | Baseline F1 | QLabs F1 | Improvement |
+|--------|-------------|----------|-------------|
+| CHAMPION | 0.7580 | 0.7417 | -0.0163 |
+| CATASTROPHIC | 0.0000 | 0.3333 | +0.3333 |
+
+## QLabs Techniques Applied
+
+1. **Muon Optimizer**: Orthogonalized gradient updates via Newton-Schulz iteration
+2. **Heavy Regularization**: 16x weight decay (reg_lambda=1.6)
+3. **Epoch Shuffling**: 12 epochs with reshuffling
+4. **SwiGLU Activation**: Gated MLP activations (where applicable)
+5. **U-Net Skip Connections**: Residual pathways (where applicable)
+6. **Deep Ensembling**: Logit averaging across 8 models