Files
siloqy/prod/docs/VIOLET_V3_FINDINGS.md

248 lines
16 KiB
Markdown
Raw Normal View History

# VIOLET V3 — Consolidated Findings
**Period:** 2026-06-13 → 2026-06-15. Branch `exp/pink-ditav2-sprint0-20260530`.
Master record of the V3 sprint (DecisionEngine SHADOW), the BLUE margin/edge study,
and all parity findings. Companion to memory `violet_v3_alpha_doctrine`,
`blue_margin_envelope_study`, `violet_subsecond_rebuild_plan` and the per-topic docs
referenced inline.
---
## 0. Governing doctrine (operator-set)
- **Model BLUE, not PINK.** Reference = BLUE's live Alpha Engine (holistic SOA).
Behavioural/distributional fidelity, not PINK pick-parity.
- **Live BLUE code is the sole doctrine.** `blue_parity.py` is a PINK-era distillation
whose fidelity must be VALIDATED (it had drift, see §3).
- **Follow BLUE in all regards — no VIOLET-imposed hygiene.** No filters BLUE lacks;
replicate BLUE's filters exactly. vel_div spikes are signal, not garbage.
- **Reactor substrate.** BLUE's scan-quantized behaviour is hosted on the V0 event
reactor and quantized at Q=scan initially, per-action knobs loosenable later.
- **3-layer:** L1 pure alpha (decide+size as BLUE) → L2 parity harness (mock exchange)
→ L3 tradeability (conviction→exchange leverage + maker policy).
- **Decision layer is slot-independent** — VIOLET decides every scan; the slot only
gates trades (execution layer). Different layers.
## 1. What shipped (V3aV3.2)
| Commit | Module | Content |
|---|---|---|
| V3a | `alpha_wrappers.py` | V-TYPES wrappers over live `AlphaAssetSelector`/`AlphaBetSizer`/`AlphaExitEngineV7`; `max_leverage=9` pinned |
| V3b | `cadence.py` (+spec) | `CadenceControlPlane` — universal per-action tunable Q, control-plane-surfaced |
| V3c | `decision_engine.py` | `VioletDecisionEngine` reactor-resident SHADOW (no exec) |
| V3d | `parity_harness.py` | base-sizer median-curve parity gate vs recorded BLUE |
| V3e | `shadow_journal.py` + `22_violet_decisions.sql` + launcher | reject-at-source CH journal; DARK soak wiring |
| V3.1 | (decision_engine) | BLUE stablecoin exclusion (parity fix) |
| V3.2 | `modulation.py` | EsoF size-modulation fold (BLUE SC haircut, exact) |
Full violet suite green (129+ tests); ZERO shared-file edits all sprint (mechanical
check per commit).
## 2. The bet-sizing model (validated)
Self-consistent at row level vs recorded `dolphin.trade_events`:
- `our_leverage = entry_price·quantity / capital_before` = notional/capital.
- `leverage` = conviction ∈ [0.5, 9] (cubic-convex strength³ curve).
- **`notional = capital × 0.20 (base_fraction) × leverage`** → our_leverage = 0.20×leverage,
max ≈ 1.81.
- **DUAL-LEVERAGE:** conviction leverage sizes the QUANTITY (internal); exchange leverage
mapped at the venue boundary via `prod/bingx/leverage.py`
`map_internal_conviction_to_exchange_leverage_target` (round_half_even linear
0.59.0 → 1..cap; PINK/VIOLET use a max-3× cubic translator).
## 3. blue_parity drift (doctrine validated by evidence)
`blue_parity.py` passes `AlphaBetSizer(max_leverage=8.0)`; live kernel *default* is 5.0;
recorded conviction reaches **9.0** (gold spec). So `blue_parity` is NOT at parity —
VIOLET pins `max_leverage=9.0` explicitly (never inherits a default). Confirmed by V3d:
base sizer reproduces BLUE's recorded MEDIAN curve at **pearson 0.9998 / max_abs_err
0.238** with max_leverage 9 / thr 0.02 / extreme 0.05 / convexity 3.
## 4. BLUE margin-envelope + true-edge study (`blue_margin_envelope_study.md`)
**Raw `sum(pnl)` = $45,981 is artifact-dominated:**
- duplicate-emission: 3453 rows / 2193 trade_ids, 98.3% of multi-row tids have IDENTICAL
pnl (pure dup; real legs live in `trade_exit_legs`); one tid had 317 dup rows.
- HIBERNATE_HALT artifacts incl. ZEC $39,164 (73.5%, bars_held=0) = 85% of raw loss.
**Cleaned (dedup + drop HIBERNATE & bars_held=0): +$47,068** (2121 trades, 58.4% win).
Independent corroboration: corrected-capital trajectory grew **$33,820 → $69,673
(+$35,852)** in the tracked window. Two methods agree → robust.
**Margin envelope (856 clean trades):** single-slot (operator-confirmed, no stacking).
Fit at 1× = 73.6%; **fit at 2× = 100%**; median wallet utilization @2× = **3.4%**
⟹ capital substantially UNDER-utilized; margin never binding; edge realizable on-exchange.
Recommend flat **3×** exchange leverage (p95 util 90% at 2×).
**Make-or-break fear REFUTED:** worry was ~9× notional being infeasible; realized
notional/capital maxes at **1.81** (because 0.20 base × 9), needing just 2×.
## 5. The under-utilization caveat (`VIOLET_FINDING__MODULATION_LAYER_VS_UNDERUTILIZATION.md`)
The margin study used ACTUAL recorded notionals (post-modulation), so it is NOT
contradicted by the modulation layer. BUT the median ~6.8% wallet utilization is
**largely the EsoF haircut deliberately de-risking** — NOT free headroom. ⟹ the
#3 base-fraction study must not read it as reclaimable; the modulation layer is
**required before V4 execution** (base-only would trade bigger/riskier than BLUE).
## 6. Regime-conditional edge (`blue_margin_envelope_study.md`)
Edge is regime-CONCENTRATED, not invariant: **95% of clean edge in choppy-bearish**
(short core signal is short-positive; long side = separate operational EFSM algo).
Sub-regimes CONFIRMED inside it (univariate): BTC<MA99 +28.4/trade vs >MA99 +12.6 (2.25×);
hi-DVOL +27.2 vs lo +17.1; strong-velDiv +30.8 vs weak +13.6 at EQUAL win-rate ⟹
**conviction-sizing validated**. MARAS labels UNRELIABLE (39% of "bearish"-labeled trades
had BTC ABOVE MA99) → use `composite_hash`, not the label. Gates tilt only mildly
(~6pts) from bull. Research TODO: stablecoin IRP signal
(`VIOLET_RESEARCH_TODO__STABLECOIN_IRP_SIGNAL.md`).
## 7. Selection parity (VIOLET already matches BLUE)
- BLUE picks via IRP with the alignment gate MUTED (`min_irp_alignment=0.0` = gold
"no IRP filter", `nautilus_event_trader.py:134/605`) + sizes with scan TOP-LEVEL
`vel_div` (:3915). VIOLET defaults `min_alignment=0.0` + `decide(vel_div=payload['vel_div'])`
= EXACT match.
- **Stablecoin exclusion (V3.1):** BLUE removes `_STABLECOIN_SYMBOLS` (10 pegged symbols)
from prices_dict pre-select (:24/3906). VIOLET replicates the exact set; drift-guarded
against BLUE source. (Picking itself unchanged; this is BLUE's separate exclusion gate.)
## 8. EsoF modulation fold (§3 of the modulation doc; V3.2)
`modulation.py` wraps BLUE's `esof_size_mult_from_score` (exact ESOF_* constants:
NEUTRAL 0.8 / UNFAV 0.3 / STALE_FB 0.4 / EDGE 0.02) and applies the SC haircut
step-for-step as `_apply_sc_entry_size_multiplier` (:3307): **mult clamp [0,1]
HAIRCUT-ONLY** (:3316), **near-1 no-op** (:3318), `round(lev×mult,6)` / `round(notional×mult,12)`.
8 tests. Empirical mult-recovery on 1500 recorded trades: **median 1.000**, EsoF haircut
bands (0.65/0.8/0.9/0.3) VISIBLE → fold validated. **Not yet wired into the live engine**
(needs EsoF HZ score plane + restart, held).
**OPEN — the 28% upward tail** (`recorded_leverage / base > 1.05`): localized to
mid-range vel_div ONLY (0.02→0.05 where base<9: 3660% boost; 0.06 where base=9:
0%). NOT EsoF (haircut-only), NOT flat — a **regime-dependent conviction-curve STEEPENER**
(distribution bimodal at 0.04, median matched). = almost certainly the "gold"/ACB-adjacent
UPWARD sizing organ. Full per-trade parity = base × EsoF-haircut(done) × steepener(NEXT).
**Sizing-engine note:** live BLUE sizes via `AlphaBetSizer` (docstring: "Matches
dolphin_vbt_real.py sizing"). `dolphin_vbt_real` is the **gold-standard BACKTEST engine**
used as the BENCHMARK for BLUE Alpha-Engine algo compliance — likely NOT the live sizer,
but it is the authoritative ORACLE for the exact conviction-curve math (and may be
imported/reused for exact code). For exact-parity work on the steepener, consult
`vbt_real` as the gold reference; check whether the live path reuses its code.
## 8b. COMPLETE sizing composition (authoritative, esf_alpha_orchestrator :597-619)
BLUE's full conviction-leverage is FIVE multipliers on the base cubic, in the
orchestrator (NOT AlphaBetSizer alone). Exact, in operation order:
```
raw_leverage = size_result["leverage"] # base cubic (AlphaBetSizer, base_max=8)
* dc_lev_mult # "DC boost" (operator-recalled factor #1) = signal_gen.dc_leverage_boost if signal.dc_status=="CONFIRM" else 1.0
* regime_size_mult # ACB_boost × meta_boost × MC_scale (the steepener)
* market_ob_mult # OB cross-asset consensus: 1.0 default; up to 1.20 (eff_imb>0.08 & agree>0.70), down to 0.85
* _esof_size_mult # EsoF haircut [0,1] (V3.2 fold)
clamped_max = min(base_max_leverage(8) * regime_size_mult * market_ob_mult * _esof_size_mult, abs_max_leverage(9))
if _day_posture == 'STALKER': clamped_max = min(clamped_max, 2.0)
leverage = min(raw_leverage, clamped_max); leverage = max(bet_sizer.min_leverage, leverage)
notional = capital * size_result["fraction"] * leverage
```
Components: base ✓(V3a, fix max→8), esof ✓(V3.2). TO WRAP: `regime_size_mult`
(= AdaptiveCircuitBreaker `get_dynamic_boost_for_date/from_hz` → ACB×meta×MC),
`dc_lev_mult` (signal_gen DC-confirm), `market_ob_mult` (OBFeatureEngine market
consensus), `_day_posture` (STALKER 2.0 cap). The composition block itself is
deterministic float arithmetic — replicate in EXACT operation order for bit-identity.
**Validation doctrine (operator, BINDING):** (1) Monte-Carlo the ENTIRE JOINT input
universe of both surfaces (vel_div × ACB signals × w750/β × esof × MC × ob × posture),
(2) match BLUE actual-code output to BIT IDENTITY (only achievable by wrapping real
kernels → mandates wrapping, not reconstructing), (3) THEN test upstream (live/recorded).
## 8c. Vision & long-horizon roadmap — from the five-factor map to DISTRACK
This train of thought starts right next to the two-factors resolution (§8b) and runs to
the long-horizon dream. Captured per operator request 2026-06-15.
**Reframe — five lanes is separation-of-concerns, not "smear."** The sizing being
composed across five distinct multipliers (base · DC · ACB-regime · OB-consensus · EsoF)
is a VIRTUE: each is its own distinct, traceable, OBSERVABLE subsystem with a clean
domain boundary → full attributability (tap any lane, ask what it said at bar N). The
composition is a tiny pure fan-in (`base × dc × acb × ob × esof → clamp`); the lanes are
independent pure functions of `(inputs, params)`.
**Holy grail #1 — LONG alpha.** The core NG7 eigenvalue-breakdown signal is short-positive;
a reliable LONG-side algo is a grail (the operational EFSM long-reversal — "market must
bounce" mean-reversion — is a profitable signal in that direction). Architecturally LONG
alpha = ONE MORE pure signal lane in the same DAG: attributable, hot-swappable,
VIBRISS-tunable alongside the shorts. You add a lane, you don't fork the system.
**Holy grail #2 — FPGA-pure instantiation for VIBRISS banditry.** Distil the algo so an
instance comes up in femtoseconds on a faster-than-gVisor stack → MILLIONS of concurrent
algo-instances hyperadjusting in real time (VIBRISS bandit governance).
**Nirvana — faster-than-ASM/FPGA-like purity WHILE keeping separation of concerns.** Not a
contradiction; a known-reachable shape. The tension (modular-attributable ⊥ fused-fast)
only exists if concerns share mutable state or I/O — here they mostly don't. Path:
**pure-dataflow-DAG → compile.** Source level: each concern stays a typed, observable node
(attributability untouched). Compile level: the composed pure-function DAG fuses to a flat
kernel (Rust/SIMD now, FPGA/RTL later) — cf. Halide (modular schedule → fused kernel),
JAX→XLA, RTL synthesis from modular HDL. **Bit-identity (the operator's MC-to-bit-identity
gate, §8b) is the BRIDGE** — it proves the fused fast kernel computes the identical function
as the readable modular source, turning "distil to FPGA purity" from a leap of faith into a
verified refactor. Separation-of-concerns and faster-than-ASM purity are the SAME artifact
at two compilation stages; purity + bit-identity is the path between them. VIBRISS
millions-of-instances falls out of purity: an "instance" is a parameter binding to a frozen
graph (femto-cheap, no state to construct), and purity makes a million concurrent instances
SAFE (each provably its own clean function, no shared-mutable footguns).
**DISTRACK — the state-side enabler (culmination).** Memory-CONSTANT streaming distribution
tracking: rolling-window param/outcome distributions in O(1) memory (online/streaming
quantiles — t-digest / P² / reservoir / EWMA sketches). Docs:
`CRITICAL_VIOLET_MAYBE_TODO_STREAMING_STATS_COMPRESSION.md`,
`VIOLET_TODO_CRITICAL_DISTRIBUTION_TRACKING_IN_CONSTRAINED_MEMORY.md`. This is what makes
the million-instance DAG AFFORDABLE — per-instance distribution state stays femto-cheap
instead of a memory bomb. DISTRACK is the *state* side of the vision the way bit-identity is
the *correctness* side. Seed already in-house: V0 `LatencyHistogram` does reservoir/percentile
— generalize that pattern. **SEQUENCING: DISTRACK is for AFTER VIOLET is actually trading
(testnet BingX → mainnet) — not before.** First get it live; then the banditry-scale work.
## 9. Shadow soak validation (2026-06-14/15)
Faithful DARK soak (threshold 0.02), 9h+ stable, single session:
- **4,878 decisions, 0 stablecoin leak, 0 bad rows** (V-TYPES/reject-at-source), **0 orders,
0 errors**, spool steady 4.5M. 20 distinct real assets.
- **Faithfulness:** all SHORT+actuated; all vel_div < 0.02; sizer cubic curve reproduced
live (conviction p50 1.315 = base formula at median vel_div 0.0337); notional_fraction
max exactly 1.80; schema 100% compliant.
- **Statistics:** conviction min 0.5/p50 1.32/avg 3.91/max 9 (BLUE-shaped); temporal
conviction swings 2.6→6.5/hr (tracks vol regime).
- **Diversity:** 20 assets, top-2 ~31%, well-spread.
- **Anomalies:** vel_div spikes to 605 (legitimate signal, saturate to 9); "duplication"
avg 14.5 = sticky-signal re-decisions at ≤1/scan (correct cadence). NO defects.
**VIOLET vs BLUE comparison:** BLUE doesn't log per-scan entry decisions (only trades +
v7 exits); signal-level vel_div matches the scan archive exactly. BLUE's 69 trades in
window across 6 assets — **100% in VIOLET's decision universe, 100% timing co-located**
(same asset, ±5min). Frequency differs by slot only (execution layer, deferred).
## 10. Open items / next steps
1. **Gold/ACB conviction-curve steepener** — wrap the upward sizing organ (the 28% tail).
Live sizer = `AlphaBetSizer`; consult `dolphin_vbt_real` (the gold-standard backtest
oracle BLUE is benchmarked against) for the exact curve math / possible code reuse.
Required for full per-trade leverage parity.
2. **Wire EsoF score plane** into the live decision engine (HZ `_read_esof_payload` equiv)
+ restart to fold modulation into the soak.
3. **Trade/slot-granularity comparison** (episode-collapse) — deferred until VIOLET has
comparable execution-layer facilities (decision layer already faithful).
4. **Base-fraction sizing study** (`VIOLET_STUDY_SPEC__BASE_FRACTION_SIZING.md`) — gated on
regime-robustness; respect the de-risking caveat (§5).
5. **composite_hash multivariate sub-regime model** (deeper §6).
6. **V4** = execution on (single asset, conservative caps) — only after the modulation
layer is complete.
7. Operator: VST keys for live exec; stays DARK until then.
## 11. Live state at time of writing
`dolphin_violet` DARK, SHADOW ON (faithful 0.02), 0 orders, stablecoin fix live, EsoF
fold built+committed but NOT yet wired. Overnight monitor report at
`prod/VIOLET_dev/reports/violet_overnight_soak_20260614.log`.