From c3a18f693a031e981d5b5e87408c2c2632b4b34f Mon Sep 17 00:00:00 2001
From: Codex <codex@localhost>
Date: Fri, 12 Jun 2026 15:04:15 +0200
Subject: [PATCH] =?UTF-8?q?docs:=20VIBRISS=20spec=20(+=20=C2=A710.6=20casc?=
 =?UTF-8?q?ade/adaptive-TP=20paramsets),=20PINK=20accounting=20fix=20spec,?=
 =?UTF-8?q?=20BLUE=20incident=20docs?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

VIBRISS_PARAMETER_GOVERNANCE_SPEC §10.6: ob_cascade.count_threshold
(currently cascade_count>0 = ONE asset widens every TP x1.40),
tp_widen_factor, withdrawal_velocity_threshold as governance candidates;
adaptive/Dynamic-TP threshold marked fit for VIBRISS governance; TP_FLOOR
joint-policy reward requirement.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---
 ...TICAL_VIOLET_DESIGN__BLUE_HYDRATION_BUG.md |  182 +
 prod/docs/MALFORMED_OPEN_RESTORE_BUG.md       |  131 +
 prod/docs/PINK_ACCOUNTING_EXEC_FIX.md         |  362 ++
 .../docs/VIBRISS_PARAMETER_GOVERNANCE_SPEC.md | 2978 +++++++++++++++++
 4 files changed, 3653 insertions(+)
 create mode 100644 prod/docs/CRITICAL_VIOLET_DESIGN__BLUE_HYDRATION_BUG.md
 create mode 100644 prod/docs/MALFORMED_OPEN_RESTORE_BUG.md
 create mode 100644 prod/docs/PINK_ACCOUNTING_EXEC_FIX.md
 create mode 100644 prod/docs/VIBRISS_PARAMETER_GOVERNANCE_SPEC.md

diff --git a/prod/docs/CRITICAL_VIOLET_DESIGN__BLUE_HYDRATION_BUG.md b/prod/docs/CRITICAL_VIOLET_DESIGN__BLUE_HYDRATION_BUG.md
new file mode 100644
index 0000000..1513851
--- /dev/null
+++ b/prod/docs/CRITICAL_VIOLET_DESIGN__BLUE_HYDRATION_BUG.md
@@ -0,0 +1,182 @@
+# Critical Violet Design: BLUE hydration bug
+
+Date: 2026-06-11
+
+## Summary
+
+This incident is a BLUE hydration / restore bug on the XTZUSDT short trade `863c21da`.
+
+The important facts are:
+
+1. The XTZ trade was real and opened at `2026-06-11 17:22:12.678265+00:00`.
+2. The trade did **not** close via TP, SL, or MAX_HOLD before hydration.
+3. The restore path later rebuilt the open slot from `position_state` and `trade_reconstruction`.
+4. The restored state had a chain-token mismatch, but the engine continued with the derived token instead of hard-failing.
+5. A later hydrate-time stop was recorded at `2026-06-11 18:35:52.789008+00:00` with `STOP_LOSS`.
+6. The ledger shows the next trade was admitted while XTZ was still officially open, which violates the single-slot invariant.
+
+## Trade identity
+
+- trade_id: `863c21da`
+- asset: `XTZUSDT`
+- side: `SHORT`
+- entry price: `0.2276`
+- entry notional: `56484.4305702418`
+- leverage: `6.374647927191287`
+- entry bar: `238`
+- tp_base_pct: `0.002`
+- tp_effective_pct: `0.0019999655500463724`
+
+## Ledger evidence
+
+### Open record
+
+`dolphin.trade_reconstruction` contains the canonical open record:
+
+- ts: `2026-06-11 17:22:12.911989`
+- event_type: `OPEN`
+- event_id: `863c21da:open`
+- chain_token: `26852fa25fb5cdaa3b4c354d5e3eea93e27bce0ebdcd0da896d4f981642eeeb2`
+
+The payload confirms:
+
+- `entry_ts = 1781198532678265`
+- `entry_bar = 238`
+- `retraction_legs = 0`
+- `realized_pnl_legs_total = 0.0`
+- `chain_mode = LIVE`
+- `chain_kind = ROOT`
+
+### No close before hydrate
+
+`dolphin.trade_exit_legs` has no rows for `863c21da`.
+
+`dolphin.trade_events` also has no close row for `863c21da`.
+
+So there is no official TP, SL, or MAX_HOLD exit recorded before the restore/hydration event.
+
+### Decision tape before hydrate
+
+`dolphin.v7_decision_events` shows the trade was live and being evaluated:
+
+- `2026-06-11 17:22:13.274556` `HOLD`
+- `2026-06-11 17:22:23.124863` `HOLD`
+- `2026-06-11 17:22:45.232894` `HOLD`
+- `2026-06-11 17:23:28.274004` `HOLD`
+- `2026-06-11 17:24:43.182413` `RETRACT / V7_RISK_DOMINANT`
+
+The best favorable excursion in the pre-hydrate tape was only about `+0.065905094%`, which is far below the fixed TP threshold.
+
+## Restore / hydration behavior
+
+At restore time the engine logged:
+
+- `chain token mismatch on restore: trade=863c21da stored=26852fa25fb5 derived=98875e225e9e — continuing with derived token`
+- `position_state RESTORED: XTZUSDT SHORT entry=0.2276 notional=56484 bars_held≈0 trade=863c21da`
+
+The restore path in [`prod/nautilus_event_trader.py`](../nautilus_event_trader.py) does the following:
+
+- reads `position_state`
+- reconstructs `restored_entry_bar = max(0, self.bar_idx - stored_bars)`
+- loads reconstruction data from `dolphin.trade_reconstruction`
+- rebuilds chain state from the persisted payload
+- if the stored chain token differs from the derived token, it logs the mismatch and continues with the derived token
+
+Relevant code:
+
+- `_chain_state_from_reconstruction(...)` around lines `3315-3348`
+- restore from `position_state` around lines `1944-2058`
+
+This is a validator, not a hard guardrail.
+
+## Single-slot violation
+
+The next distinct open trade in the reconstruction ledger is:
+
+- ts: `2026-06-11 17:50:50.420620`
+- trade_id: `43494ade`
+- asset: `TRXUSDT`
+- side: `SHORT`
+
+That means the system admitted a new trade while XTZ was still officially open in the ledger.
+
+On a single-slot engine, that should not happen.
+
+## What would have happened without hydration
+
+This is the conservative conclusion from the tape:
+
+- The trade did not hit TP on the observed pre-hydrate tape.
+- The trade did not have an official close row before hydration.
+- The tape does not contain a clean uninterrupted decision path beyond the first pre-hydrate window.
+
+The best-supported natural outcome from the observed tape is the live `RETRACT` state at `2026-06-11 17:24:43.182413`, where the engine still considered the slot active and the trade had only reached `bars_held = 14`.
+
+At that point:
+
+- `current_price = 0.22765000000000002`
+- `pnl_pct = -0.021968365`
+- `reason = V7_RISK_DOMINANT`
+
+If that retract state had been executed immediately, the estimated trade PnL would have been:
+
+- `-12.4087058758423` USDT on the recorded notional
+- trade ROI: `-0.021968365%`
+
+The max-hold clock also would have forced a decision long before the 18:35 restore:
+
+- trade-specific `market_state_max_hold_bars = 102`
+- live tape reached `bars_held = 14` by `17:24:43`
+- at an ~11 second cadence, the max-hold boundary would have arrived around `17:40-17:41`
+
+So the 18:35 stop-loss is not the natural continuation of the original entry. It is a restore-time artifact on top of a stale open slot.
+
+What is observable is the hydrated-path close that actually got booked:
+
+- exit ts: `2026-06-11 18:35:52.789008+00:00`
+- exit reason: `STOP_LOSS`
+- exit price: `0.23526757499999998`
+- realized pnl_pct: `-0.033056485743551446`
+- realized net_pnl: `-1913.155101369921`
+
+That realized stop corresponds to:
+
+- price move against the short of about `3.3056%`
+- account-level ROI of about `-2.726636%` using capital before exit (`70165.39`)
+
+## Root cause
+
+The bug is the restore path itself:
+
+1. The open trade state was preserved in `trade_reconstruction`.
+2. The current `position_state` snapshot was lossy or stale enough to rehydrate with `bars_held≈0`.
+3. The chain token mismatch was detected, but the code explicitly continues with the derived token.
+4. The engine therefore recovered continuity without enforcing strict equality between the live open chain and the reconstructed state.
+
+That combination makes orphaned trades possible after a bad hydrate.
+
+## Operational impact
+
+- The XTZ short remained open in the ledger with no formal close.
+- The engine later allowed a new trade while the slot should still have been occupied.
+- Capital accounting diverged from the true live slot history.
+- The restore path masked the inconsistency instead of stopping the recovery.
+
+## Recommended fix direction
+
+1. Treat a chain-token mismatch on restore as a hard failure for BLUE when a live open slot exists.
+2. Preserve the original `entry_bar` and bar counter from the open-chain payload instead of reconstructing them from the current `position_state` row when the two disagree materially.
+3. Refuse to admit a new trade until the single-slot invariant is proven flat.
+4. Add a regression test for:
+   - open XTZ trade
+   - stale `position_state`
+   - chain-token mismatch
+   - no new trade admission while the open slot remains unresolved
+
+## Bottom line
+
+XTZ was a real open trade.
+It never got a clean pre-hydrate exit.
+The restore path tolerated chain drift and rebuilt a misleading open state.
+The best-supported no-freeze outcome is the 17:24 retract, roughly flat to slightly negative.
+The realized hydrated-path loss was `-3.3056485743551446%` on the position and `-2.726636%` of capital before exit, but that is a restore artifact, not the natural end of the original trade.
diff --git a/prod/docs/MALFORMED_OPEN_RESTORE_BUG.md b/prod/docs/MALFORMED_OPEN_RESTORE_BUG.md
new file mode 100644
index 0000000..a9430d1
--- /dev/null
+++ b/prod/docs/MALFORMED_OPEN_RESTORE_BUG.md
@@ -0,0 +1,131 @@
+# MALFORMED_OPEN_RESTORE_BUG
+
+## Summary
+
+BLUE was repeatedly rehydrating after startup because `dolphin.position_state` contained stale `OPEN` rows with zero effective size.
+
+The restore path treated those rows as fatal:
+
+- it selected the latest `OPEN` row per `trade_id`
+- it accepted that row even when `quantity` or `notional` had been driven to `0`
+- it hard-stopped on `position_state row invalid quantity ...`
+- `supervisord` then restarted the trader
+- the next startup read the same bad row again
+
+That created a restart loop.
+
+This was observed most clearly on the `2026-06-11` BLUE window. The recurring bad row was the legacy `ATOMUSDT` leg `1a3d2f9c`, which was persisted as:
+
+- `status = OPEN`
+- `quantity = 0`
+- `notional = 0`
+- `bars_held = 34`
+
+That row is not a live position. It is a stale snapshot that should have been treated as tombstoned history.
+
+## Root Cause
+
+The bad rows were self-inflicted by the partial-retract path in `nautilus_event_trader.py`.
+
+Before the fix:
+
+1. `_apply_internal_retract()` shrank the live position.
+2. It wrote a new `position_state` row with `status="OPEN"` for the remaining leg.
+3. If the remaining size rounded to zero, the row still existed as an `OPEN` snapshot.
+4. A later startup restore could pick that row and treat it as authoritative.
+
+That is enough to leave behind `OPEN` rows with:
+
+- `quantity = 0`
+- `notional = 0`
+
+These are not valid live positions, but they looked like one to the old restore logic.
+
+There is a second contributing factor in the restore path:
+
+- the restore code historically trusted the latest `OPEN` candidate too early
+- zero-sized `OPEN` rows were only rejected after the row had already been chosen as the best candidate
+- rejection used a hard failure path, which made the process exit instead of trying the next sane source
+
+That means the persistence bug and the restore policy bug reinforced each other.
+
+## Observable Symptoms
+
+- repeated `restore candidate parse failed from capital_update_ledger: 'list' object has no attribute 'get'`
+- repeated `position_state row invalid quantity for trade ...: 0.0`
+- `RESTORE HALT`
+- immediate restart by `supervisord`
+
+The chain-token mismatch logs were a separate warning. They were not the restart trigger.
+
+The capital-ledger parse warning is also distinct:
+
+- it indicates the ledger file is list-shaped, not a dict
+- it forces restore to rely more heavily on the other state surfaces
+- it is noisy, but it is not what actually killed the process in this incident
+
+## Fix Applied
+
+Two changes were made.
+
+### 1. Stop writing zero-sized `OPEN` rows
+
+In `_apply_internal_retract()`:
+
+- compute `remaining_qty`
+- if the remaining size is effectively zero, treat the retract as a full close
+- return the forced exit without emitting a new `position_state` row with `status="OPEN"`
+
+This prevents the bad row from being created in the first place.
+
+### 2. Make restore skip legacy bad `OPEN` rows
+
+In `_restore_position_state()`:
+
+- the ClickHouse restore query now filters `OPEN` rows with `quantity > 0 AND notional > 0`
+- if an invalid candidate still appears, restore logs and rejects it instead of hard-halting the process
+- restore falls back to HZ state or flat continuation rather than turning a stale row into a restart loop
+
+This is important because the repository already contains stale history. The fix is not only to stop producing new malformed rows; it also has to prevent old rows from re-triggering the same failure path on the next reboot.
+
+### 3. Keep the full-close path coherent
+
+The retract path now computes `remaining_qty` explicitly and treats `remaining_notional <= 1e-9` or `remaining_qty <= 0.0` as a full close.
+
+That means:
+
+- a full retract does not leave a zero-size `OPEN` snapshot behind
+- the exit is finalized as a close, not as a pseudo-open partial state
+- the runtime slot is removed cleanly instead of being left in a half-closed limbo
+
+## Verification Added
+
+Regression tests were added for both sides:
+
+- full-close retracts no longer emit zero-sized `OPEN` rows
+- restore skips zero-sized `OPEN` candidates without setting `restore_failed`
+
+The tests use the existing retract and restore harnesses:
+
+- one test seeds a tiny short leg that collapses to zero on retract and asserts no `OPEN` zero-size row is written
+- one test feeds a zero-sized `OPEN` `position_state` row into restore and asserts restore does not hard-halt
+
+## Operational Impact
+
+After this fix:
+
+- stale zero-sized `OPEN` rows no longer restart BLUE
+- malformed open snapshots are quarantined as legacy garbage
+- the live runtime can continue from a sane source instead of bouncing on the same bad record
+
+## What This Does Not Fix
+
+This change does not rewrite historical ClickHouse rows already present in the warehouse.
+
+It only changes:
+
+- new retract writes
+- restore selection and rejection policy
+- restart behavior when the old garbage is encountered
+
+If you want the historical ledger cleaned up, that is a separate reconciliation task. The current patch is intentionally conservative and only stops the bad row from causing further damage.
diff --git a/prod/docs/PINK_ACCOUNTING_EXEC_FIX.md b/prod/docs/PINK_ACCOUNTING_EXEC_FIX.md
new file mode 100644
index 0000000..5b94709
--- /dev/null
+++ b/prod/docs/PINK_ACCOUNTING_EXEC_FIX.md
@@ -0,0 +1,362 @@
+# PINK / DITAv2 Accounting & Execution Fix — Spec and Dev Guide
+
+**Status**: SPEC — ready for implementation agent
+**Date**: 2026-06-11
+**Branch**: `exp/pink-ditav2-sprint0-20260530` (continue on it or fork `fix/pink-accounting-consolidation`)
+**Author of spec**: forensic session 2026-06-11 (FET −$5,990.90 mis-book replay)
+**Prerequisite for**: VIOLET rebuild (`violet_subsecond_rebuild_plan` memory / future plan session)
+
+---
+
+## 0. Why this exists — the incident in one paragraph
+
+On 2026-06-11 PINK closed a FET-USDT short that the exchange settled at
+≈ **+$164 net** (entry VWAP 0.1878, exit 0.1866, ~202K FET) but the kernel
+booked **−$5,990.90** and capital diverged −$6,154 from the exchange wallet.
+Replay against `dolphin_pink.trade_reconstruction` slot images identified
+three stacked defects, all in *derivation* code (none in exchange facts):
+(1) fill events carried BingX's MARKET **protective bound price** (0.229,
++22% off tape) instead of the true fill price; (2) `realized_pnl()` and
+`mark_price()` multiplied PnL by `slot.leverage` (exchange leverage — but
+`slot.size` is exchange *quantity*, so every leg was 3× inflated); (3) the
+Python settle baseline `_last_settled_pnl` resets empty on every restart,
+so reconcile-adopted slots re-settle carried PnL. Exact replay of leg 1:
+`26,007 × (0.229−0.1878)/0.1878 × 0.1878 × 3 = −3,214.4652` ✓ matches the
+booked increment to the cent.
+
+A fourth structural finding: there are **three parallel ledgers** (Rust
+`AccountState` K/E, Python `AccountProjection` — the one persistence reads,
+fee-blind — and `AccountProjectionV2`, dead in the live path). This spec
+consolidates to **E-facts as ledger of record + K as integrity checksum +
+one atomic published snapshot**.
+
+---
+
+## 1. Scope and non-goals
+
+IN SCOPE
+1. Commit + activate the Phase-0 fixes already in the working tree.
+2. E-anchored published capital; single atomic account snapshot.
+3. Per-trade PnL provenance (`exchange | kernel_estimate`) end-to-end.
+4. Sizer feedback off trade-realized PnL (not capital deltas).
+5. Persistence hygiene: duplicate row emission, silent async-insert loss,
+   `event_seq` stamping, `bars_held` clamp, naive-UTC timestamps.
+6. Kernel hardening leftovers: `resolve_slot` no-match sentinel,
+   FILL_SETTLED realized override of flagged estimate legs.
+
+OUT OF SCOPE (separate tickets)
+- BLUE's exit-path masking bug (LINK −$1,248, `TODO_TP_SCAN_CADENCE_BUGFIX.md`) — BLUE stack, not DITAv2.
+- VIOLET fork, sub-second clock, venue price-feed port, cadence quantizer.
+- ch_writer head-of-line poison-row parking redesign (mitigations land here;
+  the full parking-lane design is its own task).
+- prefect.db / ClickHouse TTL disk remediation.
+
+HARD INVARIANTS — MUST NOT CHANGE
+- **Dual leverage**: `slot.size` = exchange quantity; `slot.leverage` =
+  exchange leverage (1–3x cap, set at BingX API); *our*-leverage
+  (conviction) = `size × entry_price / capital`, computed only at
+  `pink_direct._hz_publish` (line ~911). PnL is therefore **leverage-free**:
+  `qty × Δprice`, side-signed. Do not touch the conviction→exchange mapping
+  (`round_half_even_linear_0.5_to_9.0_to_1_to_exchange_cap`) or
+  `target_size` computation.
+- **Exits are never skipped** (exec-router invariant set, §16 kernel ref).
+- **BLUE-parity policy contract**: `DecisionEngine`/`IntentEngine` inputs
+  (MarketSnapshot + capital + slot state) unchanged in shape.
+- **Namespace isolation**: zero writes to `dolphin.*` / `dolphin_prodgreen.*`
+  or BLUE/PRODGREEN HZ maps. Re-verify with `pink_ctl.py mode-verify`.
+- **Data cadences are sacred** (operator rule 2026-06-10): never reduce a
+  data cadence for throughput.
+
+---
+
+## 2. Phase 0 — Commit and activate the already-applied fixes
+
+These changes exist UNCOMMITTED in the working tree as of 2026-06-11 ~16:30.
+Verify each hunk, commit as one reviewed unit, then restart `dolphin_pink`.
+
+### 0.1 `prod/clean_arch/dita_v2/_rust_kernel/src/lib.rs`
+| Function | Change (already applied) |
+|---|---|
+| `KernelCore::realized_pnl` (~line 1153) | PnL = side-signed `qty × (exit − entry)`; **no leverage factor**; returns 0 when `entry<=0 ∨ exit_size<=0 ∨ exit_price<=0 ∨ !finite` |
+| `TradeSlot::mark_price` (~line 394) | no `× leverage` in unrealized; a mark NEVER becomes entry basis — missing basis flags `metadata.entry_basis_missing=true`, unrealized stays 0 |
+| `KernelCore::fill_matches_order` (new) | identity match on `venue_order_id` / `venue_client_id` |
+| `KernelCore::apply_fill` | entry/exit routing by ORDER IDENTITY first, FSM state second (`!id_matches_exit` / `!id_matches_entry` guards); entry basis = **VWAP across entry fills** (`(prev_basis×prev_filled + price×fill)/accumulated`); price-less exit fill reduces size, books 0 PnL, flags `metadata.realized_skipped_no_price=true` |
+
+Rebuild required: `cargo build --release` in `_rust_kernel/` (the `.so` is
+only auto-built when missing — **source/binary drift is a known hazard**;
+add the build to the commit checklist). `cargo test`: 32/32 green as of spec.
+
+### 0.2 `prod/clean_arch/dita_v2/bingx_venue.py`
+Fill events must carry a TRUE fill price or 0.0 — never the order's nominal
+`price` / submit `receipt.price` (BingX MARKET bound price, ±20–25%):
+- `_events_from_submit` fill event (~line 585): `_row_float(ack_row,
+  "avgPrice","ap","lastFillPrice","L", default=0.0)`
+- `_event_from_row` (~line 697): fills use the same true-price chain;
+  non-fill events (ACK/CANCEL/REJECT) may keep nominal `price` as info
+- `_fill_event_from_row` (~line 736): `"lastFillPrice","L","avgPrice","ap"`
+
+### 0.3 `prod/clean_arch/dita_v2/rust_backend.py`
+- `reconcile_from_slots`: seeds `_last_settled_pnl[slot_id] = slot.realized_pnl`
+  and `_slot_was_closed[slot_id] = slot.closed` for every adopted slot.
+- `restore_state`: same re-anchoring after successful restore.
+
+### 0.4 Adjacent fixes riding the same commit
+- `prod/ch_writer.py`: insert URLs append `&date_time_input_format=best_effort`;
+  flush errors log at WARNING (first 10 + every 100th), counter `_flush_errors`.
+- `prod/clean_arch/dita_v2/blue_parity.py` `price_of`: hyphen-tolerant
+  fallback (`FET-USDT` → `FETUSDT`) — fixes the unmanaged-position block.
+- `prod/clickhouse/users.xml`: `date_time_input_format=best_effort` for the
+  `dolphin` user (NOTE: running CH container did not honor it even after
+  restart — the container does not mount compose configs; effective on next
+  compose recreation. The client-side URL param is the operative fix.)
+- `prod/tests/test_dita_v2_kernel.py`: partial→full fill test updated to
+  incremental `filled_size` semantics (BingX WS `lastFilledQty`).
+
+### 0.5 Phase 0 gates
+1. `cargo test` in `_rust_kernel`: 32/32.
+2. `pytest prod/tests/test_dita_v2_kernel.py`: 7/7.
+3. `pytest prod/clean_arch/dita_v2/test_exec_router_runtime.py
+   test_venue_reconcile.py test_orphan_prevention.py
+   prod/tests/test_pink_async_fill_pump.py
+   prod/clean_arch/dita_v2/test_account_core_v2.py test_bingx_bugs.py`: 134/134.
+4. KNOWN pre-existing failures (NOT introduced by this work — verified by
+   hunk-revert): 4 tests in `prod/tests/test_dita_v2_bingx_adapter.py`
+   (snapshot-fill emission broke when sync `submit()` started passing None
+   snapshots on 2026-06-10). Fix or quarantine them explicitly in this phase
+   — do not let them mask new regressions.
+5. Restart `dolphin_pink` at a FLAT moment; verify in logs: no
+   `realized_skipped_no_price` storms, no `entry_basis_missing` on fresh
+   entries, first round-trip books PnL within ±(fees+slippage) of
+   `GET /openApi/swap/v2/user/income` for the same trade.
+
+---
+
+## 3. Phase 1 — E-anchored published capital
+
+**Goal**: the capital that persistence/HZ/sizer see is exchange-anchored;
+K never publishes.
+
+### 3.1 `prod/clean_arch/dita_v2/account.py`
+- Add to `AccountSnapshot`: `capital_source: str` (`"e_anchored" |
+  "k_bridged" | "seed"`), `e_wallet_balance: float`, `event_seq: int`.
+- New method `AccountProjection.anchor_to_exchange(wallet_balance: float,
+  available_margin: float, event_seq: int)`: sets `capital = wallet_balance`
+  (guard `>0` and finite — the zero-wb frame lesson), `capital_source =
+  "e_anchored"`, recomputes equity. `settle()` remains for the BRIDGE case
+  only: between anchors, capital += realized (`capital_source="k_bridged"`).
+- `settle(realized_pnl, fees)`: **stop ignoring fees** — `capital +=
+  realized_pnl − fees` (today fees only accumulate in `fees_paid`; published
+  capital ignores them between reseeds).
+
+### 3.2 `prod/clean_arch/runtime/pink_direct.py`
+- The existing reseed path (balance-bearing ACCOUNT_UPDATE →
+  `kernel.reset_and_seed(wb)`) additionally calls
+  `kernel.account.anchor_to_exchange(...)` — one anchoring action, two
+  ledgers consistent.
+- Boot seed (launcher `exchange_balance_capital` block, pink_direct ~line
+  262) goes through `anchor_to_exchange` instead of direct attribute writes.
+
+### 3.3 Gates
+- New unit tests (`prod/tests/test_pink_account_anchor.py`):
+  anchor sets capital/source; zero/negative/NaN wb rejected; settle bridges
+  with fees; anchor after bridge snaps to wb exactly.
+- Shadow check (live, 24 h on VST): published capital vs
+  `GET /openApi/swap/v2/user/balance` polled 1/min — max |Δ| outside a
+  trade-settlement window ≤ $0.01; during settlement ≤ pending-fee bound.
+
+---
+
+## 4. Phase 2 — Single atomic snapshot, ledger consolidation
+
+**Goal**: one immutable, versioned account snapshot; the two redundant
+ledgers demoted/removed.
+
+### 4.1 `prod/clean_arch/dita_v2/account.py`
+- Make the published snapshot **immutable-replace**: `AccountProjection`
+  builds a new frozen `AccountSnapshot` (carry `event_seq`) on every
+  mutation and swaps a single reference (GIL-atomic). Readers must take
+  `snap = kernel.account.snapshot` once per use (audit call sites:
+  `pink_clickhouse.py`, `hazelcast_projection.py` HZ writer, `pink_direct`).
+- `AccountProjectionV2`: DELETE, or move to `prod/clean_arch/dita_v2/
+  _attic/` with a module docstring pointing here. Its only live-path import
+  is `exchange_event.py` — migrate that import or the dataclasses it uses
+  (`EPosition` is genuinely useful; keep it in `account.py`).
+- The Rust `AccountState` K-ledger STAYS — demoted by documentation and by
+  Phase 1 (it no longer feeds published capital): its jobs are reconcile
+  classification (R1-style), `capital_frozen`, and E-dark bridging. Update
+  the module docstring to say exactly this.
+
+### 4.2 `prod/clean_arch/persistence/pink_clickhouse.py`
+- Read capital/equity/peak/trade_seq from the single snapshot reference;
+  no recomputation.
+- Add columns to emitted rows (and the matching `ALTER TABLE` DDLs under
+  `prod/clickhouse/pink/08_provenance.sql` — **apply DDLs to CH BEFORE
+  deploying code that emits them**; the missing-table head-of-line jam of
+  2026-06-11 is the cautionary tale):
+  - `account_events`, `status_snapshots`: `capital_source LowCardinality(String) DEFAULT ''`,
+    `account_event_seq UInt64 DEFAULT 0`
+  - `trade_events`, `trade_exit_legs`: `pnl_source LowCardinality(String) DEFAULT ''`
+    (`exchange` | `kernel_estimate`)
+- `bars_held`: clamp to `max(0, …)` at row-build time (UInt16 column;
+  negative values currently 400 on trade_events / silently vanish on
+  async tables).
+- Timestamps: route every `ts` through one helper emitting **naive-UTC
+  microsecond ISO** (no `+00:00`) — best_effort already tolerates both, but
+  rows must stop depending on a parser setting.
+
+### 4.3 Duplicate-emission fix (same file)
+Every CH row is currently emitted twice (visible in any query). Hunt the
+double call: instrument `_sink()` with a per-(table, content-hash) debug
+counter in a test, then trace the two call paths (suspect: `persist_result`
+invoked both from the runtime step and from the fill pump for the same
+event). Fix at the caller level; do NOT dedupe by content in the sink
+(masks real double-events). Regression test: one simulated round trip →
+exactly one row per logical event per table.
+
+### 4.4 `prod/ch_writer.py`
+- `wait_for_async_insert`: `"1"` for ALL `dolphin_pink` tables (accounting
+  rows must never be silently lost; the spool absorbs latency). Keep `0`
+  acceptable only for high-volume shadow tables if measured necessary —
+  document any exception inline.
+- Mitigation for head-of-line (full redesign out of scope): after
+  `attempts > 1000` on a row, log ERROR with the CH response body once per
+  100 attempts (today the reject reason is invisible without manual replay).
+
+### 4.5 Gates
+- Full offline suite (the 533+ DITAv2/PINK set) green, minus the Phase-0
+  quarantined adapter tests if still open.
+- One live VST round trip: every table gets exactly one row per event;
+  `pnl_source`/`capital_source` populated; CH `system.text_log` shows zero
+  parse rejections for `dolphin_pink`.
+
+---
+
+## 5. Phase 3 — Sizer feedback off trade-realized PnL
+
+**THE one seam where this refactor can silently change alpha behavior.**
+
+### 5.1 `prod/clean_arch/runtime/pink_direct.py` — `_sizer_trade_feedback` (~line 1453)
+Today: `pnl = acc.capital − self._sizer_entry_capital` (capital delta).
+Under E-anchored capital this absorbs funding, fees of other activity, and
+**foreign fills from the shared VST account** (PRODGREEN collision class).
+Change to:
+```
+pnl = slot_realized_for_trade(trade_id)   # Σ slot.realized_pnl legs, i.e.
+                                          # kernel estimate, overridden by
+                                          # exchange rp when settled (5.2)
+```
+Source: the closing slot dict already carries `realized_pnl`; use it (minus
+the fees recorded for the trade when available) instead of the capital
+delta. Keep the magnitude semantics the sizer expects (sign + rough size —
+per the existing comment, bucket/streak multipliers only need that).
+
+### 5.2 Exchange override (E-led repair) — `bingx_user_stream.py` + `rust_backend.py`
+- The WS `FILL_SETTLED` path already carries the exchange's realized (`rp`)
+  and fee (`n`, sign-flipped at boundary per BingX quirks memory). Extend
+  the kernel account-event payload with `trade_id`, and on receipt:
+  - if the matching slot leg was flagged `realized_skipped_no_price`,
+    ADD the exchange realized to `slot.realized_pnl` (repair) and clear
+    the flag; settle the increment through the normal baseline mechanism;
+  - else record `pnl_source="exchange"` for the trade-event row (the
+    estimate stays as the booked figure unless |estimate−rp| exceeds a
+    tolerance — then log ERROR + emit an `anomaly_events` row; do NOT
+    silently re-book).
+- Rust: add `dita_kernel_repair_realized(slot_id, amount)` FFI (or fold the
+  repair into `on_account_event` with `slot_id` in payload). Keep it
+  idempotent via the existing account-event dedup.
+
+### 5.3 Gates
+- Unit: feedback receives trade-realized, not capital delta (simulate a
+  foreign-fill capital jump mid-trade → feedback unaffected).
+- Unit: price-less exit leg + later FILL_SETTLED repair → slot realized
+  equals exchange `rp`; settle baseline consistent (no double-settle).
+- Parity: `test_blue_parity.py`, `test_alpha_blue_untouched_g7.py` green
+  (sizer behavior unchanged for normal fills).
+
+---
+
+## 6. Phase 4 — Kernel hardening leftovers
+
+### 6.1 `lib.rs` — `resolve_slot` (~line 1099)
+Falls back to **slot 0** when nothing matches. Change: return
+`Option<usize>`; on `None`, `on_venue_event` returns
+`UNRESOLVED_SLOT` (diagnostic exists already) without mutating any slot,
+severity WARNING, event recorded in outcome details. Python callers: the
+runtime treats UNRESOLVED_SLOT as a logged no-op (the `_fill_is_ours`
+filter remains first-line defense; this is kernel-side defense for
+venue-agnostic reuse).
+NOTE: several tests construct events with `slot_id=-1` expecting slot-0
+fallback — update them to pass explicit `slot_id=0` (behavioral test
+change; list each in the PR description).
+
+### 6.2 ID-less fill routing (documentation + metric, not code)
+BingX WS omits clientOrderId, so identity routing can't always engage.
+Add a counter metric (`fills_routed_by_state_total`) via an
+`anomaly_events` row per occurrence, severity INFO — gives VIOLET the data
+to justify per-venue synthetic ids later. No FSM behavior change.
+
+### 6.3 Gates
+- New Rust tests: unresolved event mutates nothing; entry-id fill during
+  EXIT_WORKING routes to entry (already covered by Phase-0 routing — add
+  the explicit case); price-less exit leg books 0 + flag.
+
+---
+
+## 7. Test matrix (run-order for the implementing agent)
+
+| Stage | Command (env: `PYTHONPATH=/mnt/dolphinng5_predict:/mnt/dolphinng5_predict/nautilus_dolphin`, venv `/home/dolphin/siloqy_env/bin/python3`) | Pass bar |
+|---|---|---|
+| Rust unit | `cargo test --release` in `_rust_kernel/` | 100% |
+| Kernel FSM | `pytest prod/tests/test_dita_v2_kernel.py` | 100% |
+| Bridge/accounting | `pytest prod/tests/test_pink_ditav2_kernel_bridge.py test_pink_ditav2_accounting_invariants.py prod/clean_arch/dita_v2/test_account_core_v2.py` | 100% |
+| Runtime/reconcile | `pytest prod/clean_arch/dita_v2/test_venue_reconcile.py test_orphan_prevention.py test_exec_router_runtime.py prod/tests/test_pink_async_fill_pump.py test_pink_direct_runtime.py` | 100% |
+| Chaos | `pytest prod/tests/test_pink_ditav2_chaos_harness.py` + `test_dita_v2_e2e_functional.py` | 100% |
+| Parity | `pytest prod/clean_arch/dita_v2/test_blue_parity.py test_alpha_blue_untouched_g7.py` | 100% |
+| Adapter | `pytest prod/tests/test_dita_v2_bingx_adapter.py` | 100% after Phase-0 item 4 resolution |
+| LIVE VST E2E | `python prod/ops/dita_v2_live_bingx_smoke.py --pink --symbol TRXUSDT` | suite green |
+| **Golden replays (NEW — write these)** | `prod/tests/test_pink_accounting_golden.py` | see below |
+| Shadow soak | 24–48 h on VST | capital vs balance ≤ $0.01 idle |
+
+### Golden replay tests (the heart of the acceptance)
+Feed the kernel the recorded FET event sequence (entry fills 195,259 +
+7,017 @ 0.1878; exit fills 26,007 + remainder; the poisoned variant with
+price=0.229 and the clean variant with 0.1866):
+1. Clean prices → realized = `(0.1878−0.1866) × 202,276 ≈ +242.7` gross.
+2. Poisoned price (0.229) reaching the kernel anyway → with the adapter fix
+   it must arrive as 0.0 → leg books 0 + `realized_skipped_no_price`; after
+   synthetic FILL_SETTLED rp=+164 → slot realized = +164, `pnl_source=exchange`.
+3. Restart mid-position (save_state/restore_state + reconcile_from_slots)
+   → next venue event settles ONLY the incremental PnL.
+4. VWAP: two entry fills at different prices → basis = weighted average.
+5. Dual-leverage invariant: same fills at exchange-leverage 1 vs 3 →
+   **identical realized PnL**; only margin fields differ.
+
+---
+
+## 8. Rollout & rollback
+
+1. Each phase = one PR-sized commit, gates green before the next.
+2. Activation requires `supervisorctl restart dolphin_pink` — restart at a
+   FLAT moment (check `DOLPHIN_STATE_PINK` + exchange positions). The
+   restart-reconcile path is itself under test here; first restart after
+   Phase 0 should be watched live.
+3. Rollback = `git revert` of the phase commit + rebuild `.so` + restart.
+   The Rust `.so` MUST be rebuilt on both apply and revert — stale-binary
+   drift is how the incremental-fill change sat uncompiled until 2026-06-11.
+4. CH DDLs are additive (`ADD COLUMN ... DEFAULT`) — no destructive
+   migrations anywhere in this spec; rollback leaves unused columns, which
+   is fine.
+5. PINK is VST (virtual funds) — it is the canary by construction. Nothing
+   in this spec touches BLUE files (verify with `git diff --name-only`
+   against the §38.7 checklist).
+
+## 9. Done criteria (the whole spec)
+
+- All phases merged; full matrix green; golden replays green.
+- 48 h VST soak: zero UNEXPLAINED reconcile errors; published capital
+  tracks exchange balance; every closed trade's `trade_events.pnl` within
+  fees+slippage of the exchange income record, with `pnl_source` populated.
+- `pink_ctl.py mode-verify` passes (namespace isolation intact).
+- SYSTEM BIBLE §38 addendum updated (one paragraph: E-led ledger, K as
+  checksum, provenance fields) + `DITA_V2_KERNEL_REFERENCE.md` §"Capital
+  simplification" rewritten to match reality.
diff --git a/prod/docs/VIBRISS_PARAMETER_GOVERNANCE_SPEC.md b/prod/docs/VIBRISS_PARAMETER_GOVERNANCE_SPEC.md
new file mode 100644
index 0000000..d92798c
--- /dev/null
+++ b/prod/docs/VIBRISS_PARAMETER_GOVERNANCE_SPEC.md
@@ -0,0 +1,2978 @@
+# VIBRISS Parameter Governance Spec
+
+**Name**: VIBRISS — Variational Input-driven Bandit-Reactive Intelligent Sensing System  
+**Status**: Design doctrine / implementation target  
+**Scope**: BLUE/PINK parameter governance, initially shadow/advisory only  
+**Canonical dependency**: `SYSTEM_BIBLE_v7.md`  
+**Operational stance**: shadow-first, replay-first, guardrail-first. VIBRISS
+must be useful even when it never gets permission to actuate live.
+
+## 1. Purpose
+
+VIBRISS is the engine's active parameter-sensing and adaptive execution layer.
+Its job is to replace brittle hardcoded execution constants with bounded,
+auditable, continuously re-evaluated parameter recommendations.
+
+VIBRISS is not a new alpha model and not a full RL layer. It is an online
+statistical parameter-governance system: observe outcomes, test safe candidate
+values, score the realized response, retire weak settings, and keep enough
+controlled exploration alive to detect drift.
+
+The first intended target is exit-parameter governance, especially ADVSL and
+fast/cubic TP parameters such as hold-bar limits, floor thresholds, pressure
+thresholds, and TP posture. Later targets can include sizing haircuts, urgency,
+asset-selection posture, and venue-specific execution parameters.
+
+## 2. Design Stance
+
+VIBRISS must be modular, spec-driven, replayable, and safety bounded.
+
+Key doctrine:
+
+- One learner per parameter spec by default.
+- Bundle/slate learning only after interaction effects are repeatedly material.
+- Contextual bandits first; full RL only later if decisions are truly sequential
+  and materially coupled across multiple execution steps.
+- Discrete and bucketed parameters use Thompson Sampling, UCB, LinTS, or LinUCB.
+- Continuous bounded scalars are discretized into safe buckets first.
+- Nonstationary behavior uses discounted or sliding-window evidence plus drift
+  detection.
+- Safety-critical parameters require baseline-safe exploration, confidence
+  thresholds, step limits, cooldowns, and hard guardrails.
+- Passive fill and time-to-fill decisions should use survival-analysis modules
+  where censoring matters.
+
+## 3. System Boundary
+
+VIBRISS must not silently mutate engine internals.
+
+The correct production shape is:
+
+```text
+context ingestion
+  -> admissible candidate generation
+  -> learner scoring
+  -> guardrail filter
+  -> action selection
+  -> advice publication
+  -> allowed engine consumption point
+  -> delayed outcome capture
+  -> reward mapping
+  -> online update
+```
+
+The hot execution path consumes advice only at documented decision points. The
+learner/update path is separate and may lag. If advice is stale, low-confidence,
+or invalid, the engine falls back to the baseline parameter.
+
+BLUE is in-memory/paper and not BingX-enabled. PINK is the BingX venue-facing
+world. VIBRISS may govern both, but its output contract must be namespace-aware
+and must not assume that BLUE has exchange state.
+
+Non-goals:
+
+- VIBRISS does not pick assets.
+- VIBRISS does not replace MARAS, OBF, V7, ACB, EFSM, or SurvivalStack.
+- VIBRISS does not own exchange reconciliation.
+- VIBRISS does not rewrite frozen champion configs.
+- VIBRISS does not turn offline backtest winners into live settings without
+  a shadow/OPE/promotion path.
+
+Its only authority is to publish bounded, versioned parameter advice and to
+learn from the outcome trail.
+
+## 4. Terminology
+
+| Term | Meaning |
+|---|---|
+| `vibrissa` | One probe-trade, parameter test, or market feeler. |
+| `vibrissae` | The active parameter-probe array. |
+| `parameter spec` | Loadable contract defining one tunable parameter. |
+| `arm` | One candidate value or execution configuration. |
+| `reward` | Bounded realized execution-quality score. |
+| `posture` | Current preferred parameter set plus confidence and fallback metadata. |
+| `baseline` | The currently trusted hardcoded or documented production value. |
+
+## 4.1 Control-Plane Elegance Constraints
+
+VIBRISS must remain a disciplined parameter-governance control plane, not an
+unbounded mesh of subsystems mutating each other. Adaptive behavior is allowed
+only when it preserves ownership, auditability, and bounded actuation.
+
+Hard architecture rules:
+
+1. One writer per parameter.
+   - A live parameter may have many sensors and many context inputs, but only
+     one ParamSet is allowed to publish the effective value for that parameter
+     in a given namespace.
+
+2. ParamSpecs and ParamSetSpecs own promotion rules.
+   - Promotion cadence, evidence gates, rollback rules, manual-approval
+     requirements, and replacement rhythm are part of the spec. The runner must
+     execute declared policy, not invent policy.
+
+3. Meta-cadence is itself a parameter, but only at a slower cadence.
+   - VIBRISS may tune replay cadence, promotion-review cadence, checkpoint
+     cadence, or reward-join cadence, but those meta-parameters must move more
+     slowly than the governed trading/execution parameter and must have
+     stronger guardrails.
+
+4. EsoF, ExoF, MARAS, OBF, V7, MHS, and drawdown state are context inputs, not
+   arbitrary controllers.
+   - They may influence candidate scoring, confidence, demotion, or fallback,
+     but they must not directly mutate live parameters outside the owning
+     ParamSet.
+
+5. Every live change must be reproducible.
+   - Log candidate set, chosen action, action probability or confidence,
+     context hash, reward mapping, model version, compiled config hash,
+     fallback reason, promotion state, and rollback path.
+
+6. No hidden cross-subsystem mutation.
+   - If one subsystem changes another subsystem's effective behavior, the change
+     must appear as a typed ParamSet advice event and an audited engine-consumed
+     posture update.
+
+7. Shadow first, replay/OPE second, canary third, live last.
+   - No safety-critical parameter may skip directly from idea or in-sample
+     replay to live actuation. Live promotion requires held-out evidence,
+     shadow logging, explicit approval when required, and automatic demotion
+     conditions.
+
+These constraints are mandatory for all future ADVSL, TP, DVOL/VOL, IRP,
+asset-picker, EFSM/overlay, and meta-cadence ParamSets. If a design violates
+them, the design is considered tangled and must be simplified before
+implementation.
+
+## 5. Parameter Spec Contract
+
+Each adaptive parameter must be declared by a loadable spec. VIBRISS should not
+hardcode knowledge of individual parameters.
+
+Important terminology:
+
+- `ParamSetSpec`: the loadable contract for a family of related parameters.
+- `paramset_config`: configuration that applies to the ParamSet as a whole.
+- `params`: the parameter declarations contained by the ParamSet.
+- `param_defaults`: defaults inherited by every parameter in `params`.
+- per-param override: a field inside one `params.<param_name>` entry that
+  overrides `param_defaults` for that parameter only.
+
+The live runner must not perform complex inheritance during scoring. Specs are
+authored in a rich hierarchical form, validated, compiled, and hash-stamped into
+a flat canonical policy document before the runner consumes them.
+
+Required fields:
+
+```yaml
+identity:
+  name: advsl.overlay_min_hold_bars
+  type: integer
+  units: bars
+  default: 6
+
+domain:
+  candidates: [4, 6, 8, 10, 12, 16, 20]
+  hard_min: 0
+  hard_max: 40
+
+safety:
+  fallback_baseline: 6
+  max_step_change: 4
+  cooldown_trades: 5
+  min_shadow_samples: 100
+  min_live_confidence: 0.80
+  max_exploration_rate: 0.05
+
+placement:
+  consumer: advanced_sl
+  decision_point: open_trade_exit_evaluation
+  namespace: blue
+
+live_change_policy:
+  mode: between_trades
+  allow_intratrade_change: false
+
+candidate_policy:
+  learner: linucb
+  nonstationarity: sliding_window
+  window_trades: 300
+
+success:
+  primary_metric: capital_curve_delta_after_cost
+  secondary_metrics:
+    - clipped_winner_cost
+    - saved_loss
+    - drawdown_delta
+    - recovery_lag
+
+inputs:
+  - maras_latest
+  - v7_decision_events
+  - advanced_sl_monitor_latest
+  - obf_universe_latest
+  - eigen_scan
+  - trade_path
+
+reward_mapping:
+  bounded_range: [-1.0, 1.0]
+  delayed_until: trade_close_or_counterfactual_terminal
+  components:
+    saved_loss: +1.0
+    missed_profit: -1.5
+    drawdown_reduction: +0.5
+    tail_loss: -2.0
+
+promotion_policy:
+  owner: param_set
+  technique: replay_shadow_canary
+  review_cadence_s: 900
+  min_replay_trades: 300
+  min_shadow_decisions: 200
+  min_realized_rewards: 50
+  min_contiguous_regions: 4
+  required_evidence:
+    recursive_capital_curve_delta_after_cost: "> 0"
+    worst_region_delta: ">= configured_floor"
+    clipped_winner_cost: "<= configured_budget"
+    drawdown_delta: "<= 0"
+  allowed_transitions:
+    - disabled_to_shadow
+    - shadow_to_advisory
+    - advisory_to_canary_live
+    - canary_live_to_controlled_live
+  manual_approval_required:
+    - advisory_to_canary_live
+    - canary_live_to_controlled_live
+  automatic_demotion_on:
+    - stale_required_sensor
+    - reward_drift
+    - drawdown_alarm
+    - invalid_checkpoint
+
+meta_cadence_policy:
+  owner: param_set
+  status: shadow_first
+  tunable_cadences:
+    calibration_interval_s: [300, 900, 1800, 3600]
+    promotion_review_interval_s: [900, 1800, 3600, 7200]
+    checkpoint_interval_s: [30, 60, 120, 300]
+    shadow_to_canary_cooldown_trades: [25, 50, 100, 200]
+  context_inputs:
+    - maras_latest
+    - exof_latest
+    - esof_latest
+    - mhs_latest
+    - reward_backlog
+    - drawdown_state
+  success:
+    primary_metric: policy_stability_adjusted_reward
+    secondary_metrics:
+      - stale_advice_rate
+      - promotion_false_positive_rate
+      - missed_adaptation_cost
+      - operator_churn
+      - compute_cost
+  live_change_policy:
+    calibration_cadence: controlled_after_shadow
+    promotion_cadence: advisory_only_until_explicit_approval
+
+outputs:
+  hz_key: DOLPHIN_FEATURES.vibriss_param_advice
+  clickhouse_table: dolphin.vibriss_decisions
+  state_table: dolphin.vibriss_policy_state
+```
+
+### 5.1 ParamSet Config and Per-Parameter Overrides
+
+The canonical authoring shape is:
+
+```yaml
+param_set:
+  id: advsl.hold_substitute.v1
+  version: 1.0.0
+  namespace_default: blue
+  status: shadow_first
+
+paramset_config:
+  consumer: advanced_sl
+  decision_family: exit_risk_timing
+  placement:
+    decision_point: trade_entry
+    live_replacement_rhythm: capture_on_entry
+  promotion_policy:
+    technique: replay_shadow_canary
+    review_cadence_s: 1800
+  meta_cadence_policy:
+    status: shadow_first
+  outputs:
+    hz_key: DOLPHIN_FEATURES.vibriss_hold_substitute_advice
+    decision_table: dolphin.vibriss_decisions
+    reward_table: dolphin.vibriss_rewards
+
+param_defaults:
+  learner:
+    type: discounted_ucb
+    nonstationarity: sliding_window
+    window_trades: 300
+  safety:
+    fallback_baseline: 12
+    min_shadow_samples: 200
+    min_live_confidence: 0.80
+    max_exploration_rate: 0.0
+  reward_mapping:
+    bounded_range: [-1.0, 1.0]
+    primary_metric: recursive_capital_curve_delta_after_cost
+  guardrails:
+    stale_sensor_policy: shrink_to_baseline
+    drawdown_alarm_policy: freeze_to_baseline
+
+params:
+  advsl.min_hold_bars_before_floor_arm:
+    type: integer
+    units: bars
+    domain:
+      candidates: [4, 6, 8, 10, 12, 14, 16, 20, 24, 28, 34, 40]
+      hard_min: 0
+      hard_max: 48
+    default: 12
+    baseline_reference: 20
+
+  advsl.recovery_extension_max_bars:
+    type: integer
+    units: bars
+    domain:
+      candidates: [0, 4, 8, 12, 20, 34]
+      hard_min: 0
+      hard_max: 40
+    default: 0
+    learner:
+      type: shadow_only_discounted_ucb
+    safety:
+      min_shadow_samples: 500
+      min_live_confidence: 0.90
+```
+
+Merge precedence:
+
+```text
+compiled_param =
+  built_in_schema_defaults
+  < paramset_config
+  < param_defaults
+  < params.<param_name>
+  < namespace/runtime override if explicitly allowed by spec
+```
+
+Rules:
+
+- ParamSet-wide promotion and meta-cadence policy live in `paramset_config`
+  unless a parameter explicitly overrides a narrower field.
+- Per-param overrides may tighten safety, narrow domains, increase sample
+  requirements, or change learner type only if the ParamSet allows it.
+- Per-param overrides may not weaken global catastrophic guardrails.
+- The compiler must emit both the original source spec hash and the compiled
+  canonical hash.
+- The runner consumes only the compiled canonical form.
+
+### 5.2 Spec Compiler and Validation Library
+
+Use an existing platform-agnostic schema/config tool for the authoring layer.
+Do not invent a bespoke inheritance language.
+
+Recommended stance:
+
+| Need | Recommended tool | Runtime placement |
+|---|---|---|
+| Cross-language schema contract | JSON Schema | CI, compiler, runner validation. |
+| Rich defaults, constraints, unification, inheritance-like config | CUE | Spec compiler / CI, not hot path. |
+| Human-friendly authoring | YAML | Source only; compiled immediately. |
+| Runner consumption | canonical JSON | Hot path. |
+| Fast internal representation | dataclass / Pydantic / msgspec-style object | Runner load time only. |
+
+VIBRISS should prefer:
+
+```text
+YAML authoring -> CUE/JSON-Schema validation -> canonical JSON -> runner cache
+```
+
+The live runner should never parse CUE, run template expansion, or resolve a
+large inheritance tree during an advice decision. It should load a precompiled
+canonical JSON document, verify hashes and schema version, then use direct field
+access.
+
+Performance requirements:
+
+- spec compile can be slower because it is CI/worker time;
+- runner spec load should be bounded and rare;
+- advice scoring must use already-merged values;
+- every compiled ParamSet must include a deterministic `compiled_config_hash`;
+- all advice/audit rows must log `spec_hash` and `compiled_config_hash`.
+
+## 6. Candidate Algorithms
+
+V1 should support a small set of algorithms well, rather than a broad library
+surface poorly.
+
+Recommended V1 learners:
+
+| Parameter type | Default learner | Notes |
+|---|---|---|
+| Small categorical | Thompson Sampling | Useful for urgency, route, retry, fixed mode selection. |
+| Ordered discrete scalar | UCB or discounted UCB | Good for hold bars, TP buckets, pressure thresholds. |
+| Contextual finite arms | LinUCB or LinTS | First choice for MARAS/OBF/V7-conditioned advice. |
+| Continuous scalar | Adaptive discretization | Start bucketed; upgrade only if buckets are too coarse. |
+| Passive fill/delay | Survival model | Explicitly handle censored fill and recovery windows. |
+
+Useful libraries to inspect:
+
+- Vowpal Wabbit for contextual bandits, logged propensities, and OPE.
+- River for streaming statistics, online GLMs, and drift detection.
+- Open Bandit Pipeline for offline policy evaluation.
+- MABWiser for fast Python prototype comparison.
+- lifelines or statsmodels for survival analysis.
+- NumPyro/Pyro only when hierarchical Bayesian pooling is justified.
+
+### 6.1 Dependency Placement and Reliability Policy
+
+VIBRISS must distinguish algorithm research from live parameter governance.
+Performance and reliability are more important than using the most general
+library in the first live version.
+
+Dependency rule:
+
+- The live runner should have a small deterministic dependency surface.
+- Heavy learning, OPE, simulation, Bayesian inference, and broad model
+  comparison belong in `vibriss_worker` or offline jobs.
+- The engine consumes compact checkpointed policy state and advice payloads. It
+  must not shell out to a learner or wait on an offline library.
+- ClickHouse writes, model updates, and replay jobs must never block the hot
+  advice publication loop.
+- If a dependency is not needed to score the current checkpointed policy, it is
+  not a live-runner dependency.
+
+Recommended V1 split:
+
+| Layer | Allowed dependency posture | Reason |
+|---|---|---|
+| Engine hot path | no VIBRISS learner dependency | Engine reads validated advice only. |
+| `vibriss_runner` | stdlib + NumPy/Pandas only if needed; optional River subset for drift/stats | Keep startup, memory, and failure modes bounded. |
+| `vibriss_worker` | VW, River, OBP, MABWiser, lifelines, statsmodels, contextual libraries | Calibration, OPE, replay, walk-forward, and report generation. |
+| Research/simulation | ABIDES, Pyro/NumPyro, CATX, experimental packages | Valuable, but not part of the live critical path. |
+
+### 6.2 Library Decision Matrix
+
+| Library / stack | VIBRISS use | Placement | Decision |
+|---|---|---|---|
+| Internal UCB/TS/LinUCB | First production learners for bounded discrete arms. | runner + worker | Use first; easiest to audit and checkpoint. |
+| Vowpal Wabbit | Contextual bandit benchmark, action-dependent features, OPE workflows, possible future compact policy generator. | worker/offline | Approved for evaluation; not a V1 hot-path dependency. |
+| River | Streaming stats, reward normalization, ADWIN/Page-Hinkley/KSWIN-style drift detection, progressive validation. | runner optional; worker default | Approved, but keep live usage narrow. |
+| Open Bandit Pipeline | OPE estimator benchmarking and logged-bandit evaluation. | offline/worker | Approved for reports; not live. |
+| MABWiser | Fast Python comparison of TS/UCB/LinTS/LinUCB policies. | offline/worker | Approved for prototyping; not live. |
+| lifelines / statsmodels | Survival models, recursive diagnostics, stability checks. | worker/offline | Approved for passive fill/recovery modeling. |
+| contextualbandits | Alternative contextual-bandit benchmark implementations. | offline/worker | Research benchmark only. |
+| SMPyBandits / BanditPylib / PyBandits | Algorithm comparison and stochastic-bandit sandboxing. | offline/research | Optional; do not add to live image. |
+| NumPyro / Pyro | Hierarchical Bayesian pooling for sparse per-symbol/per-hash modules. | research/worker | Defer until sparse-data pooling is clearly needed. |
+| CATX | Continuous-action contextual bandit research. | research | Defer; bucketed actions first. |
+| ABIDES / ABIDES-Gym | Market-interactive simulation and stress rehearsal. | research/simulation | Useful later; too heavy for V1 runner. |
+| Kafka / Flink | Durable event-stream backbone and stateful stream processing. | future infra | Defer; Dolphin already has Hazelcast + ClickHouse + supervisord. |
+| scikit-multiflow | Historical stream-learning reference. | none | Do not use for net-new code; prefer River. |
+| banditml | Architectural reference for production bandit services. | research only | Do not depend on it without a fresh maintenance review. |
+
+### 6.3 Performance Budgets
+
+Initial budgets for the live runner:
+
+| Operation | Target | Hard behavior on miss |
+|---|---:|---|
+| Score one ParamSet advice snapshot | `p95 <= 10 ms` | publish fallback or previous checkpoint. |
+| Full live advice loop over enabled ParamSets | `p95 <= 50 ms` | skip noncritical ParamSets first. |
+| Hazelcast publish | nonblocking best effort | mark advice degraded if publish fails. |
+| ClickHouse audit write | never blocks advice | spool locally and expose backlog. |
+| Runner startup with warm checkpoint | `<5 s` target | publish no advice until checkpoint valid. |
+| Memory footprint | bounded and observable | disable worker-style models in runner. |
+
+Candidate sets must stay small. For `advsl.hold_substitute.v1`, a dozen finite
+hold-bar arms is acceptable; hundreds of arms are not. Continuous-action
+learners are disallowed in live V1 because they make bounded behavior harder to
+audit and harder to replay exactly.
+
+### 6.4 Algorithm Defaults by Parameter Class
+
+Concrete defaults:
+
+| Parameter situation | Default | Upgrade path | Notes |
+|---|---|---|---|
+| Small finite categorical, weak context | Thompson Sampling or UCB1 | discounted UCB if drift appears | Use for mode, urgency, route, retry-like knobs. |
+| Ordered discrete scalar | discounted UCB with monotone/smoothness diagnostics | contextual finite-arm learner | Good first fit for hold bars and TP buckets. |
+| Finite arms with rich context | LinUCB or LinTS | GLM-UCB/GLM-TS if reward shape demands it | Use MARAS/OBF/V7/EFSM context. |
+| Continuous bounded scalar | adaptive discretization | continuous-action contextual bandit only after bucket failure | Prefer auditability over fine resolution. |
+| Coupled parameter bundle | small safe bundle catalog | slate/combinatorial learner only if interaction is proven | Avoid action-space explosion. |
+| Nonstationary regime | discounted/sliding-window learner + drift detector | replay-reset logic | Freeze or shrink on drift; do not blindly chase. |
+| Safety/budget constrained parameter | baseline-safe gating around the learner | conservative contextual bandit / budgeted bandit | Guardrails must dominate learner output. |
+| Passive fill or recovery delay | survival model | richer survival only after classical model stability | Treat censoring explicitly. |
+
+### 6.5 Explicit Deferrals
+
+VIBRISS V1 should not attempt:
+
+- full RL;
+- continuous-action live control;
+- live probe trades by default;
+- Kafka/Flink migration;
+- ABIDES-in-the-loop production scoring;
+- hierarchical Bayesian pooling in the runner;
+- joint optimization of many parameters before single-ParamSet evidence exists.
+
+These are not rejected ideas. They are deferred because the current bottleneck is
+reliable evidence collection, replay/OPE discipline, and safe advice
+publication.
+
+## 7. Reward Design
+
+Rewards must be decomposed, bounded, and auditable. Store both raw components
+and normalized reward.
+
+Typical reward components:
+
+- positive: saved loss, lower drawdown, better realized terminal PnL, better
+  capital compounding trajectory, successful recovery without excess hold.
+- negative: clipped winner, missed TP, extra adverse selection, slippage, timeout,
+  excessive hold, larger tail loss, oscillation, stale-data actuation.
+
+For ADVSL/TP research, the primary reward should be capital-curve delta after
+opportunity cost, not terminal trade PnL alone. A rule that saves losses but
+systematically clips larger winners must be penalized accordingly.
+
+## 8. Required Audit Logging
+
+Every VIBRISS decision must be replayable.
+
+Minimum decision log fields:
+
+- timestamp and scan number
+- namespace: blue, pink, prodgreen, research
+- parameter spec id and version
+- context snapshot hash
+- MARAS regime, scalar hash, composite hash when available
+- candidate set
+- chosen arm
+- action probability or confidence
+- baseline value
+- guardrail decisions and fallback reason
+- model version
+- advice publication timestamp
+- engine consumption timestamp, if consumed
+- delayed reward components
+- terminal reward
+- policy update version
+
+## 9. Control-Plane Output
+
+VIBRISS publishes advice, not imperative mutations.
+
+Recommended HZ shape:
+
+```json
+{
+  "schema": "vibriss.param_advice.v1",
+  "namespace": "blue",
+  "ts": "2026-06-03T00:00:00Z",
+  "spec_id": "advsl.overlay_min_hold_bars",
+  "spec_version": "1.0.0",
+  "baseline_value": 6,
+  "recommended_value": 12,
+  "confidence": 0.82,
+  "candidate_set": [4, 6, 8, 10, 12, 16, 20],
+  "context_hash": "maras:57957|asset:XLMUSDT|side:LONG",
+  "learner": "linucb",
+  "guardrail_status": "PASS",
+  "fallback_reason": null,
+  "expires_at": "2026-06-03T00:05:00Z"
+}
+```
+
+Consumption rule: the engine may consume this only if the parameter spec says
+the current state is an allowed change point and all guardrails pass. Otherwise
+the baseline remains in force.
+
+## 10. Initial VIBRISS Targets
+
+### 10.1 Conditional Fast TP
+
+First replay-backed target:
+
+- `fast_tp.tp_pct`
+- `fast_tp.bars_held_min`
+- `fast_tp.exit_pressure_min`
+- `fast_tp.mfe_decay_min`
+- `fast_tp.pnl_mfe_frac_max`
+
+Current evidence says blanket first-touch `0.20%` TP clips too many winners, but
+conditional fast TP is net positive in both full corpus and capital-known BLUE
+subset. The first VIBRISS job is to turn those calibrated constants into a
+shadow policy with logged propensities and OOS replay.
+
+This TP percentage is a prime VIBRISS assistance target. Treat it as a
+first-class tunable rather than a frozen constant once replay coverage is
+sufficient.
+
+Open research note:
+
+- investigate whether the `0.20%` TP should be risk-normalized by notional
+  risked, using a monotone nonlinearity such as a cubic retract/expansion curve;
+- the candidate question is whether high-notional or high-leverage trades should
+  have a proportionally different TP posture, while keeping the first-touch
+  semantics intact for replay accounting;
+- if tested, this must be evaluated with full capital-curve compounding and
+  opportunity cost, not just raw win-rate or per-trade PnL.
+
+#### 10.1.1 Re-entry-Conditioned Fast TP
+
+Same-asset reentries after a profitable exit are a separate research bucket.
+They should not inherit the exact same fast-TP posture as a first-entry trade
+without evidence. In current BLUE history, same-asset reentries after wins are
+usually profitable, but the average second-leg move is smaller than the initial
+leg, which means a lower TP multiplier may preserve geometry better than a blunt
+`2.0x` repeat.
+
+Recommended candidate arms:
+
+- `fast_tp.reentry_tp_multiplier = 1.2`
+- `fast_tp.reentry_tp_multiplier = 1.5`
+- `fast_tp.reentry_tp_multiplier = 2.0`
+
+Interpretation:
+
+- first-entry trades keep the baseline conditional fast TP
+- re-entry-after-win trades may use a smaller multiplier band
+- re-entry-after-loss trades should remain a separate bucket and may need a
+  slower TP or stronger confirmation, not just a smaller multiplier
+- a mild nonlinear / cubic trim on re-entry is a valid shadow-only follow-up
+  candidate, but only after the flat multiplier band has been replayed first
+
+Ownering rule:
+
+- VIBRISS should learn and score the candidate multiplier in shadow replay
+- EFSM should own live application if the runtime ever consumes the bucket
+- do not flatten the geometric ROI curve by forcing a single multiplier on all
+  reentries
+
+#### 10.1.2 TP Near-Miss Replay
+
+The TP research set must include a distinct near-miss population:
+
+- trades that came within a small epsilon of the candidate TP but did not
+  satisfy the live trigger on the observed cadence
+- trades that briefly exceeded the candidate TP and then reversed before the
+  engine observed the touch
+- trades that later stopped out after first-touch proximity, because those are
+  the exact counterexamples needed to learn whether a lower TP bucket would
+  have been better
+
+This bucket is mandatory because a corpus dominated by profitable TP closes is
+survivorship-biased. A learner trained only on winners can learn that the
+current TP is "usually profitable" while remaining blind to the trades where a
+slightly lower TP would have caught the move and prevented a later stop-loss.
+
+Required replay semantics:
+
+- use first-touch TP labels, not close-only labels
+- keep near-miss candidates separate from clean TP hits
+- score each candidate by recursive capital-curve delta after opportunity cost
+- preserve scan-cadence effects when the live engine is scan-driven
+
+Primary use:
+
+- learn whether a tighter TP bucket is justified for specific regimes, assets,
+  or reentry conditions
+- quantify the opportunity cost of the missed touch itself, not just the later
+  realized close
+- explain repeated "why did this one not TP?" incidents without overfitting to
+  already-winning trades
+
+### 10.2 ADVSL Hold/Floor
+
+Second target:
+
+- `advsl.base_catastrophic_floor_pct`
+- `advsl.overlay_catastrophic_floor_pct`
+- `advsl.overlay_max_loss_usd`
+- `advsl.overlay_min_hold_bars`
+- `advsl.overlay_pressure_min`
+- `advsl.overlay_mae_risk_min`
+
+This is safety-critical. VIBRISS may advise, but live application requires
+strong guardrails, bounded step changes, and explicit fallback to the current
+documented ADVSL values.
+
+Floor percentage is also a prime VIBRISS assistance target, but it must stay
+outside the learner’s ability to disable the catastrophic floor entirely.
+
+Hard safety ceiling:
+
+- the operator may define a non-negotiable max-loss ceiling per trade, per leg,
+  or per session
+- this ceiling is distinct from the replay optimum and distinct from the
+  learner’s preferred floor/TP/hold posture
+- if a candidate policy exceeds the ceiling, the ceiling wins even when the
+  replayed recursive capital curve would otherwise look better
+- VIBRISS may tune inside the ceiling, but it must not optimize the ceiling
+  away, relax it implicitly, or treat operator pain tolerance as a soft signal
+
+### 10.3 MARAS-Conditioned Hold Bars
+
+Third target:
+
+- per-hash or per-regime hold-bar posture
+- per-label bias around known hash medians
+- OBF-conditioned hold extension or contraction
+
+Do not use MARAS labels as hard filters. Labels such as CHOPPY can contain both
+many wins and severe losses. Use the composite hash, raw signature dimensions,
+confidence, conflict, and nearest-neighbor regime evidence as context features.
+
+### 10.4 DVOL/VOL Gate and Trade-Pause Posture
+
+Candidate carefulness-critical target:
+
+- `entry_gate.dvol_threshold`
+- `entry_gate.vol_open_persistence_bars`
+- `entry_gate.min_qualified_cross_rate`
+- `entry_gate.pick_latency_pause_s`
+- `entry_gate.open_gate_no_pick_pause_score`
+
+This target exists because a VOL/DVOL gate can be technically open while the
+engine still sees low-quality entry conditions: few accepted threshold crosses,
+weak asset-pick evidence, or no fresh accepted pick after a normally sufficient
+latency window.
+
+The first useful derived sensor is:
+
+```text
+open_gate_no_pick_pause_score =
+  VOL/DVOL gate open
+  + low recent vel_div threshold-cross density
+  + no accepted entry for expected_pick_latency_s
+  + neutral/hostile EsoF/ExoF/MARAS context
+  + no evidence of stale scans or halted runtime
+```
+
+This must not be treated as an urgent kill switch by default. It is a
+carefulness parameter: VIBRISS should first log it, correlate it with later
+trade quality, and test whether it predicts profitable trade pauses or smaller
+position sizing. The baseline is no pause beyond current gate logic.
+
+Related empirical TODOs:
+
+- Reconsider `min_irp_alignment=0.0` empirically. The live gold config disables
+  the IRP alignment filter, but the larger current corpus may now be sufficient
+  to retest whether a nonzero IRP alignment floor improves asset-pick quality.
+- Examine whether the apparent `VOL open / no immediate pick` condition is a
+  useful trade-pause state or simply the expected effect of the stricter
+  effective signal-strength gate (`vel_div < about -0.03`).
+- Initial live observation: recent quiet after the last known good picks appears
+  protective rather than broken. This must be tested with opportunity cost:
+  measure what the system avoided during quiet periods and what it missed by not
+  entering.
+- Examine whether MARAS composite hashes need more granularity: more distinct
+  market-descriptive buckets while preserving the sortable scalar hash and
+  nearest-neighbor/similarity behavior.
+
+### 10.5 Capital-Protect / Profit-Lock
+
+Fourth target:
+
+- `capital.protect_arm_threshold_pct`
+- `capital.protect_full_threshold_pct`
+- `capital.protect_tp_min_multiplier`
+- `capital.protect_cubic_coeff`
+- `capital.protect_reset_drawdown_pct`
+- `capital.protect_hysteresis_bars`
+- reset family selector: `capital.protect_reset_mode`
+- time-based reset controls: `capital.protect_reset_time_trades`, `capital.protect_reset_time_seconds`
+- regime/hash reset controls: `capital.protect_reset_regime_whitelist`, `capital.protect_reset_fingerprint_whitelist`
+- sc-EsoF reset controls: `capital.protect_reset_sc_floor`, `capital.protect_reset_sc_neutral_floor`, `capital.protect_reset_sc_positive_floor`
+
+This is the profit-protect / peak-lock family. The idea is not to mute risk
+management, but to preserve capital once the day/session has already become
+meaningfully profitable. The study must test whether a gain threshold such as
+`1.2%`, `2.3%`, `3.3%`, ... should arm a more conservative TP posture for
+subsequent trades, and whether a cubic trim on the TP multiplier is better than
+an abrupt step change.
+
+Required policy questions:
+
+- what profit threshold should arm the protect state
+- how quickly TP should tighten once the threshold is crossed
+- whether the tighten curve should be cubic, stepped, or mixed
+- when the protect state must reset
+- how much drawdown from the protected peak is required to disarm
+- how many bars/trades of hysteresis are needed before a reset is valid
+- whether reset should be keyed to time, regime, known fingerprint, sc-EsoF, or mixed logic
+- whether reset should use a whitelist gate or a change-detection gate for regime/fingerprint families
+
+The baseline reset rule should be conservative:
+
+- arm only after the gain threshold is crossed on the recursive capital curve
+- keep the lock until a real drawdown-from-peak or day/session reset occurs
+- do not reset on a single noisy bar if the protected peak is still intact
+
+This target must be evaluated against:
+
+- recursive capital-curve delta after opportunity cost
+- clipped-winner cost from over-tightening
+- saved-loss from avoiding giveback after the day is already up
+- win-return statistics after the arm event
+- ceiling-violation count, because the profit protect should never create an
+  implicit max-loss escape hatch
+
+It is especially important to compare:
+
+- flat threshold steps vs cubic tightening
+- no hysteresis vs bar-count hysteresis
+- immediate reset vs drawdown-based reset
+- day-reset vs rolling-session reset
+
+The tape should be replayed on the same capital curve used by the live engine,
+so the protect state is evaluated recursively, not from a fixed post-hoc label.
+
+### 10.6 OB Cascade TP-Modulation (added 2026-06-12, LINK 5e05eeeb post-mortem)
+
+Candidate carefulness-critical target — the parameters of the OB
+tail-avoidance layer in `alpha_exit_manager.evaluate()` that silently
+modulate the "fixed" TP:
+
+- `ob_cascade.count_threshold` — number of assets withdrawing liquidity
+  (depth withdrawal velocity < CASCADE_THRESHOLD) required to enter cascade
+  mode. **Currently hardcoded as `cascade_count > 0`, i.e. a SINGLE asset
+  anywhere in the tracked set widens every open trade's TP by x1.40.** The
+  LINK 5e05eeeb diagnosis (2026-06-11, -$1,248.71) showed this trigger is
+  active on a large fraction of trades because entries occur during panics
+  by construction. Domain candidates: {1, 2, 3, n_assets//4, n_assets//2};
+  fallback_baseline: 1 (current behavior).
+- `ob_cascade.tp_widen_factor` — currently hardcoded 1.40. Population
+  evidence (post-2026-05-11 cohort): widening earned ~+$84.7K on
+  continuation trades vs ~-$16.9K given back on reversals, so the factor is
+  net-positive but fat-left-tailed. Domain: [1.0 .. 1.6]; 1.0 = modulation
+  off.
+- `ob_cascade.withdrawal_velocity_threshold` — `CASCADE_THRESHOLD` in
+  `ob_features.py`, currently -0.10 (10% depth pulled over lookback).
+
+Required sensors already exist since 2026-06-12: `dynamic_tp_pct`,
+`tp_mod_factor`, `cascade_count`, `ob_regime_signal`, `tp_floor_armed` are
+logged on every `dolphin.v7_decision_events` row, so reward attribution can
+be computed offline from the live tape with no new instrumentation.
+
+INTERPLAY (REQUIRED reading for the paramset author): these parameters
+interact with (a) the TP_FLOOR profit-floor ratchet (2026-06-12,
+`DOLPHIN_TP_FLOOR`) which caps the left tail of the widening — reward must
+be computed on the JOINT policy (widen + floor), not the widen alone; and
+(b) §10.1 Conditional Fast TP / the future ADAPTIVE TP THRESHOLD ("Dynamic
+TP"): the adaptive TP threshold itself is hereby marked FIT FOR VIBRISS
+GOVERNANCE — the effective TP should ultimately be one governed surface
+(base x leverage-curve x market-state x cascade modulation), with VIBRISS
+owning the modulation terms and the champion base (0.20%) remaining frozen
+outside governance. A VIOLET-era sub-second exit guard changes the
+actuation latency of both TP and floor; cadence is therefore a context
+feature, not a governed parameter, per the data-cadence operator rule.
+
+## 11. First Concrete ParamSet: ADVSL Hold Substitute
+
+### 11.1 Objective
+
+This is the first concrete VIBRISS use case.
+
+The parameter set replaces a static ADVSL no-arm / min-hold rule with a bounded,
+evidence-scored hold target. The original research problem was the legacy
+`20`-bar hold window: it protects winners from premature ADVSL exits, but it can
+also let fast adverse trades slip through before the floor arms. Replay work
+found that shorter centers, especially around `12` bars, can protect capital in
+tail events, while longer holds can be correct in snapback/recovery pockets.
+
+The VIBRISS answer is not "always use 12" and not "always use 20." It is:
+
+- choose a hold target from a bounded set,
+- condition the choice on current trade/path/regime sensors,
+- score it by recursive capital-curve impact after opportunity cost,
+- keep catastrophic loss floors outside the learner as non-negotiable safety.
+
+The sweep geometry itself is also a VIBRISS parameter. The ParamSet may carry a
+global sweep window plus per-regime/per-hash sweep windows in `sweep_policy`.
+When the derived best band touches the search window boundary, treat that as a
+signal that the search is still censored by the current bounds, not as proof
+that the optimum is "wide open." In that case, expand the admissible sweep
+window and re-evaluate before promoting the range.
+
+### 11.2 ParamSet Identity
+
+```yaml
+param_set:
+  id: advsl.hold_substitute.v1
+  name: ADVSL Hold Substitute
+  status: shadow_first
+  namespace_default: blue
+  consumer: advanced_sl
+  decision_family: exit_risk_timing
+  replaces:
+    - legacy_advsl_min_hold_bars_20
+  related_live_controls:
+    - advsl.base_catastrophic_floor_pct
+    - advsl.overlay_catastrophic_floor_pct
+    - advsl.overlay_max_loss_usd
+    - advsl.overlay_pressure_min
+    - advsl.overlay_mae_risk_min
+```
+
+This spec governs the hold/arming decision only. It may recommend when ADVSL
+is allowed to arm, but it must not remove the catastrophic floor.
+
+### 11.3 ParamSet Config and Parameters
+
+Shared ParamSet config:
+
+```yaml
+paramset_config:
+  consumer: advanced_sl
+  decision_family: exit_risk_timing
+  placement:
+    decision_point: trade_entry
+    live_replacement_rhythm: capture_on_entry
+    intratrade_change_policy: shadow_only
+  outputs:
+    hz_key: DOLPHIN_FEATURES.vibriss_hold_substitute_advice
+    decision_table: dolphin.vibriss_decisions
+    reward_table: dolphin.vibriss_rewards
+
+param_defaults:
+  learner:
+    type: discounted_ucb
+    contextual_shadow_branch: linucb
+    nonstationarity: sliding_window
+    window_trades: 300
+  safety:
+    fallback_baseline: 12
+    max_exploration_rate: 0.0
+    min_shadow_samples: 200
+    min_live_confidence: 0.80
+  reward_mapping:
+    primary_metric: recursive_capital_curve_delta_after_opportunity_cost
+    bounded_range: [-1.0, 1.0]
+  guardrails:
+    stale_obf_policy: ignore_obf_features
+    low_maras_confidence_policy: shrink_to_global_prior
+    drawdown_alarm_policy: freeze_to_safe_baseline
+```
+
+Primary learned parameter:
+
+```yaml
+params:
+  advsl.min_hold_bars_before_floor_arm:
+    type: integer
+    units: bars
+    baseline_reference: 20
+    starting_center: 12
+    current_live_overlay_reference: 6
+    default: 12
+    domain:
+      candidates: [4, 6, 8, 10, 12, 14, 16, 20, 24, 28, 34, 40]
+      hard_min: 0
+      hard_max: 48
+```
+
+Companion deterministic guardrails:
+
+```yaml
+params:
+  advsl.max_loss_usd_floor:
+    type: float
+    units: usd
+    default_overlay: 500.0
+    research_candidate: 400.0
+    learner_controlled: false
+
+  advsl.catastrophic_floor_pct:
+    type: float
+    units: pct
+    default_base: 0.0120
+    default_overlay: 0.0050
+    learner_controlled: false
+
+  advsl.recovery_extension_max_bars:
+    type: integer
+    units: bars
+    default: 0
+    domain:
+      candidates: [0, 4, 8, 12, 20, 34]
+      hard_min: 0
+      hard_max: 40
+    learner_controlled: shadow_only_until_validated
+    safety:
+      min_shadow_samples: 500
+      min_live_confidence: 0.90
+```
+
+Interpretation:
+
+- `baseline_reference=20` preserves the historical question.
+- `starting_center=12` is the current replay-derived center.
+- `current_live_overlay_reference=6` records the tightened overlay state and
+  must be reported separately from the legacy 20-bar research baseline.
+- `34` and `40` remain candidates because contiguous-region medians observed
+  during replay included materially longer optima.
+
+### 11.4 Required Sensors
+
+The hold substitute must use point-in-time sensors only. End-of-trade labels may
+be used for reward calculation, not for action selection.
+
+Core context sensors:
+
+| Sensor | Source | Use |
+|---|---|---|
+| `asset` | live trade state | Asset-level prior and OBF join key. |
+| `side` | live trade state / EFSM | Separate SHORT base from EFSM-flipped LONG contexts. |
+| `bars_held` | live trade state | Determines current arming progress. |
+| `entry_price` / `current_price` | live trade state | Signed path and current PnL. |
+| `post_gross_path_pct` | trade path replay/live path state | Measures post-entry excursion shape. |
+| `mae_pct` | live path state | Adverse excursion severity. |
+| `mfe_pct` | live path state | Favorable excursion and recovery potential. |
+| `mfe_decay` | derived from MFE/current PnL | Detects giveback and weakening recovery. |
+| `current_pnl_mfe_frac` | derived from current PnL / MFE | Indicates whether recovery is intact or mostly lost. |
+| `v7_exit_pressure` | `v7_decision_events` / live V7 snapshot | Pressure/continuation signal for recovery unlikely cases. |
+| `v7_mae_risk` | V7 snapshot | Separates ordinary drawdown from risk-tier drawdown. |
+| `v7_action` | V7 snapshot | EXIT/RETRACT/EXTEND/HOLD context. |
+| `state_confidence` | market-state / MARAS / bundle confidence | Low confidence forces conservative fallback. |
+
+OBF sensors:
+
+| Sensor | Source | Use |
+|---|---|---|
+| `obf_depth_1pct_usd` | `obf_universe_latest` / OBF CH | Recovery-capacity and liquidity depth. |
+| `obf_depth_quality` | OBF derived quality | Distinguishes deep snapback pockets from weak-book grinds. |
+| `obf_spread_bps` | OBF | Penalizes bad microstructure. |
+| `obf_imbalance` | OBF | Directional liquidity pressure. |
+| `obf_imbalance_ma5` / `obf_imbalance_ma10` | OBF derived path | Smooths raw book pressure for in-trade TP/SL context. |
+| `obf_imbalance_slope` | OBF derived path | Detects whether pressure is strengthening or fading. |
+| `obf_imbalance_persistence` | OBF derived path | Measures sign stability rather than one-tick noise. |
+| `obf_imbalance_reaccel` | OBF derived path | Detects renewed pressure after a mid-trade weakening/plateau. |
+| `obf_staleness_s` | OBF timestamp | Guardrail; stale OBF cannot steer hold. |
+
+Regime sensors:
+
+| Sensor | Source | Use |
+|---|---|---|
+| `maras_regime` | `maras_latest` / `maras_fingerprint` | Label-level bias only, never hard filter. |
+| `maras_composite_hash` | MARAS Scope B | Exact historical hash prior when sample size is enough. |
+| `maras_scalar_hash` | MARAS Scope A | Coarse sortable regime prior. |
+| `maras_confidence` | MARAS | Low confidence reduces live trust. |
+| `maras_conflict_level` | MARAS | High conflict increases uncertainty/exploration penalty. |
+| `s_eigen_vd`, `s_eigen_w50`, `s_eigen_w750` | MARAS raw signature | Eigen-state context. |
+| `s_btc_dev_pct`, `raw_btc_ma99` | MARAS BTC tier | Trend/uptrend/downtrend pressure context. |
+| `s_acb_boost`, `s_acb_beta` | MARAS/ACB | Protective/risk-on context. |
+
+Outcome-only reward sensors:
+
+| Sensor | Source | Use |
+|---|---|---|
+| `actual_exit_pnl` | `trade_events` | Realized baseline outcome. |
+| `counterfactual_exit_pnl_by_hold` | tape replay | Arm-level reward. |
+| `recovery_lag_s` | tape replay | Time to recover after floor/cut. |
+| `extra_bars_to_recovery` | tape replay | Cost of too-short hold. |
+| `clipped_winner_delta` | tape replay | Opportunity cost of premature exit. |
+| `saved_loss_delta` | tape replay | Loss avoided by earlier floor arm. |
+| `capital_curve_delta` | recursive replay | Primary reward accounting. |
+
+### 11.5 Feature Construction
+
+VIBRISS should compute a compact feature vector from the sensors:
+
+```text
+path_speed = abs(post_gross_path_pct) / max(1, bars_held)
+mae_velocity = mae_pct / max(1, bars_since_entry)
+mfe_velocity = mfe_pct / max(1, bars_since_entry)
+recovery_ratio = current_pnl_mfe_frac
+giveback_ratio = 1.0 - current_pnl_mfe_frac
+liquidity_score = f(obf_depth_1pct_usd, obf_depth_quality, obf_spread_bps)
+signed_obf_imbalance = side_sign * obf_imbalance
+imbalance_confirmation = f(signed_obf_imbalance_ma5, persistence, slope)
+imbalance_reacceleration = f(prior_weakening, current_signed_slope, persistence)
+pressure_score = f(v7_exit_pressure, v7_mae_risk, v7_action)
+regime_key = maras_composite_hash if sample_count(hash) >= min_hash_n else maras_regime
+confidence_weight = min(state_confidence, maras_confidence) * (1.0 - maras_conflict_level)
+```
+
+Feature requirements:
+
+- All features must be point-in-time.
+- Missing OBF must not become zero-depth unless zero-depth is the actual
+  observation. Missing OBF is its own mask feature.
+- MARAS labels are context, not filters. Use hash/sample priors and raw
+  signature dimensions where possible.
+- Side must be explicit. EFSM-flipped LONG trades cannot share a blind SHORT
+  prior.
+- OBF imbalance must be side-normalized. For a SHORT, negative raw imbalance is
+  confirming; for a LONG, positive raw imbalance is confirming.
+- Raw imbalance is not enough. Use moving averages, persistence, slope, and
+  re-acceleration after weakening so a single noisy tick cannot steer ADVSL.
+
+### 11.5.1 OBF Imbalance Assistance Research
+
+Live ENJUSDT observation on `2026-06-04` motivates an explicit research feature
+family for ADVSL/TP assistance. The trade entered SHORT near `10:06:14 UTC` and
+closed `FIXED_TP` near `10:10:11 UTC` for `+$118.53`.
+
+Observed OBF path:
+
+- entry imbalance was near neutral (`~ -0.015` to `+0.001`);
+- within seconds it snapped SHORT-confirming (`~ -0.18` to `-0.21`);
+- mid-trade it weakened and oscillated around neutral in 30s buckets;
+- into TP it re-strengthened materially (`~ -0.30` to `-0.35`).
+
+Conclusion:
+
+- Imbalance did not monotonically increase from entry to exit.
+- It behaved as a confirmation/re-acceleration signal: neutral -> confirming
+  pressure -> weakening/plateau -> renewed confirming pressure into TP.
+- Therefore VIBRISS should not use raw imbalance as a simple exit trigger.
+
+Candidate uses:
+
+| Use | Candidate rule |
+|---|---|
+| TP assist | If price is near TP and side-normalized imbalance re-accelerates in favor, avoid premature ADVSL/retract exits. |
+| SL/ADVSL assist | If adverse PnL appears and side-normalized imbalance persistently contradicts the trade, recovery probability should shrink. |
+| Hold assist | If imbalance is neutral/choppy but not contradictory, do not force an exit from imbalance alone. |
+| Floor timing | Combine `price_progress_to_tp * imbalance_confirmation` with MAE/MFE path shape to decide whether the floor should wait or arm. |
+
+Candidate feature names:
+
+```text
+imbalance_signed_for_trade
+imbalance_ma5_signed
+imbalance_ma10_signed
+imbalance_slope_signed
+imbalance_persistence_signed
+imbalance_reacceleration_after_weakening
+price_progress_to_tp_x_imbalance_confirmation
+adverse_pnl_x_imbalance_contradiction
+```
+
+Research requirement: replay this across completed trades before live use. Score
+it by recursive capital delta after opportunity cost, not by whether it explains
+one ENJ winner.
+
+### 11.5.2 Macro-Thesis Persistence vs Local Danger Research
+
+Live XLMUSDT observation on `2026-06-04` motivates a mandatory ADVSL/VIBRISS
+research direction. The trade suffered a large adverse excursion before closing
+at `FIXED_TP`. Local OBF imbalance and V7 pressure were frightening during the
+worst MAE; they did not cleanly foresee the recovery. The higher-level
+eigen/MARAS context, however, stayed coherent with the trade thesis: bearish or
+choppy-bearish posture, low conflict, active dislocation, and bearish BTC
+context.
+
+Actionable lesson to test to exhaustion:
+
+```text
+ADVSL/V7 local danger should be overruled only when macro thesis persistence
+remains strong, MARAS conflict/novelty remains low, and OBF contradiction is not
+persistent/deep enough to invalidate the thesis.
+```
+
+This is not a live rule yet. It is a research requirement for the first
+VIBRISS-governed ADVSL/bar-hold policy. The learner must explicitly measure
+when local pain is a true invalidation signal versus when it is survivable
+excursion inside a still-valid macro/eigen thesis.
+
+The required research output is a weighting model, not a binary exception. The
+policy must estimate how much authority belongs to local danger signals versus
+macro-thesis persistence under the current context. Those weights are themselves
+VIBRISS-tunable parameters and must be represented in the ParamSet spec with
+safe defaults, bounded candidate ranges, promotion rules, and audit logging.
+
+Candidate feature names:
+
+```text
+macro_thesis_persistence
+maras_conflict_low_during_mae
+maras_hash_knownness_during_mae
+eigen_dislocation_persistence_during_mae
+btc_context_alignment_during_mae
+local_obf_contradiction_persistence
+local_obf_contradiction_depth_weighted
+v7_pressure_without_macro_invalidation
+adverse_move_vs_macro_persistence
+late_recovery_obf_reacceleration
+```
+
+Candidate tunable parameters:
+
+```text
+local_danger_weight
+macro_thesis_weight
+obf_contradiction_weight
+maras_conflict_weight
+eigen_persistence_weight
+btc_context_weight
+v7_pressure_weight
+macro_override_min_confidence
+local_invalidation_min_persistence_bars
+```
+
+The initial decision form should be simple and auditable:
+
+```text
+local_danger_score =
+    local_danger_weight * v7_pressure
+  + obf_contradiction_weight * local_obf_contradiction_persistence
+  + maras_conflict_weight * maras_conflict_or_novelty
+
+macro_thesis_score =
+    macro_thesis_weight * macro_thesis_persistence
+  + eigen_persistence_weight * eigen_dislocation_persistence_during_mae
+  + btc_context_weight * btc_context_alignment_during_mae
+
+hold_or_cut_bias = macro_thesis_score - local_danger_score
+```
+
+VIBRISS may tune the weights, but guardrails must prevent pathological behavior:
+local danger cannot be ignored at extreme MAE, and macro thesis cannot override
+persistent high-depth OBF contradiction plus MARAS conflict/novelty.
+
+Required tests:
+
+- replay all completed trades with this feature family available point-in-time;
+- isolate high-MAE trades that later TP'd from high-MAE trades that continued
+  into real loss;
+- charge every delayed cut for worst-case tail loss and every early cut for
+  missed recovery/opportunity cost;
+- evaluate separately for base SHORTs and EFSM/overlay-flipped LONGs;
+- report per-MARAS-hash, per-label, and nearest-neighbor raw-signature results;
+- report learned/suggested weights and their stability by contiguous region,
+  MARAS hash, side, and asset-liquidity bucket;
+- promote only if held-out contiguous regions improve recursive capital delta
+  without hiding clipped winners or worse tail events.
+
+### 11.5.3 Macro/OBF Evidence Hierarchy Research
+
+Live DASHUSDT observations on `2026-06-04` add a third case study to the XLM
+and ETC findings. DASH produced two fast SHORT `FIXED_TP` trades, including
+`efcc6dce`, which entered near `11:00:15 UTC` and closed near `11:00:38 UTC`
+after only `2` bars for `+$367.92`.
+
+The large DASH trade was not a scary hold-through-MAE case:
+
+- V7 recorded `mae = 0` for the trade path;
+- entry `vel_div` was extreme (`~ -0.2463`);
+- MARAS at entry was `BEARISH`, low conflict, composite hash `58981`;
+- BTC context remained bearish (`s_btc_above_ma99 = 0`);
+- OBF imbalance initially leaned against the SHORT, then flipped materially
+  SHORT-confirming during the price break.
+
+This suggests an evidence hierarchy that must be tested explicitly:
+
+```text
+macro/eigen OK + OBF confirms
+  > macro/eigen OK + OBF neutral/choppy
+  > macro/eigen OK + OBF counters transiently but then flips confirming
+  > macro/eigen OK + OBF persistently counters with depth
+  > macro/eigen weak/conflicted regardless of OBF
+```
+
+The hierarchy is not a live rule. DASH shows that a very strong macro/eigen
+impulse can overcome early OBF contradiction when the contradiction is shallow
+or transient. ETC shows the stronger case, where OBF remained SHORT-confirming
+through adverse price movement. XLM shows the weaker/riskier case, where macro
+thesis persistence carried the trade while OBF was ugly at the worst point.
+
+Candidate features:
+
+```text
+macro_obf_alignment_class
+macro_extreme_impulse_score
+obf_counter_transience_bars
+obf_counter_depth_weighted
+obf_flip_to_confirmation_latency_s
+obf_confirmation_after_macro_impulse
+macro_ok_obf_confirm_weight
+macro_ok_obf_counter_weight
+macro_extreme_overrides_obf_counter_weight
+```
+
+Required tests:
+
+- rank outcomes by `macro_obf_alignment_class`;
+- compare `macro OK + OBF confirm` against `macro OK + OBF counter`;
+- split OBF counter cases into transient, shallow, persistent, and
+  depth-weighted contradiction;
+- measure whether OBF flip-to-confirmation latency predicts TP speed;
+- report whether extreme `vel_div` can safely receive more weight than early
+  OBF contradiction, and where that becomes unsafe;
+- expose the learned hierarchy weights as VIBRISS-tunable parameters, not
+  hardcoded doctrine.
+
+### 11.5.4 Falling-Knife / Missing-Bounce-Sensor Case Study
+
+Live LTCUSDT observation on `2026-06-04` (`c0139cea`) adds an open/pending case
+study for the opposite side of the DASH impulse capture. The trade entered SHORT
+near `11:15:12 UTC` with extreme entry `vel_div` (`~ -0.1942`) and high notional,
+but subsequently showed severe adverse excursion and no meaningful favorable
+excursion at the time of review. V7 also emitted repeated `RETRACT`
+recommendations, but V7 pressure is not treated as truth by itself; XLM showed
+that V7 can scream during a trade that later recovers profitably.
+
+Observed at review time:
+
+- `inverse_ars_bounce_shadow` was stale; latest row was `2026-06-03 18:42:26
+  UTC`, so the bounce detector was not assisting live;
+- V7 repeatedly emitted `RETRACT / V7_RISK_DOMINANT`, which is local-pain
+  evidence only;
+- V7 observed `mae ~ 0.854%`, `mfe = 0`, and `exit_pressure = 3`;
+- OBF was mostly neutral/choppy with weak, oscillating side-normalized evidence,
+  not a strong rescue signal;
+- MARAS/BTC remained broadly bearish/low-conflict, but recent eigen values were
+  intermittent rather than steadily thesis-confirming.
+
+Research meaning:
+
+```text
+macro/eigen entry impulse alone is insufficient when local danger is extreme,
+MFE remains zero, OBF does not confirm, and the bounce/inverse-risk sensor is
+missing or stale.
+```
+
+V7 pressure must be weighted conditionally:
+
+```text
+V7 pressure is discounted when macro thesis remains strong, OBF confirms, and
+MFE exists.
+
+V7 pressure receives more weight only when independent local invalidation
+features agree: zero MFE, rising MAE, neutral/counter OBF, stale/missing bounce
+sensor, macro impulse decay, or MARAS conflict/novelty.
+```
+
+Candidate features:
+
+```text
+bounce_sensor_freshness_s
+bounce_sensor_missing_mask
+extreme_macro_without_mfe
+v7_retract_persistence_bars
+zero_mfe_high_mae_flag
+obf_neutral_or_counter_during_mae
+macro_impulse_decay_after_entry
+```
+
+Required replay treatment:
+
+- stale/missing bounce data must be an explicit mask feature, not an assumed
+  neutral score;
+- compare extreme-entry trades that get early MFE against extreme-entry trades
+  with zero MFE and rising MAE;
+- treat persistent V7 `RETRACT` as a local-danger amplifier only when confirmed
+  by independent invalidation sensors such as stale bounce, zero MFE, rising
+  MAE, neutral/counter OBF, or macro impulse decay;
+- only promote a macro override if it survives this LTC-style case family after
+  opportunity-cost and tail-loss accounting.
+
+### 11.6 Learning / Computing Model
+
+V1 should use a two-layer policy:
+
+1. Prior/posture estimator:
+   - computes candidate priors from historical replay by MARAS composite hash,
+     MARAS label, asset, side, and contiguous time region.
+   - uses shrinkage: hash prior -> label prior -> global prior.
+   - initializes the hold target near `12` bars unless the context prior has
+     enough evidence to move it.
+
+2. Online contextual bandit:
+   - learner: discounted LinUCB or LinTS over finite hold-bar arms.
+   - arms: `[4, 6, 8, 10, 12, 14, 16, 20, 24, 28, 34, 40]`.
+   - reward: delayed until trade close or replay terminal.
+   - discount/window: sliding 300 closed trades, plus faster decay when drift is
+     detected.
+   - exploration: shadow-only by default; live exploration cap starts at `0`.
+
+Recommended fallback if contextual coverage is sparse:
+
+```text
+if hash_sample_n >= 30:
+    prior = median_best_hold_for_hash
+elif label_side_sample_n >= 100:
+    prior = median_best_hold_for_label_side + label_bias
+else:
+    prior = 12
+
+advice = guardrail_filter(contextual_bandit(prior, candidates))
+```
+
+Optional recovery model:
+
+- Train a survival model for `extra_bars_to_recovery`.
+- Use it only as a veto/adjuster until validated.
+- It may increase hold only when recovery probability is high and expected
+  extra hold is short.
+
+### 11.7 Success Definition
+
+Primary success metric:
+
+```text
+recursive_capital_curve_delta_after_opportunity_cost
+```
+
+This means the replay must account for saved capital compounding forward, and
+must subtract the opportunity cost of trades that would have recovered or won
+after a premature floor/ADVSL action.
+
+Secondary metrics:
+
+- net PnL delta
+- ROI delta
+- max drawdown delta
+- tail-loss count and severity
+- number of hard/floor cuts
+- number of clipped winners
+- gross saved loss
+- gross missed upside
+- average and median recovery lag
+- average and median extra bars to recovery
+- TP near-miss count, TP near-miss recovery lag, and first-touch TP hit rate
+- per-hash and per-label stability
+- OOD region performance
+- worst contiguous-region degradation
+- explicit ceiling-violation count and worst single-loss size under the tested
+  policy, because a "best" replay result is not acceptable if it breaches the
+  operator's declared loss ceiling
+
+Promotion requires:
+
+- positive recursive capital-curve delta on held-out contiguous regions,
+- no unacceptable increase in clipped-winner opportunity cost,
+- no hidden dependence on a single asset or single MARAS hash,
+- improvement or neutral behavior on EFSM-flipped LONG subset,
+- deterministic replay reproducibility,
+- shadow logging coverage sufficient for OPE.
+
+### 11.8 Calibration Protocol
+
+Calibration must run in this order:
+
+1. Full-tape replay:
+   - evaluate every candidate hold arm on every eligible historical trade path.
+   - include all available BLUE/PINK/PRODGREEN executed trade history only when
+     namespace semantics are kept separate.
+
+2. Capital-aware replay:
+   - recursively recompute capital after each counterfactual exit.
+   - preserve position sizing geometry when the saved/lost capital changes the
+     subsequent notional.
+
+3. Opportunity-cost audit:
+   - for every floor/ADVSL cut, measure whether the trade later recovered.
+   - record recovery lag, extra bars, and missed PnL.
+
+4. Region validation:
+   - split into contiguous time regions with enough trades.
+   - repeat with moving/randomized boundaries.
+   - report median/best hold per region.
+
+5. MARAS proximity validation:
+   - group by composite hash when sample size is enough.
+   - otherwise use nearest-neighbor distance over MARAS raw signature fields.
+   - report whether per-hash/per-neighbor priors outperform global 12-bar center.
+
+6. OBF validation:
+   - bind optimum hold to `obf_depth_1pct_usd`, `obf_depth_quality`, spread, and
+     imbalance.
+
+7. TP near-miss validation:
+   - include trades that nearly touched candidate TP but missed on the observed
+     cadence.
+   - compute first-touch labels from the highest-resolution available path.
+   - isolate the opportunity cost of late reversal after near-touch.
+   - compare the resulting TP bucket against the profitable-close-only sample.
+   - test on OOD time slices; do not promote an OBF rule from in-sample fit only.
+
+7. Walk-forward:
+   - train on region N, validate on N+1.
+   - repeat across the full history.
+   - freeze the learner if the current best policy degrades versus baseline.
+
+### 11.9 Advice Payload
+
+Example advice:
+
+```json
+{
+  "schema": "vibriss.param_set_advice.v1",
+  "namespace": "blue",
+  "param_set_id": "advsl.hold_substitute.v1",
+  "spec_version": "1.0.0",
+  "trade_scope": "on_entry",
+  "baseline_reference": 20,
+  "current_live_overlay_reference": 6,
+  "recommended": {
+    "advsl.min_hold_bars_before_floor_arm": 12,
+    "advsl.recovery_extension_max_bars": 0
+  },
+  "candidate_set": [4, 6, 8, 10, 12, 14, 16, 20, 24, 28, 34, 40],
+  "confidence": 0.74,
+  "context": {
+    "asset": "XLMUSDT",
+    "side": "LONG",
+    "maras_composite_hash": 57957,
+    "maras_regime": "CHOPPY_BEARISH",
+    "obf_depth_quality_bucket": "weak",
+    "v7_pressure_bucket": "high"
+  },
+  "guardrail_status": "SHADOW_ONLY",
+  "fallback_value": 12,
+  "expires_at": "2026-06-03T00:05:00Z"
+}
+```
+
+### 11.10 Guardrails
+
+Mandatory guardrails:
+
+- Shadow-only until walk-forward validation is positive.
+- No live exploration by default.
+- Do not allow the learner to disable catastrophic floors.
+- If OBF is stale, ignore OBF-derived hold extension.
+- If MARAS confidence is low or conflict is high, shrink toward global prior.
+- If context is EFSM-flipped LONG and LONG sample count is sparse, use the
+  tighter safe prior, not a broad SHORT-derived prior.
+- If the recommended hold would increase worst-case open loss beyond the active
+  floor/cap, the floor/cap wins.
+- If capital drawdown alarm is active, freeze to deterministic safe baseline.
+
+### 11.11 Starting Priors From Current Research
+
+Current replay-derived starting posture:
+
+| Context | Starting prior | Rationale |
+|---|---:|---|
+| Global ADVSL hold substitute | `12` bars | Best current center for reducing 20-bar tail slips without assuming all contexts need long waits. |
+| Legacy baseline comparison | `20` bars | Historical no-arm/min-hold reference. |
+| Tight overlay reference | `6` bars | Current live overlay guardrail reference, not the general learned policy. |
+| Recovery/snapback pockets | `24` to `40` bars | Some contiguous-region medians were materially longer; keep as candidates, not defaults. |
+| Sparse/unknown context | `12` bars | Conservative research center with shrinkage. |
+| EFSM-flipped LONG sparse context | `6` to `12` bars | Do not borrow broad SHORT recovery priors blindly. |
+
+Known caution:
+
+- A `$400` hard cap improved one capital-aware slice by about `+$592.83` versus
+  the 12-bar-only replay, but generated a gross forgone-upside bucket around
+  `+$6,617.30` on hard-cap hits. Therefore max-loss floors must be evaluated
+  with opportunity cost and recovery lag, not judged by saved-loss totals alone.
+
+### 11.12 Promotion Policy
+
+Promotion is part of this ParamSet, not a global runner decision.
+
+```yaml
+promotion_policy:
+  owner: advsl.hold_substitute.v1
+  technique: replay_shadow_canary
+  baseline_policy:
+    legacy_reference: 20
+    current_overlay_reference: 6
+    fallback_value: 12
+  cadence:
+    replay_calibration: every_6h_or_50_new_rewards
+    promotion_review: every_30m
+    checkpoint_review: every_60s
+    live_replacement_rhythm: at_trade_entry_only
+  evidence_gates:
+    shadow_to_advisory:
+      min_replay_trades: 300
+      min_contiguous_regions: 4
+      recursive_capital_curve_delta_after_cost: "> 0"
+      worst_region_delta: ">= -0.10 * positive_total_delta"
+      clipped_winner_cost_budget: "documented_and_bounded"
+    advisory_to_canary_live:
+      min_shadow_decisions: 200
+      min_closed_trade_rewards: 50
+      min_days_observed: 3
+      no_unexplained_tail_loss_cluster: true
+      manual_approval_required: true
+    canary_live_to_controlled_live:
+      min_live_consumed_trades: 50
+      live_vs_shadow_regret: "<= 0"
+      no_guardrail_violation: true
+      manual_approval_required: true
+  canary_scope:
+    namespaces: [blue]
+    max_paramsets_live: 1
+    max_live_exploration_rate: 0.0
+    allow_only_capture_on_entry: true
+  automatic_demotion:
+    - stale_obf_or_maras_required_context
+    - reward_backlog_critical
+    - drawdown_alarm
+    - candidate_underperforms_baseline_in_shadow
+    - checkpoint_hash_mismatch
+```
+
+Interpretation:
+
+- `replay_calibration` answers how often the ParamSet re-estimates candidate
+  quality from historical/newly closed data.
+- `promotion_review` answers how often the ParamSet is checked for stronger
+  mode eligibility.
+- `live_replacement_rhythm` answers when the engine may replace the old
+  parameter with the VIBRISS value. For this ParamSet it is only at trade entry.
+- The runner executes this contract. It does not invent promotion thresholds.
+
+### 11.13 Meta-Cadence Policy
+
+The cadence parameters are themselves governed by this ParamSet. They are not
+free-floating daemon settings.
+
+```yaml
+meta_cadence_policy:
+  owner: advsl.hold_substitute.v1
+  status: shadow_first
+  learner: discounted_ucb_then_linucb
+  tunable_cadences:
+    replay_calibration_interval_s:
+      baseline: 21600
+      candidates: [1800, 3600, 10800, 21600, 43200]
+    promotion_review_interval_s:
+      baseline: 1800
+      candidates: [900, 1800, 3600, 7200]
+    checkpoint_interval_s:
+      baseline: 60
+      candidates: [30, 60, 120, 300]
+    min_new_rewards_before_recalibration:
+      baseline: 50
+      candidates: [10, 25, 50, 100]
+    shadow_to_canary_cooldown_trades:
+      baseline: 100
+      candidates: [25, 50, 100, 200]
+  context_inputs:
+    maras:
+      - maras_composite_hash
+      - maras_confidence
+      - maras_conflict_level
+      - maras_nearest_distance
+    exof:
+      - exf_latest
+      - btc_regime_features
+      - market_volatility_context
+    esof:
+      - session_bucket
+      - day_of_week
+      - calendar_event_flags
+    ops:
+      - reward_backlog_age_s
+      - ch_write_failure_rate
+      - artifact_disk_free_gb
+      - drawdown_state
+  reward_mapping:
+    positive:
+      - faster_detection_of_degraded_hold_policy
+      - lower_stale_advice_rate
+      - lower_missed_adaptation_cost
+    negative:
+      - promotion_false_positive
+      - noisy_recalibration_churn
+      - excessive_compute_or_backlog
+      - operator_churn
+  live_change_policy:
+    replay_calibration_interval_s: controlled_after_shadow
+    promotion_review_interval_s: advisory_only_until_manual_approval
+    checkpoint_interval_s: fixed_by_ops_until_runner_load_tested
+    shadow_to_canary_cooldown_trades: advisory_only
+```
+
+This makes MARAS, ExoF, and EsoF eligible context for cadence advice. For
+example, VIBRISS may learn that high MARAS novelty plus hostile ExoF context
+requires faster recalibration review, while ordinary stable regimes can use a
+slower cadence to avoid overreacting.
+
+Cadence testing is permitted, but first in shadow:
+
+- log what cadence would have been chosen;
+- replay whether that cadence would have detected degradation sooner;
+- charge compute/backlog cost;
+- charge false-promotion cost;
+- compare against fixed-cadence baseline.
+
+Only after the meta-cadence policy beats fixed cadence in walk-forward replay
+and shadow operation may it control any real scheduler interval.
+
+### 11.14 Catastrophic Floor Derivation Study
+
+The floor percentage is now a dedicated shadow-only VIBRISS research target.
+
+```yaml
+param_set:
+  id: advsl.catastrophic_floor_derivation.v1
+  name: ADVSL Catastrophic Floor Derivation
+  status: shadow_first
+  success:
+    primary_metric: recursive_capital_curve_delta_after_opportunity_cost
+    artifact_kinds: [code, test, spec]
+    artifact_refs:
+      - prod/vibriss/floor_derivation.py
+      - prod/vibriss/test_floor_derivation.py
+      - prod/docs/ADVSL_CATASTROPHIC_FLOOR_DERIVATION_STUDY.md
+      - prod/docs/VIBRISS_PARAMETER_GOVERNANCE_SPEC.md
+```
+
+Current full-tape replay on the blue trade tape:
+
+- replayable trades: `802`
+- actual end capital: `$51,937.21`
+- floor-only best aggregate candidate: `1.50%`
+- floor-only per-regime averages: still centered at `0.50%`
+
+Interpretation:
+
+- this study does **not** validate `1.20%` as a universal standalone floor;
+- it validates the need for a derivation path and the ability to bind the
+  floor to code/test/spec evidence;
+- `1.20%` remains a coupled-policy prior for the broader ADVSL/TP/hold stack,
+  not a floor-only truth.
+
+The floor-only study must remain shadow-only. Live use may only follow a
+coupled policy that demonstrates positive recursive capital curve delta on
+held-out contiguous regions.
+
+### 11.15 Acceptance Tests
+
+Minimum tests before implementation can be called complete:
+
+- Given a fixed replay window, the same hold recommendation and reward are
+  reproduced bit-for-bit or within declared float tolerance.
+- Candidate arms outside the hard range are rejected.
+- Stale OBF creates a masked feature, not a fake zero-depth observation.
+- Low MARAS confidence or high conflict shrinks advice toward the global prior.
+- EFSM-flipped LONG contexts do not use unqualified SHORT-only priors.
+- Capital-aware replay compounds saved/lost capital forward.
+- Opportunity cost is charged when a cut trade later recovers.
+- The shadow advice payload contains candidate set, chosen arm, confidence,
+  baseline, guardrail result, and reproducibility keys.
+- Promotion decisions are rejected when the ParamSet omits `promotion_policy`.
+- Meta-cadence advice is logged as a ParamSet decision, not a runner-local
+  heuristic.
+
+## 12. VIBRISS Ops / Runner System
+
+### 12.1 Operational Objective
+
+VIBRISS must run as an observable production subsystem, not as an ad hoc
+notebook or one-off replay script.
+
+The runner is responsible for:
+
+- loading parameter specs and ParamSet specs,
+- ingesting live context from Hazelcast and historical context from ClickHouse,
+- publishing shadow/advisory parameter postures,
+- scheduling replay/calibration subtasks,
+- writing full audit logs,
+- exposing health sensors to MHS,
+- feeding TUI/observability surfaces,
+- checkpointing learner state so recommendations are reproducible after restart.
+
+The runner must reuse the existing infrastructure pattern:
+
+- supervisord is the process authority;
+- Hazelcast is the live bus;
+- ClickHouse is the audit/event store;
+- NATS is the optional event transport for replay, reward, and policy-state
+  fanout when decoupled workers or durable queues are useful;
+- MHS reads composite health from HZ and reports it in `DOLPHIN_META_HEALTH`;
+- TUI observes primarily through HZ listeners and polls CH only for heavier
+  historical panels;
+- Prefect is optional for scheduled offline jobs, not required for the hot
+  VIBRISS daemon.
+
+### 12.2 Process Topology
+
+VIBRISS should be containerized, but still owned by supervisord.
+In the current production layout, the host supervisord owns only the
+container bootstrap wrapper; the container itself runs its own supervisord
+instance, which owns the live runner process. That makes later full-system
+containerization easier without changing the runner contract.
+
+If sandboxing is enabled, gVisor is the outer runtime boundary for the
+container or worker container. VIBRISS does not instantiate or manage gVisor
+from inside the container; the host/container runtime selects that boundary at
+launch time. The containerized runner must still reach host Hazelcast and
+ClickHouse over the configured backplane. If NATS is enabled, it runs as a
+sibling stack service on the host backplane and the container talks to it over
+`nats://localhost:4222`.
+
+Recommended process shape:
+
+```text
+supervisord
+  -> vibriss_runner container
+       -> live advice loop
+       -> spec loader
+       -> health publisher
+       -> lightweight replay scheduler
+       -> learner checkpoint writer
+
+  -> optional vibriss_worker container(s)
+       -> full-tape replay
+       -> walk-forward validation
+       -> OBF/MARAS proximity calibration
+       -> offline policy evaluation
+```
+
+The live runner is a long-lived daemon. Heavy replay/calibration jobs are
+separate subtasks so the live advice loop cannot be blocked by ML work.
+
+The experiment-side harness that replays trade episodes, sweep ranges, and
+walk-forward windows is specified separately in
+[`VIBRASS_EXPERIMENT_RUNNER_SPEC.md`](VIBRASS_EXPERIMENT_RUNNER_SPEC.md).
+
+Container runtime:
+
+- Docker or Podman is acceptable.
+- Prefer Podman if rootless isolation becomes important.
+- Optional sandbox runtime: gVisor may wrap the launched container or worker
+  container, but it is selected outside VIBRISS by the host/container runtime.
+  VIBRISS must not attempt to manage the sandbox boundary from inside the
+  container.
+- Do not put Hazelcast in the VIBRISS container.
+- Do not restart Hazelcast as part of VIBRISS recovery.
+- Mount large replay outputs to `/mnt/dolphin_training/vibriss/`, not the SMB
+  repo path.
+- Write only small docs/specs to `/mnt/dolphinng5_predict/prod/docs/`.
+
+### 12.3 Supervisor Contract
+
+Recommended supervisord entries:
+
+```ini
+[program:vibriss_runner]
+command=/usr/bin/podman run --rm --name dolphin-vibriss-runner
+        --network host
+        -v /mnt/dolphinng5_predict:/mnt/dolphinng5_predict:ro
+        -v /mnt/dolphin_training/vibriss:/mnt/dolphin_training/vibriss:rw
+        -v /mnt/ng6_data:/mnt/ng6_data:ro
+        -e HZ_HOST=localhost:5701
+        -e CH_URL=http://localhost:8123/
+        -e CH_DB=dolphin
+        dolphin-vibriss:latest
+        python -m vibriss.runner --mode shadow
+directory=/mnt/dolphinng5_predict/prod
+autostart=true
+autorestart=true
+startsecs=10
+startretries=5
+stopwaitsecs=20
+stopasgroup=true
+killasgroup=true
+stdout_logfile=/mnt/dolphin_training/vibriss/logs/supervisor/vibriss_runner.log
+stderr_logfile=/mnt/dolphin_training/vibriss/logs/supervisor/vibriss_runner-error.log
+
+[program:vibriss_worker]
+command=/usr/bin/podman run --rm --name dolphin-vibriss-worker
+        --network host
+        -v /mnt/dolphinng5_predict:/mnt/dolphinng5_predict:ro
+        -v /mnt/dolphin_training/vibriss:/mnt/dolphin_training/vibriss:rw
+        -v /mnt/ng6_data:/mnt/ng6_data:ro
+        dolphin-vibriss:latest
+        python -m vibriss.worker --idle
+directory=/mnt/dolphinng5_predict/prod
+autostart=false
+autorestart=false
+startsecs=0
+stopwaitsecs=30
+stdout_logfile=/mnt/dolphin_training/vibriss/logs/supervisor/vibriss_worker.log
+stderr_logfile=/mnt/dolphin_training/vibriss/logs/supervisor/vibriss_worker-error.log
+```
+
+Group placement:
+
+```ini
+[group:dolphin_data]
+programs=exf_fetcher,acb_processor,obf_universe,meta_health,system_stats,
+         esof_advisor,maras_service,vibriss_runner
+```
+
+Rationale:
+
+- VIBRISS is data/control-plane infrastructure, not the trader itself.
+- The runner can be autostarted because it begins shadow-only.
+- Workers remain manual or scheduler-launched because full replay can be heavy.
+- MHS must observe VIBRISS health, but must not fight the container runtime
+  through systemd.
+
+### 12.4 Container Interface
+
+Required environment variables:
+
+| Env | Meaning |
+|---|---|
+| `HZ_HOST` | Hazelcast host/port, default `localhost:5701`. |
+| `CH_URL` | ClickHouse HTTP URL. |
+| `CH_DB` | Namespace DB: `dolphin`, `dolphin_prodgreen`, or PINK-specific DB. |
+| `CH_USER` / `CH_PASS` | ClickHouse credentials. |
+| `NATS_URL` | Optional NATS server URL, default `nats://localhost:4222`. |
+| `VIBRISS_ENABLE_NATS_TRANSPORT` | Enable best-effort NATS publication. |
+| `VIBRISS_NATS_SUBJECT_PREFIX` | Subject prefix, default `vibriss`. |
+| `VIBRISS_MODE` | `shadow`, `advisory`, `canary`, or `disabled`. |
+| `VIBRISS_NAMESPACE` | `blue`, `pink`, `prodgreen`, or `research`. |
+| `VIBRISS_SPEC_DIR` | Param spec directory. |
+| `VIBRISS_STATE_DIR` | Checkpoint/output directory. |
+| `VIBRISS_ENABLE_LIVE_ACTUATION` | Must default to `0`. |
+| `VIBRISS_CALIBRATION_INTERVAL_S` | Default replay/calibration scheduler interval. |
+| `VIBRISS_PROMOTION_REVIEW_INTERVAL_S` | Default promotion-gate review interval. |
+| `VIBRISS_META_CADENCE_MODE` | `fixed`, `shadow`, or `controlled`; defaults to `fixed`. |
+| `VIBRISS_MHS_SENSOR_KEY` | Default `vibriss_sensors_blue`. |
+| `VIBRISS_HEALTH_INTERVAL_S` | Default `5`. |
+
+Filesystem contract:
+
+| Path | Mode | Use |
+|---|---|---|
+| `/mnt/dolphinng5_predict` | read-only in container | Code/spec/doc access. |
+| `/mnt/dolphin_training/vibriss` | read-write | Learner state, replay artifacts, reports. |
+| `/mnt/ng6_data` | read-only | Tape, OBF, scan data. |
+| `/tmp` inside container | read-write ephemeral | Small temporary files only. |
+
+### 12.5 Internal Runner Loops
+
+The runner should have separate loops with independent health status:
+
+| Loop | Cadence | Responsibility |
+|---|---:|---|
+| `spec_loader` | startup + 60s | Load/validate ParamSpec and ParamSetSpec files. |
+| `context_ingestor` | 0.5s to 5s | Read HZ live context and keep a point-in-time snapshot. |
+| `advice_loop` | on context/trade event | Score candidates and publish shadow/advisory advice. |
+| `reward_collector` | 10s to 60s | Join closed trades to advice and write delayed rewards. |
+| `checkpoint_loop` | 60s | Persist learner state and model metadata. |
+| `calibration_scheduler` | 5m+ | Queue replay/validation subtasks when new data warrants it. |
+| `promotion_evaluator` | 15m+ | Evaluate whether a ParamSet may move to a stronger mode. |
+| `meta_cadence_evaluator` | 15m+ | Shadow-test cadence settings for calibration/promotion/update loops. |
+| `health_publisher` | 5s | Publish MHS-compatible sensor payload. |
+
+The advice loop must never wait on full replay, model training, or ClickHouse
+backfill. If ClickHouse is slow, advice may continue from latest checkpoint and
+mark reward collection degraded.
+
+### 12.6 Hazelcast Surfaces
+
+Recommended HZ maps/keys:
+
+| Map | Key | Producer | Consumer | Purpose |
+|---|---|---|---|---|
+| `DOLPHIN_FEATURES` | `vibriss_param_advice` | runner | BLUE/PINK/TUI | Latest general parameter advice. |
+| `DOLPHIN_FEATURES` | `vibriss_hold_substitute_advice` | runner | ADVSL/TUI | Latest ADVSL hold-substitute advice. |
+| `DOLPHIN_FEATURES` | `vibriss_latest` | runner | TUI/MHS/manual ops | Compact subsystem summary. |
+| `DOLPHIN_META_HEALTH` | `vibriss_sensors_blue` | runner | MHS | BLUE VIBRISS sensor payload. |
+| `DOLPHIN_META_HEALTH` | `vibriss_sensors_pink` | runner | MHS | PINK VIBRISS sensor payload. |
+| `DOLPHIN_HEARTBEAT` | `vibriss_runner_heartbeat` | runner | MHS/TUI | Liveness heartbeat. |
+| `DOLPHIN_CONTROL_PLANE` | `vibriss_commands` | ops/TUI | runner | Freeze, unfreeze, replay, reload specs. |
+
+Advice remains separate from commands. An advice key tells the engine what
+VIBRISS recommends; a command key tells VIBRISS what operators want it to do.
+
+### 12.7 ClickHouse Tables
+
+VIBRISS needs durable audit tables. Recommended tables:
+
+| Table | Purpose |
+|---|---|
+| `dolphin.vibriss_decisions` | One row per candidate-scoring decision. |
+| `dolphin.vibriss_rewards` | Delayed realized/counterfactual reward rows. |
+| `dolphin.vibriss_policy_state` | Checkpoint metadata and active posture versions. |
+| `dolphin.vibriss_paramset_status` | Per-ParamSet health/performance summary. |
+| `dolphin.vibriss_subtasks` | Replay/calibration/ML subtask lifecycle. |
+
+Minimum `vibriss_decisions` fields:
+
+```sql
+ts DateTime64(6, 'UTC'),
+namespace LowCardinality(String),
+mode LowCardinality(String),
+param_set_id LowCardinality(String),
+spec_version String,
+decision_id String,
+trade_id String,
+asset LowCardinality(String),
+side LowCardinality(String),
+scan_number UInt64,
+context_hash String,
+maras_composite_hash UInt16,
+maras_regime LowCardinality(String),
+candidate_set_json String,
+chosen_arm String,
+baseline_value String,
+recommended_value String,
+confidence Float32,
+propensity Float32,
+guardrail_status LowCardinality(String),
+fallback_reason String,
+model_version String,
+payload_json String
+```
+
+Minimum `vibriss_rewards` fields:
+
+```sql
+ts DateTime64(6, 'UTC'),
+decision_id String,
+trade_id String,
+reward_status LowCardinality(String),
+raw_actual_pnl Float64,
+raw_counterfactual_pnl Float64,
+saved_loss_delta Float64,
+clipped_winner_delta Float64,
+capital_curve_delta Float64,
+drawdown_delta Float64,
+recovery_lag_s Float32,
+extra_bars_to_recovery Float32,
+normalized_reward Float32,
+reward_components_json String
+```
+
+Subtask rows must include `subtask_id`, `param_set_id`, `kind`, `status`,
+`started_at`, `finished_at`, `input_window`, `artifact_path`, `n_trades`,
+`primary_metric`, `failure_reason`, and `parent_decision_id` when applicable.
+
+### 12.8 MHS Sensor Contract
+
+VIBRISS should expose an MHS-compatible composite payload, modeled after the
+existing optional DITA sensor pattern.
+
+Recommended HZ key:
+
+```text
+DOLPHIN_META_HEALTH["vibriss_sensors_blue"]
+```
+
+Payload:
+
+```json
+{
+  "schema": "vibriss.mhs_sensors.v1",
+  "namespace": "blue",
+  "ts": "2026-06-03T00:00:00Z",
+  "rm_meta": 0.93,
+  "status": "GREEN",
+  "m14_vibriss_runner_liveness": 1.0,
+  "m15_vibriss_spec_integrity": 1.0,
+  "m16_vibriss_data_freshness": 0.9,
+  "m17_vibriss_advice_integrity": 1.0,
+  "m18_vibriss_reward_backlog": 0.85,
+  "m19_vibriss_paramset_health": 0.95,
+  "param_sets": {
+    "advsl.hold_substitute.v1": {
+      "score": 0.94,
+      "status": "GREEN",
+      "mode": "shadow",
+      "last_advice_age_s": 2.4,
+      "last_reward_age_s": 31.0,
+      "open_decisions": 1,
+      "reward_backlog": 3,
+      "shadow_samples": 240,
+      "walk_forward_status": "pending",
+      "latest_recommended_hold": 12
+    }
+  },
+  "subtasks": {
+    "full_tape_replay": {"score": 1.0, "status": "IDLE"},
+    "walk_forward": {"score": 0.8, "status": "STALE"},
+    "obf_binding": {"score": 1.0, "status": "IDLE"}
+  }
+}
+```
+
+Sensor scoring:
+
+| Sensor | Score rule |
+|---|---|
+| `m14_vibriss_runner_liveness` | 1 if heartbeat age < 15s, 0.5 if < 60s, else 0. |
+| `m15_vibriss_spec_integrity` | Fraction of loaded specs passing validation. |
+| `m16_vibriss_data_freshness` | Freshness of HZ context, CH close rows, OBF/MARAS context. |
+| `m17_vibriss_advice_integrity` | 1 when latest advice is schema-valid and guardrailed. |
+| `m18_vibriss_reward_backlog` | Penalizes unjoined decisions awaiting reward too long. |
+| `m19_vibriss_paramset_health` | Mean score of all enabled ParamSets. |
+
+MHS integration rule:
+
+- VIBRISS starts with weight `0.0` in RM_META until stable.
+- Then enable a small optional weight, analogous to DITA sensors.
+- Suggested initial weight: `0.02`.
+- Maximum allowed weight: `0.10` until the subsystem is live-actuating.
+- If VIBRISS is disabled, MHS score must be neutral and must not degrade BLUE.
+
+Suggested MHS env shape:
+
+```text
+DOLPHIN_MHS_USE_VIBRISS_SENSORS=1
+DOLPHIN_MHS_VIBRISS_SENSOR_WEIGHT=0.02
+DOLPHIN_VIBRISS_SENSOR_KEY=vibriss_sensors_blue
+DOLPHIN_MHS_VIBRISS_SENSOR_MAPS=DOLPHIN_META_HEALTH,DOLPHIN_FEATURES
+```
+
+### 12.9 Observability / TUI Integration
+
+TUI integration should follow the existing v9 pattern:
+
+- use HZ listeners for latest VIBRISS state;
+- add CH polling only for historical/replay-heavy summaries;
+- never poll origin subsystems directly from the TUI.
+
+Recommended panels:
+
+| Panel | Source | Cadence | Content |
+|---|---|---:|---|
+| `VIBRISS` main panel | `DOLPHIN_FEATURES/vibriss_latest` | HZ listener | mode, status, latest ParamSet advice, confidence, MHS score. |
+| `VIBRISS Hold` footer | `vibriss_hold_substitute_advice` + CH rewards | HZ + 60s CH | recommended hold, baseline, prior, reward backlog, recent net delta. |
+| `VIBRISS Tasks` footer | `vibriss_subtasks` | 60s CH | replay/walk-forward/OBF binding status. |
+| `MHS` existing panel | `DOLPHIN_META_HEALTH/latest` | HZ listener | include VIBRISS sensor details if enabled. |
+
+Display fields for `advsl.hold_substitute.v1`:
+
+```text
+VIBRISS HOLD  mode=shadow  rec=12b  base=20b  live_ref=6b
+conf=74%  guard=PASS  hash=57957  obf=weak  pressure=high
+reward_backlog=3  wf=pending  samples=240
+```
+
+The TUI must clearly distinguish:
+
+- baseline reference,
+- current live reference,
+- VIBRISS recommendation,
+- whether recommendation is shadow-only or live-consumed.
+
+Implementation note:
+
+- `prod/vibriss/vibriss_tui.py` now provides the Textual dashboard, and
+  `python -m vibriss.vibriss_runner tui` launches it in read-only shadow mode.
+- The UI is panel-registry based so additional metrics can be added without
+  rewriting the dashboard shell.
+
+### 12.10 Control Commands
+
+Commands should be written to `DOLPHIN_CONTROL_PLANE["vibriss_commands"]`.
+
+Allowed commands:
+
+| Command | Effect |
+|---|---|
+| `RELOAD_SPECS` | Reload ParamSpec/ParamSetSpec files and validate. |
+| `FREEZE_PARAMSET` | Stop updating and publish fallback for one ParamSet. |
+| `UNFREEZE_PARAMSET` | Resume shadow/advisory scoring. |
+| `RUN_REPLAY` | Queue replay subtask for a parameter set/window. |
+| `RUN_WALK_FORWARD` | Queue walk-forward validation. |
+| `SET_MODE` | Move `disabled -> shadow -> advisory`; live/canary requires explicit code/config gate. |
+| `CHECKPOINT_NOW` | Persist learner state immediately. |
+
+Commands must be acknowledged to:
+
+```text
+DOLPHIN_CONTROL_PLANE["vibriss_command_ack"]
+```
+
+Ack payloads must include command id, acceptance/rejection, reason, and current
+mode. Queue consumption alone is not success.
+
+### 12.11 Prefect Role
+
+Prefect is optional for VIBRISS. It should not be required for live advice.
+
+Acceptable Prefect use:
+
+- daily full-tape replay,
+- scheduled walk-forward validation,
+- artifact publication,
+- long offline calibration runs.
+
+Not acceptable:
+
+- live advice loop,
+- hot-path reward joining,
+- health publication,
+- operator freeze/unfreeze commands.
+
+If Prefect is unavailable, the VIBRISS runner should continue shadow/advisory
+operation from the last checkpoint and mark scheduled calibration stale.
+
+### 12.12 Failure Modes and Fallback
+
+| Failure | Required behavior |
+|---|---|
+| HZ unavailable | Runner logs degraded, cannot publish advice, MHS score <= 0.5. |
+| CH unavailable | Advice may continue from checkpoint; reward collector degrades. |
+| OBF stale | Mask OBF features; do not use OBF hold extension. |
+| MARAS stale | Shrink to global/label-free prior. |
+| Spec validation failure | Disable affected ParamSet, publish fallback. |
+| Learner checkpoint corrupt | Revert to last good checkpoint or baseline prior. |
+| Replay worker OOM/fails | Mark subtask failed; live runner continues. |
+| Advice schema invalid | Do not publish; MHS advice integrity drops. |
+| Drawdown alarm | Freeze to deterministic safe baseline. |
+
+### 12.13 Promotion Gates
+
+Before any engine consumes VIBRISS hold advice live:
+
+1. Runner has been stable for at least 7 calendar days.
+2. MHS VIBRISS sensors are GREEN or neutral for 95% of runner uptime.
+3. `advsl.hold_substitute.v1` has completed full-tape replay.
+4. Walk-forward is positive versus baseline on capital-curve delta after
+   opportunity cost.
+5. OOD region performance has no catastrophic degradation.
+6. TUI displays baseline/current/recommended state correctly.
+7. Command ack path is verified.
+8. Safe fallback is tested by intentionally freezing the ParamSet.
+9. Engine consumption is limited to one ParamSet and one namespace.
+10. `VIBRISS_ENABLE_LIVE_ACTUATION=1` is explicitly set and reviewed.
+
+## 13. V1 Rollout Plan
+
+1. Offline replay only:
+   - replay historical decisions from ClickHouse and tape.
+   - benchmark against baseline constants.
+   - compute OPE where logged propensities exist.
+   - report by asset, side, MARAS hash, regime label, V7 reason, OBF bucket,
+     and contiguous time region.
+
+2. Shadow mode:
+   - publish advice to HZ.
+   - do not allow engine consumption.
+   - write `vibriss_decisions`, `vibriss_rewards`, and `vibriss_policy_state`.
+
+3. Guarded advisory:
+   - engine reads advice and surfaces what it would have used.
+   - still no actuation.
+
+4. Canary live:
+   - one parameter only.
+   - no simultaneous bundle changes.
+   - low exploration cap.
+   - hard fallback on stale data, drawdown alarm, or drift alarm.
+
+5. Controlled live comparison:
+   - compare baseline-vs-advised on matched contexts.
+   - freeze policy if replay quality deteriorates.
+
+## 14. Safety Rules
+
+Mandatory:
+
+- no direct mutation of `blue.yml` or frozen champion config from VIBRISS.
+- no live promotion without replay, shadow, and documented approval.
+- no advice consumption when data is stale.
+- no advice consumption inside disallowed live-change windows.
+- no multi-parameter bundle learning until single-parameter learners prove that
+  independent adaptation is insufficient.
+- every live-consumed recommendation must be reconstructable from logs.
+- every safety-critical parameter must preserve a catastrophic fallback floor.
+
+## 15. Concrete Storage and Schema
+
+VIBRISS must be event-sourced. Current policy state is a cache; decisions and
+rewards are the durable truth.
+
+### 15.1 ClickHouse DDL
+
+Recommended DDL:
+
+```sql
+CREATE TABLE IF NOT EXISTS dolphin.vibriss_decisions
+(
+    ts DateTime64(6, 'UTC'),
+    namespace LowCardinality(String),
+    mode LowCardinality(String),
+    param_set_id LowCardinality(String),
+    spec_version String,
+    decision_id String,
+    parent_decision_id String,
+    trade_id String,
+    asset LowCardinality(String),
+    side LowCardinality(String),
+    scan_number UInt64,
+    bars_held UInt32,
+    context_hash String,
+    context_schema String,
+    maras_composite_hash UInt32,
+    maras_scalar_hash UInt32,
+    maras_regime LowCardinality(String),
+    maras_confidence Float32,
+    maras_conflict Float32,
+    obf_stale UInt8,
+    obf_depth_1pct_usd Float64,
+    obf_depth_quality Float32,
+    v7_pressure Float32,
+    v7_mae_risk Float32,
+    candidate_set_json String,
+    chosen_arm String,
+    baseline_value String,
+    recommended_value String,
+    confidence Float32,
+    propensity Float32,
+    guardrail_status LowCardinality(String),
+    fallback_reason String,
+    model_version String,
+    policy_version String,
+    compiled_config_hash String,
+    consumed UInt8,
+    consumed_ts Nullable(DateTime64(6, 'UTC')),
+    payload_json String
+)
+ENGINE = MergeTree
+PARTITION BY toYYYYMM(ts)
+ORDER BY (namespace, param_set_id, ts, decision_id)
+TTL ts + INTERVAL 180 DAY;
+
+CREATE TABLE IF NOT EXISTS dolphin.vibriss_rewards
+(
+    ts DateTime64(6, 'UTC'),
+    namespace LowCardinality(String),
+    param_set_id LowCardinality(String),
+    decision_id String,
+    trade_id String,
+    reward_status LowCardinality(String),
+    reward_delay_s Float32,
+    actual_exit_reason LowCardinality(String),
+    counterfactual_exit_reason LowCardinality(String),
+    actual_exit_pnl Float64,
+    counterfactual_exit_pnl Float64,
+    saved_loss_delta Float64,
+    clipped_winner_delta Float64,
+    capital_curve_delta Float64,
+    drawdown_delta Float64,
+    recovery_lag_s Float32,
+    extra_bars_to_recovery Float32,
+    normalized_reward Float32,
+    opportunity_cost_charged UInt8,
+    replay_artifact_path String,
+    reward_components_json String
+)
+ENGINE = MergeTree
+PARTITION BY toYYYYMM(ts)
+ORDER BY (namespace, param_set_id, ts, decision_id)
+TTL ts + INTERVAL 365 DAY;
+
+CREATE TABLE IF NOT EXISTS dolphin.vibriss_policy_state
+(
+    ts DateTime64(6, 'UTC'),
+    namespace LowCardinality(String),
+    param_set_id LowCardinality(String),
+    policy_version String,
+    mode LowCardinality(String),
+    learner LowCardinality(String),
+    checkpoint_path String,
+    checkpoint_hash String,
+    spec_hash String,
+    compiled_config_hash String,
+    n_decisions UInt64,
+    n_rewards UInt64,
+    shadow_samples UInt64,
+    walk_forward_status LowCardinality(String),
+    active_baseline_value String,
+    active_recommended_value String,
+    confidence Float32,
+    state_json String
+)
+ENGINE = ReplacingMergeTree(ts)
+ORDER BY (namespace, param_set_id, policy_version);
+
+CREATE TABLE IF NOT EXISTS dolphin.vibriss_subtasks
+(
+    ts DateTime64(6, 'UTC'),
+    namespace LowCardinality(String),
+    subtask_id String,
+    param_set_id LowCardinality(String),
+    kind LowCardinality(String),
+    status LowCardinality(String),
+    started_at DateTime64(6, 'UTC'),
+    finished_at Nullable(DateTime64(6, 'UTC')),
+    input_window String,
+    n_trades UInt64,
+    n_decisions UInt64,
+    primary_metric Float64,
+    baseline_metric Float64,
+    artifact_path String,
+    artifact_hash String,
+    failure_reason String,
+    payload_json String
+)
+ENGINE = MergeTree
+PARTITION BY toYYYYMM(started_at)
+ORDER BY (namespace, param_set_id, started_at, subtask_id)
+TTL started_at + INTERVAL 365 DAY;
+
+CREATE TABLE IF NOT EXISTS dolphin.vibriss_promotions
+(
+    ts DateTime64(6, 'UTC'),
+    namespace LowCardinality(String),
+    param_set_id LowCardinality(String),
+    promotion_id String,
+    from_mode LowCardinality(String),
+    to_mode LowCardinality(String),
+    requested_by LowCardinality(String),
+    approved_by LowCardinality(String),
+    policy_version String,
+    checkpoint_hash String,
+    evidence_window String,
+    n_decisions UInt64,
+    n_rewards UInt64,
+    n_shadow_samples UInt64,
+    n_live_samples UInt64,
+    recursive_capital_delta Float64,
+    opportunity_cost_delta Float64,
+    max_drawdown_delta Float64,
+    worst_region_delta Float64,
+    baseline_metric Float64,
+    candidate_metric Float64,
+    guardrail_status LowCardinality(String),
+    decision LowCardinality(String),
+    reason String,
+    artifact_path String,
+    payload_json String
+)
+ENGINE = MergeTree
+PARTITION BY toYYYYMM(ts)
+ORDER BY (namespace, param_set_id, ts, promotion_id)
+TTL ts + INTERVAL 730 DAY;
+
+CREATE TABLE IF NOT EXISTS dolphin.vibriss_meta_cadence_decisions
+(
+    ts DateTime64(6, 'UTC'),
+    namespace LowCardinality(String),
+    param_set_id LowCardinality(String),
+    cadence_id LowCardinality(String),
+    decision_id String,
+    mode LowCardinality(String),
+    context_hash String,
+    maras_composite_hash UInt32,
+    maras_regime LowCardinality(String),
+    exof_state String,
+    esof_state String,
+    candidate_set_json String,
+    chosen_value String,
+    baseline_value String,
+    confidence Float32,
+    reward_status LowCardinality(String),
+    reward_value Float32,
+    guardrail_status LowCardinality(String),
+    fallback_reason String,
+    policy_version String,
+    payload_json String
+)
+ENGINE = MergeTree
+PARTITION BY toYYYYMM(ts)
+ORDER BY (namespace, param_set_id, cadence_id, ts, decision_id)
+TTL ts + INTERVAL 365 DAY;
+```
+
+These tables are deliberately narrow enough for hot audit reads and broad enough
+to replay the decision. Large path arrays, per-bar simulations, and model
+artifacts must be written to artifact storage, not inlined into ClickHouse.
+
+### 15.2 Artifact Layout
+
+Use a non-SMB path for generated artifacts:
+
+```text
+/mnt/dolphin_training/vibriss/
+  specs/
+    advsl.hold_substitute.v1.yaml
+  checkpoints/
+    blue/advsl.hold_substitute.v1/<policy_version>/
+      state.json
+      learner.pkl
+      manifest.json
+  replays/
+    <YYYY-MM-DD>/<subtask_id>/
+      config.yaml
+      replay_summary.json
+      capital_curve.csv
+      per_trade_counterfactuals.parquet
+      opportunity_cost_audit.parquet
+  reports/
+    walk_forward/
+    obf_binding/
+    maras_hash_priors/
+```
+
+Every artifact directory must contain a `manifest.json`:
+
+```json
+{
+  "schema": "vibriss.artifact_manifest.v1",
+  "subtask_id": "wf-20260603-001",
+  "param_set_id": "advsl.hold_substitute.v1",
+  "namespace": "blue",
+  "created_at": "2026-06-03T00:00:00Z",
+  "git_sha": "unknown-or-sha",
+  "spec_hash": "sha256:...",
+  "input_tables": {
+    "trade_events": {"min_ts": "...", "max_ts": "...", "row_count": 1234},
+    "v7_decision_events": {"min_ts": "...", "max_ts": "...", "row_count": 9999}
+  },
+  "tape_sources": ["/mnt/ng6_data/arrow_scans/..."],
+  "random_seed": 0,
+  "artifact_hashes": {
+    "replay_summary.json": "sha256:...",
+    "per_trade_counterfactuals.parquet": "sha256:..."
+  }
+}
+```
+
+## 16. Replay, OPE, and Causality Rules
+
+VIBRISS must be explicit about what kind of evidence it has.
+
+Evidence classes:
+
+| Class | Meaning | Allowed use |
+|---|---|---|
+| `realized_live` | Parameter was actually used live. | Highest-quality reward. |
+| `shadow_counterfactual` | Advice logged, baseline used, tape can replay alternative. | OPE/research only unless validated. |
+| `historical_replay` | Offline replay over historical trades with no logged propensity. | Calibration prior, not proof. |
+| `synthetic_mc` | Monte Carlo augmentation from validated distribution. | Stress coverage only. |
+| `expert_baseline` | Human/research default such as 12 bars. | Fallback/prior. |
+
+Counterfactual replay must store:
+
+- actual entry, actual exit, and actual capital before/after;
+- counterfactual exit scan/bar and price;
+- whether the counterfactual exit depends on sub-bar, bar-close, or tape-close
+  cadence;
+- whether the trade later recovered;
+- how many bars/seconds were needed for recovery;
+- opportunity cost charged;
+- recursive capital state after applying the counterfactual.
+
+OPE rules:
+
+- Use inverse propensity or doubly robust estimators only when propensities were
+  actually logged.
+- Do not pretend historical replay has logged propensities.
+- For shadow decisions without randomized action, report them as model
+  counterfactuals, not causal estimates.
+- Region splits must be contiguous first; randomized splits are secondary
+  robustness checks only.
+- A policy that wins by one tail event and loses broadly must be flagged as
+  fragile even when net capital delta is positive.
+
+Minimum replay report:
+
+```text
+baseline_end_capital
+policy_end_capital
+recursive_delta
+gross_saved_loss
+gross_opportunity_cost
+net_trade_pnl_delta
+max_drawdown_delta
+tail_loss_count_delta
+clipped_winner_count
+recovered_cut_count
+median_recovery_lag_s
+worst_region_delta
+best_region_delta
+per_asset_concentration
+per_hash_concentration
+```
+
+## 17. Mode State Machine
+
+VIBRISS modes are explicit and monotonic unless an operator command or guardrail
+forces demotion.
+
+```text
+disabled
+  -> shadow
+  -> advisory
+  -> canary_live
+  -> controlled_live
+```
+
+Mode meanings:
+
+| Mode | Publishes advice | Engine may read | Engine may act | Learner updates |
+|---|---:|---:|---:|---:|
+| `disabled` | no | no | no | no |
+| `shadow` | yes | no | no | yes |
+| `advisory` | yes | yes, display only | no | yes |
+| `canary_live` | yes | yes | yes, one ParamSet/namespace | yes |
+| `controlled_live` | yes | yes | yes, bounded | yes |
+
+Automatic demotions:
+
+- stale required sensor -> `shadow` or fallback advice;
+- invalid spec -> affected ParamSet disabled;
+- reward backlog beyond threshold -> freeze learner updates;
+- drawdown alarm -> deterministic safe baseline;
+- ClickHouse unavailable -> keep publishing only if checkpoint is fresh; mark
+  reward collection degraded;
+- Hazelcast unavailable -> no advice publication;
+- policy drift alarm -> freeze to last known-good checkpoint.
+
+Promotion technique, thresholds, cadence, and evidence gates must be declared
+inside the affected ParamSet spec. The runner evaluates and records those gates;
+it is not allowed to invent a promotion policy from global defaults.
+
+Promotion must be manual and auditable for any transition that enables live
+actuation. No health recovery path may silently promote VIBRISS into a stronger
+actuation mode.
+
+### 17.1 ParamSet-Owned Promotion Lifecycle
+
+Every ParamSet must answer these questions before it can leave `shadow`:
+
+| Question | Required ParamSet field |
+|---|---|
+| What baseline is being challenged? | `promotion_policy.baseline_policy` |
+| What evidence class is allowed? | `promotion_policy.technique` and `evidence_gates` |
+| How often is the evidence recomputed? | `promotion_policy.cadence.replay_calibration` |
+| How often is promotion eligibility reviewed? | `promotion_policy.cadence.promotion_review` |
+| When may the engine replace the old value? | `promotion_policy.cadence.live_replacement_rhythm` |
+| What samples are required? | `promotion_policy.evidence_gates.*min*` |
+| What demotes it? | `promotion_policy.automatic_demotion` |
+| Who approves live use? | `promotion_policy.*manual_approval_required` |
+
+Promotion is also subject to the control-plane elegance constraints in §4.1:
+one writer per parameter, spec-owned promotion, slow-governed meta-cadence,
+context inputs instead of arbitrary controllers, reproducible live changes, no
+hidden cross-subsystem mutation, and shadow/replay/canary before live.
+
+Default lifecycle:
+
+```text
+historical_replay
+  -> walk_forward_replay
+  -> shadow_advice_logging
+  -> advisory_display
+  -> canary_live_capture
+  -> controlled_live
+```
+
+The cadence of each phase is also ParamSet-owned:
+
+- `advice cadence`: how often the ParamSet emits advice.
+- `reward cadence`: how often delayed rewards are joined and scored.
+- `calibration cadence`: how often the learner updates from replay/rewards.
+- `promotion-review cadence`: how often mode eligibility is evaluated.
+- `replacement rhythm`: the exact engine decision point where a live parameter
+  can replace the baseline.
+
+For safety-critical exit parameters, replacement rhythm should usually be
+`capture_on_entry` or `between_trades`, not arbitrary intratrade mutation.
+
+### 17.2 Meta-Cadences as Governed Parameters
+
+Meta-cadences are tunable parameters. If VIBRISS changes them, they must be
+declared in the ParamSet under `meta_cadence_policy`.
+
+Examples:
+
+| Meta-cadence | Meaning |
+|---|---|
+| `replay_calibration_interval_s` | How often to re-run replay/calibration. |
+| `promotion_review_interval_s` | How often to evaluate mode promotion/demotion. |
+| `checkpoint_interval_s` | How often to persist learner state. |
+| `min_new_rewards_before_recalibration` | Event-driven cadence threshold. |
+| `shadow_to_canary_cooldown_trades` | Minimum stable evidence before live canary. |
+
+MARAS, ExoF, EsoF, OBF, V7, MHS, and drawdown state may be context inputs for
+meta-cadence advice, but the cadence learner is subject to the same evidence
+rules as any other parameter learner. In particular:
+
+- fixed cadence is the baseline;
+- shadow cadence decisions must be logged with candidate set and confidence;
+- replay must estimate missed-adaptation cost and false-promotion cost;
+- compute/backlog cost is part of reward;
+- live control of promotion cadence requires explicit manual approval.
+
+## 18. Engine Consumption Contract
+
+The engine must treat VIBRISS advice as optional, expiring input.
+
+Consumption algorithm:
+
+```text
+read advice payload
+validate schema and spec_version
+check namespace matches runtime
+check mode permits consumption
+check expires_at > now
+check trade_scope is current decision point
+check recommendation within hard range
+check guardrail_status == PASS or permitted advisory state
+check fallback/catastrophic floor remains active
+capture value into trade-local immutable parameter snapshot
+emit consumption audit
+```
+
+For `advsl.hold_substitute.v1`, the first live contract should be:
+
+- consume only on entry;
+- store the selected hold bars in the pending/open trade state;
+- do not mutate it intratrade;
+- allow intratrade VIBRISS values only as shadow comparisons;
+- let catastrophic floor and max-dollar floor override hold advice.
+
+This avoids a subtle failure mode where a learner changes the hold target after
+seeing adverse movement that was not available at entry. Intratrade contraction
+can be researched later, but it is a different ParamSet.
+
+## 19. Drift, Novelty, and Freezing
+
+VIBRISS must separate three conditions:
+
+1. data-quality degradation,
+2. market/regime novelty,
+3. policy underperformance.
+
+Drift sensors:
+
+| Sensor | Trigger |
+|---|---|
+| context distribution drift | MARAS/OBF/V7 feature distribution shifts versus training window. |
+| reward drift | rolling reward lower than baseline beyond confidence bound. |
+| regret drift | chosen arm underperforms baseline arm in shadow replay. |
+| tail cluster | tail-loss or floor-hit count above historical percentile. |
+| sparse regime | nearest-neighbor distance to known MARAS/OBF contexts too high. |
+
+Actions:
+
+- distribution drift alone: shrink toward baseline and raise uncertainty;
+- reward drift: freeze learner updates and publish fallback;
+- tail cluster: tighten safety floors only if pre-authorized by the ParamSet;
+- sparse regime: use global safe prior, not nearest hash overfit;
+- data-quality drift: stop consuming affected sensors.
+
+VIBRISS should publish drift state in `vibriss_latest` and
+`vibriss_paramset_status`.
+
+## 20. Data Volume and Backpressure
+
+The ClickHouse outage and spool backlog failure mode matters for VIBRISS.
+
+Rules:
+
+- VIBRISS must have its own spool and backlog metric.
+- Advice publication must not block on ClickHouse.
+- Reward collection may lag, but the lag must be visible in MHS.
+- Large per-bar OBF or path arrays must not be written to hot audit tables.
+- Calibration workers must rate-limit writes and should prefer compact Parquet
+  artifacts for heavy outputs.
+- If ClickHouse spool backlog exceeds threshold, VIBRISS must degrade to
+  `shadow_no_update`: publish from checkpoint only, do not update learners from
+  partial reward data.
+
+Recommended thresholds:
+
+| Metric | GREEN | DEGRADED | CRITICAL |
+|---|---:|---:|---:|
+| decision spool backlog | `<1k` | `1k-50k` | `>50k` |
+| reward backlog age | `<10m` | `10m-2h` | `>2h` |
+| artifact disk free | `>20GB` | `5-20GB` | `<5GB` |
+| CH write failure rate | `<1%` | `1-10%` | `>10%` |
+
+VIBRISS must not repeat the OBF-style failure mode of letting millions of
+low-priority rows delay high-priority trade/reward rows. Use priority queues:
+
+1. decisions, rewards, policy state;
+2. trade/path summary;
+3. calibration summary;
+4. heavy diagnostics.
+
+## 21. Security and Operational Guardrails
+
+Secrets:
+
+- use existing ClickHouse user/password env pattern;
+- do not write credentials into spec files;
+- do not put secrets in artifact manifests.
+
+Filesystem:
+
+- code/spec mount is read-only inside the container;
+- learner state and replay artifacts are written outside the SMB repo path;
+- runner must check free disk before replay subtasks;
+- no large file writes to `/mnt/dolphinng5_predict`.
+
+Runtime:
+
+- do not restart Hazelcast;
+- do not use systemd for Dolphin services;
+- use supervisord as the owner of the container process;
+- if gVisor is used, treat it as a host-selected sandbox/runtime wrapper, not a
+  process owned by VIBRISS internals;
+- worker OOM must not kill the live advice runner;
+- health checks must distinguish runner alive from learner valid.
+
+## 22. Implementation Defaults
+
+These decisions are now recommended defaults, not open questions:
+
+- First learner: discounted UCB for non-contextual hold-bar baseline plus LinUCB
+  shadow branch for MARAS/OBF/V7 context.
+- First live dependency posture: internal finite-arm learners and compact
+  checkpointed state in the runner; no VW, OBP, ABIDES, Pyro/NumPyro, CATX, or
+  broad benchmark libraries in the live advice path.
+- First worker dependency posture: VW, River, OBP, MABWiser, lifelines,
+  statsmodels, and benchmark libraries are allowed only in replay/OPE/calibration
+  jobs with bounded memory and artifact output.
+- First drift implementation: simple internal rolling statistics plus optional
+  River-backed detectors if the dependency remains stable inside the runner.
+- First HZ publication surface: `DOLPHIN_FEATURES["vibriss_param_advice"]` plus
+  dedicated keys for high-value ParamSets such as
+  `vibriss_hold_substitute_advice`.
+- First consumption point for ADVSL hold substitute: capture-on-entry only.
+- Counterfactual rewards: store as `shadow_counterfactual` with explicit
+  replay artifact path and no causal-propensity claim.
+- Drift ownership: VIBRISS computes policy/reward drift and subscribes to MHS,
+  MARAS, OBF, and SurvivalStack for external drift/context.
+- Container launch: use a small wrapper script under supervisord in production
+  so image existence, disk space, mount health, and env are checked before
+  `podman run` or `docker run`.
+- MHS integration: prefer a generic external-sensor loader eventually, but V1
+  may implement a VIBRISS-specific optional sensor as long as it is neutral when
+  disabled.
+- Infrastructure posture: keep Hazelcast + ClickHouse + supervisord for V1;
+  Kafka/Flink are deferred until measured event volume or recovery requirements
+  exceed the existing bus/audit pattern.
+
+## 23. Open Implementation Questions
+
+- Exact minimum sample thresholds per parameter family after the full 1.7k+
+  trade corpus is rebuilt under the same capital geometry.
+- Whether hard `$400` floors should be a separate ParamSet or remain outside
+  VIBRISS as fixed safety policy.
+- How to measure sub-bar TP/cadence opportunity cost in a way compatible with
+  bar-based ADVSL replay.
+- Whether intratrade hold contraction deserves a second ParamSet after
+  entry-captured hold advice is validated.
+- How much MC/synthetic data is statistically acceptable without overstating
+  confidence in rare-tail regimes.
+- Whether PINK can share BLUE priors after venue slippage, fills, and exchange
+  state are included, or must maintain separate priors from day one.
+
+## 24. Recommended First Build
+
+Build VIBRISS V1 as a shadow-only package with:
+
+- `ParamSpec` dataclasses and YAML loader.
+- `ParamSetSpec` support for `advsl.hold_substitute.v1`.
+- discrete UCB/Thompson learner.
+- contextual LinUCB learner stub or implementation.
+- advice publisher.
+- ClickHouse audit writer.
+- MHS-compatible sensor publisher.
+- supervisord/container runner definition.
+- offline replay harness for conditional fast TP and ADVSL hold bars.
+- capital-aware replay and opportunity-cost accounting for the hold substitute.
+- no live actuation.
+
+Recommended package layout:
+
+```text
+/mnt/dolphinng5_predict/vibriss/
+  __init__.py
+  specs.py                 # ParamSpec / ParamSetSpec dataclasses and validation
+  context.py               # HZ/CH context snapshots, masks, point-in-time joins
+  features.py              # deterministic feature construction
+  learners/
+    __init__.py
+    ucb.py                 # discounted UCB over finite arms
+    thompson.py            # categorical Thompson sampling
+    linucb.py              # contextual finite-arm learner
+    priors.py              # MARAS/label/asset/side shrinkage priors
+  guardrails.py            # hard range, freshness, confidence, drawdown gates
+  advice.py                # advice payload builder + schema validation
+  publisher.py             # Hazelcast publication
+  audit.py                 # ClickHouse writer facade and spool priority
+  rewards.py               # delayed reward joining and opportunity cost
+  replay/
+    tape.py                # tape/path loading
+    capital_curve.py       # recursive capital replay
+    counterfactuals.py     # arm-level exit simulation
+    walk_forward.py        # contiguous and moving-window validation
+    reports.py             # JSON/CSV/Parquet artifact writers
+  runner.py                # live shadow/advisory daemon
+  worker.py                # offline subtasks
+  cli.py                   # ops commands and local replay entry points
+  tests/
+```
+
+V1 module responsibilities:
+
+| Module | Must do | Must not do |
+|---|---|---|
+| `specs.py` | validate ranges, modes, required sensors, output surfaces | import live trader code |
+| `context.py` | build point-in-time snapshots with freshness masks | fill missing market data with fake zeros |
+| `features.py` | compute deterministic feature vectors | read future outcome labels |
+| `learners/*` | expose `choose`, `update`, `checkpoint`, `restore` | know about ADVSL internals |
+| `guardrails.py` | enforce hard safety and fallback | optimize reward |
+| `advice.py` | produce schema-valid advice payloads | publish directly to HZ |
+| `publisher.py` | write HZ advice and heartbeat | mutate engine state |
+| `rewards.py` | join decisions to realized/counterfactual outcomes | update policy without reward status |
+| `replay/*` | reproduce capital-aware backtests | depend on live HZ |
+| `runner.py` | run shadow loops and MHS payloads | run full replay inline |
+| `worker.py` | run heavy calibration/replay jobs | publish live advice |
+
+Minimum local commands:
+
+```bash
+python -m vibriss.cli validate-specs \
+  --spec-dir /mnt/dolphin_training/vibriss/specs
+
+python -m vibriss.cli replay \
+  --param-set advsl.hold_substitute.v1 \
+  --namespace blue \
+  --from 2026-05-01 --to 2026-06-04 \
+  --out /mnt/dolphin_training/vibriss/replays/manual
+
+python -m vibriss.runner \
+  --mode shadow \
+  --namespace blue \
+  --spec-dir /mnt/dolphin_training/vibriss/specs \
+  --state-dir /mnt/dolphin_training/vibriss/checkpoints
+```
+
+Minimum test set:
+
+| Test | Purpose |
+|---|---|
+| `test_spec_validation.py` | rejects invalid ranges, missing sensors, unsafe live policies. |
+| `test_advice_schema.py` | validates HZ payloads and expiry/fallback fields. |
+| `test_guardrails.py` | proves stale OBF/MARAS and drawdown alarms force fallback. |
+| `test_replay_determinism.py` | same tape/spec/seed gives same capital curve. |
+| `test_opportunity_cost.py` | recovered cut trades charge missed upside. |
+| `test_priority_spool.py` | high-priority decision/reward rows flush before diagnostics. |
+| `test_mode_state_machine.py` | promotion is manual; demotion is automatic. |
+| `test_no_live_actuation_default.py` | default env cannot make engine consume advice. |
+
+The first acceptance test is not "did it make more money in-sample." The first
+acceptance test is:
+
+1. the same historical decision can be replayed deterministically,
+2. every recommended parameter has a valid spec and guardrail trail,
+3. baseline fallback is used under stale/low-confidence context,
+4. reward accounting includes clipped-winner opportunity cost,
+5. the replayed capital curve is reproducible.
+
+The first useful artifact is a replay bundle, not a daemon:
+
+```text
+replay_summary.json
+capital_curve.csv
+per_trade_counterfactuals.parquet
+opportunity_cost_audit.parquet
+maras_hash_hold_priors.parquet
+obf_hold_binding_report.json
+walk_forward_summary.json
+```
+
+Only after that bundle is reproducible should the shadow runner be started.