docs: VIBRISS spec (+ §10.6 cascade/adaptive-TP paramsets), PINK accounting fix spec, BLUE incident docs

VIBRISS_PARAMETER_GOVERNANCE_SPEC §10.6: ob_cascade.count_threshold (currently cascade_count>0 = ONE asset widens every TP x1.40), tp_widen_factor, withdrawal_velocity_threshold as governance candidates; adaptive/Dynamic-TP threshold marked fit for VIBRISS governance; TP_FLOOR joint-policy reward requirement. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 15:04:15 +02:00
parent f4ff1cd9b7
commit c3a18f693a
4 changed files with 3653 additions and 0 deletions
--- a/prod/docs/CRITICAL_VIOLET_DESIGN__BLUE_HYDRATION_BUG.md
+++ b/prod/docs/CRITICAL_VIOLET_DESIGN__BLUE_HYDRATION_BUG.md
@@ -0,0 +1,182 @@
+# Critical Violet Design: BLUE hydration bug
+
+Date: 2026-06-11
+
+## Summary
+
+This incident is a BLUE hydration / restore bug on the XTZUSDT short trade `863c21da`.
+
+The important facts are:
+
+1. The XTZ trade was real and opened at `2026-06-11 17:22:12.678265+00:00`.
+2. The trade did **not** close via TP, SL, or MAX_HOLD before hydration.
+3. The restore path later rebuilt the open slot from `position_state` and `trade_reconstruction`.
+4. The restored state had a chain-token mismatch, but the engine continued with the derived token instead of hard-failing.
+5. A later hydrate-time stop was recorded at `2026-06-11 18:35:52.789008+00:00` with `STOP_LOSS`.
+6. The ledger shows the next trade was admitted while XTZ was still officially open, which violates the single-slot invariant.
+
+## Trade identity
+
+- trade_id: `863c21da`
+- asset: `XTZUSDT`
+- side: `SHORT`
+- entry price: `0.2276`
+- entry notional: `56484.4305702418`
+- leverage: `6.374647927191287`
+- entry bar: `238`
+- tp_base_pct: `0.002`
+- tp_effective_pct: `0.0019999655500463724`
+
+## Ledger evidence
+
+### Open record
+
+`dolphin.trade_reconstruction` contains the canonical open record:
+
+- ts: `2026-06-11 17:22:12.911989`
+- event_type: `OPEN`
+- event_id: `863c21da:open`
+- chain_token: `26852fa25fb5cdaa3b4c354d5e3eea93e27bce0ebdcd0da896d4f981642eeeb2`
+
+The payload confirms:
+
+- `entry_ts = 1781198532678265`
+- `entry_bar = 238`
+- `retraction_legs = 0`
+- `realized_pnl_legs_total = 0.0`
+- `chain_mode = LIVE`
+- `chain_kind = ROOT`
+
+### No close before hydrate
+
+`dolphin.trade_exit_legs` has no rows for `863c21da`.
+
+`dolphin.trade_events` also has no close row for `863c21da`.
+
+So there is no official TP, SL, or MAX_HOLD exit recorded before the restore/hydration event.
+
+### Decision tape before hydrate
+
+`dolphin.v7_decision_events` shows the trade was live and being evaluated:
+
+- `2026-06-11 17:22:13.274556` `HOLD`
+- `2026-06-11 17:22:23.124863` `HOLD`
+- `2026-06-11 17:22:45.232894` `HOLD`
+- `2026-06-11 17:23:28.274004` `HOLD`
+- `2026-06-11 17:24:43.182413` `RETRACT / V7_RISK_DOMINANT`
+
+The best favorable excursion in the pre-hydrate tape was only about `+0.065905094%`, which is far below the fixed TP threshold.
+
+## Restore / hydration behavior
+
+At restore time the engine logged:
+
+- `chain token mismatch on restore: trade=863c21da stored=26852fa25fb5 derived=98875e225e9e — continuing with derived token`
+- `position_state RESTORED: XTZUSDT SHORT entry=0.2276 notional=56484 bars_held≈0 trade=863c21da`
+
+The restore path in [`prod/nautilus_event_trader.py`](../nautilus_event_trader.py) does the following:
+
+- reads `position_state`
+- reconstructs `restored_entry_bar = max(0, self.bar_idx - stored_bars)`
+- loads reconstruction data from `dolphin.trade_reconstruction`
+- rebuilds chain state from the persisted payload
+- if the stored chain token differs from the derived token, it logs the mismatch and continues with the derived token
+
+Relevant code:
+
+- `_chain_state_from_reconstruction(...)` around lines `3315-3348`
+- restore from `position_state` around lines `1944-2058`
+
+This is a validator, not a hard guardrail.
+
+## Single-slot violation
+
+The next distinct open trade in the reconstruction ledger is:
+
+- ts: `2026-06-11 17:50:50.420620`
+- trade_id: `43494ade`
+- asset: `TRXUSDT`
+- side: `SHORT`
+
+That means the system admitted a new trade while XTZ was still officially open in the ledger.
+
+On a single-slot engine, that should not happen.
+
+## What would have happened without hydration
+
+This is the conservative conclusion from the tape:
+
+- The trade did not hit TP on the observed pre-hydrate tape.
+- The trade did not have an official close row before hydration.
+- The tape does not contain a clean uninterrupted decision path beyond the first pre-hydrate window.
+
+The best-supported natural outcome from the observed tape is the live `RETRACT` state at `2026-06-11 17:24:43.182413`, where the engine still considered the slot active and the trade had only reached `bars_held = 14`.
+
+At that point:
+
+- `current_price = 0.22765000000000002`
+- `pnl_pct = -0.021968365`
+- `reason = V7_RISK_DOMINANT`
+
+If that retract state had been executed immediately, the estimated trade PnL would have been:
+
+- `-12.4087058758423` USDT on the recorded notional
+- trade ROI: `-0.021968365%`
+
+The max-hold clock also would have forced a decision long before the 18:35 restore:
+
+- trade-specific `market_state_max_hold_bars = 102`
+- live tape reached `bars_held = 14` by `17:24:43`
+- at an ~11 second cadence, the max-hold boundary would have arrived around `17:40-17:41`
+
+So the 18:35 stop-loss is not the natural continuation of the original entry. It is a restore-time artifact on top of a stale open slot.
+
+What is observable is the hydrated-path close that actually got booked:
+
+- exit ts: `2026-06-11 18:35:52.789008+00:00`
+- exit reason: `STOP_LOSS`
+- exit price: `0.23526757499999998`
+- realized pnl_pct: `-0.033056485743551446`
+- realized net_pnl: `-1913.155101369921`
+
+That realized stop corresponds to:
+
+- price move against the short of about `3.3056%`
+- account-level ROI of about `-2.726636%` using capital before exit (`70165.39`)
+
+## Root cause
+
+The bug is the restore path itself:
+
+1. The open trade state was preserved in `trade_reconstruction`.
+2. The current `position_state` snapshot was lossy or stale enough to rehydrate with `bars_held≈0`.
+3. The chain token mismatch was detected, but the code explicitly continues with the derived token.
+4. The engine therefore recovered continuity without enforcing strict equality between the live open chain and the reconstructed state.
+
+That combination makes orphaned trades possible after a bad hydrate.
+
+## Operational impact
+
+- The XTZ short remained open in the ledger with no formal close.
+- The engine later allowed a new trade while the slot should still have been occupied.
+- Capital accounting diverged from the true live slot history.
+- The restore path masked the inconsistency instead of stopping the recovery.
+
+## Recommended fix direction
+
+1. Treat a chain-token mismatch on restore as a hard failure for BLUE when a live open slot exists.
+2. Preserve the original `entry_bar` and bar counter from the open-chain payload instead of reconstructing them from the current `position_state` row when the two disagree materially.
+3. Refuse to admit a new trade until the single-slot invariant is proven flat.
+4. Add a regression test for:
+   - open XTZ trade
+   - stale `position_state`
+   - chain-token mismatch
+   - no new trade admission while the open slot remains unresolved
+
+## Bottom line
+
+XTZ was a real open trade.
+It never got a clean pre-hydrate exit.
+The restore path tolerated chain drift and rebuilt a misleading open state.
+The best-supported no-freeze outcome is the 17:24 retract, roughly flat to slightly negative.
+The realized hydrated-path loss was `-3.3056485743551446%` on the position and `-2.726636%` of capital before exit, but that is a restore artifact, not the natural end of the original trade.
--- a/prod/docs/MALFORMED_OPEN_RESTORE_BUG.md
+++ b/prod/docs/MALFORMED_OPEN_RESTORE_BUG.md
@@ -0,0 +1,131 @@
+# MALFORMED_OPEN_RESTORE_BUG
+
+## Summary
+
+BLUE was repeatedly rehydrating after startup because `dolphin.position_state` contained stale `OPEN` rows with zero effective size.
+
+The restore path treated those rows as fatal:
+
+- it selected the latest `OPEN` row per `trade_id`
+- it accepted that row even when `quantity` or `notional` had been driven to `0`
+- it hard-stopped on `position_state row invalid quantity ...`
+- `supervisord` then restarted the trader
+- the next startup read the same bad row again
+
+That created a restart loop.
+
+This was observed most clearly on the `2026-06-11` BLUE window. The recurring bad row was the legacy `ATOMUSDT` leg `1a3d2f9c`, which was persisted as:
+
+- `status = OPEN`
+- `quantity = 0`
+- `notional = 0`
+- `bars_held = 34`
+
+That row is not a live position. It is a stale snapshot that should have been treated as tombstoned history.
+
+## Root Cause
+
+The bad rows were self-inflicted by the partial-retract path in `nautilus_event_trader.py`.
+
+Before the fix:
+
+1. `_apply_internal_retract()` shrank the live position.
+2. It wrote a new `position_state` row with `status="OPEN"` for the remaining leg.
+3. If the remaining size rounded to zero, the row still existed as an `OPEN` snapshot.
+4. A later startup restore could pick that row and treat it as authoritative.
+
+That is enough to leave behind `OPEN` rows with:
+
+- `quantity = 0`
+- `notional = 0`
+
+These are not valid live positions, but they looked like one to the old restore logic.
+
+There is a second contributing factor in the restore path:
+
+- the restore code historically trusted the latest `OPEN` candidate too early
+- zero-sized `OPEN` rows were only rejected after the row had already been chosen as the best candidate
+- rejection used a hard failure path, which made the process exit instead of trying the next sane source
+
+That means the persistence bug and the restore policy bug reinforced each other.
+
+## Observable Symptoms
+
+- repeated `restore candidate parse failed from capital_update_ledger: 'list' object has no attribute 'get'`
+- repeated `position_state row invalid quantity for trade ...: 0.0`
+- `RESTORE HALT`
+- immediate restart by `supervisord`
+
+The chain-token mismatch logs were a separate warning. They were not the restart trigger.
+
+The capital-ledger parse warning is also distinct:
+
+- it indicates the ledger file is list-shaped, not a dict
+- it forces restore to rely more heavily on the other state surfaces
+- it is noisy, but it is not what actually killed the process in this incident
+
+## Fix Applied
+
+Two changes were made.
+
+### 1. Stop writing zero-sized `OPEN` rows
+
+In `_apply_internal_retract()`:
+
+- compute `remaining_qty`
+- if the remaining size is effectively zero, treat the retract as a full close
+- return the forced exit without emitting a new `position_state` row with `status="OPEN"`
+
+This prevents the bad row from being created in the first place.
+
+### 2. Make restore skip legacy bad `OPEN` rows
+
+In `_restore_position_state()`:
+
+- the ClickHouse restore query now filters `OPEN` rows with `quantity > 0 AND notional > 0`
+- if an invalid candidate still appears, restore logs and rejects it instead of hard-halting the process
+- restore falls back to HZ state or flat continuation rather than turning a stale row into a restart loop
+
+This is important because the repository already contains stale history. The fix is not only to stop producing new malformed rows; it also has to prevent old rows from re-triggering the same failure path on the next reboot.
+
+### 3. Keep the full-close path coherent
+
+The retract path now computes `remaining_qty` explicitly and treats `remaining_notional <= 1e-9` or `remaining_qty <= 0.0` as a full close.
+
+That means:
+
+- a full retract does not leave a zero-size `OPEN` snapshot behind
+- the exit is finalized as a close, not as a pseudo-open partial state
+- the runtime slot is removed cleanly instead of being left in a half-closed limbo
+
+## Verification Added
+
+Regression tests were added for both sides:
+
+- full-close retracts no longer emit zero-sized `OPEN` rows
+- restore skips zero-sized `OPEN` candidates without setting `restore_failed`
+
+The tests use the existing retract and restore harnesses:
+
+- one test seeds a tiny short leg that collapses to zero on retract and asserts no `OPEN` zero-size row is written
+- one test feeds a zero-sized `OPEN` `position_state` row into restore and asserts restore does not hard-halt
+
+## Operational Impact
+
+After this fix:
+
+- stale zero-sized `OPEN` rows no longer restart BLUE
+- malformed open snapshots are quarantined as legacy garbage
+- the live runtime can continue from a sane source instead of bouncing on the same bad record
+
+## What This Does Not Fix
+
+This change does not rewrite historical ClickHouse rows already present in the warehouse.
+
+It only changes:
+
+- new retract writes
+- restore selection and rejection policy
+- restart behavior when the old garbage is encountered
+
+If you want the historical ledger cleaned up, that is a separate reconciliation task. The current patch is intentionally conservative and only stops the bad row from causing further damage.
--- a/prod/docs/PINK_ACCOUNTING_EXEC_FIX.md
+++ b/prod/docs/PINK_ACCOUNTING_EXEC_FIX.md
@@ -0,0 +1,362 @@
+# PINK / DITAv2 Accounting & Execution Fix — Spec and Dev Guide
+
+**Status**: SPEC — ready for implementation agent
+**Date**: 2026-06-11
+**Branch**: `exp/pink-ditav2-sprint0-20260530` (continue on it or fork `fix/pink-accounting-consolidation`)
+**Author of spec**: forensic session 2026-06-11 (FET −$5,990.90 mis-book replay)
+**Prerequisite for**: VIOLET rebuild (`violet_subsecond_rebuild_plan` memory / future plan session)
+
+---
+
+## 0. Why this exists — the incident in one paragraph
+
+On 2026-06-11 PINK closed a FET-USDT short that the exchange settled at
+≈ **+$164 net** (entry VWAP 0.1878, exit 0.1866, ~202K FET) but the kernel
+booked **−$5,990.90** and capital diverged −$6,154 from the exchange wallet.
+Replay against `dolphin_pink.trade_reconstruction` slot images identified
+three stacked defects, all in *derivation* code (none in exchange facts):
+(1) fill events carried BingX's MARKET **protective bound price** (0.229,
+22% off tape) instead of the true fill price; (2) `realized_pnl()` and
+`mark_price()` multiplied PnL by `slot.leverage` (exchange leverage — but
+`slot.size` is exchange *quantity*, so every leg was 3× inflated); (3) the
+Python settle baseline `_last_settled_pnl` resets empty on every restart,
+so reconcile-adopted slots re-settle carried PnL. Exact replay of leg 1:
+`26,007 × (0.229−0.1878)/0.1878 × 0.1878 × 3 = −3,214.4652` ✓ matches the
+booked increment to the cent.
+
+A fourth structural finding: there are **three parallel ledgers** (Rust
+`AccountState` K/E, Python `AccountProjection` — the one persistence reads,
+fee-blind — and `AccountProjectionV2`, dead in the live path). This spec
+consolidates to **E-facts as ledger of record + K as integrity checksum +
+one atomic published snapshot**.
+
+---
+
+## 1. Scope and non-goals
+
+IN SCOPE
+1. Commit + activate the Phase-0 fixes already in the working tree.
+2. E-anchored published capital; single atomic account snapshot.
+3. Per-trade PnL provenance (`exchange | kernel_estimate`) end-to-end.
+4. Sizer feedback off trade-realized PnL (not capital deltas).
+5. Persistence hygiene: duplicate row emission, silent async-insert loss,
+   `event_seq` stamping, `bars_held` clamp, naive-UTC timestamps.
+6. Kernel hardening leftovers: `resolve_slot` no-match sentinel,
+   FILL_SETTLED realized override of flagged estimate legs.
+
+OUT OF SCOPE (separate tickets)
+- BLUE's exit-path masking bug (LINK −$1,248, `TODO_TP_SCAN_CADENCE_BUGFIX.md`) — BLUE stack, not DITAv2.
+- VIOLET fork, sub-second clock, venue price-feed port, cadence quantizer.
+- ch_writer head-of-line poison-row parking redesign (mitigations land here;
+  the full parking-lane design is its own task).
+- prefect.db / ClickHouse TTL disk remediation.
+
+HARD INVARIANTS — MUST NOT CHANGE
+- **Dual leverage**: `slot.size` = exchange quantity; `slot.leverage` =
+  exchange leverage (1–3x cap, set at BingX API); *our*-leverage
+  (conviction) = `size × entry_price / capital`, computed only at
+  `pink_direct._hz_publish` (line ~911). PnL is therefore **leverage-free**:
+  `qty × Δprice`, side-signed. Do not touch the conviction→exchange mapping
+  (`round_half_even_linear_0.5_to_9.0_to_1_to_exchange_cap`) or
+  `target_size` computation.
+- **Exits are never skipped** (exec-router invariant set, §16 kernel ref).
+- **BLUE-parity policy contract**: `DecisionEngine`/`IntentEngine` inputs
+  (MarketSnapshot + capital + slot state) unchanged in shape.
+- **Namespace isolation**: zero writes to `dolphin.*` / `dolphin_prodgreen.*`
+  or BLUE/PRODGREEN HZ maps. Re-verify with `pink_ctl.py mode-verify`.
+- **Data cadences are sacred** (operator rule 2026-06-10): never reduce a
+  data cadence for throughput.
+
+---
+
+## 2. Phase 0 — Commit and activate the already-applied fixes
+
+These changes exist UNCOMMITTED in the working tree as of 2026-06-11 ~16:30.
+Verify each hunk, commit as one reviewed unit, then restart `dolphin_pink`.
+
+### 0.1 `prod/clean_arch/dita_v2/_rust_kernel/src/lib.rs`
+| Function | Change (already applied) |
+|---|---|
+| `KernelCore::realized_pnl` (~line 1153) | PnL = side-signed `qty × (exit − entry)`; **no leverage factor**; returns 0 when `entry<=0 ∨ exit_size<=0 ∨ exit_price<=0 ∨ !finite` |
+| `TradeSlot::mark_price` (~line 394) | no `× leverage` in unrealized; a mark NEVER becomes entry basis — missing basis flags `metadata.entry_basis_missing=true`, unrealized stays 0 |
+| `KernelCore::fill_matches_order` (new) | identity match on `venue_order_id` / `venue_client_id` |
+| `KernelCore::apply_fill` | entry/exit routing by ORDER IDENTITY first, FSM state second (`!id_matches_exit` / `!id_matches_entry` guards); entry basis = **VWAP across entry fills** (`(prev_basis×prev_filled + price×fill)/accumulated`); price-less exit fill reduces size, books 0 PnL, flags `metadata.realized_skipped_no_price=true` |
+
+Rebuild required: `cargo build --release` in `_rust_kernel/` (the `.so` is
+only auto-built when missing — **source/binary drift is a known hazard**;
+add the build to the commit checklist). `cargo test`: 32/32 green as of spec.
+
+### 0.2 `prod/clean_arch/dita_v2/bingx_venue.py`
+Fill events must carry a TRUE fill price or 0.0 — never the order's nominal
+`price` / submit `receipt.price` (BingX MARKET bound price, ±20–25%):
+- `_events_from_submit` fill event (~line 585): `_row_float(ack_row,
+  "avgPrice","ap","lastFillPrice","L", default=0.0)`
+- `_event_from_row` (~line 697): fills use the same true-price chain;
+  non-fill events (ACK/CANCEL/REJECT) may keep nominal `price` as info
+- `_fill_event_from_row` (~line 736): `"lastFillPrice","L","avgPrice","ap"`
+
+### 0.3 `prod/clean_arch/dita_v2/rust_backend.py`
+- `reconcile_from_slots`: seeds `_last_settled_pnl[slot_id] = slot.realized_pnl`
+  and `_slot_was_closed[slot_id] = slot.closed` for every adopted slot.
+- `restore_state`: same re-anchoring after successful restore.
+
+### 0.4 Adjacent fixes riding the same commit
+- `prod/ch_writer.py`: insert URLs append `&date_time_input_format=best_effort`;
+  flush errors log at WARNING (first 10 + every 100th), counter `_flush_errors`.
+- `prod/clean_arch/dita_v2/blue_parity.py` `price_of`: hyphen-tolerant
+  fallback (`FET-USDT` → `FETUSDT`) — fixes the unmanaged-position block.
+- `prod/clickhouse/users.xml`: `date_time_input_format=best_effort` for the
+  `dolphin` user (NOTE: running CH container did not honor it even after
+  restart — the container does not mount compose configs; effective on next
+  compose recreation. The client-side URL param is the operative fix.)
+- `prod/tests/test_dita_v2_kernel.py`: partial→full fill test updated to
+  incremental `filled_size` semantics (BingX WS `lastFilledQty`).
+
+### 0.5 Phase 0 gates
+1. `cargo test` in `_rust_kernel`: 32/32.
+2. `pytest prod/tests/test_dita_v2_kernel.py`: 7/7.
+3. `pytest prod/clean_arch/dita_v2/test_exec_router_runtime.py
+   test_venue_reconcile.py test_orphan_prevention.py
+   prod/tests/test_pink_async_fill_pump.py
+   prod/clean_arch/dita_v2/test_account_core_v2.py test_bingx_bugs.py`: 134/134.
+4. KNOWN pre-existing failures (NOT introduced by this work — verified by
+   hunk-revert): 4 tests in `prod/tests/test_dita_v2_bingx_adapter.py`
+   (snapshot-fill emission broke when sync `submit()` started passing None
+   snapshots on 2026-06-10). Fix or quarantine them explicitly in this phase
+   — do not let them mask new regressions.
+5. Restart `dolphin_pink` at a FLAT moment; verify in logs: no
+   `realized_skipped_no_price` storms, no `entry_basis_missing` on fresh
+   entries, first round-trip books PnL within ±(fees+slippage) of
+   `GET /openApi/swap/v2/user/income` for the same trade.
+
+---
+
+## 3. Phase 1 — E-anchored published capital
+
+**Goal**: the capital that persistence/HZ/sizer see is exchange-anchored;
+K never publishes.
+
+### 3.1 `prod/clean_arch/dita_v2/account.py`
+- Add to `AccountSnapshot`: `capital_source: str` (`"e_anchored" |
+  "k_bridged" | "seed"`), `e_wallet_balance: float`, `event_seq: int`.
+- New method `AccountProjection.anchor_to_exchange(wallet_balance: float,
+  available_margin: float, event_seq: int)`: sets `capital = wallet_balance`
+  (guard `>0` and finite — the zero-wb frame lesson), `capital_source =
+  "e_anchored"`, recomputes equity. `settle()` remains for the BRIDGE case
+  only: between anchors, capital += realized (`capital_source="k_bridged"`).
+- `settle(realized_pnl, fees)`: **stop ignoring fees** — `capital +=
+  realized_pnl − fees` (today fees only accumulate in `fees_paid`; published
+  capital ignores them between reseeds).
+
+### 3.2 `prod/clean_arch/runtime/pink_direct.py`
+- The existing reseed path (balance-bearing ACCOUNT_UPDATE →
+  `kernel.reset_and_seed(wb)`) additionally calls
+  `kernel.account.anchor_to_exchange(...)` — one anchoring action, two
+  ledgers consistent.
+- Boot seed (launcher `exchange_balance_capital` block, pink_direct ~line
+  262) goes through `anchor_to_exchange` instead of direct attribute writes.
+
+### 3.3 Gates
+- New unit tests (`prod/tests/test_pink_account_anchor.py`):
+  anchor sets capital/source; zero/negative/NaN wb rejected; settle bridges
+  with fees; anchor after bridge snaps to wb exactly.
+- Shadow check (live, 24 h on VST): published capital vs
+  `GET /openApi/swap/v2/user/balance` polled 1/min — max |Δ| outside a
+  trade-settlement window ≤ $0.01; during settlement ≤ pending-fee bound.
+
+---
+
+## 4. Phase 2 — Single atomic snapshot, ledger consolidation
+
+**Goal**: one immutable, versioned account snapshot; the two redundant
+ledgers demoted/removed.
+
+### 4.1 `prod/clean_arch/dita_v2/account.py`
+- Make the published snapshot **immutable-replace**: `AccountProjection`
+  builds a new frozen `AccountSnapshot` (carry `event_seq`) on every
+  mutation and swaps a single reference (GIL-atomic). Readers must take
+  `snap = kernel.account.snapshot` once per use (audit call sites:
+  `pink_clickhouse.py`, `hazelcast_projection.py` HZ writer, `pink_direct`).
+- `AccountProjectionV2`: DELETE, or move to `prod/clean_arch/dita_v2/
+  _attic/` with a module docstring pointing here. Its only live-path import
+  is `exchange_event.py` — migrate that import or the dataclasses it uses
+  (`EPosition` is genuinely useful; keep it in `account.py`).
+- The Rust `AccountState` K-ledger STAYS — demoted by documentation and by
+  Phase 1 (it no longer feeds published capital): its jobs are reconcile
+  classification (R1-style), `capital_frozen`, and E-dark bridging. Update
+  the module docstring to say exactly this.
+
+### 4.2 `prod/clean_arch/persistence/pink_clickhouse.py`
+- Read capital/equity/peak/trade_seq from the single snapshot reference;
+  no recomputation.
+- Add columns to emitted rows (and the matching `ALTER TABLE` DDLs under
+  `prod/clickhouse/pink/08_provenance.sql` — **apply DDLs to CH BEFORE
+  deploying code that emits them**; the missing-table head-of-line jam of
+  2026-06-11 is the cautionary tale):
+  - `account_events`, `status_snapshots`: `capital_source LowCardinality(String) DEFAULT ''`,
+    `account_event_seq UInt64 DEFAULT 0`
+  - `trade_events`, `trade_exit_legs`: `pnl_source LowCardinality(String) DEFAULT ''`
+    (`exchange` | `kernel_estimate`)
+- `bars_held`: clamp to `max(0, …)` at row-build time (UInt16 column;
+  negative values currently 400 on trade_events / silently vanish on
+  async tables).
+- Timestamps: route every `ts` through one helper emitting **naive-UTC
+  microsecond ISO** (no `+00:00`) — best_effort already tolerates both, but
+  rows must stop depending on a parser setting.
+
+### 4.3 Duplicate-emission fix (same file)
+Every CH row is currently emitted twice (visible in any query). Hunt the
+double call: instrument `_sink()` with a per-(table, content-hash) debug
+counter in a test, then trace the two call paths (suspect: `persist_result`
+invoked both from the runtime step and from the fill pump for the same
+event). Fix at the caller level; do NOT dedupe by content in the sink
+(masks real double-events). Regression test: one simulated round trip →
+exactly one row per logical event per table.
+
+### 4.4 `prod/ch_writer.py`
+- `wait_for_async_insert`: `"1"` for ALL `dolphin_pink` tables (accounting
+  rows must never be silently lost; the spool absorbs latency). Keep `0`
+  acceptable only for high-volume shadow tables if measured necessary —
+  document any exception inline.
+- Mitigation for head-of-line (full redesign out of scope): after
+  `attempts > 1000` on a row, log ERROR with the CH response body once per
+  100 attempts (today the reject reason is invisible without manual replay).
+
+### 4.5 Gates
+- Full offline suite (the 533+ DITAv2/PINK set) green, minus the Phase-0
+  quarantined adapter tests if still open.
+- One live VST round trip: every table gets exactly one row per event;
+  `pnl_source`/`capital_source` populated; CH `system.text_log` shows zero
+  parse rejections for `dolphin_pink`.
+
+---
+
+## 5. Phase 3 — Sizer feedback off trade-realized PnL
+
+**THE one seam where this refactor can silently change alpha behavior.**
+
+### 5.1 `prod/clean_arch/runtime/pink_direct.py` — `_sizer_trade_feedback` (~line 1453)
+Today: `pnl = acc.capital − self._sizer_entry_capital` (capital delta).
+Under E-anchored capital this absorbs funding, fees of other activity, and
+**foreign fills from the shared VST account** (PRODGREEN collision class).
+Change to:
+```
+pnl = slot_realized_for_trade(trade_id)   # Σ slot.realized_pnl legs, i.e.
+                                          # kernel estimate, overridden by
+                                          # exchange rp when settled (5.2)
+```
+Source: the closing slot dict already carries `realized_pnl`; use it (minus
+the fees recorded for the trade when available) instead of the capital
+delta. Keep the magnitude semantics the sizer expects (sign + rough size —
+per the existing comment, bucket/streak multipliers only need that).
+
+### 5.2 Exchange override (E-led repair) — `bingx_user_stream.py` + `rust_backend.py`
+- The WS `FILL_SETTLED` path already carries the exchange's realized (`rp`)
+  and fee (`n`, sign-flipped at boundary per BingX quirks memory). Extend
+  the kernel account-event payload with `trade_id`, and on receipt:
+  - if the matching slot leg was flagged `realized_skipped_no_price`,
+    ADD the exchange realized to `slot.realized_pnl` (repair) and clear
+    the flag; settle the increment through the normal baseline mechanism;
+  - else record `pnl_source="exchange"` for the trade-event row (the
+    estimate stays as the booked figure unless |estimate−rp| exceeds a
+    tolerance — then log ERROR + emit an `anomaly_events` row; do NOT
+    silently re-book).
+- Rust: add `dita_kernel_repair_realized(slot_id, amount)` FFI (or fold the
+  repair into `on_account_event` with `slot_id` in payload). Keep it
+  idempotent via the existing account-event dedup.
+
+### 5.3 Gates
+- Unit: feedback receives trade-realized, not capital delta (simulate a
+  foreign-fill capital jump mid-trade → feedback unaffected).
+- Unit: price-less exit leg + later FILL_SETTLED repair → slot realized
+  equals exchange `rp`; settle baseline consistent (no double-settle).
+- Parity: `test_blue_parity.py`, `test_alpha_blue_untouched_g7.py` green
+  (sizer behavior unchanged for normal fills).
+
+---
+
+## 6. Phase 4 — Kernel hardening leftovers
+
+### 6.1 `lib.rs` — `resolve_slot` (~line 1099)
+Falls back to **slot 0** when nothing matches. Change: return
+`Option<usize>`; on `None`, `on_venue_event` returns
+`UNRESOLVED_SLOT` (diagnostic exists already) without mutating any slot,
+severity WARNING, event recorded in outcome details. Python callers: the
+runtime treats UNRESOLVED_SLOT as a logged no-op (the `_fill_is_ours`
+filter remains first-line defense; this is kernel-side defense for
+venue-agnostic reuse).
+NOTE: several tests construct events with `slot_id=-1` expecting slot-0
+fallback — update them to pass explicit `slot_id=0` (behavioral test
+change; list each in the PR description).
+
+### 6.2 ID-less fill routing (documentation + metric, not code)
+BingX WS omits clientOrderId, so identity routing can't always engage.
+Add a counter metric (`fills_routed_by_state_total`) via an
+`anomaly_events` row per occurrence, severity INFO — gives VIOLET the data
+to justify per-venue synthetic ids later. No FSM behavior change.
+
+### 6.3 Gates
+- New Rust tests: unresolved event mutates nothing; entry-id fill during
+  EXIT_WORKING routes to entry (already covered by Phase-0 routing — add
+  the explicit case); price-less exit leg books 0 + flag.
+
+---
+
+## 7. Test matrix (run-order for the implementing agent)
+
+| Stage | Command (env: `PYTHONPATH=/mnt/dolphinng5_predict:/mnt/dolphinng5_predict/nautilus_dolphin`, venv `/home/dolphin/siloqy_env/bin/python3`) | Pass bar |
+|---|---|---|
+| Rust unit | `cargo test --release` in `_rust_kernel/` | 100% |
+| Kernel FSM | `pytest prod/tests/test_dita_v2_kernel.py` | 100% |
+| Bridge/accounting | `pytest prod/tests/test_pink_ditav2_kernel_bridge.py test_pink_ditav2_accounting_invariants.py prod/clean_arch/dita_v2/test_account_core_v2.py` | 100% |
+| Runtime/reconcile | `pytest prod/clean_arch/dita_v2/test_venue_reconcile.py test_orphan_prevention.py test_exec_router_runtime.py prod/tests/test_pink_async_fill_pump.py test_pink_direct_runtime.py` | 100% |
+| Chaos | `pytest prod/tests/test_pink_ditav2_chaos_harness.py` + `test_dita_v2_e2e_functional.py` | 100% |
+| Parity | `pytest prod/clean_arch/dita_v2/test_blue_parity.py test_alpha_blue_untouched_g7.py` | 100% |
+| Adapter | `pytest prod/tests/test_dita_v2_bingx_adapter.py` | 100% after Phase-0 item 4 resolution |
+| LIVE VST E2E | `python prod/ops/dita_v2_live_bingx_smoke.py --pink --symbol TRXUSDT` | suite green |
+| **Golden replays (NEW — write these)** | `prod/tests/test_pink_accounting_golden.py` | see below |
+| Shadow soak | 24–48 h on VST | capital vs balance ≤ $0.01 idle |
+
+### Golden replay tests (the heart of the acceptance)
+Feed the kernel the recorded FET event sequence (entry fills 195,259 +
+7,017 @ 0.1878; exit fills 26,007 + remainder; the poisoned variant with
+price=0.229 and the clean variant with 0.1866):
+1. Clean prices → realized = `(0.1878−0.1866) × 202,276 ≈ +242.7` gross.
+2. Poisoned price (0.229) reaching the kernel anyway → with the adapter fix
+   it must arrive as 0.0 → leg books 0 + `realized_skipped_no_price`; after
+   synthetic FILL_SETTLED rp=+164 → slot realized = +164, `pnl_source=exchange`.
+3. Restart mid-position (save_state/restore_state + reconcile_from_slots)
+   → next venue event settles ONLY the incremental PnL.
+4. VWAP: two entry fills at different prices → basis = weighted average.
+5. Dual-leverage invariant: same fills at exchange-leverage 1 vs 3 →
+   **identical realized PnL**; only margin fields differ.
+
+---
+
+## 8. Rollout & rollback
+
+1. Each phase = one PR-sized commit, gates green before the next.
+2. Activation requires `supervisorctl restart dolphin_pink` — restart at a
+   FLAT moment (check `DOLPHIN_STATE_PINK` + exchange positions). The
+   restart-reconcile path is itself under test here; first restart after
+   Phase 0 should be watched live.
+3. Rollback = `git revert` of the phase commit + rebuild `.so` + restart.
+   The Rust `.so` MUST be rebuilt on both apply and revert — stale-binary
+   drift is how the incremental-fill change sat uncompiled until 2026-06-11.
+4. CH DDLs are additive (`ADD COLUMN ... DEFAULT`) — no destructive
+   migrations anywhere in this spec; rollback leaves unused columns, which
+   is fine.
+5. PINK is VST (virtual funds) — it is the canary by construction. Nothing
+   in this spec touches BLUE files (verify with `git diff --name-only`
+   against the §38.7 checklist).
+
+## 9. Done criteria (the whole spec)
+
+- All phases merged; full matrix green; golden replays green.
+- 48 h VST soak: zero UNEXPLAINED reconcile errors; published capital
+  tracks exchange balance; every closed trade's `trade_events.pnl` within
+  fees+slippage of the exchange income record, with `pnl_source` populated.
+- `pink_ctl.py mode-verify` passes (namespace isolation intact).
+- SYSTEM BIBLE §38 addendum updated (one paragraph: E-led ledger, K as
+  checksum, provenance fields) + `DITA_V2_KERNEL_REFERENCE.md` §"Capital
+  simplification" rewritten to match reality.
--- a/prod/docs/VIBRISS_PARAMETER_GOVERNANCE_SPEC.md
+++ b/prod/docs/VIBRISS_PARAMETER_GOVERNANCE_SPEC.md