docs: VIBRISS spec (+ §10.6 cascade/adaptive-TP paramsets), PINK accounting fix spec, BLUE incident docs

VIBRISS_PARAMETER_GOVERNANCE_SPEC §10.6: ob_cascade.count_threshold
(currently cascade_count>0 = ONE asset widens every TP x1.40),
tp_widen_factor, withdrawal_velocity_threshold as governance candidates;
adaptive/Dynamic-TP threshold marked fit for VIBRISS governance; TP_FLOOR
joint-policy reward requirement.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
Codex
2026-06-12 15:04:15 +02:00
parent f4ff1cd9b7
commit c3a18f693a
4 changed files with 3653 additions and 0 deletions

View File

@@ -0,0 +1,362 @@
# PINK / DITAv2 Accounting & Execution Fix — Spec and Dev Guide
**Status**: SPEC — ready for implementation agent
**Date**: 2026-06-11
**Branch**: `exp/pink-ditav2-sprint0-20260530` (continue on it or fork `fix/pink-accounting-consolidation`)
**Author of spec**: forensic session 2026-06-11 (FET $5,990.90 mis-book replay)
**Prerequisite for**: VIOLET rebuild (`violet_subsecond_rebuild_plan` memory / future plan session)
---
## 0. Why this exists — the incident in one paragraph
On 2026-06-11 PINK closed a FET-USDT short that the exchange settled at
**+$164 net** (entry VWAP 0.1878, exit 0.1866, ~202K FET) but the kernel
booked **$5,990.90** and capital diverged $6,154 from the exchange wallet.
Replay against `dolphin_pink.trade_reconstruction` slot images identified
three stacked defects, all in *derivation* code (none in exchange facts):
(1) fill events carried BingX's MARKET **protective bound price** (0.229,
+22% off tape) instead of the true fill price; (2) `realized_pnl()` and
`mark_price()` multiplied PnL by `slot.leverage` (exchange leverage — but
`slot.size` is exchange *quantity*, so every leg was 3× inflated); (3) the
Python settle baseline `_last_settled_pnl` resets empty on every restart,
so reconcile-adopted slots re-settle carried PnL. Exact replay of leg 1:
`26,007 × (0.2290.1878)/0.1878 × 0.1878 × 3 = 3,214.4652` ✓ matches the
booked increment to the cent.
A fourth structural finding: there are **three parallel ledgers** (Rust
`AccountState` K/E, Python `AccountProjection` — the one persistence reads,
fee-blind — and `AccountProjectionV2`, dead in the live path). This spec
consolidates to **E-facts as ledger of record + K as integrity checksum +
one atomic published snapshot**.
---
## 1. Scope and non-goals
IN SCOPE
1. Commit + activate the Phase-0 fixes already in the working tree.
2. E-anchored published capital; single atomic account snapshot.
3. Per-trade PnL provenance (`exchange | kernel_estimate`) end-to-end.
4. Sizer feedback off trade-realized PnL (not capital deltas).
5. Persistence hygiene: duplicate row emission, silent async-insert loss,
`event_seq` stamping, `bars_held` clamp, naive-UTC timestamps.
6. Kernel hardening leftovers: `resolve_slot` no-match sentinel,
FILL_SETTLED realized override of flagged estimate legs.
OUT OF SCOPE (separate tickets)
- BLUE's exit-path masking bug (LINK $1,248, `TODO_TP_SCAN_CADENCE_BUGFIX.md`) — BLUE stack, not DITAv2.
- VIOLET fork, sub-second clock, venue price-feed port, cadence quantizer.
- ch_writer head-of-line poison-row parking redesign (mitigations land here;
the full parking-lane design is its own task).
- prefect.db / ClickHouse TTL disk remediation.
HARD INVARIANTS — MUST NOT CHANGE
- **Dual leverage**: `slot.size` = exchange quantity; `slot.leverage` =
exchange leverage (13x cap, set at BingX API); *our*-leverage
(conviction) = `size × entry_price / capital`, computed only at
`pink_direct._hz_publish` (line ~911). PnL is therefore **leverage-free**:
`qty × Δprice`, side-signed. Do not touch the conviction→exchange mapping
(`round_half_even_linear_0.5_to_9.0_to_1_to_exchange_cap`) or
`target_size` computation.
- **Exits are never skipped** (exec-router invariant set, §16 kernel ref).
- **BLUE-parity policy contract**: `DecisionEngine`/`IntentEngine` inputs
(MarketSnapshot + capital + slot state) unchanged in shape.
- **Namespace isolation**: zero writes to `dolphin.*` / `dolphin_prodgreen.*`
or BLUE/PRODGREEN HZ maps. Re-verify with `pink_ctl.py mode-verify`.
- **Data cadences are sacred** (operator rule 2026-06-10): never reduce a
data cadence for throughput.
---
## 2. Phase 0 — Commit and activate the already-applied fixes
These changes exist UNCOMMITTED in the working tree as of 2026-06-11 ~16:30.
Verify each hunk, commit as one reviewed unit, then restart `dolphin_pink`.
### 0.1 `prod/clean_arch/dita_v2/_rust_kernel/src/lib.rs`
| Function | Change (already applied) |
|---|---|
| `KernelCore::realized_pnl` (~line 1153) | PnL = side-signed `qty × (exit entry)`; **no leverage factor**; returns 0 when `entry<=0 exit_size<=0 exit_price<=0 !finite` |
| `TradeSlot::mark_price` (~line 394) | no `× leverage` in unrealized; a mark NEVER becomes entry basis — missing basis flags `metadata.entry_basis_missing=true`, unrealized stays 0 |
| `KernelCore::fill_matches_order` (new) | identity match on `venue_order_id` / `venue_client_id` |
| `KernelCore::apply_fill` | entry/exit routing by ORDER IDENTITY first, FSM state second (`!id_matches_exit` / `!id_matches_entry` guards); entry basis = **VWAP across entry fills** (`(prev_basis×prev_filled + price×fill)/accumulated`); price-less exit fill reduces size, books 0 PnL, flags `metadata.realized_skipped_no_price=true` |
Rebuild required: `cargo build --release` in `_rust_kernel/` (the `.so` is
only auto-built when missing — **source/binary drift is a known hazard**;
add the build to the commit checklist). `cargo test`: 32/32 green as of spec.
### 0.2 `prod/clean_arch/dita_v2/bingx_venue.py`
Fill events must carry a TRUE fill price or 0.0 — never the order's nominal
`price` / submit `receipt.price` (BingX MARKET bound price, ±2025%):
- `_events_from_submit` fill event (~line 585): `_row_float(ack_row,
"avgPrice","ap","lastFillPrice","L", default=0.0)`
- `_event_from_row` (~line 697): fills use the same true-price chain;
non-fill events (ACK/CANCEL/REJECT) may keep nominal `price` as info
- `_fill_event_from_row` (~line 736): `"lastFillPrice","L","avgPrice","ap"`
### 0.3 `prod/clean_arch/dita_v2/rust_backend.py`
- `reconcile_from_slots`: seeds `_last_settled_pnl[slot_id] = slot.realized_pnl`
and `_slot_was_closed[slot_id] = slot.closed` for every adopted slot.
- `restore_state`: same re-anchoring after successful restore.
### 0.4 Adjacent fixes riding the same commit
- `prod/ch_writer.py`: insert URLs append `&date_time_input_format=best_effort`;
flush errors log at WARNING (first 10 + every 100th), counter `_flush_errors`.
- `prod/clean_arch/dita_v2/blue_parity.py` `price_of`: hyphen-tolerant
fallback (`FET-USDT` → `FETUSDT`) — fixes the unmanaged-position block.
- `prod/clickhouse/users.xml`: `date_time_input_format=best_effort` for the
`dolphin` user (NOTE: running CH container did not honor it even after
restart — the container does not mount compose configs; effective on next
compose recreation. The client-side URL param is the operative fix.)
- `prod/tests/test_dita_v2_kernel.py`: partial→full fill test updated to
incremental `filled_size` semantics (BingX WS `lastFilledQty`).
### 0.5 Phase 0 gates
1. `cargo test` in `_rust_kernel`: 32/32.
2. `pytest prod/tests/test_dita_v2_kernel.py`: 7/7.
3. `pytest prod/clean_arch/dita_v2/test_exec_router_runtime.py
test_venue_reconcile.py test_orphan_prevention.py
prod/tests/test_pink_async_fill_pump.py
prod/clean_arch/dita_v2/test_account_core_v2.py test_bingx_bugs.py`: 134/134.
4. KNOWN pre-existing failures (NOT introduced by this work — verified by
hunk-revert): 4 tests in `prod/tests/test_dita_v2_bingx_adapter.py`
(snapshot-fill emission broke when sync `submit()` started passing None
snapshots on 2026-06-10). Fix or quarantine them explicitly in this phase
— do not let them mask new regressions.
5. Restart `dolphin_pink` at a FLAT moment; verify in logs: no
`realized_skipped_no_price` storms, no `entry_basis_missing` on fresh
entries, first round-trip books PnL within ±(fees+slippage) of
`GET /openApi/swap/v2/user/income` for the same trade.
---
## 3. Phase 1 — E-anchored published capital
**Goal**: the capital that persistence/HZ/sizer see is exchange-anchored;
K never publishes.
### 3.1 `prod/clean_arch/dita_v2/account.py`
- Add to `AccountSnapshot`: `capital_source: str` (`"e_anchored" |
"k_bridged" | "seed"`), `e_wallet_balance: float`, `event_seq: int`.
- New method `AccountProjection.anchor_to_exchange(wallet_balance: float,
available_margin: float, event_seq: int)`: sets `capital = wallet_balance`
(guard `>0` and finite — the zero-wb frame lesson), `capital_source =
"e_anchored"`, recomputes equity. `settle()` remains for the BRIDGE case
only: between anchors, capital += realized (`capital_source="k_bridged"`).
- `settle(realized_pnl, fees)`: **stop ignoring fees** — `capital +=
realized_pnl fees` (today fees only accumulate in `fees_paid`; published
capital ignores them between reseeds).
### 3.2 `prod/clean_arch/runtime/pink_direct.py`
- The existing reseed path (balance-bearing ACCOUNT_UPDATE →
`kernel.reset_and_seed(wb)`) additionally calls
`kernel.account.anchor_to_exchange(...)` — one anchoring action, two
ledgers consistent.
- Boot seed (launcher `exchange_balance_capital` block, pink_direct ~line
262) goes through `anchor_to_exchange` instead of direct attribute writes.
### 3.3 Gates
- New unit tests (`prod/tests/test_pink_account_anchor.py`):
anchor sets capital/source; zero/negative/NaN wb rejected; settle bridges
with fees; anchor after bridge snaps to wb exactly.
- Shadow check (live, 24 h on VST): published capital vs
`GET /openApi/swap/v2/user/balance` polled 1/min — max |Δ| outside a
trade-settlement window ≤ $0.01; during settlement ≤ pending-fee bound.
---
## 4. Phase 2 — Single atomic snapshot, ledger consolidation
**Goal**: one immutable, versioned account snapshot; the two redundant
ledgers demoted/removed.
### 4.1 `prod/clean_arch/dita_v2/account.py`
- Make the published snapshot **immutable-replace**: `AccountProjection`
builds a new frozen `AccountSnapshot` (carry `event_seq`) on every
mutation and swaps a single reference (GIL-atomic). Readers must take
`snap = kernel.account.snapshot` once per use (audit call sites:
`pink_clickhouse.py`, `hazelcast_projection.py` HZ writer, `pink_direct`).
- `AccountProjectionV2`: DELETE, or move to `prod/clean_arch/dita_v2/
_attic/` with a module docstring pointing here. Its only live-path import
is `exchange_event.py` — migrate that import or the dataclasses it uses
(`EPosition` is genuinely useful; keep it in `account.py`).
- The Rust `AccountState` K-ledger STAYS — demoted by documentation and by
Phase 1 (it no longer feeds published capital): its jobs are reconcile
classification (R1-style), `capital_frozen`, and E-dark bridging. Update
the module docstring to say exactly this.
### 4.2 `prod/clean_arch/persistence/pink_clickhouse.py`
- Read capital/equity/peak/trade_seq from the single snapshot reference;
no recomputation.
- Add columns to emitted rows (and the matching `ALTER TABLE` DDLs under
`prod/clickhouse/pink/08_provenance.sql` — **apply DDLs to CH BEFORE
deploying code that emits them**; the missing-table head-of-line jam of
2026-06-11 is the cautionary tale):
- `account_events`, `status_snapshots`: `capital_source LowCardinality(String) DEFAULT ''`,
`account_event_seq UInt64 DEFAULT 0`
- `trade_events`, `trade_exit_legs`: `pnl_source LowCardinality(String) DEFAULT ''`
(`exchange` | `kernel_estimate`)
- `bars_held`: clamp to `max(0, …)` at row-build time (UInt16 column;
negative values currently 400 on trade_events / silently vanish on
async tables).
- Timestamps: route every `ts` through one helper emitting **naive-UTC
microsecond ISO** (no `+00:00`) — best_effort already tolerates both, but
rows must stop depending on a parser setting.
### 4.3 Duplicate-emission fix (same file)
Every CH row is currently emitted twice (visible in any query). Hunt the
double call: instrument `_sink()` with a per-(table, content-hash) debug
counter in a test, then trace the two call paths (suspect: `persist_result`
invoked both from the runtime step and from the fill pump for the same
event). Fix at the caller level; do NOT dedupe by content in the sink
(masks real double-events). Regression test: one simulated round trip →
exactly one row per logical event per table.
### 4.4 `prod/ch_writer.py`
- `wait_for_async_insert`: `"1"` for ALL `dolphin_pink` tables (accounting
rows must never be silently lost; the spool absorbs latency). Keep `0`
acceptable only for high-volume shadow tables if measured necessary —
document any exception inline.
- Mitigation for head-of-line (full redesign out of scope): after
`attempts > 1000` on a row, log ERROR with the CH response body once per
100 attempts (today the reject reason is invisible without manual replay).
### 4.5 Gates
- Full offline suite (the 533+ DITAv2/PINK set) green, minus the Phase-0
quarantined adapter tests if still open.
- One live VST round trip: every table gets exactly one row per event;
`pnl_source`/`capital_source` populated; CH `system.text_log` shows zero
parse rejections for `dolphin_pink`.
---
## 5. Phase 3 — Sizer feedback off trade-realized PnL
**THE one seam where this refactor can silently change alpha behavior.**
### 5.1 `prod/clean_arch/runtime/pink_direct.py` — `_sizer_trade_feedback` (~line 1453)
Today: `pnl = acc.capital self._sizer_entry_capital` (capital delta).
Under E-anchored capital this absorbs funding, fees of other activity, and
**foreign fills from the shared VST account** (PRODGREEN collision class).
Change to:
```
pnl = slot_realized_for_trade(trade_id) # Σ slot.realized_pnl legs, i.e.
# kernel estimate, overridden by
# exchange rp when settled (5.2)
```
Source: the closing slot dict already carries `realized_pnl`; use it (minus
the fees recorded for the trade when available) instead of the capital
delta. Keep the magnitude semantics the sizer expects (sign + rough size —
per the existing comment, bucket/streak multipliers only need that).
### 5.2 Exchange override (E-led repair) — `bingx_user_stream.py` + `rust_backend.py`
- The WS `FILL_SETTLED` path already carries the exchange's realized (`rp`)
and fee (`n`, sign-flipped at boundary per BingX quirks memory). Extend
the kernel account-event payload with `trade_id`, and on receipt:
- if the matching slot leg was flagged `realized_skipped_no_price`,
ADD the exchange realized to `slot.realized_pnl` (repair) and clear
the flag; settle the increment through the normal baseline mechanism;
- else record `pnl_source="exchange"` for the trade-event row (the
estimate stays as the booked figure unless |estimaterp| exceeds a
tolerance — then log ERROR + emit an `anomaly_events` row; do NOT
silently re-book).
- Rust: add `dita_kernel_repair_realized(slot_id, amount)` FFI (or fold the
repair into `on_account_event` with `slot_id` in payload). Keep it
idempotent via the existing account-event dedup.
### 5.3 Gates
- Unit: feedback receives trade-realized, not capital delta (simulate a
foreign-fill capital jump mid-trade → feedback unaffected).
- Unit: price-less exit leg + later FILL_SETTLED repair → slot realized
equals exchange `rp`; settle baseline consistent (no double-settle).
- Parity: `test_blue_parity.py`, `test_alpha_blue_untouched_g7.py` green
(sizer behavior unchanged for normal fills).
---
## 6. Phase 4 — Kernel hardening leftovers
### 6.1 `lib.rs` — `resolve_slot` (~line 1099)
Falls back to **slot 0** when nothing matches. Change: return
`Option<usize>`; on `None`, `on_venue_event` returns
`UNRESOLVED_SLOT` (diagnostic exists already) without mutating any slot,
severity WARNING, event recorded in outcome details. Python callers: the
runtime treats UNRESOLVED_SLOT as a logged no-op (the `_fill_is_ours`
filter remains first-line defense; this is kernel-side defense for
venue-agnostic reuse).
NOTE: several tests construct events with `slot_id=-1` expecting slot-0
fallback — update them to pass explicit `slot_id=0` (behavioral test
change; list each in the PR description).
### 6.2 ID-less fill routing (documentation + metric, not code)
BingX WS omits clientOrderId, so identity routing can't always engage.
Add a counter metric (`fills_routed_by_state_total`) via an
`anomaly_events` row per occurrence, severity INFO — gives VIOLET the data
to justify per-venue synthetic ids later. No FSM behavior change.
### 6.3 Gates
- New Rust tests: unresolved event mutates nothing; entry-id fill during
EXIT_WORKING routes to entry (already covered by Phase-0 routing — add
the explicit case); price-less exit leg books 0 + flag.
---
## 7. Test matrix (run-order for the implementing agent)
| Stage | Command (env: `PYTHONPATH=/mnt/dolphinng5_predict:/mnt/dolphinng5_predict/nautilus_dolphin`, venv `/home/dolphin/siloqy_env/bin/python3`) | Pass bar |
|---|---|---|
| Rust unit | `cargo test --release` in `_rust_kernel/` | 100% |
| Kernel FSM | `pytest prod/tests/test_dita_v2_kernel.py` | 100% |
| Bridge/accounting | `pytest prod/tests/test_pink_ditav2_kernel_bridge.py test_pink_ditav2_accounting_invariants.py prod/clean_arch/dita_v2/test_account_core_v2.py` | 100% |
| Runtime/reconcile | `pytest prod/clean_arch/dita_v2/test_venue_reconcile.py test_orphan_prevention.py test_exec_router_runtime.py prod/tests/test_pink_async_fill_pump.py test_pink_direct_runtime.py` | 100% |
| Chaos | `pytest prod/tests/test_pink_ditav2_chaos_harness.py` + `test_dita_v2_e2e_functional.py` | 100% |
| Parity | `pytest prod/clean_arch/dita_v2/test_blue_parity.py test_alpha_blue_untouched_g7.py` | 100% |
| Adapter | `pytest prod/tests/test_dita_v2_bingx_adapter.py` | 100% after Phase-0 item 4 resolution |
| LIVE VST E2E | `python prod/ops/dita_v2_live_bingx_smoke.py --pink --symbol TRXUSDT` | suite green |
| **Golden replays (NEW — write these)** | `prod/tests/test_pink_accounting_golden.py` | see below |
| Shadow soak | 2448 h on VST | capital vs balance ≤ $0.01 idle |
### Golden replay tests (the heart of the acceptance)
Feed the kernel the recorded FET event sequence (entry fills 195,259 +
7,017 @ 0.1878; exit fills 26,007 + remainder; the poisoned variant with
price=0.229 and the clean variant with 0.1866):
1. Clean prices → realized = `(0.18780.1866) × 202,276 ≈ +242.7` gross.
2. Poisoned price (0.229) reaching the kernel anyway → with the adapter fix
it must arrive as 0.0 → leg books 0 + `realized_skipped_no_price`; after
synthetic FILL_SETTLED rp=+164 → slot realized = +164, `pnl_source=exchange`.
3. Restart mid-position (save_state/restore_state + reconcile_from_slots)
→ next venue event settles ONLY the incremental PnL.
4. VWAP: two entry fills at different prices → basis = weighted average.
5. Dual-leverage invariant: same fills at exchange-leverage 1 vs 3 →
**identical realized PnL**; only margin fields differ.
---
## 8. Rollout & rollback
1. Each phase = one PR-sized commit, gates green before the next.
2. Activation requires `supervisorctl restart dolphin_pink` — restart at a
FLAT moment (check `DOLPHIN_STATE_PINK` + exchange positions). The
restart-reconcile path is itself under test here; first restart after
Phase 0 should be watched live.
3. Rollback = `git revert` of the phase commit + rebuild `.so` + restart.
The Rust `.so` MUST be rebuilt on both apply and revert — stale-binary
drift is how the incremental-fill change sat uncompiled until 2026-06-11.
4. CH DDLs are additive (`ADD COLUMN ... DEFAULT`) — no destructive
migrations anywhere in this spec; rollback leaves unused columns, which
is fine.
5. PINK is VST (virtual funds) — it is the canary by construction. Nothing
in this spec touches BLUE files (verify with `git diff --name-only`
against the §38.7 checklist).
## 9. Done criteria (the whole spec)
- All phases merged; full matrix green; golden replays green.
- 48 h VST soak: zero UNEXPLAINED reconcile errors; published capital
tracks exchange balance; every closed trade's `trade_events.pnl` within
fees+slippage of the exchange income record, with `pnl_source` populated.
- `pink_ctl.py mode-verify` passes (namespace isolation intact).
- SYSTEM BIBLE §38 addendum updated (one paragraph: E-led ledger, K as
checksum, provenance fields) + `DITA_V2_KERNEL_REFERENCE.md` §"Capital
simplification" rewritten to match reality.