Files
siloqy/prod/docs/PINK_ACCOUNTING_EXEC_FIX.md
Codex c3a18f693a docs: VIBRISS spec (+ §10.6 cascade/adaptive-TP paramsets), PINK accounting fix spec, BLUE incident docs
VIBRISS_PARAMETER_GOVERNANCE_SPEC §10.6: ob_cascade.count_threshold
(currently cascade_count>0 = ONE asset widens every TP x1.40),
tp_widen_factor, withdrawal_velocity_threshold as governance candidates;
adaptive/Dynamic-TP threshold marked fit for VIBRISS governance; TP_FLOOR
joint-policy reward requirement.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 15:04:15 +02:00

363 lines
20 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# PINK / DITAv2 Accounting & Execution Fix — Spec and Dev Guide
**Status**: SPEC — ready for implementation agent
**Date**: 2026-06-11
**Branch**: `exp/pink-ditav2-sprint0-20260530` (continue on it or fork `fix/pink-accounting-consolidation`)
**Author of spec**: forensic session 2026-06-11 (FET $5,990.90 mis-book replay)
**Prerequisite for**: VIOLET rebuild (`violet_subsecond_rebuild_plan` memory / future plan session)
---
## 0. Why this exists — the incident in one paragraph
On 2026-06-11 PINK closed a FET-USDT short that the exchange settled at
**+$164 net** (entry VWAP 0.1878, exit 0.1866, ~202K FET) but the kernel
booked **$5,990.90** and capital diverged $6,154 from the exchange wallet.
Replay against `dolphin_pink.trade_reconstruction` slot images identified
three stacked defects, all in *derivation* code (none in exchange facts):
(1) fill events carried BingX's MARKET **protective bound price** (0.229,
+22% off tape) instead of the true fill price; (2) `realized_pnl()` and
`mark_price()` multiplied PnL by `slot.leverage` (exchange leverage — but
`slot.size` is exchange *quantity*, so every leg was 3× inflated); (3) the
Python settle baseline `_last_settled_pnl` resets empty on every restart,
so reconcile-adopted slots re-settle carried PnL. Exact replay of leg 1:
`26,007 × (0.2290.1878)/0.1878 × 0.1878 × 3 = 3,214.4652` ✓ matches the
booked increment to the cent.
A fourth structural finding: there are **three parallel ledgers** (Rust
`AccountState` K/E, Python `AccountProjection` — the one persistence reads,
fee-blind — and `AccountProjectionV2`, dead in the live path). This spec
consolidates to **E-facts as ledger of record + K as integrity checksum +
one atomic published snapshot**.
---
## 1. Scope and non-goals
IN SCOPE
1. Commit + activate the Phase-0 fixes already in the working tree.
2. E-anchored published capital; single atomic account snapshot.
3. Per-trade PnL provenance (`exchange | kernel_estimate`) end-to-end.
4. Sizer feedback off trade-realized PnL (not capital deltas).
5. Persistence hygiene: duplicate row emission, silent async-insert loss,
`event_seq` stamping, `bars_held` clamp, naive-UTC timestamps.
6. Kernel hardening leftovers: `resolve_slot` no-match sentinel,
FILL_SETTLED realized override of flagged estimate legs.
OUT OF SCOPE (separate tickets)
- BLUE's exit-path masking bug (LINK $1,248, `TODO_TP_SCAN_CADENCE_BUGFIX.md`) — BLUE stack, not DITAv2.
- VIOLET fork, sub-second clock, venue price-feed port, cadence quantizer.
- ch_writer head-of-line poison-row parking redesign (mitigations land here;
the full parking-lane design is its own task).
- prefect.db / ClickHouse TTL disk remediation.
HARD INVARIANTS — MUST NOT CHANGE
- **Dual leverage**: `slot.size` = exchange quantity; `slot.leverage` =
exchange leverage (13x cap, set at BingX API); *our*-leverage
(conviction) = `size × entry_price / capital`, computed only at
`pink_direct._hz_publish` (line ~911). PnL is therefore **leverage-free**:
`qty × Δprice`, side-signed. Do not touch the conviction→exchange mapping
(`round_half_even_linear_0.5_to_9.0_to_1_to_exchange_cap`) or
`target_size` computation.
- **Exits are never skipped** (exec-router invariant set, §16 kernel ref).
- **BLUE-parity policy contract**: `DecisionEngine`/`IntentEngine` inputs
(MarketSnapshot + capital + slot state) unchanged in shape.
- **Namespace isolation**: zero writes to `dolphin.*` / `dolphin_prodgreen.*`
or BLUE/PRODGREEN HZ maps. Re-verify with `pink_ctl.py mode-verify`.
- **Data cadences are sacred** (operator rule 2026-06-10): never reduce a
data cadence for throughput.
---
## 2. Phase 0 — Commit and activate the already-applied fixes
These changes exist UNCOMMITTED in the working tree as of 2026-06-11 ~16:30.
Verify each hunk, commit as one reviewed unit, then restart `dolphin_pink`.
### 0.1 `prod/clean_arch/dita_v2/_rust_kernel/src/lib.rs`
| Function | Change (already applied) |
|---|---|
| `KernelCore::realized_pnl` (~line 1153) | PnL = side-signed `qty × (exit entry)`; **no leverage factor**; returns 0 when `entry<=0 exit_size<=0 exit_price<=0 !finite` |
| `TradeSlot::mark_price` (~line 394) | no `× leverage` in unrealized; a mark NEVER becomes entry basis — missing basis flags `metadata.entry_basis_missing=true`, unrealized stays 0 |
| `KernelCore::fill_matches_order` (new) | identity match on `venue_order_id` / `venue_client_id` |
| `KernelCore::apply_fill` | entry/exit routing by ORDER IDENTITY first, FSM state second (`!id_matches_exit` / `!id_matches_entry` guards); entry basis = **VWAP across entry fills** (`(prev_basis×prev_filled + price×fill)/accumulated`); price-less exit fill reduces size, books 0 PnL, flags `metadata.realized_skipped_no_price=true` |
Rebuild required: `cargo build --release` in `_rust_kernel/` (the `.so` is
only auto-built when missing — **source/binary drift is a known hazard**;
add the build to the commit checklist). `cargo test`: 32/32 green as of spec.
### 0.2 `prod/clean_arch/dita_v2/bingx_venue.py`
Fill events must carry a TRUE fill price or 0.0 — never the order's nominal
`price` / submit `receipt.price` (BingX MARKET bound price, ±2025%):
- `_events_from_submit` fill event (~line 585): `_row_float(ack_row,
"avgPrice","ap","lastFillPrice","L", default=0.0)`
- `_event_from_row` (~line 697): fills use the same true-price chain;
non-fill events (ACK/CANCEL/REJECT) may keep nominal `price` as info
- `_fill_event_from_row` (~line 736): `"lastFillPrice","L","avgPrice","ap"`
### 0.3 `prod/clean_arch/dita_v2/rust_backend.py`
- `reconcile_from_slots`: seeds `_last_settled_pnl[slot_id] = slot.realized_pnl`
and `_slot_was_closed[slot_id] = slot.closed` for every adopted slot.
- `restore_state`: same re-anchoring after successful restore.
### 0.4 Adjacent fixes riding the same commit
- `prod/ch_writer.py`: insert URLs append `&date_time_input_format=best_effort`;
flush errors log at WARNING (first 10 + every 100th), counter `_flush_errors`.
- `prod/clean_arch/dita_v2/blue_parity.py` `price_of`: hyphen-tolerant
fallback (`FET-USDT` → `FETUSDT`) — fixes the unmanaged-position block.
- `prod/clickhouse/users.xml`: `date_time_input_format=best_effort` for the
`dolphin` user (NOTE: running CH container did not honor it even after
restart — the container does not mount compose configs; effective on next
compose recreation. The client-side URL param is the operative fix.)
- `prod/tests/test_dita_v2_kernel.py`: partial→full fill test updated to
incremental `filled_size` semantics (BingX WS `lastFilledQty`).
### 0.5 Phase 0 gates
1. `cargo test` in `_rust_kernel`: 32/32.
2. `pytest prod/tests/test_dita_v2_kernel.py`: 7/7.
3. `pytest prod/clean_arch/dita_v2/test_exec_router_runtime.py
test_venue_reconcile.py test_orphan_prevention.py
prod/tests/test_pink_async_fill_pump.py
prod/clean_arch/dita_v2/test_account_core_v2.py test_bingx_bugs.py`: 134/134.
4. KNOWN pre-existing failures (NOT introduced by this work — verified by
hunk-revert): 4 tests in `prod/tests/test_dita_v2_bingx_adapter.py`
(snapshot-fill emission broke when sync `submit()` started passing None
snapshots on 2026-06-10). Fix or quarantine them explicitly in this phase
— do not let them mask new regressions.
5. Restart `dolphin_pink` at a FLAT moment; verify in logs: no
`realized_skipped_no_price` storms, no `entry_basis_missing` on fresh
entries, first round-trip books PnL within ±(fees+slippage) of
`GET /openApi/swap/v2/user/income` for the same trade.
---
## 3. Phase 1 — E-anchored published capital
**Goal**: the capital that persistence/HZ/sizer see is exchange-anchored;
K never publishes.
### 3.1 `prod/clean_arch/dita_v2/account.py`
- Add to `AccountSnapshot`: `capital_source: str` (`"e_anchored" |
"k_bridged" | "seed"`), `e_wallet_balance: float`, `event_seq: int`.
- New method `AccountProjection.anchor_to_exchange(wallet_balance: float,
available_margin: float, event_seq: int)`: sets `capital = wallet_balance`
(guard `>0` and finite — the zero-wb frame lesson), `capital_source =
"e_anchored"`, recomputes equity. `settle()` remains for the BRIDGE case
only: between anchors, capital += realized (`capital_source="k_bridged"`).
- `settle(realized_pnl, fees)`: **stop ignoring fees** — `capital +=
realized_pnl fees` (today fees only accumulate in `fees_paid`; published
capital ignores them between reseeds).
### 3.2 `prod/clean_arch/runtime/pink_direct.py`
- The existing reseed path (balance-bearing ACCOUNT_UPDATE →
`kernel.reset_and_seed(wb)`) additionally calls
`kernel.account.anchor_to_exchange(...)` — one anchoring action, two
ledgers consistent.
- Boot seed (launcher `exchange_balance_capital` block, pink_direct ~line
262) goes through `anchor_to_exchange` instead of direct attribute writes.
### 3.3 Gates
- New unit tests (`prod/tests/test_pink_account_anchor.py`):
anchor sets capital/source; zero/negative/NaN wb rejected; settle bridges
with fees; anchor after bridge snaps to wb exactly.
- Shadow check (live, 24 h on VST): published capital vs
`GET /openApi/swap/v2/user/balance` polled 1/min — max |Δ| outside a
trade-settlement window ≤ $0.01; during settlement ≤ pending-fee bound.
---
## 4. Phase 2 — Single atomic snapshot, ledger consolidation
**Goal**: one immutable, versioned account snapshot; the two redundant
ledgers demoted/removed.
### 4.1 `prod/clean_arch/dita_v2/account.py`
- Make the published snapshot **immutable-replace**: `AccountProjection`
builds a new frozen `AccountSnapshot` (carry `event_seq`) on every
mutation and swaps a single reference (GIL-atomic). Readers must take
`snap = kernel.account.snapshot` once per use (audit call sites:
`pink_clickhouse.py`, `hazelcast_projection.py` HZ writer, `pink_direct`).
- `AccountProjectionV2`: DELETE, or move to `prod/clean_arch/dita_v2/
_attic/` with a module docstring pointing here. Its only live-path import
is `exchange_event.py` — migrate that import or the dataclasses it uses
(`EPosition` is genuinely useful; keep it in `account.py`).
- The Rust `AccountState` K-ledger STAYS — demoted by documentation and by
Phase 1 (it no longer feeds published capital): its jobs are reconcile
classification (R1-style), `capital_frozen`, and E-dark bridging. Update
the module docstring to say exactly this.
### 4.2 `prod/clean_arch/persistence/pink_clickhouse.py`
- Read capital/equity/peak/trade_seq from the single snapshot reference;
no recomputation.
- Add columns to emitted rows (and the matching `ALTER TABLE` DDLs under
`prod/clickhouse/pink/08_provenance.sql` — **apply DDLs to CH BEFORE
deploying code that emits them**; the missing-table head-of-line jam of
2026-06-11 is the cautionary tale):
- `account_events`, `status_snapshots`: `capital_source LowCardinality(String) DEFAULT ''`,
`account_event_seq UInt64 DEFAULT 0`
- `trade_events`, `trade_exit_legs`: `pnl_source LowCardinality(String) DEFAULT ''`
(`exchange` | `kernel_estimate`)
- `bars_held`: clamp to `max(0, …)` at row-build time (UInt16 column;
negative values currently 400 on trade_events / silently vanish on
async tables).
- Timestamps: route every `ts` through one helper emitting **naive-UTC
microsecond ISO** (no `+00:00`) — best_effort already tolerates both, but
rows must stop depending on a parser setting.
### 4.3 Duplicate-emission fix (same file)
Every CH row is currently emitted twice (visible in any query). Hunt the
double call: instrument `_sink()` with a per-(table, content-hash) debug
counter in a test, then trace the two call paths (suspect: `persist_result`
invoked both from the runtime step and from the fill pump for the same
event). Fix at the caller level; do NOT dedupe by content in the sink
(masks real double-events). Regression test: one simulated round trip →
exactly one row per logical event per table.
### 4.4 `prod/ch_writer.py`
- `wait_for_async_insert`: `"1"` for ALL `dolphin_pink` tables (accounting
rows must never be silently lost; the spool absorbs latency). Keep `0`
acceptable only for high-volume shadow tables if measured necessary —
document any exception inline.
- Mitigation for head-of-line (full redesign out of scope): after
`attempts > 1000` on a row, log ERROR with the CH response body once per
100 attempts (today the reject reason is invisible without manual replay).
### 4.5 Gates
- Full offline suite (the 533+ DITAv2/PINK set) green, minus the Phase-0
quarantined adapter tests if still open.
- One live VST round trip: every table gets exactly one row per event;
`pnl_source`/`capital_source` populated; CH `system.text_log` shows zero
parse rejections for `dolphin_pink`.
---
## 5. Phase 3 — Sizer feedback off trade-realized PnL
**THE one seam where this refactor can silently change alpha behavior.**
### 5.1 `prod/clean_arch/runtime/pink_direct.py` — `_sizer_trade_feedback` (~line 1453)
Today: `pnl = acc.capital self._sizer_entry_capital` (capital delta).
Under E-anchored capital this absorbs funding, fees of other activity, and
**foreign fills from the shared VST account** (PRODGREEN collision class).
Change to:
```
pnl = slot_realized_for_trade(trade_id) # Σ slot.realized_pnl legs, i.e.
# kernel estimate, overridden by
# exchange rp when settled (5.2)
```
Source: the closing slot dict already carries `realized_pnl`; use it (minus
the fees recorded for the trade when available) instead of the capital
delta. Keep the magnitude semantics the sizer expects (sign + rough size —
per the existing comment, bucket/streak multipliers only need that).
### 5.2 Exchange override (E-led repair) — `bingx_user_stream.py` + `rust_backend.py`
- The WS `FILL_SETTLED` path already carries the exchange's realized (`rp`)
and fee (`n`, sign-flipped at boundary per BingX quirks memory). Extend
the kernel account-event payload with `trade_id`, and on receipt:
- if the matching slot leg was flagged `realized_skipped_no_price`,
ADD the exchange realized to `slot.realized_pnl` (repair) and clear
the flag; settle the increment through the normal baseline mechanism;
- else record `pnl_source="exchange"` for the trade-event row (the
estimate stays as the booked figure unless |estimaterp| exceeds a
tolerance — then log ERROR + emit an `anomaly_events` row; do NOT
silently re-book).
- Rust: add `dita_kernel_repair_realized(slot_id, amount)` FFI (or fold the
repair into `on_account_event` with `slot_id` in payload). Keep it
idempotent via the existing account-event dedup.
### 5.3 Gates
- Unit: feedback receives trade-realized, not capital delta (simulate a
foreign-fill capital jump mid-trade → feedback unaffected).
- Unit: price-less exit leg + later FILL_SETTLED repair → slot realized
equals exchange `rp`; settle baseline consistent (no double-settle).
- Parity: `test_blue_parity.py`, `test_alpha_blue_untouched_g7.py` green
(sizer behavior unchanged for normal fills).
---
## 6. Phase 4 — Kernel hardening leftovers
### 6.1 `lib.rs` — `resolve_slot` (~line 1099)
Falls back to **slot 0** when nothing matches. Change: return
`Option<usize>`; on `None`, `on_venue_event` returns
`UNRESOLVED_SLOT` (diagnostic exists already) without mutating any slot,
severity WARNING, event recorded in outcome details. Python callers: the
runtime treats UNRESOLVED_SLOT as a logged no-op (the `_fill_is_ours`
filter remains first-line defense; this is kernel-side defense for
venue-agnostic reuse).
NOTE: several tests construct events with `slot_id=-1` expecting slot-0
fallback — update them to pass explicit `slot_id=0` (behavioral test
change; list each in the PR description).
### 6.2 ID-less fill routing (documentation + metric, not code)
BingX WS omits clientOrderId, so identity routing can't always engage.
Add a counter metric (`fills_routed_by_state_total`) via an
`anomaly_events` row per occurrence, severity INFO — gives VIOLET the data
to justify per-venue synthetic ids later. No FSM behavior change.
### 6.3 Gates
- New Rust tests: unresolved event mutates nothing; entry-id fill during
EXIT_WORKING routes to entry (already covered by Phase-0 routing — add
the explicit case); price-less exit leg books 0 + flag.
---
## 7. Test matrix (run-order for the implementing agent)
| Stage | Command (env: `PYTHONPATH=/mnt/dolphinng5_predict:/mnt/dolphinng5_predict/nautilus_dolphin`, venv `/home/dolphin/siloqy_env/bin/python3`) | Pass bar |
|---|---|---|
| Rust unit | `cargo test --release` in `_rust_kernel/` | 100% |
| Kernel FSM | `pytest prod/tests/test_dita_v2_kernel.py` | 100% |
| Bridge/accounting | `pytest prod/tests/test_pink_ditav2_kernel_bridge.py test_pink_ditav2_accounting_invariants.py prod/clean_arch/dita_v2/test_account_core_v2.py` | 100% |
| Runtime/reconcile | `pytest prod/clean_arch/dita_v2/test_venue_reconcile.py test_orphan_prevention.py test_exec_router_runtime.py prod/tests/test_pink_async_fill_pump.py test_pink_direct_runtime.py` | 100% |
| Chaos | `pytest prod/tests/test_pink_ditav2_chaos_harness.py` + `test_dita_v2_e2e_functional.py` | 100% |
| Parity | `pytest prod/clean_arch/dita_v2/test_blue_parity.py test_alpha_blue_untouched_g7.py` | 100% |
| Adapter | `pytest prod/tests/test_dita_v2_bingx_adapter.py` | 100% after Phase-0 item 4 resolution |
| LIVE VST E2E | `python prod/ops/dita_v2_live_bingx_smoke.py --pink --symbol TRXUSDT` | suite green |
| **Golden replays (NEW — write these)** | `prod/tests/test_pink_accounting_golden.py` | see below |
| Shadow soak | 2448 h on VST | capital vs balance ≤ $0.01 idle |
### Golden replay tests (the heart of the acceptance)
Feed the kernel the recorded FET event sequence (entry fills 195,259 +
7,017 @ 0.1878; exit fills 26,007 + remainder; the poisoned variant with
price=0.229 and the clean variant with 0.1866):
1. Clean prices → realized = `(0.18780.1866) × 202,276 ≈ +242.7` gross.
2. Poisoned price (0.229) reaching the kernel anyway → with the adapter fix
it must arrive as 0.0 → leg books 0 + `realized_skipped_no_price`; after
synthetic FILL_SETTLED rp=+164 → slot realized = +164, `pnl_source=exchange`.
3. Restart mid-position (save_state/restore_state + reconcile_from_slots)
→ next venue event settles ONLY the incremental PnL.
4. VWAP: two entry fills at different prices → basis = weighted average.
5. Dual-leverage invariant: same fills at exchange-leverage 1 vs 3 →
**identical realized PnL**; only margin fields differ.
---
## 8. Rollout & rollback
1. Each phase = one PR-sized commit, gates green before the next.
2. Activation requires `supervisorctl restart dolphin_pink` — restart at a
FLAT moment (check `DOLPHIN_STATE_PINK` + exchange positions). The
restart-reconcile path is itself under test here; first restart after
Phase 0 should be watched live.
3. Rollback = `git revert` of the phase commit + rebuild `.so` + restart.
The Rust `.so` MUST be rebuilt on both apply and revert — stale-binary
drift is how the incremental-fill change sat uncompiled until 2026-06-11.
4. CH DDLs are additive (`ADD COLUMN ... DEFAULT`) — no destructive
migrations anywhere in this spec; rollback leaves unused columns, which
is fine.
5. PINK is VST (virtual funds) — it is the canary by construction. Nothing
in this spec touches BLUE files (verify with `git diff --name-only`
against the §38.7 checklist).
## 9. Done criteria (the whole spec)
- All phases merged; full matrix green; golden replays green.
- 48 h VST soak: zero UNEXPLAINED reconcile errors; published capital
tracks exchange balance; every closed trade's `trade_events.pnl` within
fees+slippage of the exchange income record, with `pnl_source` populated.
- `pink_ctl.py mode-verify` passes (namespace isolation intact).
- SYSTEM BIBLE §38 addendum updated (one paragraph: E-led ledger, K as
checksum, provenance fields) + `DITA_V2_KERNEL_REFERENCE.md` §"Capital
simplification" rewritten to match reality.