Commit Graph

124 Commits

Author SHA1 Message Date
Codex
130c466cc9 SYSTEM BIBLE v7.1: OOM recovery runbook + supervisord safety rules
Add §16.10 corrected daemon start sequence (supervisord NOT auto-started on boot),
§16.12 critical supervisord.conf rules (no /tmp paths, OBF starvation → BLUE freeze,
pre-restart position check), §16.13 OOM recovery runbook with exact commands.

Incident context (2026-06-08):
- Previous agent set nautilus_trader to /tmp/blue_runtime_mirror/ — broken after OOM reboot
- OBF died during BLUE run, degraded gate for 285+ bars, BLUE stuck in RETRACT on LTCUSDT
- Fix: revert supervisord.conf to /mnt canonical paths, restart supervisord

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 13:25:50 +02:00
Codex
8f57f4d855 CRITICAL: fix nautilus_trader supervisord path — revert /tmp clone to /mnt canonical
A previous agent changed nautilus_trader to run from /tmp/blue_runtime_mirror/prod/
and updated PYTHONPATH + DOLPHIN_LOCAL_RUNTIME_ROOT to match. /tmp/blue_runtime_mirror
no longer exists after the OOM reboot, so supervisord could not start BLUE.

Fix: restore canonical paths for nautilus_trader:
  command:    /mnt/dolphinng5_predict/prod/nautilus_event_trader.py
  directory:  /mnt/dolphinng5_predict/prod
  PYTHONPATH: /mnt/dolphinng5_predict:/mnt/dolphinng5_predict/nautilus_dolphin:/mnt/dolphinng5_predict/prod
  DOLPHIN_LOCAL_RUNTIME_ROOT: /mnt/dolphinng5_predict

Rule: NEVER point nautilus_trader at /tmp. /tmp dirs are volatile;
canonical trader binaries must always be referenced via /mnt paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 13:19:29 +02:00
Codex
0eac51d2e9 PINK: docs v7 + reset_and_seed startup fix — 451/451 tests green
- SYSTEM_BIBLE.md → v7.0: documents fee-sign fix (Defect A), opening-fee
  fix (Defect B), WARN-unfreeze, orphan prevention, reset_and_seed startup,
  and vel_div env-override.
- CAPITAL_BOOKKEEPING_DESIGN.md: status updated to PHASE-1 BUGFIXES APPLIED;
  sections 8.1-8.4 (applied fixes + 34-test coverage) were already present.
- rust_backend.py: expose dita_kernel_reset_and_seed() via _RustKernelLib +
  ExecutionKernel.reset_and_seed(); zeros stale K-accumulators at startup so
  K=E=live_capital → delta=0 → capital_frozen=False on every clean restart.
- pink_direct.py: call kernel.reset_and_seed(live_capital) after
  _restore_kernel_snapshot() so BingX is always the ledger of record.
- launch_dolphin_pink.py: DOLPHIN_PINK_VEL_DIV_THRESHOLD env-var override
  for on-exchange debugging; BLUE unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 12:33:50 +02:00
Codex
e38ec77221 PINK: fix fee-sign bug + WARN-unfreeze — 451/451 tests green
Defect A (fee sign): bingx_user_stream._normalise_order flipped to
  fee = -raw_fee so BingX negative-n costs arrive as positive kernel
  costs.  k_maker_rebates no longer accumulates phantom rebates.

Defect B (opening fee dropped): fill_qty now falls back to "z"
  (cumFilledQty) when "l" (lastFilledQty) is zero/absent, so
  apply_predicted_fill computes a non-zero opening-leg fee.

Architectural fix (WARN unfreezes): lib.rs reconcile() now unfreezes
  capital_frozen on WARN as well as OK.  WARN (0.01-20 USDT delta) is
  normal in-flight settlement — only ERROR (≥20, unexplained) should
  halt ENTERs.  The old keep-state logic trapped the kernel permanently
  frozen after the first trade's ENTER predicted-fee phase pushed delta
  briefly into ERROR.

Acceptance criterion: |k_capital - bingx_balance| < 1 USDT, frozen=False
after every round-trip trade — verified numerically against T-1/T-2
ground truth from the CRITICAL doc.

Docs: CRITICAL_AGENT-TODO_ACCOUNTING_BUGFIX.md §12-13 (fix record),
      CAPITAL_BOOKKEEPING_DESIGN.md §8 (kernel spec), SYSTEM_BIBLE §11.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-08 11:08:31 +02:00
Codex
7e83a5c5c5 PINK: fix event-loop corruption in open_positions/open_orders + 23-test orphan suite
Root cause: open_positions()/open_orders() called _backend_snapshot() ->
_call_backend() -> _run() -> pool.submit(asyncio.run, coro) which spawned a
temporary event loop in a worker thread. httpx AsyncClient created inside that
temp loop, loop closed immediately. All subsequent HTTP calls raised Event loop
is closed or asyncio.locks.Event bound to different loop. Crash triggered WS
stream reconnects; each reconnect re-ran reconcile with N>1 BingX positions and
orphaned all but the largest.

Fix: open_positions()/open_orders() now read backend._state (populated by
await backend.connect() in the main loop). Fallback to _backend_snapshot()
for callers without a connected backend.

Fixes test_bingx_bugs::TestConnectNoDoubleRefresh: connect() is now async.

New test_orphan_prevention.py: 23 tests covering all 5 orphan mechanisms:
  A. open_positions/open_orders use backend._state, never hit thread pool
  B. connect() awaitable, backend.connect() runs in main event loop
  C. Reconcile guard: >1 position logs ERROR and takes only largest
  D. clientOrderId p-action-base36-rand4 on every order
  E. EXIT sizing capped to kernel slot_size

391 passed, 2 skipped, 0 failed across all 14 test files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 20:53:41 +02:00
Codex
a3169b762d PINK: reconcile guard — refuse to silently drop orphan positions on restart
_reconcile_position_slot passed all N BingX positions (all slot_id=0) to
reconcile_from_slots; with N>1 the kernel silently took one and forgot the
rest. Now: sort by size desc, take only the largest, log ERROR naming every
ignored orphan symbol. Caller must flatten exchange to 0 before restarting.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 18:12:07 +02:00
Codex
10a44d86b1 PINK: wire CH persistence to monitor + add missing friction DDL
- migrate_pink_sink_schema.sql: ALTER TABLE adds fee/fee_source/is_maker/
  slippage_bps/mark_at_submit/exchange_ts to trade_events and trade_exit_legs;
  CREATE TABLE fee_settled_events (was missing entirely). DDL already applied.
- monitor_pink.py: wire real PinkClickHousePersistence so monitor roundtrips
  write to dolphin_pink CH tables. Adds _make_decision_intent() helper; calls
  persist_step() after ENTER and EXIT. Persistence failure is non-fatal (warns
  and continues). 42 persistence tests green.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 11:10:49 +02:00
Codex
33d8e855c8 PINK: fix EXIT position not closing — 3 root causes, 368/368 tests green
Root cause 1 (http.py): duplicate signature= in POST body — canonical_query
included signature key after build_signed_params injected it, then body
appended &signature= again. Fix: exclude 'signature' from canonical.

Root cause 2 (bingx_direct + http.py): HTTP retry sent same MARKET order to
backup URL (bingx.pro), which hits the same VST account. Without clientOrderId,
each retry opened a new SHORT position; EXIT BUY 10 only closed one. Fix:
restore clientOrderId in hyphen format p-{e/x}-{base36_ts}-{rand4} (pure
alphanumeric rejected by VST; hyphen format accepted). Adds max_retries_override
+ urls_to_try to _request_json for non-idempotent override path.

Root cause 3 (flat_and_start_pink): k.venue.connect() ran backend.connect()
inside asyncio.run() in a thread-pool. httpx session created there references
a dead event loop; order POSTs raise RuntimeError("Event loop is closed").
Fix: await adapter.connect() directly from main event loop.

Also: enter_wall_ms + tight _is_our_position createTime filter to separate
PINK's position from concurrent strategies on shared VST account. 1.5s
settle sleep before flat check.

New test suite test_bingx_http_safety.py: 20 tests covering idempotency,
retry correctness, backup-URL dedup, event-loop hygiene, signing correctness.

Live result: ENTER 290ms, EXIT 260ms — both sub-second. Position flat.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 01:39:35 +02:00
Codex
535eea855d PINK: cancel_async, S2 task guard, 29 new regression tests — 346/346 green
Bug fixes:
  1. bingx_venue.py: add cancel_async() — async cancel that awaits backend.cancel()
     directly in the main event loop. The sync cancel() path goes through _run()
     → thread-pool → asyncio.run() in a new thread, but aiohttp is bound to the
     main loop → deadlock. Identical root cause as the old sync submit() → fixed
     via submit_async. Remove dead cancel_order branch (BingxDirectExecutionAdapter
     has cancel, not cancel_order).

  2. rust_backend.py: process_intent_async CANCEL path now uses cancel_async when
     available (matching the submit_async pattern for ENTER/EXIT). Sync cancel()
     fallback kept for MockVenueAdapter compat.

  3. bingx_direct.py: guard S2 background refresh task per symbol. Old code discarded
     the task reference; rapid submits piled up concurrent _refresh_state_background
     calls all writing self._state in arbitrary completion order (stale last-writer-
     wins). Now: skip creating a new task if one is already pending for the symbol;
     store reference and clear via done-callback.

Test additions (test_bingx_bugs.py, 29 tests):
  - cancel_async: awaitable, calls backend.cancel directly, maps all statuses
  - process_intent_async CANCEL: dispatches cancel_async / falls back to sync
  - S2 guard: task stored, no duplicates while pending, new task after done
  - _events_from_submit with None snapshots: FILLED/NEW/REJECTED/PARTIAL/RATE_LIMITED
  - _filled_size_from_snapshots(None, None): safe 0.0 return
  - _events_from_cancel: before/after completely ignored
  - connect(): no double refresh_state, no-op if backend has no connect
  - submit() sync with None snapshots: FULL_FILL still emitted
  - cancel() branch audit: uses cancel not cancel_order, raises for no-cancel backend

Fix: test_exchange_event_seam_parity.py TestMockSubscribe — replace deprecated
asyncio.get_event_loop().run_until_complete() with asyncio.run() (Python 3.12
raises RuntimeError when event loop is closed by earlier suite tests).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 16:02:13 +02:00
Codex
f2596e1155 PINK: S3 dead-snapshot removal — connect/cancel/submit overhead cuts
Fix 1: connect() — remove redundant _backend_snapshot(include_history=True).
backend.connect() already called refresh_state(); this was a second identical
network round-trip at startup (~400ms wasted).

Fix 2: cancel() — remove snapshot_before + snapshot_after. _events_from_cancel
never reads 'before' or 'after' — two gratuitous round-trips per cancel with
zero benefit.

Fix 3: submit() (sync/legacy path) — drop both _backend_snapshot calls, pass
None like submit_async already does. Receipt executedQty fields take precedence;
_filled_size_from_snapshots returning 0.0 is the correct safe fallback.

117/117 tests pass (2 pre-existing pytest-ordering failures in TestMockSubscribe
are unrelated — asyncio.get_event_loop() contamination from other suite files,
25/25 pass when the file runs alone).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 14:07:45 +02:00
Codex
c864e9c550 PINK: S1 leverage cache, S2 background refresh, Gap 1/2/3 fee+slippage logging
S1 — Leverage cache (bingx_direct.py):
  _ensure_leverage(): per-symbol asyncio.Lock + cached value check; skips ~350ms
  POST when exchange already has the requested leverage.  Saves ~350ms/trade.
  Cache updated ONLY on success; failed POST leaves cache stale → correct retry.
  Persist: JSON sidecar /tmp/.bingx_leverage_cache_{env}.json; survives restarts.
  connect(): _verify_leverage_drift() detects when another process changed leverage
  at the exchange and updates cache to exchange truth (logs WARNING on drift).
  Multi-runner contract: leverage is account-level on BingX; documented that
  concurrent runners with different leverage desires for same symbol conflict.
  20 mock tests: same-lev skip, change-triggers-POST, failure-no-cache-update,
  concurrent-same-symbol (lock prevents race), drift-detect, persist/restore,
  multi-runner known-limitation documentation test.

S2 — Background state refresh (bingx_direct.py):
  MARKET fills: asyncio.create_task(_refresh_state_background) — does not block
  submit path.  WS FILL_SETTLED + ACCOUNT_UPDATE deliver capital truth anyway.
  LIMIT fills: synchronous refresh retained (include_history=False, not True) —
  needed to detect resting order state for next pump cycle.
  Saves ~600–900ms/trade on MARKET exits. ENTER similarly improved.

Gap 1 — VenueEvent friction fields (contracts.py):
  Added: fee, fee_asset, fee_source, is_maker, exchange_ts, slippage_bps,
  mark_at_submit — all with defaults so existing callers are unaffected.
  Detailed inline docs for sign conventions and provenance codes.

Gap 2 — Fee estimation + WS_SETTLED provenance (bingx_direct.py, pink_clickhouse.py):
  submit_intent: estimates fee from fill_price × fill_qty × taker/maker rate;
  annotates ack_row with _fee_estimated, _fee_source, _is_maker_est.
  persist_fee_settled(): new method writes fee_settled_events row when WS
  ORDER_TRADE_UPDATE delivers actual commission ("n" field); fee_source="WS_SETTLED".
  pink_direct._run_account_stream: calls persist_fee_settled on FILL_SETTLED.

Gap 3 — Slippage measurement (bingx_direct.py, bingx_venue.py, pink_clickhouse.py):
  Captures mark_at_submit before the order POST; computes slippage_bps signed
  by side: positive = adverse (taker overpaid / maker undersold), negative =
  price improvement.  Measured for BOTH taker and maker fills for symmetry.
  Flows through VenueEvent → trade_events.slippage_bps + trade_exit_legs.slippage_bps.

S3 / SOR — Maker order placement: comprehensive TODO block in submit_intent with:
  SHORT/LONG-aware price offset design, OBF integration requirements,
  TODO_ADD_PARAMSET_VIBRISS for spread_bps threshold, intelligent timeout_s
  calibration requirements, price-impact awareness gap, SOR abstraction CRITICAL TODO.
  REST/WS split: documented why BingX (and all retail venues) separate these
  and why a unified VenueAdapter protocol is the long-term solution.

151/151 existing tests green + 20 new leverage cache tests = 171 total.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 12:25:12 +02:00
Codex
714913bab6 PINK: flat_and_start_pink — persistence check + asset-scoped flatness test
Startup roundtrip now verifies persistence accounting inline:
  - Wires PinkClickHousePersistence to a mock sink after EXIT
  - Checks trade_events: exit_price != entry_price, pnl finite, capital finite
  - Checks trade_exit_legs: pnl_leg, exit_qty, exit_price all populated
  - Logs ALL FIELDS CORRECT / ISSUES DETECTED

Flatness check scoped to PINK's traded asset only (other strategies
may have open positions; those are reported informational, not failure).
Adds _normalize_asset() for TRX-USDT → TRXUSDT symbol normalization.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 09:38:59 +02:00
Codex
f7ee491f15 PINK: FLAWS doc — backfill SHA b30205c for pass-6 persistence entries
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 09:32:01 +02:00
Codex
b30205ceb6 PINK: fix persistence layer — exit_price, entry_bar, recovery, external exits, NaN tracing
G21/E23/A13 — exit_price used entry_price (every trade had exit_price==entry_price):
  _write_trade_event: exit_price = fill_price_hint > intent.reference_price > decision.reference_price
  _write_trade_exit_leg: same priority chain via fill_price_hint parameter
  persist_result: extracts fill_price_hint from FULL_FILL/PARTIAL_FILL events in outcome
  persist_fill_events: intent.reference_price = actual fill price → propagates correctly

A14 — entry_bar was active_leg_index (exit leg counter, not bar count):
  _write_position_state: entry_bar = intent.bars_held (0 when intent is None)

A15 — persist_recovery_state used acc_dict as slot_dict (trade_id always ""):
  Now reads kernel.slot(0).to_dict() when kernel is wired; trade_id from real slot

External-position exit_qty=0 fix:
  _write_trade_exit_leg: when prev_size<=0 (no prior ENTER tracked), falls back to
  initial_size or intent.target_size so exit legs for reconcile-detected positions are meaningful

exit_qty field added to trade_exit_legs rows (was computed but not emitted)

NaN tracing (_checked_float):
  Introduces _checked_float() wrapper that logs WARNING + writes anomaly_events spool
  row on NaN/inf in financial fields; applied to realized_pnl in exit paths

29 new persistence unit tests (mocked) + chaos/fuzz suite:
  exit_price correctness, capital ordering, pnl_leg incremental, entry_bar,
  recovery trade_id, external position exits, multi-leg, restart-mid-trade, NaN/None fields
  164/164 total (97 flaws + 25 kernel reliability + 29 persistence + 13 phase4) green

FLAWS doc: pass 6 — G21/E23/A13/A14/A15 closed; 26 total fixed

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 09:30:30 +02:00
Codex
025d381623 PINK: flat_and_start_pink.py — flatten BingX VST + async startup check
CLI: python flat_and_start_pink.py [--flatten] [--no-start]
  --flatten : cancel all orders + MARKET-close all positions (correct side
              by positionAmt sign, not abs()); verifies account flat after
  --no-start: flatten only, skip roundtrip
  (no flags): startup roundtrip only — ENTER/EXIT via process_intent_async()

Startup roundtrip exercises the full N2/N3/N4 async hot path:
  process_intent_async → submit_async → await backend.submit_intent →
  BingX POST → on_venue_event(ORDER_ACK+FULL_FILL) → POSITION_OPEN → CLOSED

Min-order detection: queries /quote/contracts for tradeMinQty/minOrderQuantity;
fallback 10 units. Fixes the 0.001-TRX rejection that BingX returned.

Bugs fixed in flatten:
  - positionAmt sign was lost via abs(); SHORT positions now correctly use BUY
    (positionAmt < 0) vs SELL for LONG (positionAmt > 0) with reduceOnly=true

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-05 07:37:56 +02:00
Codex
feaf75e70f PINK: FLAWS doc — backfill real SHA f3a5f21 for pass-5 entries
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 21:03:58 +02:00
Codex
f3a5f21460 PINK: async submit + process_intent hot path; async/race flaw audit (pass 5)
N2/N3/N4 (3x Critical async bugs):
- BingxVenueAdapter.submit_async(): awaits backend.submit_intent() directly
  in caller's event loop — no thread-pool, no asyncio.run(), no _backend_snapshot()
- ExecutionKernel.process_intent_async(): same FSM guard logic as sync version;
  replaces venue.submit() with await venue.submit_async(); sync process_intent()
  untouched so all 122 tests stay green
- pink_direct.step() line 952: process_intent() -> await process_intent_async()

restore_state JSON parse (test fix):
- ExecutionKernel.restore_state() wraps Rust FFI in try/except JSONDecodeError
  returns False; matches documented contract; test_restore_corrupt_json_rejected passes

FLAWS doc: pass 5 table added; 21 total fixed; Z6/N5 marked resolved

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 21:02:26 +02:00
Codex
a9ba407ae2 PINK: fix reconcile 30s deadlock — async def + direct await
Root cause: _run() → pool.submit(asyncio.run, coro).result(30s) created a
new event loop in a thread-pool thread; aiohttp session is main-loop-bound
→ silent deadlock every step cycle. BingX VST is healthy (544ms gather).

Fix: async def reconcile() + await self.backend.refresh_state() in main loop.
pump_venue_events() already handles isawaitable → zero caller changes.
include_history=False (symbol=None skips history anyway).
Tests: 13/13 passing (async contract, 3 fault paths, <2s timing, gather-10).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 18:46:19 +02:00
Codex
d7e272e148 PINK: FSM occupancy & rollback test suite (9 new tests, 97/97 green)
TestFSMOccupancyAndRollback covers the invariants that prevent orphaned
exchange orders — the production failure mode where multiple positions
accumulated because slot state wasn't rolled back on submit failure:

  - ENTRY_WORKING blocks new ENTER (different trade_id → SLOT_BUSY)
  - POSITION_OPEN blocks new ENTER
  - venue.submit raise → synthetic REJECTED → FSM back to IDLE
  - After rollback slot immediately reusable
  - N consecutive submit failures never strand the slot
  - submit-fail then success → exactly 1 position, not N
  - 20 rapid enter→exit cycles leave no residual state
  - EXIT on IDLE always rejected (no phantom closes)
  - 5 assets, 1 slot → only first accepted, rest SLOT_BUSY

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 18:31:50 +02:00
Codex
a5894a7196 PINK: FSM rollback on venue.submit failure via synthetic REJECTED event
When venue.submit() raises (BingX timeout / network error), the Rust FSM
had already advanced to ORDER_REQUESTED/ENTRY_WORKING with no corresponding
exchange order — stranding the slot. Every subsequent ENTER for a different
asset hit SLOT_BUSY, preventing recovery without a restart. Restarts create
a fresh IDLE kernel, leaving the orphaned exchange position unmanaged.

Fix: catch submit exceptions, synthesise an ORDER_REJECT VenueEvent, feed it
through on_venue_event() so the FSM rolls back to IDLE atomically. The slot
is free on the next cycle with no orphan on the exchange.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 18:17:31 +02:00
Codex
9acaeafc8b PINK: seed capital from BingX ledger of record on startup
Previously: set_seed_capital(hardcoded_25000) then on_account_event(BingX_100K+)
→ reconcile delta ~75K → capital_frozen=True → no trades allowed.

Fix: _fetch_exchange_wallet_balance() queries BingX wallet balance BEFORE
seeding the kernel. set_seed_capital() and the subsequent ACCOUNT_UPDATE
reconcile now agree → delta ≈ 0 → capital_frozen=False → sizing correct.

Falls back to DOLPHIN_INITIAL_CAPITAL if BingX is unreachable, with WARNING.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 17:53:34 +02:00
Codex
f78cc0d3f9 PINK: fix last c_char_p temporary in set_slot_json
Completes the ctypes lifetime audit. All eight FFI call sites now
assign _to_rust_bytes() to a local var before passing to c_char_p,
ensuring the bytes object lives for the full duration of the Rust call.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 17:00:16 +02:00
Codex
acffc783e6 PINK: fix naive datetime → INVALID_INTENT_PARSE at column 41
Root cause: intent.timestamp was a naive datetime (no tzinfo). isoformat()
produces '2026-06-04T14:26:55.098914' (26 chars). The JSON prefix
'{"timestamp":"' is 14 chars → closing quote lands at column 41. Rust's
chrono::DateTime<Utc> serde rejects naive timestamps and serde_json reports
the error as 'premature end of input at line 1 column 41'.

Fix: _utc_isoformat() attaches UTC tzinfo before isoformat(), producing
'2026-06-04T14:26:55.098914+00:00' which chrono accepts.

Previous null-byte fix (_to_rust_bytes) and dangling-pointer fix (local vars)
remain correct and address real separate failure modes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 16:45:33 +02:00
Codex
55197b2047 PINK: fix ctypes dangling-pointer + venue.submit guard
Two bugs causing INVALID_INTENT_PARSE at FFI boundary:

1. Dangling pointer: ctypes.c_char_p stores a raw C pointer without
   incrementing the Python refcount. Temporaries passed inline are freed
   by CPython before the Rust FFI call executes, giving Rust a dangling
   pointer whose freed memory looks like truncated JSON (column 41).
   Fix: assign bytes to local vars (_pb/_mb/_vb) to hold refs alive.

2. venue.submit guard: process_intent() called venue.submit() even when
   the kernel returned INVALID_INTENT, cascading a 30s BingX timeout
   into a fatal crash. Fix: gate on outcome.accepted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-04 16:14:42 +02:00
Codex
a89e766da1 PINK: fix ctypes c_char_p null-byte truncation (INVALID_INTENT_PARSE)
_to_rust_bytes() centralises all Python→Rust JSON serialisation:
- _json_null_clean() strips U+0000 from all string values recursively
- ensure_ascii=True guarantees no 0x00 in output bytes
- All _json() call sites migrated; mode/verbosity now .encode("ascii")
- 9 null-safety unit tests added to TestRustBytesNullSafety

Root cause: ctypes.c_char_p silently truncates at first 0x00 byte,
causing serde_json "premature end of input at column 41" on EXIT intents
with BNB-USDT leverage values. Long-term fix: Rust FFI (ptr, len) pairs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-03 18:30:10 +02:00
Codex
beef39eaf5 PINK: HOLD_DC_CONTRADICTED enum + trace log (104/104 green)
- contracts.py: DecisionAction.HOLD_DC_CONTRADICTED = "HOLD_DC_CONTRADICTED"
  Interim policy-gate veto enum. Comment marks CRITICAL TODO: KernelPolicyGate
  hook system (downstream-registered hooks; see memory).
- pink_direct.py: dc_contradicts() now sets HOLD_DC_CONTRADICTED (was plain HOLD)
  + logger.info trace with vel_div / scan_number / symbol — observable in logs,
  CH persistence, and Hz engine_snapshot.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-03 17:05:49 +02:00
Codex
29d44c338e PINK: TUI Hz fix + DC gate + ACB boost + 10 new tests (104/104 green)
TUI Hz fix:
- hazelcast_projection.py: write_engine_snapshot now writes all NAUTILUS-era
  field aliases (trades_executed, current_leverage, open_positions as list,
  last_scan_number, last_vel_div, vol_ok, open_notional) so gear_rows/capital
  panel work with no TUI changes.
- dolphin_status_pink.py: _normalize_eng_for_tui() safety-net translation added;
  render() uses it on every Hz read.

DC gate (SYSTEM BIBLE §4.2, champion config):
- pink_direct.py: _dc_contradicts() — 7-tick lookback, 0.75 bps threshold.
  Rising price (chg > 0.75 bps) blocks ENTER via dataclasses.replace(HOLD, DC_CONTRADICT).
  Price history deque initialized in connect(); dc_skip_contradicts=True enforced.

ACB boost (SYSTEM BIBLE §10):
- hazelcast_feed.py: fix wrong key "latest_acb" → "acb_boost" (DOLPHIN_FEATURES key
  written by acb_processor_service.py).
- pink_direct.py: _last_acb_boost read from scan_payload["acb_boost"] first (scan
  bridge may embed it), then Hz direct fallback. Applied to intent.leverage via
  dataclasses.replace() after IntentEngine.plan(), capped at 3x.
- _last_scan_number, _last_vel_div, _last_vol_ok tracked from scan_payload.

OBF gate: NOT implemented. OBF shards (DOLPHIN_FEATURES_SHARD_*) require new
Hz map connections + symbol routing. Gap documented; requires separate decision.

Tests: TestDCGate (5) + TestNormalizeEngForTui (5) — 10 new, 104 total, all green.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-03 14:00:48 +02:00
Codex
8d85d75ded PINK DITAv2: Hz writes + vol_ok gate + leverage logging + 8 new tests (94/94 green) 2026-06-03 13:26:36 +02:00
Codex
0f2d3f556d PINK DITAv2 flaw doc: V1+V2+V3+W10 fix markers + pass 3/4 tables
Integrates flaw doc updates from the side chain (post-Pass20) onto branch HEAD:
- V1/V2/V3 rows marked  FIXED 8d9762c
- W10 row marked  FIXED e90d542
- Pass 3 fixes table (V1/V2/V3 detail)
- Pass 4 fixes table (W10 detail)
- Header: "17 total fixed"

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-02 22:12:28 +02:00
Codex
ef473ba372 PINK: E2E trace analysis — Pass 23 closure review/unfinished fixes/ops gaps (Z1-Z14)
Twenty-third (final) pass: _safe_enum fix applied to rust_backend.py but NOT
real_zinc_plane.py other copy crashes (Z1 High), no health check endpoint
silent failures invisible to orchestration (Z5 High), process_intent calls
venue.submit without exception handler venue error bypasses Rust FSM (Z6 High),
snapshot mixes Rust and Python accounting capital can diverge (Z7 Medium),
BingxVenueAdapter.close executor null-to-shutdown TOCTOU race (Z8 Medium),
generated test f-string chr(34) template SyntaxError risk on old Python (Z9
Medium), launcher uses Python 3.10+ | union syntax no min version documented
(Z10 Medium), concurrent process_intent on same slot no lock no queue (Z12
Medium). 403 total flaws across 23 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 19:44:15 +02:00
Codex
13822d5bfa PINK: E2E trace analysis — Pass 22 serde round-trip/mock fidelity/protocol (Y1-Y14)
Twenty-second pass: asyncio.sleep(0.8) in ~295 generated test bodies flaky (Y5
Critical), MockVenueAdapter no rate_limit flag RATE_LIMITED path untested (Y6
High), reconcile() returns [] always late fills untestable (Y7 High), emits
one fill per submit multi-partial-fill untestable (Y8 High), no connect()
runtime error if protocol gains it (Y9 High), exit_leg_ratios serde default []
vs struct default vec[1.0] wrong ratio on restore (Y1 Medium), libc dead dep
(Y10 Medium), no close() (Y11 Medium), synchronous fills masks timing bugs
(Y12 Medium), _slot_from_payload duplicated two files different behavior (Y14
Medium). 389 total flaws across 22 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 18:39:49 +02:00
Codex
09db2e694b PINK: E2E trace analysis — Pass 21 rust build/deps/python packaging/shared mem (X1-X14)
Twenty-first pass: no ABI compatibility check on Rust .so load stale binary
corrupts silently (X1 Critical), real_zinc_plane _write_region zeroes entire
buffer before write visible all-zero window (X2 Critical), no requirements.txt
setup.py pyproject.toml zero Python dependency declarations (X3 Critical),
RealZincControlPlane.update() no thread lock concurrent calls corrupt seq and
shared memory (X4 High), libc declared in Cargo.toml never used dead dependency
(X5 High), 5 test files hardcoded sys.path.insert non-portable (X6 High),
_decode_packet no try/except on json.loads partial body read crashes reader (X7
High), ExchangeEvent not exported from __init__.py package API inconsistency (X8
High), RealZincPlane and RealZincControlPlane collide on {prefix}_control region
name (X10 Medium). 375 total flaws across 21 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 18:04:33 +02:00
Codex
b270b164ba PINK: E2E trace analysis — Pass 20 config/math signs/BingX protocol (W1-W14)
Twentieth pass: int() on 3 env vars uncaught ValueError (W1 Critical),
DITA_V2_PREFIX default "dita_v2" multi-process shared memory corruption (W2
Critical), funding sign opposite Python V2 vs Rust same raw value opposite
capital effect (W3 Critical), listenKeyExpired frames silently swallowed
continue skips expiry check dead code (W4 Critical), RECV_WINDOW_MS no upper
bound replay attacks (W5 High), ACTIVE_SLOT_LIMIT stored never enforced by
Rust kernel (W6 High), no fill history fetched during WS reconnect gap-backfill
fills lost (W7 High), rate limit detection fails on HTTP 429 no matching
message instant retry (W8 High), CONTROL_PLANE=REAL_ZINC silently falls back
to in-memory (W9 High), all BingxHttpError mapped to REJECTED can't distinguish
errors (W10 High), os.environ bracket access vs .get() inconsistent (W11 High).
361 total flaws across 20 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 17:13:21 +02:00
Codex
ded4b59891 PINK: E2E trace analysis — Pass 19 lifecycle/Rust subtleties/test infra (V1-V14)
Nineteenth pass: DITAv2LauncherBundle.close() never calls kernel.close() Rust
handle leaks via __del__ (V1 Critical), BingxVenueAdapter no close/disconnect
ThreadPoolExecutor/HTTP never release (V2 Critical), 3 generators write same
output file last writer wins incompatible prologues (V4 Critical), generated
tests triple env-gated never run in CI dead code (V5 Critical), kernel.close()
destroys Rust handle immediately no drain no flush UAF risk (V6 Critical),
process_intent ENTER doesn't clear seen_event_ids old dedup pollutes new trade
(V3 High), no conftest/pytest.ini/asyncio_mode test discovery fragile (V9 High),
#[serde(default)] leverage:0.0 mark_price no .max(1.0) silent accounting error
(V8 Medium). 347 total flaws across 19 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 16:34:58 +02:00
Codex
94078ee8fe PINK: E2E trace analysis — Pass 18 rust test gaps/accounting/FFI types (U1-U14)
Eighteenth pass: R2 compares cumulative vs last-fill realized PnL broken after
2nd fill (U3 Critical), R4 compares open_notional vs used_margin fundamentally
different quantities (U4 Critical), on_venue_event/apply_fill no NaN guards
price/size propagates NaN (U6 Critical), order_type/limit_price sent to Rust
no fields silently dropped (U1 High), VenueEventStatus expects
"CANCEL_REJECTED" typo fails deserialization (U2 High), R3 skipped when
len(e.positions)==0 silent false negative (U5 High), zero Rust tests for
ORDER_REJECT/PARTIAL_FILL/TERMINAL_STATE guard (U7 High), safe_float returns
NaN/Inf contradicts _safe (U8 Medium), _scan_slots uses metadata leverage not
slot.leverage (U9 Medium). 333 total flaws across 18 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 14:47:36 +02:00
Codex
66b403ff7d PINK: E2E trace analysis — Pass 17 unsafe review/dead code/build/protocols (T1-T14)
Seventeenth pass: catch_unwind + AssertUnwindSafe partially mutated state no
rollback (T1 High), HazelcastRowWriter bare json.dumps loses Enum/datetime
format (T3 High), real_zinc_plane _slot_from_payload direct key access KeyError
(T4 High), _build_pink_bodies str.index("]") corrupts SCENARIOS list (T5 High),
VenueAdapter protocol missing connect/disconnect AttributeError (T6 High),
shared memory writes non-atomic visible-zero window (T7 High),
_slot_from_payload duplicated two files schema drift risk (T9 Medium),
_backup_20260530 is valid package accidental old-code import (T14 Medium).
319 total flaws across 17 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 14:10:49 +02:00
Codex
b0aa91229f PINK: E2E trace analysis — Pass 16 error handling/arithmetic/test infra (S1-S16)
Sixteenth pass: realized_pnl/mark_price NaN bypasses <=0 guard (S1 Critical),
MockVenue _exchange_event_queue check-then-act race drops events (S2 Critical),
no test_kernel_fsm.py exists (S3 Critical), generated tests use asyncio.sleep(0.8)
flaky on slow CI (S4 Critical), _rate_limit_retry_after_ms returns 0 on parse
failure instant retry storm (S5 High), venue adapter detects rate limits but
enforces zero backoff (S6 High), capital_epsilon=1e-4 too tight false WARN (S7
High), tests use asyncio.run() leaks tasks on 3.12+ (S8 High), str.replace()
patching silently does nothing (S9 High), WS _consume no per-message timeout (S10
High), _run blocks pool thread with no timeout lock adapter (S11 High).
305 total flaws across 16 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 13:32:53 +02:00
Codex
a4c1ec6139 PINK: E2E trace analysis — Pass 15 resource leaks/trust boundaries/security (R1-R14)
Fifteenth pass: exchange REST/WS data parsed without schema validation (R7
Critical), restore_state() deserializes arbitrary JSON full kernel takeover
(R9 Critical), ThreadPoolExecutor never shut down 3 threads leak (R1 High),
BingxVenueAdapter no close() HTTP client unreleasable (R2 High),
_intent_cache unbounded growth (R3 High), shared memory JSON no integrity
check (R8 High), env-based mainnet switch (R10 High), .env secrets exposure
(R11 High), listenKey in WS URL f-string MITM injection (R13 High).
289 total flaws across 15 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 12:54:02 +02:00
Codex
062b929caf PINK: E2E trace analysis — Pass 14 serde edges/backup diffs/market data (Q1-Q12)
Fourteenth pass: fromisoformat can't parse Rust Z-suffix timestamps on Python
< 3.11 — crashes every timestamp deserialization (Q1/Q6/Q12 High), MarketSnapshot
timestamp type inconsistent float vs datetime in same file (Q5 High), no
#[serde(deny_unknown_fields)] — misspelled fields silently default (Q2 Medium),
no upper-bound price validation (Q7 Medium), threading.Event.wait uses platform-
dependent clock NTP jump risk (Q10 Medium). Backup diff reveals 6 critical bug
fixes between backup and current. 275 total flaws across 14 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 12:00:22 +02:00
Codex
b922f5ff1c PINK: E2E trace analysis — Pass 13 FFI safety/dangling pointers/coverage (P1-P9)
Thirteenth pass: dita_kernel_destroy double-free UB — Python doesn't null
handle.value (P1 Critical), CStr::from_ptr(payload) without null guard in
3 FFI exports (P2 High), _check_open_orders asyncio.run from async _verify
crashes live tests (P3 High), _get_rust() TOCTOU race concurrent cargo build
(P6 High), into_c_string NUL sanitizer produces invalid JSON (P4 Medium),
reconcile/snapshot_json null on failure no diagnostic (P5 Medium).
263 total flaws across 13 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 11:06:18 +02:00
Codex
d1a6be0d27 PINK: E2E trace analysis — Pass 12 sync/async wider scope (O1-O11)
Twelfth pass: _maybe_close asyncio.run silently skips close from async
context (O1), _pick_live_symbol missing await crashes on coroutine iteration
(O3), _run() pool .result() no timeout — backend hang freezes process (O5),
KernelSlotView.__getattr__ N FFI calls for N fields no caching (O8),
DITAv2LauncherBundle no __del__ leaks resource tree (O9), ExecutionKernel
no close() — __del__ only cleanup (O10), __setattr__ triggers 5 persistence
side effects undocumented (O11). 254 total flaws.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 09:27:25 +02:00
Codex
24034416e0 PINK: E2E trace analysis — Pass 11 async/sync seams/locks/threading (N1-N10)
Eleventh pass: Rust kernel with_handle_mut has zero synchronization —
&mut KernelCore from raw pointer with no Mutex, concurrent FFI calls cause
UB (N1 Critical), _run() has two completely different code paths depending
on event loop state (N2 Critical), path B blocks event loop thread for
every HTTP operation (N3 Critical), asyncio.run() called repeatedly creating
destroying event loops per call (N4 Critical), _snapshot_ready Event cascading
re-fetch — N callers produce N overlapping HTTP calls (N5 High). 243 total.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 08:00:50 +02:00
Codex
81fe1d6d25 PINK: E2E trace analysis — Pass 10 runtime/test bugs/FSM/persistence/metrics (M1-M18)
Tenth pass: ENTER transition always says prev_state=IDLE (M1 Critical), CANCEL
creates no transition record (M2 Critical), ORDER_REJECT on POSITION_OPEN with
stale entry order destroys position (M9 Critical), _mk_intent test helper drops
order_type/limit_price into metadata not proper field (M3 High), four test/s that
claim to test cancel but never cancel (M4, M17), no metric aggregation for trade
count/latency/slippage (M10 High), no ClickHouse INSERT retry (M12 High),
_decision_to_kernel_intent drops order_type/limit_price making LIMIT orders
dead from the runtime (M18 High). 233 total flaws.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-02 00:03:41 +02:00
Codex
b3b28bb44a PINK: kernel fee prediction + calibration loop
ExchangeFeeConfig in AccountState:
  taker_rate, maker_rate, lot_step, tick_size, funding_interval_secs
  calibration_ratio: EMA of actual/expected, updated on every fill

Kernel now predicts fees at fill time (PREDICTED_FILL event):
  k_capital updated immediately without waiting for WS FILL_SETTLED
  When actual fee arrives, prediction is replaced and ratio recalibrated
  Reconcile delta: 0.000000 (was ~0.9 USDT in canary without prediction)

Calibration loop on connect():
  Fetches recent fill history, validates model vs exchange actuals
  deviation < 1pct -> OK; < 5pct -> WARN; >= 5pct -> ERROR (pre-trade gate)

New FFI: dita_kernel_set_exchange_config_json, dita_kernel_calibrate_fee_json
New ExecutionKernel methods: set_exchange_config(), calibrate_fee()
pink_direct.py: loads BingX fee config on connect, calibrates before stream

131/131 offline pass.
2026-06-01 23:45:50 +02:00
Codex
7d13df35db PINK: E2E trace analysis — Pass 9 contracts/events/network/FFI/diffs (L1-L16)
Ninth pass: VenueEvent.price=0 causes 100% PnL loss (L3), available_margin
set to wrong field in user stream (L4), wallet_balance defaults to 0 (L5),
14+ bugs fixed between backup and current code (L12), real pipeline never
tested by any test function (L13), no proxy support (L9), 5-min DNS cache
(L10). Backup diff reveals the current Rust kernel has ~14 bugs fixed vs
the backup version. 16 new flaws, 215 total.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
2026-06-01 23:11:15 +02:00
Codex
23619e603a PINK Phase 5+6 (G6+G7): live VST gate + BLUE fence
bingx_user_stream.py: fix account_snapshot() for VST v3 balance
  (v3 returns list, not dict; extract first element)

test_pink_account_ws_g6.py (Gate G6 basic):
  I7: poll snapshot wallet_balance > 0 (PASS - live VST)
  I1-I5: seed + E-fact -> k_capital, available=E, reconcile OK/WARN,
    FILL_SETTLED folds correctly, delta=net_drift (PASS - live VST)
  I6: WS connects + gap-backfill delivers ACCOUNT_UPDATE source=poll
    (PASS - live VST)

test_alpha_blue_untouched_g7.py (Gate G7):
  mainnet hard-disabled, no BLUE imports, git diff clean

3/3 live + 131 offline = all gates green.
2026-06-01 22:35:27 +02:00
Codex
577392be8c PINK Phase 6 (G7): alpha-unchanged + BLUE-untouched gate
test_alpha_blue_untouched_g7.py:
- DOLPHIN_BINGX_ALLOW_MAINNET=0 enforced
- gen2.py uses VST + allow_mainnet=False
- New PINK modules (exchange_event, bingx_user_stream, account) import no BLUE
- Git diff confirms prod/bingx/, nautilus_dolphin, adapters/bingx_direct unchanged
- DecisionContext.capital remains a plain float (read-only new source)

G7: 9 pass, 2 skip (optional engine introspection), 0 fail.
131/131 total offline tests pass.
2026-06-01 22:07:48 +02:00
Codex
e644ee0add PINK Phase 4 (G5): reconcile_events persistence + event_seq on account rows
pink_clickhouse.py:
- optional kernel param to __init__ + set_kernel() for post-construction wiring
- _account_event_seq(): reads event_seq from kernel.snapshot()[account]
- _kernel_account(): full kernel account snapshot dict
- write_reconcile_event(): reconcile_events table writer (idempotent by seq)
- _write_account_event(): now includes account_event_seq + reconcile_status
  and auto-emits reconcile_events row when E-facts present

Gate G5: 13 tests -- event_seq wiring, row shape, one-direction invariant.
122/122 total tests pass.
2026-06-01 22:03:11 +02:00
Codex
e6988324ca PINK Phase 3 (G4): stream wiring + recovery + reconcile gate
pink_direct.py:
- connect(): set_seed_capital + REST account snapshot for crash recovery
- _run_account_stream(): BingxUserStream -> kernel.on_account_event()
  FILL_SETTLED folds K; ACCOUNT_UPDATE stores E-facts + runs reconcile;
  reconcile ERROR -> _enter_frozen=True (ENTERs blocked, exits always free)
  FUNDING_FEE folds K-funding_net
- _unsafe_entry_reason(): checks _enter_frozen first
- step(): capital from available_capital (E rules when present, K fallback)
- _venue_http_client() / _venue_ws_url() helpers

test_account_reconcile_faults.py (Gate G4):
  fee/funding/rounding -> WARN; unexplained -> ERROR
  crash-recovery sequence; exit-never-frozen invariant

109/109 total offline tests pass.
2026-06-01 21:41:30 +02:00
Codex
468984baab PINK: Rust kernel atomic K/E account layer (AccountState + FFI)
AccountState in KernelCore/KernelSnapshot:
- K-values: seed_capital, k_realized_pnl, k_fees_paid, k_funding_net
- E-facts: e_wallet_balance, e_available_margin, e_used_margin, e_maint_margin
- Cached: k_capital, available_capital (E rules when present; K fallback)
- Reconcile: OK/WARN(<20)/ERROR(>=20 delta) runs atomically on every event

New FFI:
  dita_kernel_set_seed_capital(handle, seed: f64) -> i32
  dita_kernel_on_account_event_json(handle, payload) -> *char
  Kinds: FILL_SETTLED | ACCOUNT_UPDATE | FUNDING_FEE

rust_backend.py: wires set_seed_capital() and on_account_event();
snapshot()[account] exposes both legacy and V2 fields.

Smoke-tested: fill->E_update->funding->re-sync all produce correct
K/E values and reconcile transitions (OK->OK->WARN->OK).
89/89 offline tests pass.
2026-06-01 21:22:01 +02:00