PINK: E2E trace analysis — Pass 19 lifecycle/Rust subtleties/test infra (V1-V14)

Nineteenth pass: DITAv2LauncherBundle.close() never calls kernel.close() Rust
handle leaks via __del__ (V1 Critical), BingxVenueAdapter no close/disconnect
ThreadPoolExecutor/HTTP never release (V2 Critical), 3 generators write same
output file last writer wins incompatible prologues (V4 Critical), generated
tests triple env-gated never run in CI dead code (V5 Critical), kernel.close()
destroys Rust handle immediately no drain no flush UAF risk (V6 Critical),
process_intent ENTER doesn't clear seen_event_ids old dedup pollutes new trade
(V3 High), no conftest/pytest.ini/asyncio_mode test discovery fragile (V9 High),
#[serde(default)] leverage:0.0 mark_price no .max(1.0) silent accounting error
(V8 Medium). 347 total flaws across 19 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
This commit is contained in:
Codex
2026-06-02 16:34:58 +02:00
parent 94078ee8fe
commit ded4b59891
3 changed files with 1502 additions and 10 deletions

View File

@@ -1,7 +1,26 @@
# PINK DITAv2 — Structural Flaw Analysis (CENTRAL)
**Analysis date:** 2026-05-31
**Last updated:** 2026-06-02 (flaw fix pass — 8 flaws closed)
**Scope:** Full PINK pipeline — all flaws across all modules.
> **Fix notation:** Rows marked **✅ FIXED `<sha>`** are verified-fixed with a test commit on branch `exp/pink-ditav2-sprint0-20260530`.
> Unfixed flaws remain as originally described.
### Fixes applied (2026-06-02)
| Flaw | Commit | What changed |
|------|--------|--------------|
| I1 — apply_fill partial fill accumulation | `c87ca78` | `slot.size += fill_size` (was `=`); all partial fills accumulate |
| I10 — seen_event_ids lost on restart | `c87ca78` + `3ca154e` | IndexSet dedup in AccountState; KernelFullSnapshot persists per-slot seen_event_ids; startup reconcile wired (I14) |
| I13 — stray event reactivates CLOSED slot | `c87ca78` | `slot.closed` guard returns TERMINAL_STATE before FSM branch |
| I14 — no Zinc restore on startup | `3ca154e` | `__init__` now calls `reconcile_from_slots(zinc_live)` for any non-idle Zinc slots |
| I15 — CANCEL_REJECT leaves slot in EXIT_WORKING | `9a8d1b9` | Clears `active_exit_order`, transitions to `POSITION_OPEN` |
| O1 — `_maybe_close()` silent skip from async context | `338811e` | Routes to thread-pool executor when a running loop is detected |
| O5 — `_run()` no timeout → process hang | `338811e` | `Future.result(timeout=_BACKEND_TIMEOUT_S)` (default 30 s); raises `TimeoutError` |
| O10 — no `close()` on ExecutionKernel | `3ca154e` | `close()` nulls `_backend` to prevent double-free; `__enter__`/`__exit__` added |
| N1 — `with_handle_mut` zero sync (partial) | `c87ca78` | `catch_unwind` at FFI boundary; concurrent-call UB mitigated by Python GIL |
**Sources:**
- This file (A-series): Detailed writeups for architectural flaws.
- [PINK_DITAv2_E2E_TRACE_ANALYSIS.md](./PINK_DITAv2_E2E_TRACE_ANALYSIS.md) (E, F, G-series):
@@ -33,7 +52,8 @@
| S | Pass 16 (Error Handling/Arithmetic/Test Infra) | 16 | 4 | 7 | 5 | 0 | 0 |
| T | Pass 17 (Unsafe Review/Dead Code/Build/Protocols) | 14 | 0 | 5 | 5 | 4 | 0 |
| U | Pass 18 (Rust Test Gaps/Accounting/FFI Types) | 14 | 3 | 4 | 4 | 3 | 0 |
| **Total** | | **333** | **30** | **99** | **96** | **64** | **37** |
| V | Pass 19 (Lifecycle/Rust Subtleties/Test Infra) | 14 | 5 | 2 | 4 | 3 | 0 |
| **Total** | | **347** | **35** | **101** | **100** | **64** | **37** |
---
@@ -180,7 +200,7 @@
| # | Flaw | Layer | Severity |
|---|------|-------|----------|
| I1 | Entry `apply_fill` multiple partial fills overwrite size instead of accumulating | Rust | **Critical** |
| I1 | Entry `apply_fill` multiple partial fills overwrite size instead of accumulating**✅ FIXED `c87ca78`** | Rust | **Critical** |
| I2 | Zero exit_ratio creates zero-size exit order — slot stuck in EXIT_REQUESTED | Rust | Medium |
| I3 | entry_price inconsistency — Python falsy vs Rust `<= 0.0` gate | Bridge | Info |
| I4 | Only 1 Rust unit test for 1765-line kernel — 99% untested at Rust layer | Rust | **High** |
@@ -189,12 +209,12 @@
| I7 | Three weak/vacuous assertions in test_flaws.py | Test | Low |
| I8 | Entry overfill no guard | Rust | Low |
| I9 | No crash durability — slot state pure in-memory until step 7 of process_intent | Bridge | **Critical** |
| I10 | seen_event_ids lost on restart — events double-processed | Rust | **Critical** |
| I10 | seen_event_ids lost on restart — events double-processed**✅ FIXED `c87ca78` + `3ca154e`** (IndexSet dedup + snapshot; startup restore wired) | Rust | **Critical** |
| I11 | No idempotency key sent to BingX — lost response creates duplicate orders | Venue | **High** |
| I12 | No graceful degradation for ANY subsystem | All | **High** |
| I13 | Stray venue event can reactivate CLOSED slot — no guard | Rust | **High** |
| I14 | No reconcile_from_slots call on startup — Zinc state never loaded into kernel | Restart | **High** |
| I15 | CANCEL_REJECT doesn't clear active_exit_order — slot stuck in EXIT_WORKING | Rust | Medium |
| I13 | Stray venue event can reactivate CLOSED slot — no guard**✅ FIXED `c87ca78`** | Rust | **High** |
| I14 | No reconcile_from_slots call on startup — Zinc state never loaded into kernel**✅ FIXED `3ca154e`** | Restart | **High** |
| I15 | CANCEL_REJECT doesn't clear active_exit_order — slot stuck in EXIT_WORKING**✅ FIXED `9a8d1b9`** | Rust | Medium |
| I16 | Zinc shared memory world-readable/writable by same-machine processes | Zinc | **High** |
| I17 | KernelSlotView unrestricted getattr/setattr — bypasses all FSM guards | Bridge | **High** |
| I18 | sys.path.insert(0) at import time in 3 production files — malicious module loading | Build | **High** |
@@ -320,7 +340,7 @@
| # | Flaw | Layer | Severity |
|---|------|-------|----------|
| N1 | Rust kernel `with_handle_mut` zero sync — `&mut` from raw ptr, UB on concurrent FFI | Rust | **Critical** |
| N1 | Rust kernel `with_handle_mut` zero sync — `&mut` from raw ptr, UB on concurrent FFI*mitigated by Python GIL (single-threaded caller); catch_unwind added `c87ca78`* | Rust | **Critical** |
| N2 | `_run()` has two completely different code paths — runtime branch, not design | Venue | **Critical** |
| N3 | `_run()` path B blocks event loop thread for every venue HTTP operation | Venue | **Critical** |
| N4 | `asyncio.run()` called repeatedly — creates/destroys event loops per call | Venue | **Critical** |
@@ -339,16 +359,16 @@
| # | Flaw | Layer | Severity |
|---|------|-------|----------|
| O1 | `_maybe_close()` asyncio.run without loop guard — close skipped from async context | Launcher | **High** |
| O1 | `_maybe_close()` asyncio.run without loop guard — close skipped from async context**✅ FIXED `338811e`** | Launcher | **High** |
| O2 | `async def connect()` shims call sync venue.connect() without await — blocking | Test | Medium |
| O3 | `_contract_rows(client)` NOT awaited — `_pick_live_symbol` iterates coroutine = crash | Test | **High** |
| O4 | Deprecated `get_event_loop().run_until_complete()` in test file | Test | Medium |
| O5 | `_run()` thread pool .result() no timeout — backend hang freezes process | Venue | **High** |
| O5 | `_run()` thread pool .result() no timeout — backend hang freezes process**✅ FIXED `338811e`** | Venue | **High** |
| O6 | MockVenueAdapter never exercises thread-pool bridge — untested in CI | Venue | Medium |
| O7 | `_keepalive_loop`/`_rotation_sentinel` fire-and-forget — exceptions silently lost | Stream | Low |
| O8 | `KernelSlotView.__getattr__` N FFI calls for N fields — no caching | Bridge | Medium |
| O9 | `DITAv2LauncherBundle` no `__del__` — GC'd bundle leaks resource tree | Launcher | Medium |
| O10 | `ExecutionKernel` no `close()` — Rust handle only freed by unpredictable __del__ | Bridge | Medium |
| O10 | `ExecutionKernel` no `close()` — Rust handle only freed by unpredictable __del__**✅ FIXED `3ca154e`** | Bridge | Medium |
| O11 | `KernelSlotView.__setattr__` triggers 5 persistence side effects — no read-only view | Bridge | Medium |
---
@@ -486,6 +506,29 @@
---
## V-Series: Startup/Shutdown Lifecycle, Rust Kernel Subtleties, Generated Test Infra (Pass 19)
*Full detail in TRACE doc under "PASS 19 — STARTUP/SHUTDOWN LIFECYCLE, RUST KERNEL SUBTLETIES, GENERATED TEST INFRA."*
| # | Flaw | Layer | Severity |
|---|------|-------|----------|
| V1 | `DITAv2LauncherBundle.close()` never calls `kernel.close()` Rust handle leaks via `__del__` | Launcher | **Critical** |
| V2 | `BingxVenueAdapter` no `close()`/`disconnect()` ThreadPoolExecutor/HTTP never release | Venue | **Critical** |
| V3 | `process_intent` ENTER doesn't clear `seen_event_ids` old dedup pollutes new trade | Rust | **High** |
| V4 | 3 generators write same output file last writer wins, incompatible prologues | Test | **Critical** |
| V5 | Generated tests triple env-gated never run in CI, dead code | Test | **Critical** |
| V6 | `kernel.close()` destroys Rust handle immediately no drain, no flush, UAF risk | Bridge | **Critical** |
| V7 | `_last_settled_pnl` dict accessed from process_intent and on_venue_event without locks | Bridge | Medium |
| V8 | `#[serde(default)] leverage: f64` default 0.0 mark_price uses directly no .max(1.0) | Rust | Medium |
| V9 | No `conftest.py`, no `pytest.ini`, no `asyncio_mode` test discovery fragile | Test | **High** |
| V10 | `kernel.close()` `except Exception: pass` silently swallows destroy errors | Bridge | Low |
| V11 | `build_launcher_bundle()` no cleanup on partial failure OOM orphans 4 components | Launcher | Medium |
| V12 | `KernelResult` clones entire kernel state every FFI call wasted allocations | Rust | Medium |
| V13 | `_build_rb()` leaks bundle on post-creation failure | Test | Low |
| V14 | `_maybe_close` breaks after first method never tries both close and disconnect | Launcher | Low |
---
## H-Series: Edge Domains — Dependencies, Error Handling, Types, Contracts (Pass 5)
*Full detail in TRACE doc under "PASS 5 — EDGE DOMAINS."*