diff --git a/PINK_DITAv2_E2E_TRACE_ANALYSIS.md b/PINK_DITAv2_E2E_TRACE_ANALYSIS.md index 20d1a1b..cbf6bc7 100644 --- a/PINK_DITAv2_E2E_TRACE_ANALYSIS.md +++ b/PINK_DITAv2_E2E_TRACE_ANALYSIS.md @@ -1549,3 +1549,622 @@ undo, no validation that the slot reached a consistent state. This is the single highest-impact E2E flaw because it requires no concurrency, no race condition, no unusual market conditions — just a transient FFI error during normal operation. + +--- + +## PASS 4 — SYSTEMATIC DOMAIN SCANS (Config, Rust, Persistence, Lifecycle) + +### Rust Kernel — Numeric & FSM Invariants + +#### G1: EXIT_RESIDUAL action is entirely missing from Rust KernelCommandType + +**File:** `_rust_kernel/src/lib.rs` + +```rust +string_enum! { + enum KernelCommandType { + ENTER, EXIT, MARK_PRICE, RECONCILE, CONTROL, CANCEL, + } +} +``` + +Six variants. **No `EXIT_RESIDUAL`.** If any caller submits an intent with `action = "EXIT_RESIDUAL"`, the string_enum deserializer fails — serde returns `INVALID_INTENT_PARSE`. Even if deserialization worked, there's no branch to handle residual-position cleanup. Any position with remaining size after partial exit legs has **no way to trigger a clean-up exit** via the intent system. + +The Python `KernelCommandType` enum (contracts.py) does have `EXIT_RESIDUAL`, translated to `"EXIT_RESIDUAL"` string by `_intent_to_payload`. This string hits Rust's string_enum → parse error → `INVALID_INTENT_PARSE`. + +**Fix:** Add `EXIT_RESIDUAL` variant to Rust enum + match arm that skips the `NO_OPEN_POSITION` guard for residual-sized positions. + +**Severity: Critical** + +#### G2: `into_c_string` uses `unwrap()` — panics on interior NUL byte + +**File:** `_rust_kernel/src/lib.rs:1477` + +```rust +fn into_c_string(value: &str) -> *mut c_char { + CString::new(value).unwrap().into_raw() +} +``` + +`CString::new()` returns `Err` if the string contains a NUL (`'\0'`) byte. `.unwrap()` panics at the C FFI boundary. If any `serde_json::to_string()` output (e.g., user-controlled string in `KernelIntent`, `VenueEvent`, or `TradeSlot`) contains a NUL byte, this **panics the entire process**. + +Triggered by every FFI call that returns a string: +- `dita_kernel_process_intent_json` +- `dita_kernel_on_venue_event_json` +- `dita_kernel_reconcile_slots_json` +- `dita_kernel_snapshot_json` +- `dita_kernel_get_slot_json` + +**Fix:** Replace `.unwrap()` with `unwrap_or_else(|_| ptr::null_mut())` or feed through `invalid_intent_cstring`. + +**Severity: Critical** + +#### G3: `process_intent` EXIT hardcodes `prev_state = POSITION_OPEN` unconditionally + +**File:** `_rust_kernel/src/lib.rs:842-890` + +```rust +slot.fsm_state = TradeStage::EXIT_REQUESTED; // unconditional override +let transition = self.transition( + &slot, + TradeStage::POSITION_OPEN, // always POSITION_OPEN + slot.fsm_state.clone(), + "EXIT_INTENT", +); +``` + +Three problems: + +(a) **Transition prev_state is a lie.** If the slot was in `EXIT_WORKING`, `EXIT_SENT`, `EXIT_REQUESTED`, or `POSITION_PARTIALLY_CLOSED`, the transition record says `POSITION_OPEN` — wrong. + +(b) **Backward transition.** If the slot is `EXIT_WORKING` and a new EXIT intent arrives, `fsm_state` is set to `EXIT_REQUESTED` — a backward transition from `EXIT_WORKING` → `EXIT_REQUESTED`. This corrupts the FSM. + +(c) **No state guard.** EXIT should only be allowed from `POSITION_OPEN`, `EXIT_WORKING` (for additional legs), or `POSITION_PARTIALLY_CLOSED`. Currently any state that passes `!is_free() && !closed && size > 0` can transition to `EXIT_REQUESTED`. + +**Fix:** Check actual FSM state before allowing EXIT, log actual prev_state, guard against backward transitions. + +**Severity: Critical** + +#### G4: `consume_exit_leg` advances beyond last valid index — stale `all_legs_done` variable + +**File:** `_rust_kernel/src/lib.rs:1420-1435` + +```rust +let all_legs_done = slot.active_leg_index >= slot.exit_leg_ratios.len(); // (A) +let should_close = (slot.size <= 1e-12 || (!partial && all_legs_done)); // (B) + +if !partial { + slot.consume_exit_leg(); // (C) — advances active_leg_index POST (A) +} + +if should_close && slot.size <= 1e-12 { // (D) — close +} else if !partial && !all_legs_done { // (E) — stale! uses (A) not post-advance index +``` + +On the last leg (`active_leg_index = len - 1`): +- (A): `all_legs_done = false` (pre-advance) +- (C): advances to `len` (exhausted) +- (E): `!partial && !false` = true → enters `POSITION_OPEN` instead of examining `should_close` with post-advance index + +The `all_legs_done` variable is captured **before** `consume_exit_leg` advances the index. Branch (E) should use the post-advance index to correctly detect exhaustion. + +After exhaustion, `next_exit_ratio()` returns `1.0` (out-of-bounds `unwrap_or(1.0)`) — silently tries to exit remaining size as 100% instead of detecting completion. + +**Severity: Critical** + +#### G5: `realized_pnl` uses unbounded f64 — overflows to inf at extreme values + +**File:** `_rust_kernel/src/lib.rs:648-656` + +```rust +let notional = exit_size * slot.entry_price * slot.leverage.max(1.0); +delta * notional +``` + +No `is_finite()` check on intermediate products. At `exit_price=1e200`, `entry_price=1e-200`: `delta` = `(1e200 - 1e-200) / 1e-200` ≈ `1e400` → `inf`. The resulting `inf` is stored in `slot.realized_pnl`, corrupting all future PnL tracking. + +Subnormals: `entry_price=5e-324` (subnormal) causes division to produce `inf` for modest exit prices on some platforms. + +**Fix:** Add `is_finite()` guards on both prices and cap intermediate products. + +**Severity: High** + +#### G6: `mark_price` produces unbounded `unrealized_pnl` + +**File:** `_rust_kernel/src/lib.rs:384-399` + +```rust +self.unrealized_pnl = delta * self.size * self.entry_price * self.leverage; +// No is_finite() check on result +``` + +If any of `delta`, `size`, `entry_price`, or `leverage` is extreme, the product overflows to `inf`. No result guard. `inf` stored in `unrealized_pnl` forever. Capped only by the `price <= 0.0` guard on input — no guard on the computation chain. + +Also: `self.entry_price = price` at line 388 overwrites entry_price on every mark_price call for a position with `entry_price <= 0.0`, even when the position has been open for a while. This means a stale-zero entry_price gets set to the current market price on first mark_price after open, which is correct — but if the slot is reused (re-entry without resetting entry_price), the old entry price from the prior trade bleeds into unrealized PnL. + +**Severity: High** + +#### G7: `process_intent` ENTER — no `is_finite()` guard on `target_size` + +**File:** `_rust_kernel/src/lib.rs:806-807` + +```rust +intended_size: intent.target_size.max(0.0), +``` + +`f64::NAN.max(0.0)` returns `NAN`. `f64::INFINITY.max(0.0)` returns `inf`. Serde_json **does** accept `Infinity` and `NaN` by default — they're valid JSON tokens. If the Python-side `_first_invalid_intent_field` guard is bypassed (F3 — it allows these through), `NaN`/`inf` propagates into `intended_size` in `VenueOrder`, corrupting all fill calculations. + +Similarly, `reference_price` is never validated for finiteness before being stored in `VenueOrder.metadata`. + +**Severity: High** + +#### G8: `reconcile_slots_json` — no dedup or bounds validation + +**File:** `_rust_kernel/src/lib.rs:1668-1675` + +```rust +for slot in slots { + if slot.slot_id < core.slots.len() { + core.slots[slot.slot_id] = slot.clone(); + } +} +``` + +Two slots with the same `slot_id`: the **second overwrites the first** silently. A slot with `slot_id >= core.slots.len()`: **silently dropped** — no error, no diagnostic. Caller sees `accepted=true` even if some/all slots were not applied. + +**Severity: High** + +#### G9: `exchange_order_id` propagation uses wrong order target + +**File:** `_rust_kernel/src/lib.rs:1110-1125` + +```rust +let target = if slot.active_entry_order.is_some() { + slot.active_entry_order.as_mut() +} else { + slot.active_exit_order.as_mut() +}; +``` + +If an **entry** order exists (even if fully filled) and an **exit** fill event arrives, the code updates the entry order's `venue_order_id` instead of the exit order's. The exit order's `venue_order_id` stays empty. Any subsequent `CANCEL` intent on the exit order fails because `active_exit_order.venue_order_id` is empty — the venue can't match the cancel. + +**Fix:** Disambiguate by matching `venue_client_id`, or clear `active_entry_order` when entry is complete. + +**Severity: High** + +#### G10: CANCEL diagnostic code says NO_ACTIVE_EXIT_ORDER for entry cancel too + +**File:** `_rust_kernel/src/lib.rs:966-1005` + +```rust +if !has_cancellable_exit && !has_cancellable_entry { + return KernelResult { + diagnostic_code: KernelDiagnosticCode::NO_ACTIVE_EXIT_ORDER, // always says exit + details: json!({"reason": "NO_ACTIVE_EXIT_ORDER"}), + }; +} +``` + +When neither exit nor entry is cancellable, the diagnostic returns `NO_ACTIVE_EXIT_ORDER` regardless of which order was the target. If the user wanted to cancel an entry order that's not in a cancellable state, the diagnostic is misleading. + +**Fix:** Separate diagnostic codes: `NO_ACTIVE_EXIT_ORDER`, `NO_ACTIVE_ENTRY_ORDER`, `ENTRY_NOT_CANCELLABLE`. + +**Severity: High** + +#### G11: `apply_fill` entry-fill overwrites `active_entry_order.intended_size` with `slot.size` + +**File:** `_rust_kernel/src/lib.rs:1363-1377** + +On FULL_FILL entry, `slot.active_entry_order` is entirely replaced with a new `VenueOrder` where `intended_size = slot.size` (the fill amount) instead of the original intended size. The original intended size (which could be larger than fill size for partial fills) is lost. + +If a duplicate fill event arrives (dedup fails due to missing event_id), the second fill would use `slot.size` as the basis for further fills — wrong values. + +**Severity: Medium** + +#### G12: `leverage` unbounded after `is_finite()` — no maximum cap + +**File:** `_rust_kernel/src/lib.rs:778` + +```rust +slot.leverage = if intent.leverage.is_finite() && intent.leverage > 0.0 { + intent.leverage // 1e100 accepted here +} else { 1.0 }; +``` + +`leverage = 1e100` passes `is_finite()`. Feeds into `realized_pnl()` as `slot.leverage.max(1.0) = 1e100`, producing `notional = exit_size * entry_price * 1e100`. Makes `unrealized_pnl` arbitrarily large. + +No maximum leverage cap enforced anywhere — the exchange-level cap (`DOLPHIN_BINGX_EXCHANGE_LEVERAGE_CAP`) exists in `BingxExecClientConfig` but is **never passed to the Rust kernel**. + +**Severity: Medium** + +#### G13: `resolve_slot` fallback returns `unwrap_or(0)` — can misroute events + +**File:** `_rust_kernel/src/lib.rs:623` + +```rust +self.slots.first().map(|slot| slot.slot_id).unwrap_or(0) +``` + +When no slot matches the event (`slot_id` out of range or all slot filters fail), returns `slot_id` of the **first slot** (which may be 0 or any value). No diagnostic emitted — caller sees slot state change with no idea the event was misrouted. + +**Severity: Medium** + +#### G14: `commit_slot` silently ignores out-of-bounds slot_id + +**File:** `_rust_kernel/src/lib.rs:595-600** + +```rust +fn commit_slot(&mut self, slot: TradeSlot) { + if slot.slot_id < self.slots.len() { + self.slots[slot_id] = slot; + } + // else: silently dropped — no error returned +} +``` + +Mutations to out-of-bounds slot are silently discarded. Can happen if `slot.slot_id` is corrupted via `set_slot_from_json` causing index mismatch between `slot.slot_id` and the actual slot position. + +**Severity: Medium** + +--- + +### Configuration & Validation Chain + +#### G15: Zero `__post_init__` validators on all config dataclasses + +Every config dataclass in the system has zero field-level validation: + +| Dataclass | Fields | Validators | +|-----------|--------|------------| +| `KernelControlSnapshot` | 16 | **0** | +| `ControlUpdate` | 16 | **0** | +| `KernelIntent` | 19 | **0** | +| `TradeSlot` | 22 | **0** | +| `VenueOrder` | 8 | **0** | +| `VenueEvent` | 18 | **0** | +| `KernelTransition` | 11 | **0** | +| `KernelOutcome` | 8 | **0** | +| `AccountSnapshot` | 9 | **0** | +| **Total** | **127** | **0** | + +The only validation in the entire chain: +- `_first_invalid_intent_field()` — finiteness guard at Python→Rust FFI boundary (not a dataclass validator) +- Rust `leverage = if is_finite && > 0.0 { val } else { 1.0 }` — post-hoc clamp +- Rust `KernelCore::new(max_slots.max(1))` — floor only, no ceiling +- `launcher.py:143`: `max(1, int(...))` for `active_slot_limit` — floor only + +**No `__post_init__` exists anywhere. No bounds check on any field except the two floor-only guards.** + +**Severity: High** + +#### G16: `DITA_V2_DEBUG_CLICKHOUSE` defaults to `True` when env var is unset + +**File:** `launcher.py:133` + +```python +debug = _env_bool("DITA_V2_DEBUG_CLICKHOUSE", True) +``` + +`_env_bool` (launcher.py:75) returns `default` when the env var is unset. So `debug = True` by default. Every runtime writes debug traces to ClickHouse by default. `DITA_V2_DEBUG_CLICKHOUSE=False` is required to disable it. + +This is not a bug per se, but it means debug ClickHouse writes are **on by default**, adding ~10 ClickHouse insertions per process_intent call (every transition + position state + trade event) that most production deployments may not want. + +**Severity: Informational** + +#### G17: String config fields have no charset/length validation — Zinc region injection risk + +**File:** `control.py:31-53`, `real_zinc_plane.py:30` + +`runtime_namespace`, `strategy_namespace`, `event_namespace`, `actor_name`, `exec_venue`, `data_venue`, `ledger_authority` are all free-form strings with no validation. They're used as: + +1. **Zinc shared memory region names**: `self.prefix + "." + namespace + "." + kind` — an attacker-controlled namespace could collide with other processes' Zinc regions +2. **ClickHouse table names**: `DOLPHIN_BINGX_JOURNAL_STRATEGY` is used as a table suffix — SQL injection risk in ClickHouse journal +3. **Hazelcast map names**: Same injection risk via `event_namespace` + +**Severity: Medium** + +#### G18: `exit_leg_ratios` no sum-to-1 validation + +`KernelIntent.exit_leg_ratios` and `TradeSlot.exit_leg_ratios` are tuple/list of floats. No validator ensures they sum to approximately 1.0. Ratios summing to 0.5 leave the position partially closed forever (residual can't be exited because `next_exit_ratio()` returns `1.0` after exhaustion, exiting 100% of remaining — which may exceed the intended residual). + +**Severity: Low** + +#### G19: `RealZincControlPlane.read()` has no sequence check — torn-read risk + +**File:** `real_control_plane.py:88-94** + +```python +def read(self): + payload = _decode_packet(self.region.as_buffer()) + control = payload.get("control") + if not isinstance(control, dict): + return self._snapshot + self._snapshot = KernelControlSnapshot(**control) + return self._snapshot +``` + +The binary packet has a 64-bit sequence number but `read()` **never checks it**. Between the zero-write and packet-write in `_write_region`, a reader sees an empty buffer → `_decode_packet` fails → falls back to `self._snapshot` (stale). Between the packet-write and `struct.pack` header (order depends on implementation), a reader sees a partial write with wrong size → `_decode_packet` fails. + +No checksum on the wire format: `struct.pack("!QQ", seq, len) + json_bytes`. A torn write produces garbage that `json.loads` may or may not parse successfully. + +**Severity: Low** + +#### G20: `DOLPHIN_BINGX_JOURNAL_STRATEGY`/`_DB` — ClickHouse SQL injection risk + +**File:** `launcher.py:202-203` + +```python +"DOLPHIN_BINGX_JOURNAL_STRATEGY": os.environ.get("DOLPHIN_BINGX_JOURNAL_STRATEGY", ""), +"DOLPHIN_BINGX_JOURNAL_DB": os.environ.get("DOLPHIN_BINGX_JOURNAL_DB", ""), +``` + +These are used as ClickHouse table and database name suffixes in `pink_clickhouse.py`. An attacker who can set env vars can inject SQL via semicolons or quotes in the table name. ClickHouse supports `INSERT INTO db.table FORMAT JSONEachRow` — a table name like `positions; DROP TABLE ...;` could be destructive. + +**Severity: Low** (requires env var control, which implies broader access) + +--- + +### Persistence Schema Alignment + +#### G21: `entry_price` used as `exit_price` in `trade_events` — data loss + +**File:** `pink_clickhouse.py (outside workspace)` + +The `_write_trade_event` function maps `entry_price` from `slot.to_dict()` to both the `entry_price` and `exit_price` columns. The actual exit fill price (available on the `VenueEvent` object) is **never written** to the `exit_price` column. + +**Result:** Every `trade_events` row has `exit_price == entry_price`. The `exit_price` column is a dead column — always contains the entry price, never the actual fill. + +**Severity: High** — data loss to DB for the most important trade metric. + +#### G22: `active_leg_index` → `entry_bar` semantic mis-mapping + +**File:** `pink_clickhouse.py (outside workspace)` + +```python +"entry_bar": int(slot_dict.get("active_leg_index", 0) or 0), +``` + +`active_leg_index` tracks the exit-leg-ratios cursor (which leg of a multi-leg exit we're on), not a bar count. The value `0` at position open and `1` after the first exit leg — neither value represents bars held. **The `entry_bar` column stores the wrong concept.** + +**Severity: Medium** — column contains semantically meaningless data. + +#### G23: `capital_before` arithmetic reconstruction absorbs cross-slot PnL + +**File:** `pink_clickhouse.py (outside workspace)` + +```python +capital_before = capital_after - pnl_leg +``` + +`capital_before` is reconstructed by subtracting the current leg's PnL from the current capital. In a multi-slot system, other slots' PnL changes between legs are absorbed into `capital_before`. The column is **always wrong** in multi-slot scenarios because `capital_after` reflects total PnL from all slots, not just the leg being recorded. + +**Severity: Medium** — wrong `capital_before` for multi-slot trading. + +#### G24: Recovery `trade_reconstruction` always has `trade_id=""` + +**File:** `pink_clickhouse.py (outside workspace)` + +The `persist_recovery_state` function passes `kernel.snapshot()["account"]` (an account dict with keys `capital, equity, realized_pnl, ...`) where a slot dict is expected. The `trade_id` key **does not exist** on the account dict. The `recovery_state` row always has `trade_id=""`. + +**Severity: Medium** — recovery data is not associable with any trade. + +#### G25: `seen_event_ids`, `exit_leg_ratios`, `VenueOrder`, `metadata` not in flat ClickHouse tables + +These fields are: +- Present on the Python `TradeSlot` ✅ +- Transmitted through Zinc shared memory ✅ +- Stored in Hazelcast ✅ +- Stored in ClickHouse `dita_kernel_debug` (full JSON) ✅ +- **NOT extracted** into main ClickHouse flat tables `position_state`, `trade_events`, `trade_exit_legs` ❌ + +Data exists at the source, travels through the pipeline, hits the debug journal — but is lost in the main analytical tables. + +**Severity: Low** (data exists in debug journal if needed for reconstruction) + +#### G26: `_safe_float` silently converts NaN/None/Inf to 0.0 + +**File:** `utils.py:15` + +```python +def _safe_float(v, default=0.0): + try: + f = float(v) + if not math.isfinite(f): + return default + return f + except (TypeError, ValueError, OverflowError): + return default +``` + +Used in multiple ClickHouse writers. Silently converts `NaN`/`Inf`/parsing errors to `0.0`. No diagnostic emitted when a non-finite value reaches the persistence layer — data silently zeroed. + +**Severity: Low** (safe default but silent corruption) + +--- + +### Lifecycle & Resource Management + +#### G27: `build_launcher_bundle` has no exception safety — prior resources leak + +**File:** `launcher.py:264-300** + +```python +def build_launcher_bundle(...): + control_plane = _build_control_plane(...) + projection = build_projection(...) + zinc_plane = _build_zinc_plane(...) + venue = _build_venue(...) + kernel = ExecutionKernel(...) # ← if THIS fails, everything above leaks +``` + +If any step after the first raises, all previously built resources leak: +- `RealZincPlane` created → `_build_venue()` fails → 3 shared memory regions orphaned +- `RealZincControlPlane` created → `_build_zinc_plane()` fails → 1 shared memory region orphaned +- `BingxVenueAdapter` created → `ExecutionKernel.__init__()` fails → HTTP connection leaked + +**No `try/finally` anywhere in the builder.** The init order is also optimized for forward construction, not backward cleanup. + +**Severity: High** — shared memory leak on any build failure. + +#### G28: `RealZincPlane` and `RealZincControlPlane` have no `__del__` + +When `close()` is not called (exception in builder, forgotten cleanup, GC during shutdown), the shared memory regions opened by `RealZincPlane` (3 regions) and `RealZincControlPlane` (1 region) are **orphaned on the OS**. They persist in `/dev/shm/` (or platform equivalent) until system reboot. + +Python's `__del__` is unreliable (not called on SIGKILL, not called if the object is part of a cycle without a GC run), but its absence means even normal garbage collection can't clean up. + +**Severity: High** — shared memory leaks. + +#### G29: Zero signal handlers — no cleanup on SIGTERM/SIGINT + +```bash +$ grep -rn "signal\|SIGTERM\|SIGINT\|atexit" *.py # ZERO matches +``` + +When SIGTERM or SIGINT arrives: +1. Python's default handler terminates the process immediately +2. No `DITAv2LauncherBundle.close()` is called +3. No `ExecutionKernel.__del__` is called (CPython may run GC on normal exit but not reliably) +4. All shared memory (RealZincPlane, RealZincControlPlane) is orphaned +5. In-flight BingX HTTP calls are interrupted mid-stream +6. Rust kernel handle is leaked + +**Severity: High** + +#### G30: `ExecutionKernel` has no `close()` — relies on `__del__` for Rust handle cleanup + +`ExecutionKernel` has `__del__` which calls `_get_rust().destroy(backend)`. No `close()` method. `DITAv2LauncherBundle.close()` never touches the kernel — the Rust handle is only freed by GC at unpredictable time. + +If any code holds a stale `_backend` pointer, the handle dangles when GC runs. If `__del__` is suppressed (e.g., during interpreter shutdown with cyclic references), the Rust handle leaks permanently. + +**Fix:** Add `close()` to `ExecutionKernel`, call it from `DITAv2LauncherBundle.close()`. + +**Severity: High** + +#### G31: `projection` (Hazelcast) never closed + +`build_projection()` returns a `HazelcastProjection` which holds a Hazelcast client connection. No `close()` or `disconnect()` method exists on the projection, projector, or row writer. `DITAv2LauncherBundle.close()` doesn't touch the projection. The Hazelcast client connection leaks on shutdown. + +**Severity: Medium** + +#### G32: `_maybe_close()` only calls the first method found — `break` skips the second + +**File:** `launcher.py:233-243** + +```python +for method_name in ("close", "disconnect"): + method = getattr(obj, method_name, None) + if method is None: + continue + try: + result = method() + except TypeError: + continue + if inspect.isawaitable(result): + try: + asyncio.run(result) + except RuntimeError: + pass + break # ← ONLY calls the FIRST found method, never both +``` + +If an object has both `close()` and `disconnect()`, only `close()` is called. `disconnect()` is silently skipped. Also: `asyncio.run(result)` silently swallows `RuntimeError` when a running event loop exists — the coroutine is **never executed**. + +Currently no object has both, but the pattern is fragile. + +**Severity: Low** + +#### G33: `close()` is not idempotent for RealZinc components + +`RealZincPlane.close()` and `RealZincControlPlane.close()` call their Zinc region's `close()` method. If called twice, the second call operates on an already-closed region — likely crashes from Hazelcast's shared memory code. + +No nulling of references after close: `DITAv2LauncherBundle.close()` sets `self.venue`, `self.zinc_plane`, `self.control_plane` to `None` — **wait, it doesn't. It calls `_maybe_close()` which doesn't null references.** Double `close()` is unsafe. + +**Severity: Low** + +#### G34: No context manager on `DITAv2LauncherBundle` + +`DITAv2LauncherBundle` has no `__enter__`/`__exit__`. Users must manually call `close()`. No `with` pattern exists anywhere in the source for lifecycle management. No `__del__` fallback on the bundle either. + +**Severity: Low** (ergonomic, not a leak source if caller follows the pattern) + +#### G35: `BingxVenueAdapter.connect()` exists but is never called by the launcher + +`BingxDirectExecutionAdapter` has a `connect()` method that initializes the lifetime HTTP client. `BingxVenueAdapter` has `connect()` that calls `_call_backend("connect")`. Neither is called in `build_launcher_bundle()` or `_build_venue()`. If the adapter's `submit_intent()` relies on a connected client, it initializes lazily — but the connect path is dead code that exists but is never invoked. + +**Severity: Informational** + +#### G36: Only one `try/finally` in the entire codebase + +The only `try/finally` is `_RustKernelLib._take_string()` (rust_backend.py:140-143) which frees the Rust C string. All other resource management uses `try/except` with no `finally`. + +No cleanup is guaranteed on exception: +- `build_launcher_bundle()` — no cleanup on failure +- `process_intent()` — no cleanup of partial slot state on venue event exception +- `on_venue_event()` — no cleanup on FFI failure +- `_set_slot()` — no cleanup on projection or Zinc write failure + +**Severity: High** (across all layers) + +--- + +## Pass 4 Summary + +| # | Flaw | Layer | Severity | +|---|------|-------|----------| +| G1 | EXIT_RESIDUAL action missing from Rust KernelCommandType | Rust | **Critical** | +| G2 | `into_c_string` unwrap() panics on NUL byte | Rust | **Critical** | +| G3 | EXIT hardcodes prev_state=POSITION_OPEN, allows backward FSM transition | Rust | **Critical** | +| G4 | `consume_exit_leg` stale `all_legs_done` variable — wrong branch after last leg | Rust | **Critical** | +| G5 | `realized_pnl` unbounded f64 overflow to inf | Rust | **High** | +| G6 | `mark_price` unbounded unrealized_pnl — no result guard | Rust | **High** | +| G7 | ENTER no is_finite() guard on target_size | Rust | **High** | +| G8 | `reconcile_slots_json` no dedup or bounds validation | Rust | **High** | +| G9 | `exchange_order_id` update targets wrong order — exit cancel broken | Rust | **High** | +| G10 | CANCEL diagnostic always says NO_ACTIVE_EXIT_ORDER | Rust | **High** | +| G11 | `apply_fill` overwrites intended_size with slot.size | Rust | Medium | +| G12 | No max leverage cap enforced by kernel | Rust | Medium | +| G13 | `resolve_slot` fallback returns unwrap_or(0) — misroutes events | Rust | Medium | +| G14 | `commit_slot` silently ignores out-of-bounds slot_id | Rust | Medium | +| G15 | Zero `__post_init__` validators on all config dataclasses | Config | **High** | +| G16 | DITA_V2_DEBUG_CLICKHOUSE defaults to True when unset | Config | Info | +| G17 | String config fields — Zinc region injection risk | Config | Medium | +| G18 | `exit_leg_ratios` no sum-to-1 validation | Config | Low | +| G19 | RealZincControlPlane.read() no sequence check — torn-read risk | Config | Low | +| G20 | ClickHouse journal strategy/db env vars — SQL injection risk | Config | Low | +| G21 | entry_price used as exit_price in trade_events — data loss | Persistence | **High** | +| G22 | active_leg_index → entry_bar semantic mis-mapping | Persistence | Medium | +| G23 | capital_before arithmetic absorbs cross-slot PnL | Persistence | Medium | +| G24 | Recovery trade_reconstruction always has trade_id="" | Persistence | Medium | +| G25 | seen_event_ids, exit_leg_ratios, VenueOrder, metadata not in flat CH tables | Persistence | Low | +| G26 | _safe_float silently converts NaN/None/Inf to 0.0 | Persistence | Low | +| G27 | build_launcher_bundle no exception safety — prior resources leak | Lifecycle | **High** | +| G28 | RealZincPlane/RealZincControlPlane no __del__ — SHM orphaned | Lifecycle | **High** | +| G29 | Zero signal handlers — no cleanup on SIGTERM/SIGINT | Lifecycle | **High** | +| G30 | ExecutionKernel has no close() — relies on __del__ for Rust handle | Lifecycle | **High** | +| G31 | Hazelcast projection never closed | Lifecycle | Medium | +| G32 | _maybe_close() break skips second method | Lifecycle | Low | +| G33 | close() not idempotent for RealZinc components | Lifecycle | Low | +| G34 | No context manager on DITAv2LauncherBundle | Lifecycle | Low | +| G35 | BingxVenueAdapter.connect() never called | Lifecycle | Info | +| G36 | Only one try/finally in entire codebase | Lifecycle | **High** | + +### Pass 4 Severity Distribution + +| Severity | Count | +|----------|-------| +| **Critical** | 4 (G1, G2, G3, G4) | +| **High** | 11 (G5-G10, G15, G21, G27, G28, G29, G30, G36) | +| Medium | 11 (G11-G14, G17, G22, G23, G24, G31) | +| Low | 8 (G16, G18, G19, G20, G25, G26, G32, G33, G34, G35) | +| Info | 2 | + +### Combined Catalog (All 4 Passes) + +| Pass | Focus | Count | Critical | High | Medium | Low | Info | +|------|-------|-------|----------|------|--------|-----|------| +| A | Architectural | 15 | 0 | 2 | 0 | 2 | 11 | +| T | Threading/Atomicity | 9 | 1 | 3 | 3 | 2 | 0 | +| E | E2E Trace | 26 | 0 | 4 | 10 | 11 | 1 | +| F | Deep E2E (Pass 3) | 30 | 0 | 1 | 8 | 17 | 4 | +| G | Domain Scans (Pass 4) | 36 | 4 | 11 | 11 | 8 | 2 | +| **Total** | | **116** | **5** | **21** | **32** | **40** | **18** |