PINK: E2E trace analysis — Pass 4 domain scans (G1-G36)

Four systematic passes covering Rust kernel invariants (4 criticals — missing
EXIT_RESIDUAL action, unwrap() panic on NUL, backward FSM transition, stale
all_legs_done variable), config validation chain (zero validators on 127 fields),
persistence schema drift (7 confirmed field-level mismatches), and lifecycle
management (no signal handlers, no __del__, no exception safety in builder).

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
This commit is contained in:
Codex
2026-06-01 14:26:36 +02:00
parent d475e9246b
commit d9dd54c24e

View File

@@ -1549,3 +1549,622 @@ undo, no validation that the slot reached a consistent state.
This is the single highest-impact E2E flaw because it requires no concurrency,
no race condition, no unusual market conditions — just a transient FFI error
during normal operation.
---
## PASS 4 — SYSTEMATIC DOMAIN SCANS (Config, Rust, Persistence, Lifecycle)
### Rust Kernel — Numeric & FSM Invariants
#### G1: EXIT_RESIDUAL action is entirely missing from Rust KernelCommandType
**File:** `_rust_kernel/src/lib.rs`
```rust
string_enum! {
enum KernelCommandType {
ENTER, EXIT, MARK_PRICE, RECONCILE, CONTROL, CANCEL,
}
}
```
Six variants. **No `EXIT_RESIDUAL`.** If any caller submits an intent with `action = "EXIT_RESIDUAL"`, the string_enum deserializer fails — serde returns `INVALID_INTENT_PARSE`. Even if deserialization worked, there's no branch to handle residual-position cleanup. Any position with remaining size after partial exit legs has **no way to trigger a clean-up exit** via the intent system.
The Python `KernelCommandType` enum (contracts.py) does have `EXIT_RESIDUAL`, translated to `"EXIT_RESIDUAL"` string by `_intent_to_payload`. This string hits Rust's string_enum → parse error → `INVALID_INTENT_PARSE`.
**Fix:** Add `EXIT_RESIDUAL` variant to Rust enum + match arm that skips the `NO_OPEN_POSITION` guard for residual-sized positions.
**Severity: Critical**
#### G2: `into_c_string` uses `unwrap()` — panics on interior NUL byte
**File:** `_rust_kernel/src/lib.rs:1477`
```rust
fn into_c_string(value: &str) -> *mut c_char {
CString::new(value).unwrap().into_raw()
}
```
`CString::new()` returns `Err` if the string contains a NUL (`'\0'`) byte. `.unwrap()` panics at the C FFI boundary. If any `serde_json::to_string()` output (e.g., user-controlled string in `KernelIntent`, `VenueEvent`, or `TradeSlot`) contains a NUL byte, this **panics the entire process**.
Triggered by every FFI call that returns a string:
- `dita_kernel_process_intent_json`
- `dita_kernel_on_venue_event_json`
- `dita_kernel_reconcile_slots_json`
- `dita_kernel_snapshot_json`
- `dita_kernel_get_slot_json`
**Fix:** Replace `.unwrap()` with `unwrap_or_else(|_| ptr::null_mut())` or feed through `invalid_intent_cstring`.
**Severity: Critical**
#### G3: `process_intent` EXIT hardcodes `prev_state = POSITION_OPEN` unconditionally
**File:** `_rust_kernel/src/lib.rs:842-890`
```rust
slot.fsm_state = TradeStage::EXIT_REQUESTED; // unconditional override
let transition = self.transition(
&slot,
TradeStage::POSITION_OPEN, // always POSITION_OPEN
slot.fsm_state.clone(),
"EXIT_INTENT",
);
```
Three problems:
(a) **Transition prev_state is a lie.** If the slot was in `EXIT_WORKING`, `EXIT_SENT`, `EXIT_REQUESTED`, or `POSITION_PARTIALLY_CLOSED`, the transition record says `POSITION_OPEN` — wrong.
(b) **Backward transition.** If the slot is `EXIT_WORKING` and a new EXIT intent arrives, `fsm_state` is set to `EXIT_REQUESTED` — a backward transition from `EXIT_WORKING``EXIT_REQUESTED`. This corrupts the FSM.
(c) **No state guard.** EXIT should only be allowed from `POSITION_OPEN`, `EXIT_WORKING` (for additional legs), or `POSITION_PARTIALLY_CLOSED`. Currently any state that passes `!is_free() && !closed && size > 0` can transition to `EXIT_REQUESTED`.
**Fix:** Check actual FSM state before allowing EXIT, log actual prev_state, guard against backward transitions.
**Severity: Critical**
#### G4: `consume_exit_leg` advances beyond last valid index — stale `all_legs_done` variable
**File:** `_rust_kernel/src/lib.rs:1420-1435`
```rust
let all_legs_done = slot.active_leg_index >= slot.exit_leg_ratios.len(); // (A)
let should_close = (slot.size <= 1e-12 || (!partial && all_legs_done)); // (B)
if !partial {
slot.consume_exit_leg(); // (C) — advances active_leg_index POST (A)
}
if should_close && slot.size <= 1e-12 { // (D) — close
} else if !partial && !all_legs_done { // (E) — stale! uses (A) not post-advance index
```
On the last leg (`active_leg_index = len - 1`):
- (A): `all_legs_done = false` (pre-advance)
- (C): advances to `len` (exhausted)
- (E): `!partial && !false` = true → enters `POSITION_OPEN` instead of examining `should_close` with post-advance index
The `all_legs_done` variable is captured **before** `consume_exit_leg` advances the index. Branch (E) should use the post-advance index to correctly detect exhaustion.
After exhaustion, `next_exit_ratio()` returns `1.0` (out-of-bounds `unwrap_or(1.0)`) — silently tries to exit remaining size as 100% instead of detecting completion.
**Severity: Critical**
#### G5: `realized_pnl` uses unbounded f64 — overflows to inf at extreme values
**File:** `_rust_kernel/src/lib.rs:648-656`
```rust
let notional = exit_size * slot.entry_price * slot.leverage.max(1.0);
delta * notional
```
No `is_finite()` check on intermediate products. At `exit_price=1e200`, `entry_price=1e-200`: `delta` = `(1e200 - 1e-200) / 1e-200``1e400``inf`. The resulting `inf` is stored in `slot.realized_pnl`, corrupting all future PnL tracking.
Subnormals: `entry_price=5e-324` (subnormal) causes division to produce `inf` for modest exit prices on some platforms.
**Fix:** Add `is_finite()` guards on both prices and cap intermediate products.
**Severity: High**
#### G6: `mark_price` produces unbounded `unrealized_pnl`
**File:** `_rust_kernel/src/lib.rs:384-399`
```rust
self.unrealized_pnl = delta * self.size * self.entry_price * self.leverage;
// No is_finite() check on result
```
If any of `delta`, `size`, `entry_price`, or `leverage` is extreme, the product overflows to `inf`. No result guard. `inf` stored in `unrealized_pnl` forever. Capped only by the `price <= 0.0` guard on input — no guard on the computation chain.
Also: `self.entry_price = price` at line 388 overwrites entry_price on every mark_price call for a position with `entry_price <= 0.0`, even when the position has been open for a while. This means a stale-zero entry_price gets set to the current market price on first mark_price after open, which is correct — but if the slot is reused (re-entry without resetting entry_price), the old entry price from the prior trade bleeds into unrealized PnL.
**Severity: High**
#### G7: `process_intent` ENTER — no `is_finite()` guard on `target_size`
**File:** `_rust_kernel/src/lib.rs:806-807`
```rust
intended_size: intent.target_size.max(0.0),
```
`f64::NAN.max(0.0)` returns `NAN`. `f64::INFINITY.max(0.0)` returns `inf`. Serde_json **does** accept `Infinity` and `NaN` by default — they're valid JSON tokens. If the Python-side `_first_invalid_intent_field` guard is bypassed (F3 — it allows these through), `NaN`/`inf` propagates into `intended_size` in `VenueOrder`, corrupting all fill calculations.
Similarly, `reference_price` is never validated for finiteness before being stored in `VenueOrder.metadata`.
**Severity: High**
#### G8: `reconcile_slots_json` — no dedup or bounds validation
**File:** `_rust_kernel/src/lib.rs:1668-1675`
```rust
for slot in slots {
if slot.slot_id < core.slots.len() {
core.slots[slot.slot_id] = slot.clone();
}
}
```
Two slots with the same `slot_id`: the **second overwrites the first** silently. A slot with `slot_id >= core.slots.len()`: **silently dropped** — no error, no diagnostic. Caller sees `accepted=true` even if some/all slots were not applied.
**Severity: High**
#### G9: `exchange_order_id` propagation uses wrong order target
**File:** `_rust_kernel/src/lib.rs:1110-1125`
```rust
let target = if slot.active_entry_order.is_some() {
slot.active_entry_order.as_mut()
} else {
slot.active_exit_order.as_mut()
};
```
If an **entry** order exists (even if fully filled) and an **exit** fill event arrives, the code updates the entry order's `venue_order_id` instead of the exit order's. The exit order's `venue_order_id` stays empty. Any subsequent `CANCEL` intent on the exit order fails because `active_exit_order.venue_order_id` is empty — the venue can't match the cancel.
**Fix:** Disambiguate by matching `venue_client_id`, or clear `active_entry_order` when entry is complete.
**Severity: High**
#### G10: CANCEL diagnostic code says NO_ACTIVE_EXIT_ORDER for entry cancel too
**File:** `_rust_kernel/src/lib.rs:966-1005`
```rust
if !has_cancellable_exit && !has_cancellable_entry {
return KernelResult {
diagnostic_code: KernelDiagnosticCode::NO_ACTIVE_EXIT_ORDER, // always says exit
details: json!({"reason": "NO_ACTIVE_EXIT_ORDER"}),
};
}
```
When neither exit nor entry is cancellable, the diagnostic returns `NO_ACTIVE_EXIT_ORDER` regardless of which order was the target. If the user wanted to cancel an entry order that's not in a cancellable state, the diagnostic is misleading.
**Fix:** Separate diagnostic codes: `NO_ACTIVE_EXIT_ORDER`, `NO_ACTIVE_ENTRY_ORDER`, `ENTRY_NOT_CANCELLABLE`.
**Severity: High**
#### G11: `apply_fill` entry-fill overwrites `active_entry_order.intended_size` with `slot.size`
**File:** `_rust_kernel/src/lib.rs:1363-1377**
On FULL_FILL entry, `slot.active_entry_order` is entirely replaced with a new `VenueOrder` where `intended_size = slot.size` (the fill amount) instead of the original intended size. The original intended size (which could be larger than fill size for partial fills) is lost.
If a duplicate fill event arrives (dedup fails due to missing event_id), the second fill would use `slot.size` as the basis for further fills — wrong values.
**Severity: Medium**
#### G12: `leverage` unbounded after `is_finite()` — no maximum cap
**File:** `_rust_kernel/src/lib.rs:778`
```rust
slot.leverage = if intent.leverage.is_finite() && intent.leverage > 0.0 {
intent.leverage // 1e100 accepted here
} else { 1.0 };
```
`leverage = 1e100` passes `is_finite()`. Feeds into `realized_pnl()` as `slot.leverage.max(1.0) = 1e100`, producing `notional = exit_size * entry_price * 1e100`. Makes `unrealized_pnl` arbitrarily large.
No maximum leverage cap enforced anywhere — the exchange-level cap (`DOLPHIN_BINGX_EXCHANGE_LEVERAGE_CAP`) exists in `BingxExecClientConfig` but is **never passed to the Rust kernel**.
**Severity: Medium**
#### G13: `resolve_slot` fallback returns `unwrap_or(0)` — can misroute events
**File:** `_rust_kernel/src/lib.rs:623`
```rust
self.slots.first().map(|slot| slot.slot_id).unwrap_or(0)
```
When no slot matches the event (`slot_id` out of range or all slot filters fail), returns `slot_id` of the **first slot** (which may be 0 or any value). No diagnostic emitted — caller sees slot state change with no idea the event was misrouted.
**Severity: Medium**
#### G14: `commit_slot` silently ignores out-of-bounds slot_id
**File:** `_rust_kernel/src/lib.rs:595-600**
```rust
fn commit_slot(&mut self, slot: TradeSlot) {
if slot.slot_id < self.slots.len() {
self.slots[slot_id] = slot;
}
// else: silently dropped — no error returned
}
```
Mutations to out-of-bounds slot are silently discarded. Can happen if `slot.slot_id` is corrupted via `set_slot_from_json` causing index mismatch between `slot.slot_id` and the actual slot position.
**Severity: Medium**
---
### Configuration & Validation Chain
#### G15: Zero `__post_init__` validators on all config dataclasses
Every config dataclass in the system has zero field-level validation:
| Dataclass | Fields | Validators |
|-----------|--------|------------|
| `KernelControlSnapshot` | 16 | **0** |
| `ControlUpdate` | 16 | **0** |
| `KernelIntent` | 19 | **0** |
| `TradeSlot` | 22 | **0** |
| `VenueOrder` | 8 | **0** |
| `VenueEvent` | 18 | **0** |
| `KernelTransition` | 11 | **0** |
| `KernelOutcome` | 8 | **0** |
| `AccountSnapshot` | 9 | **0** |
| **Total** | **127** | **0** |
The only validation in the entire chain:
- `_first_invalid_intent_field()` — finiteness guard at Python→Rust FFI boundary (not a dataclass validator)
- Rust `leverage = if is_finite && > 0.0 { val } else { 1.0 }` — post-hoc clamp
- Rust `KernelCore::new(max_slots.max(1))` — floor only, no ceiling
- `launcher.py:143`: `max(1, int(...))` for `active_slot_limit` — floor only
**No `__post_init__` exists anywhere. No bounds check on any field except the two floor-only guards.**
**Severity: High**
#### G16: `DITA_V2_DEBUG_CLICKHOUSE` defaults to `True` when env var is unset
**File:** `launcher.py:133`
```python
debug = _env_bool("DITA_V2_DEBUG_CLICKHOUSE", True)
```
`_env_bool` (launcher.py:75) returns `default` when the env var is unset. So `debug = True` by default. Every runtime writes debug traces to ClickHouse by default. `DITA_V2_DEBUG_CLICKHOUSE=False` is required to disable it.
This is not a bug per se, but it means debug ClickHouse writes are **on by default**, adding ~10 ClickHouse insertions per process_intent call (every transition + position state + trade event) that most production deployments may not want.
**Severity: Informational**
#### G17: String config fields have no charset/length validation — Zinc region injection risk
**File:** `control.py:31-53`, `real_zinc_plane.py:30`
`runtime_namespace`, `strategy_namespace`, `event_namespace`, `actor_name`, `exec_venue`, `data_venue`, `ledger_authority` are all free-form strings with no validation. They're used as:
1. **Zinc shared memory region names**: `self.prefix + "." + namespace + "." + kind` — an attacker-controlled namespace could collide with other processes' Zinc regions
2. **ClickHouse table names**: `DOLPHIN_BINGX_JOURNAL_STRATEGY` is used as a table suffix — SQL injection risk in ClickHouse journal
3. **Hazelcast map names**: Same injection risk via `event_namespace`
**Severity: Medium**
#### G18: `exit_leg_ratios` no sum-to-1 validation
`KernelIntent.exit_leg_ratios` and `TradeSlot.exit_leg_ratios` are tuple/list of floats. No validator ensures they sum to approximately 1.0. Ratios summing to 0.5 leave the position partially closed forever (residual can't be exited because `next_exit_ratio()` returns `1.0` after exhaustion, exiting 100% of remaining — which may exceed the intended residual).
**Severity: Low**
#### G19: `RealZincControlPlane.read()` has no sequence check — torn-read risk
**File:** `real_control_plane.py:88-94**
```python
def read(self):
payload = _decode_packet(self.region.as_buffer())
control = payload.get("control")
if not isinstance(control, dict):
return self._snapshot
self._snapshot = KernelControlSnapshot(**control)
return self._snapshot
```
The binary packet has a 64-bit sequence number but `read()` **never checks it**. Between the zero-write and packet-write in `_write_region`, a reader sees an empty buffer → `_decode_packet` fails → falls back to `self._snapshot` (stale). Between the packet-write and `struct.pack` header (order depends on implementation), a reader sees a partial write with wrong size → `_decode_packet` fails.
No checksum on the wire format: `struct.pack("!QQ", seq, len) + json_bytes`. A torn write produces garbage that `json.loads` may or may not parse successfully.
**Severity: Low**
#### G20: `DOLPHIN_BINGX_JOURNAL_STRATEGY`/`_DB` — ClickHouse SQL injection risk
**File:** `launcher.py:202-203`
```python
"DOLPHIN_BINGX_JOURNAL_STRATEGY": os.environ.get("DOLPHIN_BINGX_JOURNAL_STRATEGY", ""),
"DOLPHIN_BINGX_JOURNAL_DB": os.environ.get("DOLPHIN_BINGX_JOURNAL_DB", ""),
```
These are used as ClickHouse table and database name suffixes in `pink_clickhouse.py`. An attacker who can set env vars can inject SQL via semicolons or quotes in the table name. ClickHouse supports `INSERT INTO db.table FORMAT JSONEachRow` — a table name like `positions; DROP TABLE ...;` could be destructive.
**Severity: Low** (requires env var control, which implies broader access)
---
### Persistence Schema Alignment
#### G21: `entry_price` used as `exit_price` in `trade_events` — data loss
**File:** `pink_clickhouse.py (outside workspace)`
The `_write_trade_event` function maps `entry_price` from `slot.to_dict()` to both the `entry_price` and `exit_price` columns. The actual exit fill price (available on the `VenueEvent` object) is **never written** to the `exit_price` column.
**Result:** Every `trade_events` row has `exit_price == entry_price`. The `exit_price` column is a dead column — always contains the entry price, never the actual fill.
**Severity: High** — data loss to DB for the most important trade metric.
#### G22: `active_leg_index` → `entry_bar` semantic mis-mapping
**File:** `pink_clickhouse.py (outside workspace)`
```python
"entry_bar": int(slot_dict.get("active_leg_index", 0) or 0),
```
`active_leg_index` tracks the exit-leg-ratios cursor (which leg of a multi-leg exit we're on), not a bar count. The value `0` at position open and `1` after the first exit leg — neither value represents bars held. **The `entry_bar` column stores the wrong concept.**
**Severity: Medium** — column contains semantically meaningless data.
#### G23: `capital_before` arithmetic reconstruction absorbs cross-slot PnL
**File:** `pink_clickhouse.py (outside workspace)`
```python
capital_before = capital_after - pnl_leg
```
`capital_before` is reconstructed by subtracting the current leg's PnL from the current capital. In a multi-slot system, other slots' PnL changes between legs are absorbed into `capital_before`. The column is **always wrong** in multi-slot scenarios because `capital_after` reflects total PnL from all slots, not just the leg being recorded.
**Severity: Medium** — wrong `capital_before` for multi-slot trading.
#### G24: Recovery `trade_reconstruction` always has `trade_id=""`
**File:** `pink_clickhouse.py (outside workspace)`
The `persist_recovery_state` function passes `kernel.snapshot()["account"]` (an account dict with keys `capital, equity, realized_pnl, ...`) where a slot dict is expected. The `trade_id` key **does not exist** on the account dict. The `recovery_state` row always has `trade_id=""`.
**Severity: Medium** — recovery data is not associable with any trade.
#### G25: `seen_event_ids`, `exit_leg_ratios`, `VenueOrder`, `metadata` not in flat ClickHouse tables
These fields are:
- Present on the Python `TradeSlot` ✅
- Transmitted through Zinc shared memory ✅
- Stored in Hazelcast ✅
- Stored in ClickHouse `dita_kernel_debug` (full JSON) ✅
- **NOT extracted** into main ClickHouse flat tables `position_state`, `trade_events`, `trade_exit_legs` ❌
Data exists at the source, travels through the pipeline, hits the debug journal — but is lost in the main analytical tables.
**Severity: Low** (data exists in debug journal if needed for reconstruction)
#### G26: `_safe_float` silently converts NaN/None/Inf to 0.0
**File:** `utils.py:15`
```python
def _safe_float(v, default=0.0):
try:
f = float(v)
if not math.isfinite(f):
return default
return f
except (TypeError, ValueError, OverflowError):
return default
```
Used in multiple ClickHouse writers. Silently converts `NaN`/`Inf`/parsing errors to `0.0`. No diagnostic emitted when a non-finite value reaches the persistence layer — data silently zeroed.
**Severity: Low** (safe default but silent corruption)
---
### Lifecycle & Resource Management
#### G27: `build_launcher_bundle` has no exception safety — prior resources leak
**File:** `launcher.py:264-300**
```python
def build_launcher_bundle(...):
control_plane = _build_control_plane(...)
projection = build_projection(...)
zinc_plane = _build_zinc_plane(...)
venue = _build_venue(...)
kernel = ExecutionKernel(...) # ← if THIS fails, everything above leaks
```
If any step after the first raises, all previously built resources leak:
- `RealZincPlane` created → `_build_venue()` fails → 3 shared memory regions orphaned
- `RealZincControlPlane` created → `_build_zinc_plane()` fails → 1 shared memory region orphaned
- `BingxVenueAdapter` created → `ExecutionKernel.__init__()` fails → HTTP connection leaked
**No `try/finally` anywhere in the builder.** The init order is also optimized for forward construction, not backward cleanup.
**Severity: High** — shared memory leak on any build failure.
#### G28: `RealZincPlane` and `RealZincControlPlane` have no `__del__`
When `close()` is not called (exception in builder, forgotten cleanup, GC during shutdown), the shared memory regions opened by `RealZincPlane` (3 regions) and `RealZincControlPlane` (1 region) are **orphaned on the OS**. They persist in `/dev/shm/` (or platform equivalent) until system reboot.
Python's `__del__` is unreliable (not called on SIGKILL, not called if the object is part of a cycle without a GC run), but its absence means even normal garbage collection can't clean up.
**Severity: High** — shared memory leaks.
#### G29: Zero signal handlers — no cleanup on SIGTERM/SIGINT
```bash
$ grep -rn "signal\|SIGTERM\|SIGINT\|atexit" *.py # ZERO matches
```
When SIGTERM or SIGINT arrives:
1. Python's default handler terminates the process immediately
2. No `DITAv2LauncherBundle.close()` is called
3. No `ExecutionKernel.__del__` is called (CPython may run GC on normal exit but not reliably)
4. All shared memory (RealZincPlane, RealZincControlPlane) is orphaned
5. In-flight BingX HTTP calls are interrupted mid-stream
6. Rust kernel handle is leaked
**Severity: High**
#### G30: `ExecutionKernel` has no `close()` — relies on `__del__` for Rust handle cleanup
`ExecutionKernel` has `__del__` which calls `_get_rust().destroy(backend)`. No `close()` method. `DITAv2LauncherBundle.close()` never touches the kernel — the Rust handle is only freed by GC at unpredictable time.
If any code holds a stale `_backend` pointer, the handle dangles when GC runs. If `__del__` is suppressed (e.g., during interpreter shutdown with cyclic references), the Rust handle leaks permanently.
**Fix:** Add `close()` to `ExecutionKernel`, call it from `DITAv2LauncherBundle.close()`.
**Severity: High**
#### G31: `projection` (Hazelcast) never closed
`build_projection()` returns a `HazelcastProjection` which holds a Hazelcast client connection. No `close()` or `disconnect()` method exists on the projection, projector, or row writer. `DITAv2LauncherBundle.close()` doesn't touch the projection. The Hazelcast client connection leaks on shutdown.
**Severity: Medium**
#### G32: `_maybe_close()` only calls the first method found — `break` skips the second
**File:** `launcher.py:233-243**
```python
for method_name in ("close", "disconnect"):
method = getattr(obj, method_name, None)
if method is None:
continue
try:
result = method()
except TypeError:
continue
if inspect.isawaitable(result):
try:
asyncio.run(result)
except RuntimeError:
pass
break # ← ONLY calls the FIRST found method, never both
```
If an object has both `close()` and `disconnect()`, only `close()` is called. `disconnect()` is silently skipped. Also: `asyncio.run(result)` silently swallows `RuntimeError` when a running event loop exists — the coroutine is **never executed**.
Currently no object has both, but the pattern is fragile.
**Severity: Low**
#### G33: `close()` is not idempotent for RealZinc components
`RealZincPlane.close()` and `RealZincControlPlane.close()` call their Zinc region's `close()` method. If called twice, the second call operates on an already-closed region — likely crashes from Hazelcast's shared memory code.
No nulling of references after close: `DITAv2LauncherBundle.close()` sets `self.venue`, `self.zinc_plane`, `self.control_plane` to `None` — **wait, it doesn't. It calls `_maybe_close()` which doesn't null references.** Double `close()` is unsafe.
**Severity: Low**
#### G34: No context manager on `DITAv2LauncherBundle`
`DITAv2LauncherBundle` has no `__enter__`/`__exit__`. Users must manually call `close()`. No `with` pattern exists anywhere in the source for lifecycle management. No `__del__` fallback on the bundle either.
**Severity: Low** (ergonomic, not a leak source if caller follows the pattern)
#### G35: `BingxVenueAdapter.connect()` exists but is never called by the launcher
`BingxDirectExecutionAdapter` has a `connect()` method that initializes the lifetime HTTP client. `BingxVenueAdapter` has `connect()` that calls `_call_backend("connect")`. Neither is called in `build_launcher_bundle()` or `_build_venue()`. If the adapter's `submit_intent()` relies on a connected client, it initializes lazily — but the connect path is dead code that exists but is never invoked.
**Severity: Informational**
#### G36: Only one `try/finally` in the entire codebase
The only `try/finally` is `_RustKernelLib._take_string()` (rust_backend.py:140-143) which frees the Rust C string. All other resource management uses `try/except` with no `finally`.
No cleanup is guaranteed on exception:
- `build_launcher_bundle()` — no cleanup on failure
- `process_intent()` — no cleanup of partial slot state on venue event exception
- `on_venue_event()` — no cleanup on FFI failure
- `_set_slot()` — no cleanup on projection or Zinc write failure
**Severity: High** (across all layers)
---
## Pass 4 Summary
| # | Flaw | Layer | Severity |
|---|------|-------|----------|
| G1 | EXIT_RESIDUAL action missing from Rust KernelCommandType | Rust | **Critical** |
| G2 | `into_c_string` unwrap() panics on NUL byte | Rust | **Critical** |
| G3 | EXIT hardcodes prev_state=POSITION_OPEN, allows backward FSM transition | Rust | **Critical** |
| G4 | `consume_exit_leg` stale `all_legs_done` variable — wrong branch after last leg | Rust | **Critical** |
| G5 | `realized_pnl` unbounded f64 overflow to inf | Rust | **High** |
| G6 | `mark_price` unbounded unrealized_pnl — no result guard | Rust | **High** |
| G7 | ENTER no is_finite() guard on target_size | Rust | **High** |
| G8 | `reconcile_slots_json` no dedup or bounds validation | Rust | **High** |
| G9 | `exchange_order_id` update targets wrong order — exit cancel broken | Rust | **High** |
| G10 | CANCEL diagnostic always says NO_ACTIVE_EXIT_ORDER | Rust | **High** |
| G11 | `apply_fill` overwrites intended_size with slot.size | Rust | Medium |
| G12 | No max leverage cap enforced by kernel | Rust | Medium |
| G13 | `resolve_slot` fallback returns unwrap_or(0) — misroutes events | Rust | Medium |
| G14 | `commit_slot` silently ignores out-of-bounds slot_id | Rust | Medium |
| G15 | Zero `__post_init__` validators on all config dataclasses | Config | **High** |
| G16 | DITA_V2_DEBUG_CLICKHOUSE defaults to True when unset | Config | Info |
| G17 | String config fields — Zinc region injection risk | Config | Medium |
| G18 | `exit_leg_ratios` no sum-to-1 validation | Config | Low |
| G19 | RealZincControlPlane.read() no sequence check — torn-read risk | Config | Low |
| G20 | ClickHouse journal strategy/db env vars — SQL injection risk | Config | Low |
| G21 | entry_price used as exit_price in trade_events — data loss | Persistence | **High** |
| G22 | active_leg_index → entry_bar semantic mis-mapping | Persistence | Medium |
| G23 | capital_before arithmetic absorbs cross-slot PnL | Persistence | Medium |
| G24 | Recovery trade_reconstruction always has trade_id="" | Persistence | Medium |
| G25 | seen_event_ids, exit_leg_ratios, VenueOrder, metadata not in flat CH tables | Persistence | Low |
| G26 | _safe_float silently converts NaN/None/Inf to 0.0 | Persistence | Low |
| G27 | build_launcher_bundle no exception safety — prior resources leak | Lifecycle | **High** |
| G28 | RealZincPlane/RealZincControlPlane no __del__ — SHM orphaned | Lifecycle | **High** |
| G29 | Zero signal handlers — no cleanup on SIGTERM/SIGINT | Lifecycle | **High** |
| G30 | ExecutionKernel has no close() — relies on __del__ for Rust handle | Lifecycle | **High** |
| G31 | Hazelcast projection never closed | Lifecycle | Medium |
| G32 | _maybe_close() break skips second method | Lifecycle | Low |
| G33 | close() not idempotent for RealZinc components | Lifecycle | Low |
| G34 | No context manager on DITAv2LauncherBundle | Lifecycle | Low |
| G35 | BingxVenueAdapter.connect() never called | Lifecycle | Info |
| G36 | Only one try/finally in entire codebase | Lifecycle | **High** |
### Pass 4 Severity Distribution
| Severity | Count |
|----------|-------|
| **Critical** | 4 (G1, G2, G3, G4) |
| **High** | 11 (G5-G10, G15, G21, G27, G28, G29, G30, G36) |
| Medium | 11 (G11-G14, G17, G22, G23, G24, G31) |
| Low | 8 (G16, G18, G19, G20, G25, G26, G32, G33, G34, G35) |
| Info | 2 |
### Combined Catalog (All 4 Passes)
| Pass | Focus | Count | Critical | High | Medium | Low | Info |
|------|-------|-------|----------|------|--------|-----|------|
| A | Architectural | 15 | 0 | 2 | 0 | 2 | 11 |
| T | Threading/Atomicity | 9 | 1 | 3 | 3 | 2 | 0 |
| E | E2E Trace | 26 | 0 | 4 | 10 | 11 | 1 |
| F | Deep E2E (Pass 3) | 30 | 0 | 1 | 8 | 17 | 4 |
| G | Domain Scans (Pass 4) | 36 | 4 | 11 | 11 | 8 | 2 |
| **Total** | | **116** | **5** | **21** | **32** | **40** | **18** |