Six variants. **No `EXIT_RESIDUAL`.** If any caller submits an intent with `action = "EXIT_RESIDUAL"`, the string_enum deserializer fails — serde returns `INVALID_INTENT_PARSE`. Even if deserialization worked, there's no branch to handle residual-position cleanup. Any position with remaining size after partial exit legs has **no way to trigger a clean-up exit** via the intent system.
The Python `KernelCommandType` enum (contracts.py) does have `EXIT_RESIDUAL`, translated to `"EXIT_RESIDUAL"` string by `_intent_to_payload`. This string hits Rust's string_enum → parse error → `INVALID_INTENT_PARSE`.
**Fix:** Add `EXIT_RESIDUAL` variant to Rust enum + match arm that skips the `NO_OPEN_POSITION` guard for residual-sized positions.
`CString::new()` returns `Err` if the string contains a NUL (`'\0'`) byte. `.unwrap()` panics at the C FFI boundary. If any `serde_json::to_string()` output (e.g., user-controlled string in `KernelIntent`, `VenueEvent`, or `TradeSlot`) contains a NUL byte, this **panics the entire process**.
Triggered by every FFI call that returns a string:
-`dita_kernel_process_intent_json`
-`dita_kernel_on_venue_event_json`
-`dita_kernel_reconcile_slots_json`
-`dita_kernel_snapshot_json`
-`dita_kernel_get_slot_json`
**Fix:** Replace `.unwrap()` with `unwrap_or_else(|_| ptr::null_mut())` or feed through `invalid_intent_cstring`.
(a) **Transition prev_state is a lie.** If the slot was in `EXIT_WORKING`, `EXIT_SENT`, `EXIT_REQUESTED`, or `POSITION_PARTIALLY_CLOSED`, the transition record says `POSITION_OPEN` — wrong.
(b) **Backward transition.** If the slot is `EXIT_WORKING` and a new EXIT intent arrives, `fsm_state` is set to `EXIT_REQUESTED` — a backward transition from `EXIT_WORKING` → `EXIT_REQUESTED`. This corrupts the FSM.
(c) **No state guard.** EXIT should only be allowed from `POSITION_OPEN`, `EXIT_WORKING` (for additional legs), or `POSITION_PARTIALLY_CLOSED`. Currently any state that passes `!is_free() && !closed && size > 0` can transition to `EXIT_REQUESTED`.
**Fix:** Check actual FSM state before allowing EXIT, log actual prev_state, guard against backward transitions.
**Severity: Critical**
#### G4: `consume_exit_leg` advances beyond last valid index — stale `all_legs_done` variable
**File:** `_rust_kernel/src/lib.rs:1420-1435`
```rust
let all_legs_done = slot.active_leg_index >= slot.exit_leg_ratios.len(); // (A)
slot.consume_exit_leg(); // (C) — advances active_leg_index POST (A)
}
if should_close && slot.size <= 1e-12 { // (D) — close
} else if !partial && !all_legs_done { // (E) — stale! uses (A) not post-advance index
```
On the last leg (`active_leg_index = len - 1`):
- (A): `all_legs_done = false` (pre-advance)
- (C): advances to `len` (exhausted)
- (E): `!partial && !false` = true → enters `POSITION_OPEN` instead of examining `should_close` with post-advance index
The `all_legs_done` variable is captured **before**`consume_exit_leg` advances the index. Branch (E) should use the post-advance index to correctly detect exhaustion.
After exhaustion, `next_exit_ratio()` returns `1.0` (out-of-bounds `unwrap_or(1.0)`) — silently tries to exit remaining size as 100% instead of detecting completion.
**Severity: Critical**
#### G5: `realized_pnl` uses unbounded f64 — overflows to inf at extreme values
**File:** `_rust_kernel/src/lib.rs:648-656`
```rust
let notional = exit_size * slot.entry_price * slot.leverage.max(1.0);
delta * notional
```
No `is_finite()` check on intermediate products. At `exit_price=1e200`, `entry_price=1e-200`: `delta` = `(1e200 - 1e-200) / 1e-200` ≈ `1e400` → `inf`. The resulting `inf` is stored in `slot.realized_pnl`, corrupting all future PnL tracking.
Subnormals: `entry_price=5e-324` (subnormal) causes division to produce `inf` for modest exit prices on some platforms.
**Fix:** Add `is_finite()` guards on both prices and cap intermediate products.
If any of `delta`, `size`, `entry_price`, or `leverage` is extreme, the product overflows to `inf`. No result guard. `inf` stored in `unrealized_pnl` forever. Capped only by the `price <= 0.0` guard on input — no guard on the computation chain.
Also: `self.entry_price = price` at line 388 overwrites entry_price on every mark_price call for a position with `entry_price <= 0.0`, even when the position has been open for a while. This means a stale-zero entry_price gets set to the current market price on first mark_price after open, which is correct — but if the slot is reused (re-entry without resetting entry_price), the old entry price from the prior trade bleeds into unrealized PnL.
**Severity: High**
#### G7: `process_intent` ENTER — no `is_finite()` guard on `target_size`
**File:** `_rust_kernel/src/lib.rs:806-807`
```rust
intended_size: intent.target_size.max(0.0),
```
`f64::NAN.max(0.0)` returns `NAN`. `f64::INFINITY.max(0.0)` returns `inf`. Serde_json **does** accept `Infinity` and `NaN` by default — they're valid JSON tokens. If the Python-side `_first_invalid_intent_field` guard is bypassed (F3 — it allows these through), `NaN`/`inf` propagates into `intended_size` in `VenueOrder`, corrupting all fill calculations.
Similarly, `reference_price` is never validated for finiteness before being stored in `VenueOrder.metadata`.
**Severity: High**
#### G8: `reconcile_slots_json` — no dedup or bounds validation
**File:** `_rust_kernel/src/lib.rs:1668-1675`
```rust
for slot in slots {
if slot.slot_id <core.slots.len(){
core.slots[slot.slot_id] = slot.clone();
}
}
```
Two slots with the same `slot_id`: the **second overwrites the first** silently. A slot with `slot_id >= core.slots.len()`: **silently dropped** — no error, no diagnostic. Caller sees `accepted=true` even if some/all slots were not applied.
**Severity: High**
#### G9: `exchange_order_id` propagation uses wrong order target
**File:** `_rust_kernel/src/lib.rs:1110-1125`
```rust
let target = if slot.active_entry_order.is_some() {
slot.active_entry_order.as_mut()
} else {
slot.active_exit_order.as_mut()
};
```
If an **entry** order exists (even if fully filled) and an **exit** fill event arrives, the code updates the entry order's `venue_order_id` instead of the exit order's. The exit order's `venue_order_id` stays empty. Any subsequent `CANCEL` intent on the exit order fails because `active_exit_order.venue_order_id` is empty — the venue can't match the cancel.
**Fix:** Disambiguate by matching `venue_client_id`, or clear `active_entry_order` when entry is complete.
**Severity: High**
#### G10: CANCEL diagnostic code says NO_ACTIVE_EXIT_ORDER for entry cancel too
**File:** `_rust_kernel/src/lib.rs:966-1005`
```rust
if !has_cancellable_exit && !has_cancellable_entry {
When neither exit nor entry is cancellable, the diagnostic returns `NO_ACTIVE_EXIT_ORDER` regardless of which order was the target. If the user wanted to cancel an entry order that's not in a cancellable state, the diagnostic is misleading.
**Fix:** Separate diagnostic codes: `NO_ACTIVE_EXIT_ORDER`, `NO_ACTIVE_ENTRY_ORDER`, `ENTRY_NOT_CANCELLABLE`.
**Severity: High**
#### G11: `apply_fill` entry-fill overwrites `active_entry_order.intended_size` with `slot.size`
**File:** `_rust_kernel/src/lib.rs:1363-1377**
On FULL_FILL entry, `slot.active_entry_order` is entirely replaced with a new `VenueOrder` where `intended_size = slot.size` (the fill amount) instead of the original intended size. The original intended size (which could be larger than fill size for partial fills) is lost.
If a duplicate fill event arrives (dedup fails due to missing event_id), the second fill would use `slot.size` as the basis for further fills — wrong values.
**Severity: Medium**
#### G12: `leverage` unbounded after `is_finite()` — no maximum cap
**File:** `_rust_kernel/src/lib.rs:778`
```rust
slot.leverage = if intent.leverage.is_finite() && intent.leverage > 0.0 {
intent.leverage // 1e100 accepted here
} else { 1.0 };
```
`leverage = 1e100` passes `is_finite()`. Feeds into `realized_pnl()` as `slot.leverage.max(1.0) = 1e100`, producing `notional = exit_size * entry_price * 1e100`. Makes `unrealized_pnl` arbitrarily large.
No maximum leverage cap enforced anywhere — the exchange-level cap (`DOLPHIN_BINGX_EXCHANGE_LEVERAGE_CAP`) exists in `BingxExecClientConfig` but is **never passed to the Rust kernel**.
**Severity: Medium**
#### G13: `resolve_slot` fallback returns `unwrap_or(0)` — can misroute events
When no slot matches the event (`slot_id` out of range or all slot filters fail), returns `slot_id` of the **first slot** (which may be 0 or any value). No diagnostic emitted — caller sees slot state change with no idea the event was misrouted.
Mutations to out-of-bounds slot are silently discarded. Can happen if `slot.slot_id` is corrupted via `set_slot_from_json` causing index mismatch between `slot.slot_id` and the actual slot position.
**Severity: Medium**
---
### Configuration & Validation Chain
#### G15: Zero `__post_init__` validators on all config dataclasses
Every config dataclass in the system has zero field-level validation:
| Dataclass | Fields | Validators |
|-----------|--------|------------|
| `KernelControlSnapshot` | 16 | **0** |
| `ControlUpdate` | 16 | **0** |
| `KernelIntent` | 19 | **0** |
| `TradeSlot` | 22 | **0** |
| `VenueOrder` | 8 | **0** |
| `VenueEvent` | 18 | **0** |
| `KernelTransition` | 11 | **0** |
| `KernelOutcome` | 8 | **0** |
| `AccountSnapshot` | 9 | **0** |
| **Total** | **127** | **0** |
The only validation in the entire chain:
-`_first_invalid_intent_field()` — finiteness guard at Python→Rust FFI boundary (not a dataclass validator)
- Rust `leverage = if is_finite && > 0.0 { val } else { 1.0 }` — post-hoc clamp
- Rust `KernelCore::new(max_slots.max(1))` — floor only, no ceiling
-`launcher.py:143`: `max(1, int(...))` for `active_slot_limit` — floor only
**No `__post_init__` exists anywhere. No bounds check on any field except the two floor-only guards.**
**Severity: High**
#### G16: `DITA_V2_DEBUG_CLICKHOUSE` defaults to `True` when env var is unset
`_env_bool` (launcher.py:75) returns `default` when the env var is unset. So `debug = True` by default. Every runtime writes debug traces to ClickHouse by default. `DITA_V2_DEBUG_CLICKHOUSE=False` is required to disable it.
This is not a bug per se, but it means debug ClickHouse writes are **on by default**, adding ~10 ClickHouse insertions per process_intent call (every transition + position state + trade event) that most production deployments may not want.
**Severity: Informational**
#### G17: String config fields have no charset/length validation — Zinc region injection risk
`runtime_namespace`, `strategy_namespace`, `event_namespace`, `actor_name`, `exec_venue`, `data_venue`, `ledger_authority` are all free-form strings with no validation. They're used as:
1.**Zinc shared memory region names**: `self.prefix + "." + namespace + "." + kind` — an attacker-controlled namespace could collide with other processes' Zinc regions
2.**ClickHouse table names**: `DOLPHIN_BINGX_JOURNAL_STRATEGY` is used as a table suffix — SQL injection risk in ClickHouse journal
3.**Hazelcast map names**: Same injection risk via `event_namespace`
**Severity: Medium**
#### G18: `exit_leg_ratios` no sum-to-1 validation
`KernelIntent.exit_leg_ratios` and `TradeSlot.exit_leg_ratios` are tuple/list of floats. No validator ensures they sum to approximately 1.0. Ratios summing to 0.5 leave the position partially closed forever (residual can't be exited because `next_exit_ratio()` returns `1.0` after exhaustion, exiting 100% of remaining — which may exceed the intended residual).
**Severity: Low**
#### G19: `RealZincControlPlane.read()` has no sequence check — torn-read risk
**File:** `real_control_plane.py:88-94**
```python
def read(self):
payload = _decode_packet(self.region.as_buffer())
control = payload.get("control")
if not isinstance(control, dict):
return self._snapshot
self._snapshot = KernelControlSnapshot(**control)
return self._snapshot
```
The binary packet has a 64-bit sequence number but `read()`**never checks it**. Between the zero-write and packet-write in `_write_region`, a reader sees an empty buffer → `_decode_packet` fails → falls back to `self._snapshot` (stale). Between the packet-write and `struct.pack` header (order depends on implementation), a reader sees a partial write with wrong size → `_decode_packet` fails.
No checksum on the wire format: `struct.pack("!QQ", seq, len) + json_bytes`. A torn write produces garbage that `json.loads` may or may not parse successfully.
These are used as ClickHouse table and database name suffixes in `pink_clickhouse.py`. An attacker who can set env vars can inject SQL via semicolons or quotes in the table name. ClickHouse supports `INSERT INTO db.table FORMAT JSONEachRow` — a table name like `positions; DROP TABLE ...;` could be destructive.
**Severity: Low** (requires env var control, which implies broader access)
---
### Persistence Schema Alignment
#### G21: `entry_price` used as `exit_price` in `trade_events` — data loss
The `_write_trade_event` function maps `entry_price` from `slot.to_dict()` to both the `entry_price` and `exit_price` columns. The actual exit fill price (available on the `VenueEvent` object) is **never written** to the `exit_price` column.
**Result:** Every `trade_events` row has `exit_price == entry_price`. The `exit_price` column is a dead column — always contains the entry price, never the actual fill.
**Severity: High** — data loss to DB for the most important trade metric.
"entry_bar": int(slot_dict.get("active_leg_index", 0) or 0),
```
`active_leg_index` tracks the exit-leg-ratios cursor (which leg of a multi-leg exit we're on), not a bar count. The value `0` at position open and `1` after the first exit leg — neither value represents bars held. **The `entry_bar` column stores the wrong concept.**
`capital_before` is reconstructed by subtracting the current leg's PnL from the current capital. In a multi-slot system, other slots' PnL changes between legs are absorbed into `capital_before`. The column is **always wrong** in multi-slot scenarios because `capital_after` reflects total PnL from all slots, not just the leg being recorded.
**Severity: Medium** — wrong `capital_before` for multi-slot trading.
#### G24: Recovery `trade_reconstruction` always has `trade_id=""`
The `persist_recovery_state` function passes `kernel.snapshot()["account"]` (an account dict with keys `capital, equity, realized_pnl, ...`) where a slot dict is expected. The `trade_id` key **does not exist** on the account dict. The `recovery_state` row always has `trade_id=""`.
**Severity: Medium** — recovery data is not associable with any trade.
#### G25: `seen_event_ids`, `exit_leg_ratios`, `VenueOrder`, `metadata` not in flat ClickHouse tables
These fields are:
- Present on the Python `TradeSlot` ✅
- Transmitted through Zinc shared memory ✅
- Stored in Hazelcast ✅
- Stored in ClickHouse `dita_kernel_debug` (full JSON) ✅
- **NOT extracted** into main ClickHouse flat tables `position_state`, `trade_events`, `trade_exit_legs` ❌
Data exists at the source, travels through the pipeline, hits the debug journal — but is lost in the main analytical tables.
**Severity: Low** (data exists in debug journal if needed for reconstruction)
#### G26: `_safe_float` silently converts NaN/None/Inf to 0.0
**File:** `utils.py:15`
```python
def _safe_float(v, default=0.0):
try:
f = float(v)
if not math.isfinite(f):
return default
return f
except (TypeError, ValueError, OverflowError):
return default
```
Used in multiple ClickHouse writers. Silently converts `NaN`/`Inf`/parsing errors to `0.0`. No diagnostic emitted when a non-finite value reaches the persistence layer — data silently zeroed.
**Severity: Low** (safe default but silent corruption)
---
### Lifecycle & Resource Management
#### G27: `build_launcher_bundle` has no exception safety — prior resources leak
**File:** `launcher.py:264-300**
```python
def build_launcher_bundle(...):
control_plane = _build_control_plane(...)
projection = build_projection(...)
zinc_plane = _build_zinc_plane(...)
venue = _build_venue(...)
kernel = ExecutionKernel(...) # ← if THIS fails, everything above leaks
```
If any step after the first raises, all previously built resources leak:
-`RealZincPlane` created → `_build_venue()` fails → 3 shared memory regions orphaned
-`RealZincControlPlane` created → `_build_zinc_plane()` fails → 1 shared memory region orphaned
-`BingxVenueAdapter` created → `ExecutionKernel.__init__()` fails → HTTP connection leaked
**No `try/finally` anywhere in the builder.** The init order is also optimized for forward construction, not backward cleanup.
**Severity: High** — shared memory leak on any build failure.
#### G28: `RealZincPlane` and `RealZincControlPlane` have no `__del__`
When `close()` is not called (exception in builder, forgotten cleanup, GC during shutdown), the shared memory regions opened by `RealZincPlane` (3 regions) and `RealZincControlPlane` (1 region) are **orphaned on the OS**. They persist in `/dev/shm/` (or platform equivalent) until system reboot.
Python's `__del__` is unreliable (not called on SIGKILL, not called if the object is part of a cycle without a GC run), but its absence means even normal garbage collection can't clean up.
**Severity: High** — shared memory leaks.
#### G29: Zero signal handlers — no cleanup on SIGTERM/SIGINT
```bash
$ grep -rn "signal\|SIGTERM\|SIGINT\|atexit" *.py # ZERO matches
```
When SIGTERM or SIGINT arrives:
1. Python's default handler terminates the process immediately
2. No `DITAv2LauncherBundle.close()` is called
3. No `ExecutionKernel.__del__` is called (CPython may run GC on normal exit but not reliably)
4. All shared memory (RealZincPlane, RealZincControlPlane) is orphaned
5. In-flight BingX HTTP calls are interrupted mid-stream
6. Rust kernel handle is leaked
**Severity: High**
#### G30: `ExecutionKernel` has no `close()` — relies on `__del__` for Rust handle cleanup
`ExecutionKernel` has `__del__` which calls `_get_rust().destroy(backend)`. No `close()` method. `DITAv2LauncherBundle.close()` never touches the kernel — the Rust handle is only freed by GC at unpredictable time.
If any code holds a stale `_backend` pointer, the handle dangles when GC runs. If `__del__` is suppressed (e.g., during interpreter shutdown with cyclic references), the Rust handle leaks permanently.
**Fix:** Add `close()` to `ExecutionKernel`, call it from `DITAv2LauncherBundle.close()`.
**Severity: High**
#### G31: `projection` (Hazelcast) never closed
`build_projection()` returns a `HazelcastProjection` which holds a Hazelcast client connection. No `close()` or `disconnect()` method exists on the projection, projector, or row writer. `DITAv2LauncherBundle.close()` doesn't touch the projection. The Hazelcast client connection leaks on shutdown.
**Severity: Medium**
#### G32: `_maybe_close()` only calls the first method found — `break` skips the second
**File:** `launcher.py:233-243**
```python
for method_name in ("close", "disconnect"):
method = getattr(obj, method_name, None)
if method is None:
continue
try:
result = method()
except TypeError:
continue
if inspect.isawaitable(result):
try:
asyncio.run(result)
except RuntimeError:
pass
break # ← ONLY calls the FIRST found method, never both
```
If an object has both `close()` and `disconnect()`, only `close()` is called. `disconnect()` is silently skipped. Also: `asyncio.run(result)` silently swallows `RuntimeError` when a running event loop exists — the coroutine is **never executed**.
Currently no object has both, but the pattern is fragile.
**Severity: Low**
#### G33: `close()` is not idempotent for RealZinc components
`RealZincPlane.close()` and `RealZincControlPlane.close()` call their Zinc region's `close()` method. If called twice, the second call operates on an already-closed region — likely crashes from Hazelcast's shared memory code.
No nulling of references after close: `DITAv2LauncherBundle.close()` sets `self.venue`, `self.zinc_plane`, `self.control_plane` to `None` — **wait, it doesn't. It calls `_maybe_close()` which doesn't null references.** Double `close()` is unsafe.
**Severity: Low**
#### G34: No context manager on `DITAv2LauncherBundle`
`DITAv2LauncherBundle` has no `__enter__`/`__exit__`. Users must manually call `close()`. No `with` pattern exists anywhere in the source for lifecycle management. No `__del__` fallback on the bundle either.
**Severity: Low** (ergonomic, not a leak source if caller follows the pattern)
#### G35: `BingxVenueAdapter.connect()` exists but is never called by the launcher
`BingxDirectExecutionAdapter` has a `connect()` method that initializes the lifetime HTTP client. `BingxVenueAdapter` has `connect()` that calls `_call_backend("connect")`. Neither is called in `build_launcher_bundle()` or `_build_venue()`. If the adapter's `submit_intent()` relies on a connected client, it initializes lazily — but the connect path is dead code that exists but is never invoked.
**Severity: Informational**
#### G36: Only one `try/finally` in the entire codebase
The only `try/finally` is `_RustKernelLib._take_string()` (rust_backend.py:140-143) which frees the Rust C string. All other resource management uses `try/except` with no `finally`.
No cleanup is guaranteed on exception:
-`build_launcher_bundle()` — no cleanup on failure
-`process_intent()` — no cleanup of partial slot state on venue event exception
-`on_venue_event()` — no cleanup on FFI failure
-`_set_slot()` — no cleanup on projection or Zinc write failure
### H1: No Python dependency declaration files exist in workspace
**Files:** workspace root
Zero `requirements.txt`, `setup.py`, `setup.cfg`, `pyproject.toml`, `Pipfile`, or `poetry.lock` anywhere. All Python package dependencies are entirely implicit — determined by what's installed in the runtime environment. No reproducible installs, no version pinning, no audit trail.
The Rust side does have `Cargo.toml` + `Cargo.lock` — but all 4 direct Rust deps use open ranges (`"0.4"`, `"0.2"`, `"1"`, `"1"`).
**Severity: Critical**
### H2: Rust kernel compiled from source on every cold start via subprocess
**File:** `rust_backend.py:60-72`
```python
def _ensure_library() -> Path:
path = _library_path()
if not path.exists():
_build_library() # cargo build --release
return path
def _build_library():
subprocess.run(
["cargo", "build", "--release", ...],
check=True, # no timeout!
)
```
First load takes 3-10 minutes (Rust compilation). Requires Rust toolchain in production. `subprocess.run()` has no `timeout=` — if `cargo` hangs (network, disk, lock contention), the Python process hangs indefinitely. No prebuilt binary distribution.
**Severity: Critical**
### H3: Zero logging — every swallowed error is invisible
The entire codebase has zero use of Python's `logging` module, `print()`, or `warnings.warn()` for error reporting. Every `except: pass`, `except Exception: pass`, and `return default` silently discards the error. **There is no mechanism to detect, alert, or diagnose production failures.**
All `try/except: pass` sites found:
| # | File:Line | What's Hidden |
|---|-----------|---------------|
| 1 | `bingx_venue.py:51` | `float()` conversion failure on any API field value |
| 2 | `bingx_venue.py:133` | regex match failure in rate-limit parsing |
| 3 | `bingx_venue.py:136` | int/float conversion of retry_after |
### H4: `_row_float` rejects zero as a valid value — `or` pattern treats 0 as missing
**File:** `bingx_venue.py:47-55`
```python
def _row_float(row, *keys, default=0.0):
for key in keys:
try:
value = float(row.get(key) or 0.0) # `or 0.0` treats 0 as missing
except Exception:
continue
if value == value and value not in (float("inf"), float("-inf")) and value != 0.0:
return value # explicitly rejects 0.0
return default
```
Two bugs: (a) `except Exception: continue` swallows ALL conversion errors, and (b) `value != 0.0` explicitly rejects zero as a valid return value. A legitimate zero price, zero filled quantity, or zero position amount causes `_row_float` to skip that key and search further. If ALL keys return 0, the default `0.0` is returned — indistinguishable from "none of the keys existed."
Called by every single BingX API response parser: `_position_qty()`, `_position_price()`, `_venue_order_from_row()`, `_event_from_row()`, `_fill_event_from_row()`, `_events_from_submit()`, `_events_from_cancel()`, `_filled_size_from_snapshots()`. None verify the returned 0.0 is real vs. missing-vs-zero.
**Severity: High**
### H5: `_backend_snapshot` timeout returns stale data with no signal to callers
if not self._snapshot_ready.wait(timeout=timeout_ms / 1000.0):
with self._snap_lock:
return self._last_snapshot # STALE — could be hours old
```
When the snapshot-fetch condition times out, returns `self._last_snapshot` — initialized to `None` and only updated on successful fetches. First timeout returns `None`. All callers (`cancel()`, `open_orders()`, `open_positions()`, `reconcile()`, `submit()`) access `.open_orders`, `.open_positions` immediately — crash with `AttributeError: 'NoneType' object has no attribute 'open_orders'`.
Even after the first fetch succeeds, subsequent timeouts return the last-good snapshot which could be arbitrarily stale. No caller timestamps, version-checks, or requests a refresh.
**Severity: High**
### H6: All enum-from-raw-string sites crash on unknown value — zero fallback
If the Rust kernel introduces a new enum variant (e.g., `TradeStage::ENTRY_REJECTED`) not in the Python `TradeStage` enum, `TradeStage("ENTRY_REJECTED")` raises `ValueError` with zero fallback. Crashes `_outcome_from_payload()` and takes down the kernel's event processing loop.
17 sites total across `rust_backend.py` and `real_zinc_plane.py`. No try/except, no mapping, no fallback on any of them.
metadata["_limit_price"] = float(getattr(intent, "limit_price", 0.0) or 0.0)
```
`order_type` and `limit_price` are NOT fields on `KernelIntent` (contracts.py). They only exist in `intent.metadata` as `metadata["order_type"]` if set by the caller. `getattr(intent, "order_type", "MARKET")` checks the dataclass field — not the metadata dict — so it ALWAYS returns `"MARKET"`.
Even when the PINK runtime produces a LIMIT intent (LIMIT_DECISION → `metadata["order_type"] = "LIMIT"`), the legacy adapter converts is to MARKET because it reads the wrong source. Every LIMIT order is submitted as MARKET.
Similarly, `limit_price` is always `0.0` — any limit price from the metadata dict is lost.
**Severity: High**
### H8: `_venue_event_status_from_row` silently maps unknown venue status to ACKED
return VenueEventStatus.ACKED # fallthrough for anything unknown
```
If BingX introduces a new status (`"SUSPENDED"`, `"PENDING_CANCEL"`, `"EXPIRED"`), it doesn't match any known mapping and silently returns `ACKED`. The kernel treats a suspended/cancelled/expired order as acknowledged — dangerous misclassification.
**Severity: High**
### H9: `RealZincPlane.write_slot()` — slot written to `slot_id >= slot_count` is invisible
**File:** `real_zinc_plane.py:206-210**
```python
def write_slot(self, slot):
with self._lock:
self._slot_cache[int(slot.slot_id)] = slot
payload = {"slots": [self._slot_cache[key].to_dict() for key in range(self._slot_count)]}
```
`_slot_cache` is a plain dict — accepts any key. But `read_slots()` only reads 0..slot_count-1. Writing to `slot_id >= slot_count` stores the slot in the cache but it's **never serialized or read back**. No error.
**Severity: High**
### H10: `RealZincControlPlane.read()` has no atomicity with concurrent `update()`
**File:** `real_control_plane.py:70-77**
`_write_region()` zero-fills the buffer then writes the packet. If `read()` interleaves between zero-fill and write, it sees a partially-zeroed buffer → `_decode_packet` returns `{}` → returns stale `self._snapshot` with no observable error. No lock, no sequence check, no atomic read.
The same bug exists in `RealZincPlane.read_slots()` (real_zinc_plane.py:220-230) — reads shared memory while a concurrent `write_slot()` is in progress.
**Severity: High**
### H11: `_RustKernelLib` lazily initialized with race condition
**File:** `rust_backend.py:187-190**
```python
_RUST: _RustKernelLib | None = None
def _get_rust():
global _RUST
if _RUST is None:
_RUST = _RustKernelLib() # no lock — two threads can both create
return _RUST
```
No threading lock. Two concurrent calls to `_get_rust()` (possible via `BingxVenueAdapter`'s thread pool) can create two `_RustKernelLib` objects. The `_RustKernelLib()` constructor runs `_ensure_library()` which runs `subprocess.run(["cargo", "build", ...], check=True)` — concurrent `cargo build` can corrupt the build directory.
**Severity: High**
### H12: `ExecutionKernel.__del__` can deadlock or use-after-free
`_get_rust()` accesses the module-level `_RUST` singleton, which may already be destroyed if the module's garbage collection runs before the instance's. The destroy call happens outside any lock — one thread's destructor could destroy the Rust kernel while another thread is still using it. Use-after-free.
`ControlPlane` protocol defines `wait()` and `notify()`. `MirroredControlPlane` inherits from nothing and only implements `read()`, `update()`, and `mirror()`. Calling `plane.wait()` on a `MirroredControlPlane` raises `AttributeError`.
**Severity: Medium**
### H14: `TradeSlot.remaining_size()` and `VenueOrder.remaining_size()` — same name, different semantics
return max(0.0, float(self.size)) # open position size
# VenueOrder:
def remaining_size(self) -> float:
return max(0.0, self.intended_size - self.filled_size) # unfilled order qty
```
Same method name, completely different semantics. `TradeSlot.remaining_size()` returns the current open position size. `VenueOrder.remaining_size()` returns the untracked/unfilled order quantity. A caller using `slot.remaining_size()` to check if an order is fully filled gets position size, which doesn't change with fills — it changes with entry/exit.
**Severity: Medium**
### H15: `_maybe_close()` — `asyncio.run()` RuntimeError silently swallowed for coroutines
**File:** `launcher.py:233-243**
```python
if inspect.isawaitable(result):
try:
asyncio.run(result)
except RuntimeError:
pass # SILENT — coroutine never executed
```
When `maybe_close` is called from an async context (which it is — `DITAv2LauncherBundle.close()` is used in async test code), `asyncio.run()` raises `RuntimeError("Cannot run the event loop while another loop is running")`. The exception is swallowed, the coroutine is never awaited, and the close/disconnect never happens.
Also: `break` after calling the first found method means if an object has both `close()` and `disconnect()`, `disconnect()` is never called.
**Severity: Medium**
### H16: `_build_launcher_bundle` imports `BingxDirectExecutionAdapter` inside function — import-time side effect is safe but lazy loading masks errors
**File:** `launcher.py:254**
```python
def _build_venue(...):
from prod.clean_arch.adapters.bingx_direct import BingxDirectExecutionAdapter
```
Import inside function — safe, lazy, no side effects. But if the `bingx_direct` module has an import error (missing dependency, version mismatch), it only surfaces at bundle construction time, not at process start. A misconfigured production deployment would fail on the first trade, not on boot.
**Severity: Informational**
### H17: `load_dotenv()` at module level — import-time filesystem I/O and env mutation
**File:** `launcher.py:49-51**
```python
load_dotenv(PROJECT_ROOT / ".env") # executes on module import
```
Runs on every import of `launcher.py` — reads filesystem, mutates process environment. Hard to mock in tests — setting env vars in test setup gets overwritten on module import. Also: if `.env` doesn't exist, `load_dotenv()` silently does nothing — missing config is invisible.
**Severity: Medium**
### H18: `_run()` in `BingxVenueAdapter` — `asyncio.run()` thread-pool bridge blocks on every call
Every call to `_run()` that receives an awaitable blocks the calling thread via `.result()`. The BingX HTTP call inside `submit_intent()` can take 1-5 seconds. During this block, the event loop cannot process other tasks. In a single-runtime deployment, this stalls the entire policy cycle.
**Severity: Medium**
### H19: `HazelcastClientLike` protocol has zero concrete implementations in workspace
**File:** `hazelcast_projection.py:13-15**
```python
class HazelcastClientLike(Protocol):
def get_map(self, name: str): ...
def get_topic(self, name: str): ...
```
Used as a type hint. No code in the workspace creates an object that satisfies this protocol. The Hazelcast client comes from an external package. If the external API changes, the protocol silently drifts — no compilation check.
**Severity: Low**
### H20: `_decode_packet` in RealZinc — no bound check on `size` beyond `> len(buf)-16`
If shared memory contains a corrupted `size` field within bounds, `.decode()` or `json.loads()` raises — uncaught by callers. A single corrupted byte in shared memory crashes the kernel.
**Severity: Low**
### H21: All Rust crate features enabled by default — `wasm-bindgen` compiled into native shared library
The Rust kernel is a native `.so`/`.dylib` but chrono's `iana-time-zone` pulls in `js-sys` and `wasm-bindgen` (WebAssembly support) even on native Linux. Larger binary, longer compile times. `cc` crate pulled in for `iana-time-zone-haiku` which only compiles on Haiku OS.
**Severity: Low**
### H22: `socket.getaddrinfo` monkey-patch in test generator code
**File:** `gen2.py:295-298**
Monkey-patches Python stdlib `socket.getaddrinfo` to force IPv4 as a workaround for IPv6 resolution failure in the deployment environment. If copied to production code, would break IPv6 connectivity.
After both fills, the actual position is 0.8 but `slot.size` reports 0.3. The position is under-counted by 0.5 — 62.5% error.
The exit path correctly does `slot.size = (slot.size - fill_size).max(0.0)` (subtractive). The entry path should accumulate: `slot.size += fill_size`.
This only manifests with LIMIT orders that receive multiple partial fills over time — a scenario entirely absent from tests (I7).
**Severity: Critical**
### I2: `exit_ratio = 0.0` creates zero-size exit order — slot stuck in EXIT_REQUESTED
**File:** `_rust_kernel/src/lib.rs:467-469`
```rust
let exit_ratio = slot.next_exit_ratio(); // returns 0.0 from exit_leg_ratios=[0.0, ...]
let base_size = if slot.initial_size > 0.0 { ... } else { slot.size };
let exit_size = (base_size * exit_ratio).max(0.0); // = 0.0
```
When `exit_leg_ratios` contains `0.0` in any position, `exit_size = 0.0`. The zero-size exit order is submitted to the venue (`intended_size = 0`). On the fill side, `realized_pnl()` returns 0.0 (guarded by `exit_size <= 0.0`), and `slot.size` is unchanged. The slot stays in `EXIT_REQUESTED` with no means to advance — the leg is consumed but nothing happened. Subsequent exits may eventually handle this, but the zero-size leg is a wasted FSM transition that leaves the slot in a confusing intermediate state.
Also: `NaN` in `exit_leg_ratios` (from `clamp(0.0, 1.0)` not guarding NaN, though serde_json rejects NaN) would produce the same zero-size exit behavior.
if self.entry_price <= 0.0 { self.entry_price = price; } // catches -0.5, replaces it
```
If `entry_price` is negative (possible only via `set_slot_json` direct injection — not from normal trading), Python keeps it and computes `unrealized_pnl` with wrong sign. Rust replaces it. The Python-side `mark_price` is only called from `ExecutionKernel.mark_price()` in rust_backend.py:LOW-1, which never writes back to the Rust kernel — so the Python-side calculation is purely local and the inconsistency has no effect on the Rust kernel's canonical state. However, the `observe_slots` call after `mark_price` re-reads from the Rust kernel, which recomputes PnL correctly. The Python-side mark_price is effectively wasted computation that never feeds back.
**Severity: Informational**
### I4: No Rust unit tests for 99% of kernel functionality
**File:** `_rust_kernel/src/lib.rs:1731-1765`
Only 1 Rust test exists: `enter_then_ack_fill` — creates a 2-slot kernel, submits ENTER, sends ACK, asserts state transitions.
**Not tested in Rust:**
- EXIT, CANCEL, MARK_PRICE, RECONCILE, CONTROL actions
- Any FILL event (PARTIAL, FULL)
- CANCEL_ACK, CANCEL_REJECT, ORDER_REJECT
- RATE_LIMITED handling
- Multi-leg exits
-`consume_exit_leg` edge cases
-`realized_pnl()` formula with boundary values
-`mark_price()` with extreme values
-`resolve_slot()` fallback path
-`reconcile_slots_json` dedup/overflow
- Any C FFI boundary function
- Any serde deserialization failure
- Null pointer handling
No `#[cfg(test)]` module exists — the single test is inline. No Rust integration tests (`tests/` directory).
**Severity: High**
### I5: `MockVenueScenario` rejection flags exist but zero tests use them
**File:** `mock_venue.py:23-35`
```python
@dataclass
class MockVenueScenario:
reject_entries: bool = False
reject_exits: bool = False
cancel_reject: bool = False
```
Three boolean flags to simulate venue rejection of orders. Not a single test in `test_flaws.py` sets any of them to `True`. The `ORDER_REJECT` handler in the Rust kernel's `on_venue_event` exists (lib.rs lines ~1440-1460) but is never exercised by any test.
Similarly, `entry_partial_fill_ratio` and `exit_partial_fill_ratio` exist on `MockVenueScenario` but only one test (`test_cancel_entry_with_partial_fill`) uses partial fills at all — and it only checks `size > 0`, not the full capital-accrual chain.
**Severity: High**
### I6: No LIMIT order test through the full kernel path
The test suite has zero LIMIT orders. The Rust kernel doesn't even contain LIMIT-specific logic — all orders are MARKET. The generated live tests have `limit_does_not_fill` and `limit_immediate_fill` scenario placeholders, but:
-`limit_does_not_fill` uses `reference_price=0.0` (not a real LIMIT order)
-`limit_immediate_fill` uses `target_size=-0.001` (negative size → clamped to 0.0)
Neither scenario actually submits a LIMIT order with `order_type="LIMIT"` and a non-zero `limit_price`. The `_legacy_intent` bug (H7) would convert any LIMIT attempt to MARKET anyway.
The only LIMIT-related code is the Rust kernel's `if intent.order_type == "LIMIT"` branches (lib.rs:503, 1584) which are compile-time dead code — `KernelIntent` doesn't have an `order_type` field that serde would populate.
**Severity: High**
### I7: Three weak/vacuous assertions in `test_flaws.py`
**File:** `test_flaws.py`
1.**Line 512:**`assert order.metadata.get("asset") is not None or order.metadata.get("slot_id") is not None` — mock venue always sets both, this can never fail.
2.**Line 700:**`test_pnl_warning_on_unsettled_reentry` — titled to assert a warning is raised but only checks `r.accepted`. Never checks `diagnostic_code` or verifies the warning was issued.
3.**Line 318:**`assert slot.active_entry_order is None or slot.active_entry_order.status == VenueOrderStatus.FILLED` — the `or` allows two different scenarios to pass, reducing diagnostic power.
**Severity: Low**
### I8: `slot.size = fill_size` entry overfill no guard
**File:** `_rust_kernel/src/lib.rs:798`
Already noted in I1 — entry fill sets `slot.size` directly to `fill_size`. Unlike exit fill which has `(slot.size - fill_size).max(0.0)`, there's no guard against entry overfill (venue fills more than the intended order size). For MARKET orders this is fine (one fill per order), but for LIMIT orders with multiple partial fills, the accumulated fill could exceed `initial_size`.
**Severity: Low** (only relevant with LIMIT + partial fills, which don't exist in the codebase)
### I9: No crash durability — slot state is pure in-memory until step 7 of process_intent
If the process crashes between steps 2-5, the slot state accumulated in the Rust kernel's in-memory `KernelCore` is **completely lost**. The Rust kernel has no WAL, no journal, no persistent store. On restart, `ExecutionKernel.__init__` creates a fresh `KernelCore` with all slots IDLE.
The crash between step 3 and step 5 is the most dangerous: the exchange has an open order/position, but the kernel has no record of it. On restart:
- The Rust kernel sees `slot.slot_id = IDLE`
- The Zinc slot cache may or may not have the pre-crash state (depends on timing)
- No code on restart loads Zinc state back into the Rust kernel (I14)
- The exchange order lives until it fills (unexpected position) or is manually cancelled
**Concrete example:** `venue.submit()` sends POST to BingX, order placed. HTTP response arrives. `on_venue_event(ORDER_ACK)` transitions slot to `ENTRY_WORKING`. Crash between returning from `on_venue_event` and `zinc_plane.write_slot()`. On restart: slot is IDLE, no active entry order, `_last_settled_pnl` is reset. The exchange has a live ENTRY_WORKING order. Next `process_intent(ENTER)` gets `SLOT_BUSY` because... wait — the fresh kernel doesn't know the order exists, so it sees slot as IDLE and allows a new ENTER. The old order fills on the exchange → double position.
**Severity: Critical**
### I10: `seen_event_ids` lost on restart — events replayed after restart are double-processed
**File:** `_rust_kernel/src/lib.rs:672-683`
`seen_event_ids` is per-slot, per-[`KernelCore`] instance — purely in-process memory. On restart with a fresh `KernelCore`, every slot has `seen_event_ids = Vec::new()`. If events are replayed (from `pump_venue_events()` calling `venue.reconcile()` which re-fetches exchange state):
1. Original run: order fills → `FULL_FILL` with `event_id = "EV-00000042"` → processed, slot → `POSITION_OPEN`
4.`pump_venue_events()` fetches same exchange state → new `VenueEvent` objects with new event IDs (adapter's `_event_seq` resets)
5. Rust kernel sees these as novel events — processes them again
6. Position is double-booked, PnL double-settled
The `bingx_venue._event_seq` is an instance-level `itertools.count()` starting from 1. On adapter restart, it resets — so the new event IDs won't match the old ones anyway. Dedup is fundamentally impossible across restarts.
**Severity: Critical**
### I11: No idempotency key (`newClientOrderId`) sent to BingX
BingX supports `newClientOrderId` for order idempotency — sending the same ID twice returns the original order status instead of creating a duplicate. The DITAv2 kernel passes `intent.intent_id` as `decision_id` to the legacy adapter, but there's no guarantee this maps to `newClientOrderId` in the BingX payload.
If the HTTP POST to `/trade/order` times out before the response is read:
1. The order was placed on the exchange
2.`_call_backend` raises a `BingxHttpError` (or similar network exception)
3.`process_intent()` propagates the exception — no retry
4. Next cycle: caller may retry with a new `intent_id`
5. Second POST creates a **second order** on the exchange — duplicate position
Without a client-order-id that persists across retries, the system can create duplicate orders on network timeouts. The exchange has no way to deduplicate.
**Severity: High**
### I12: No graceful degradation for ANY subsystem
Every subsystem failure mode examined:
| Subsystem | Failure | Current behavior |
|-----------|---------|-----------------|
| Zinc SHM init | Corrupted region, OOM | Silent fallback to InMemoryZincPlane (no operator signal) |
| Memory pressure | OOM | Process killed by kernel. No signal handler. Zero signal handlers. |
**No subsystem has a graceful degradation path.** No circuit breaker, no retry queue, no fallback to log-only mode, no offline/cached trading mode. Every failure (except the two init-time silent fallbacks) crashes the current kernel operation.
**Severity: High**
### I13: Stray venue event can reactivate a CLOSED slot — no guard
**File:** `_rust_kernel/src/lib.rs:625+`
The `on_venue_event` function has no guard for closed slots:
A CLOSED slot should be a terminal state that rejects all events. Currently only CANCEL_ACK is harmless on a closed slot; the rest can revive a dead position.
**Severity: High**
### I14: No `reconcile_from_slots` call on startup — Zinc state never loaded into Rust kernel
1.`RealZincPlane.__init__` reads state from Zinc shared memory into `_slot_cache`
2.`ExecutionKernel.__init__` creates fresh `KernelCore` — all slots IDLE
3.`KernelStateView(self)` reads from the fresh kernel
4.`account.observe_slots([self._get_slot(i) for i in range(max_slots)])` — all slots IDLE
Step 3 and 4 read from the Rust kernel, NOT from Zinc. The Zinc `_slot_cache` populated in step 1 is **never loaded into the Rust kernel**. The `reconcile_on_restart` flag exists in `KernelControlSnapshot` (default `True`) but is never checked anywhere in `ExecutionKernel.__init__` or the launcher.
The system always starts with a blank state even when durable shared memory state exists.
When the exchange rejects a cancel (typically because the order was already filled or no longer exists), the slot stays in `EXIT_WORKING` with `active_exit_order` still attached. Every subsequent CANCEL attempt hits the same path — the exchange returns "order not found," the kernel sees `CANCEL_REJECT`, and the slot is stuck forever.
If the order was already filled (CANCEL_REJECT means "can't cancel, no longer open"), the slot should check the actual position size and potentially transition to `POSITION_OPEN` or `CLOSED` depending on fill status.
**Severity: Medium**
### I16: Zinc shared memory — world-readable/writable by same-machine processes
Region names are predictable (prefix defaults to `"dita_v2"`). The `SharedRegion` uses POSIX `shm_open` — the default permissions depend on umask (typically `0644` or `0600`). Any process on the same machine can:
- **Read**: Open the region → `as_buffer()` → `_decode_packet()` → read all slot state, PnL, open orders, control settings
- **Write**: Open the region → forge a packet (`struct.pack("!QQ", seq, len) + json_bytes`) → overwrite slot state, inject fake intents, modify control plane
No access control, no encryption, no integrity check (HMAC/signature) on the wire format. The sequence number is the only ordering mechanism, and it's trivially predictable.
**Severity: High**
### I17: `KernelSlotView` exposes full slot state via unrestricted `__getattr__`/`__setattr__`
**File:** `rust_backend.py:411-460`
```python
class KernelSlotView:
def __getattr__(self, name):
slot = self._snapshot()
return getattr(slot, name) # read ANY field
def __setattr__(self, name, value):
setattr(slot, name, value)
self._kernel._set_slot(slot) # write ANY field — bypasses FSM
- Write all slot fields: `slot_view.realized_pnl = -9999999` — directly manipulates PnL figures flowing into capital settlement
The `_set_slot` call writes through to the Rust kernel without any FSM validation. The entire kernel state is exposed through mutable Python objects with zero access control.
**Severity: High**
### I18: `sys.path.insert(0, ...)` at import time in three production files
# real_control_plane.py, real_zinc_plane.py — at MODULE LEVEL:
sys.path.insert(0, str(_ZINC_ADAPTER_PATH))
# test_flaws.py, _build_pink_bodies.py, _gen_test.py — at MODULE LEVEL:
sys.path.insert(0, '/mnt/dolphinng5_predict')
```
`sys.path.insert(0, ...)` gives the injected path highest import priority. An attacker with filesystem write access to the inserted path can create a malicious module that shadows a legitimate import (e.g., `zinc.py`, `utils.py`, `typing.py`). When any subsequent `from X import Y` runs, the attacker's module loads with the full privileges of the kernel process.
The production files use a relative path resolution (`Path(__file__).resolve().parents[3] / "zinc" / "adapters" / "python"`), while the test files use a hardcoded absolute path (`'/mnt/dolphinng5_predict'`). Both patterns are dangerous.
**Severity: High**
### I19: `pump_venue_events` re-fetches exchange state that can produce phantom position events
**File:** `bingx_venue.py:395-415`
`reconcile()` calls `_backend_snapshot()` which fetches current positions and open orders from the exchange. The `_events_from_snapshot` method diff-s the current snapshot against the last-known snapshot to produce events:
```python
def _events_from_snapshot(self, before, after):
for symbol, current_pos in after.open_positions.items():
prev_pos = before.open_positions.get(symbol)
if current_pos and (not prev_pos or abs(prev_pos.position_amount) <1e-12):
# This looks like a new position — emit event
```
If `before` is stale (from `_backend_snapshot` timeout), the diff can produce spurious events. A position that existed before the crash is absent from the stale snapshot → the diff sees it as "new" → emits an entry fill event → Rust kernel processes it as a fresh enter → double position. This compounds with I10 (seen_event_ids lost on restart).
**Severity: High**
### I20: `exit_leg_ratios` no guard against empty list — `next_exit_ratio` returns 1.0
**File:** `contracts.py:196-198`
```python
def next_exit_ratio(self) -> float:
if self.active_leg_index <len(self.exit_leg_ratios):
If `exit_leg_ratios` is empty (default `(1.0,)` prevents this normally, but the default is only `(1.0,)` in the dataclass), `next_exit_ratio()` returns `1.0`. This is the same as "exit everything" — the `consume_exit_leg` then advances `active_leg_index` to `min(1, 1) = 1`, and `all_legs_done = active_leg_index >= exit_leg_ratios.len()` → `1 >= 0 = true` → slot closes. The empty-ratios edge case is silently handled with `unwrap_or(1.0)`, which happens to be correct — but undocumented.
**Severity: Informational**
### I21: No test for rate-limited events — `RATE_LIMITED` kernel path is dead code
**File:** `_rust_kernel/src/lib.rs` (event handler), `MockVenueScenario.mock_venue.py` (no rate_limit flag)
The Rust kernel has a handler for `KernelEventKind::RATE_LIMITED` (lib.rs lines ~1480-1500). The event flows through the Python bridge's `process_intent()` rate-limit detection (rust_backend.py:585-593). But `MockVenueScenario` has no flag to emit rate-limited events. The only path to trigger `RATE_LIMITED` is from the real BingX adapter — which requires live exchange connectivity.
The entire RATE_LIMITED code path — in both Python and Rust — is untested in CI. Any bug in this path only surfaces in production under rate-limit conditions.
**Severity: Medium**
### I22: Thread pool for `_run` — `max_workers=3` shared across ALL adapter instances
Class-level singleton — all `BingxVenueAdapter` instances share the same 3-thread pool. With the runtime's `step()` calling `submit()` (1 thread) + `_backend_snapshot` (potentially another thread for open orders) + `cancel()` (1 thread in parallel), all 3 threads are consumed. A fourth concurrent call blocks the calling thread at `.result()` indefinitely — freezing the entire event loop.
The pool is never shut down. If a `BingxVenueAdapter` is destroyed, the threads remain running (zombie workers). No `close()`/`disconnect()` path shuts down the executor.