diff --git a/PINK_DITAv2_E2E_TRACE_ANALYSIS.md b/PINK_DITAv2_E2E_TRACE_ANALYSIS.md index 0dc0aae..77f61af 100644 --- a/PINK_DITAv2_E2E_TRACE_ANALYSIS.md +++ b/PINK_DITAv2_E2E_TRACE_ANALYSIS.md @@ -5648,3 +5648,413 @@ Not a production risk (mock is test-only), but the unbounded growth pattern is w | Q | Pass 14 (Serde Edges/Backup Diffs/Market Data) | 12 | 0 | 4 | 3 | 2 | 3 | | R | Pass 15 (Resource Leaks/Trust Boundaries/Security) | 14 | 2 | 6 | 3 | 2 | 1 | | **Total** | | **289** | **23** | **83** | **82** | **64** | **37** | + +--- + +## PASS 16 — ERROR HANDLING, ARITHMETIC STABILITY, TEST INFRASTRUCTURE + +### S1: `realized_pnl()` and `mark_price()` NaN bypasses `<= 0.0` guard — NaN PnL corrupts `k_realized_pnl` + +**Files:** `_rust_kernel/src/lib.rs:1121-1129` (realized_pnl), `_rust_kernel/src/lib.rs:400-408` (mark_price), `contracts.py:193-200` (Python mark_price) + +IEEE 754 semantics: all comparisons with `NaN` return `false`. The guard `if slot.entry_price <= 0.0 || exit_size <= 0.0` is **bypassed by NaN** — a NaN `entry_price` or `exit_size` passes through silently. + +**Rust `realized_pnl()` (line 1121-1129):** +```rust +if slot.entry_price <= 0.0 || exit_size <= 0.0 { return 0.0; } // NaN passes +let mut delta = (exit_price - slot.entry_price) / slot.entry_price; // NaN / NaN = NaN +delta * notional // NaN → corrupts k_realized_pnl via += +``` + +**Rust `mark_price()` (line 400-408):** +```rust +if self.entry_price <= 0.0 || self.size <= 0.0 { return; } // NaN passes +self.unrealized_pnl = delta * self.size * self.entry_price * self.leverage; // stores NaN +``` + +**Python `mark_price()` (contracts.py:193-200):** Same pattern — `if self.entry_price <= 0 or self.size <= 0` passes NaN, produces NaN PnL. + +Once NaN enters `k_realized_pnl` or `unrealized_pnl`, every subsequent arithmetic operation propagates NaN: `k_capital`, `available_capital`, margin checks, reconcile deltas. The kernel enters a dead state where all financial computations produce NaN. + +**Trigger paths for NaN entry_price:** +1. `set_slot_json()` bypasses process_intent — can set arbitrary slot fields +2. An `INVALID_ORDER_PARSE` event that produces `entry_price` = NaN in exchange data +3. A divide-by-zero in a prior computation (extremely unlikely but theoretically possible) + +**Fix:** Replace `<= 0.0` with `!is_finite() || <= 0.0` in all three sites. + +**Severity: Critical** + +### S2: MockVenue `_exchange_event_queue` property has check-then-act race — silently drops events + +**File:** `mock_venue.py:228-232` + +```python +@property +def _exchange_event_queue(self) -> list: + if not hasattr(self, "_exeq"): + object.__setattr__(self, "_exeq", []) + return self._exeq +``` + +`hasattr` + `object.__setattr__` is a classic TOCTOU race. If `queue_exchange_event()` (called from sync test code) and `subscribe()` (async generator started on event loop thread) interleave: +1. Thread A calls `hasattr` → `False` +2. Thread B calls `hasattr` → `False` +3. Thread A calls `object.__setattr__` → creates `_exeq = []` +4. Thread B calls `object.__setattr__` → **overwrites** `_exeq = []`, losing Thread A's window +5. Thread A appends event to its `_exeq` reference +6. Thread B returns its `_exeq` reference — Thread A's append is invisible + +The same list is then subject to `list.append()` vs `list.pop(0)` without synchronization — `pop(0)` on empty list raises `IndexError`, crashed events. + +**Fix:** Use `threading.Lock` around queue access, or use `collections.deque` with `self._exeq` initialized in `__post_init__`. + +**Severity: Critical** + +### S3: No FSM-specific test files — `test_kernel_fsm.py` and `test_kernel_fsm_recovery.py` do not exist + +**Files:** missing — no `test_kernel_fsm.py` or `test_kernel_fsm_recovery.py` anywhere in workspace + +The kernel's FSM is the core of the system, with states `IDLE → ORDER_REQUESTED → ORDER_SENT → ENTRY_WORKING → POSITION_OPEN → EXIT_REQUESTED → EXIT_WORKING → CLOSED` and additional states `STALE_STATE_RECONCILING`, `INTERVENTION_REQUIRED`, `TRADE_TERMINAL_WRITTEN`. + +**Missing transition coverage:** +| Transition | Status | +|-----------|--------| +| `IDLE → ENTRY_WORKING` via ENTER intent | ✅ tested incidentally via test_flaws | +| `ENTRY_WORKING → POSITION_OPEN` via fill | ✅ tested incidentally | +| `POSITION_OPEN → EXIT_REQUESTED` via EXIT | ✅ partial | +| `POSITION_OPEN → IDLE` via exit cancel | ⚠️ single test only | +| `POSITION_OPEN → POSITION_OPEN` via partial exit | ❌ NOT tested | +| `STALE_STATE_RECONCILING → *` | ❌ NOT tested | +| `INTERVENTION_REQUIRED → *` | ❌ NOT tested | +| `TRADE_TERMINAL_WRITTEN → *` | ❌ NOT tested | +| All error transitions (RATE_LIMITED, INVALID_INTENT) | ❌ NOT tested | +| FSM timeout transitions | ❌ NOT tested (no timer exists) | +| Concurrent intent processing (two EXIT intents same slot) | ❌ NOT tested | + +The only FSM testing is incidental through `test_flaws.py` — which tests specific flaw behaviors, not FSM correctness. + +**Severity: Critical** + +### S4: Generated tests use `await asyncio.sleep(0.8)` assuming fast mock venue — flaky false positives on slow CI + +**Files:** `_gen_test.py` (all generated bodies), `gen2.py` (all generated bodies), `gen_live_tests.py` (all generated bodies) + +Every generated test body follows this pattern: +```python +r = _si(k, E.ENTER, tid, sym, "LONG", p, 0.001); await asyncio.sleep(0.8) +r = _si(k, E.EXIT, tid, sym, "FLAT", p, 0.001); await asyncio.sleep(0.8) +r = _si(k, E.CANCEL, tid, sym, "FLAT", p, 0.001); await asyncio.sleep(0.8) +``` + +The `0.8` second sleep assumes the mock venue fills, cancels, and processes in <0.8s. On a loaded CI system (with virtualization, resource contention), the mock venue may take longer. The test then: +- Operates `EXIT` on a slot still in `ENTRY_WORKING` — intent is rejected with `SLOT_BUSY` +- The test checks `r.accepted` (or generated assertions) — gets `False` +- **The test fails, but not because the system is buggy** — it fails because the sleep was too short + +This is a **timing-dependent false negative** pattern. The mock venue processes synchronously on `subscribe()`, which is called from `_on_exchange_event_callback` which is triggered on `intent.apply()`. In tests, the `DITAv2LauncherBundle._run()` calls `intent.apply()` which calls `process_intent` which calls `_on_exchange_event_callback` — but if the venue hasn't yet yielded the fill event from `subscribe()`, the slot isn't updated. + +The fix is to await an event condition (e.g., `slot.fsm_state == POSITION_OPEN`) instead of using sleep. + +**Severity: Critical** + +### S5: `bingx_venue._rate_limit_retry_after_ms()` returns 0 on any parse failure — instant retry with no backoff + +**File:** `bingx_venue.py:169-184` + +```python +@staticmethod +def _rate_limit_retry_after_ms(msg: str) -> int: + try: + # Checks multiple response fields for retry-after hint + ... + m = re.search(r"unblocked after (\d+)", msg) # regex on exchange error message + if m: return max(0, int(float(m.group(1)))) # integer parse + return 0 # no retry-after found → default to 0 + except Exception: # catches ANY failure in the try block + return 0 # returns 0 = INSTANT RETRY +``` + +If the regex fails, `int()` fails, or any other exception occurs, the function returns `0` — meaning "retry immediately." This defeats the purpose of rate-limit detection. Every parse failure produces a retry storm rather than a safe default (e.g., `5000` ms). + +**Specific failure paths:** +- Exchange returns a new/bilingual rate-limit message format → regex misses → returns 0 +- `int(float(raw_retry))` on a non-numeric string → `ValueError` → caught → returns 0 +- `float()` on a value with locale-specific decimal (e.g., European `,`) → `ValueError` → returns 0 + +**Fix:** Default to a safe backoff (e.g., `5000` ms) in the `except` block. Log the parse failure for debugging. + +**Severity: High** + +### S6: Venue adapter detects rate limits but enforces zero backoff — retry storm reaches exchange + +**File:** `bingx_venue.py:384-386,471` + +When `submit()` or `cancel()` receives a rate-limited response, the adapter: +1. Extracts `retry_after_ms` from the response ✅ +2. Tags the event with `RATE_LIMITED` status and `retry_after_ms` ✅ +3. Returns the event to the kernel, which marks it `retryable:true` ✅ +4. **Does NOT enforce the backoff delay** ❌ — the caller must decide when to retry + +If the caller (the algo or scheduler) ignores `retry_after_ms` and resubmits immediately, the adapter does not block or queue the request. The rate-limited request reaches the exchange again, potentially getting another 429, which wastes bandwidth and exchange quota. + +No circuit breaker, no request queue, no automatic backoff at the venue adapter level. The adapter is purely passive — it reports rate limits but does not enforce them. + +**Fix:** Add a `_last_rate_limit_time` and `_last_rate_limit_delay` on the adapter. If a request arrives before `last_rate_limit_time + retry_after_ms`, queue it or return `RATE_LIMITED` immediately without calling the exchange. + +**Severity: High** + +### S7: `capital_epsilon = 1e-4` (0.0001 USDT) too tight for f64 precision — false WARN classifications + +**File:** `account.py:224` + +```python +capital_epsilon: float = 1e-4 # 0.0001 USDT — extremely tight +``` + +At 25k USDT capital, f64 has ~15-16 significant digits in base-10. The unit in the last place (ULP) at 25k is ~3.6e-12, so the absolute rounding error is ~8.8e-8 USDT — well below 1e-4. But accrual operations (100+ PnL additions) accumulate ~sqrt(N) × ULP ≈ 1e-6 USDT — still below 1e-4. + +The problem: the R1 and R2 reconcile deltas compare `k.capital` vs `e.wallet_balance` which come from **different computation paths** (kernel fold vs exchange aggregation). With different rounding behaviors, the delta can exceed 1e-4 even on perfectly correct state. The `abs(k.capital - e.wallet_balance) < 1e-4` test produces `WARN` on the third or fourth fill at typical sizes. + +At $1M capital, ULP is ~1.2e-10, absolute rounding ~1.2e-6 USDT — about 10x below 1e-4. Tight but not triggering. But aggregated across 100 fills, the accumulated rounding from different computation paths can exceed 1e-4. + +**Fix:** Increase to at least `1e-3` (0.001 USDT) or make it configurable per-asset. + +**Severity: High** + +### S8: Generated tests use module-level `asyncio.run()` — leaks pending tasks on Python 3.12+ + +**Files:** `test_flaws.py` (all test functions), `test_exchange_event_seam_parity.py` (all test functions), all generated test files + +Each test function calls `asyncio.run()` to execute async kernel operations within a sync test: +```python +def test_something(self): + asyncio.run(self._run_test()) # creates event loop, runs, closes +``` + +Python 3.10+ issues a deprecation warning for repeated `asyncio.run()` calls if the previous loop had pending tasks. In Python 3.12+, this raises `RuntimeError: asyncio.run() cannot be called from a running event loop` if any tasks survive from the previous invocation. + +All test functions call `asyncio.run()` directly. If a test creates a task that doesn't complete (e.g., a timeout that's not properly awaited), that task survives the loop close and the next `asyncio.run()` call crashes. + +**Fix:** Use `pytest-asyncio` with `@pytest.mark.asyncio` and `async def test_method`, or add `try/finally` with task cancellation. + +**Severity: High** + +### S9: `_build_pink_extended.py` and `_build_pink_bodies.py` use `str.replace()` patching — silently does nothing on format change + +**Files:** `_build_pink_extended.py` (all), `_build_pink_bodies.py` (all) + +Both scripts modify `test_pink_bingx_dita_live_e2e.py` in-place using `str.replace()` and `str.find()` index math: +```python +content = content.replace(old_imports, new_imports) +content = content.replace(old_build, new_build) +idx = content.find(old_body) +content = content[:idx] + new_body + content[idx+len(old_body):] +``` + +If the generated file's whitespace or ordering changes (e.g., Python version updates, import sorting), `str.replace()` silently does nothing — the old string is not found, so no replacement occurs. The file is written back **unchanged**. Since this is a build-time preprocess step, there's no test that validates the patched output. + +The index-based insertion is even more fragile — if the generated file's structure shifts by even 1 character (blank line added, comment changed), the index math inserts new code at the wrong position, producing syntactically broken import/assert blocks. + +**Fix:** Parse the generated file as AST and insert/modify nodes, or use a template engine with well-defined insertion points. + +**Severity: High** + +### S10: `bingx_user_stream._consume()` has no per-message timeout — silent WS hang blocks forever + +**File:** `bingx_user_stream.py:251-270` + +```python +async def _consume(self, ws: aiohttp.ClientWebSocketResponse) -> AsyncIterator[dict]: + async for msg in ws: # no timeout on individual message read + if msg.type == aiohttp.WSMsgType.TEXT: + yield json.loads(msg.data) + elif msg.type == aiohttp.WSMsgType.CLOSED: + break +``` + +The `async for msg in ws:` loop blocks until the next message arrives. If the WebSocket connection silently drops (no CLOSE frame, no TCP RST), the loop blocks until the TCP keepalive timeout — which can be **2 hours** on some Linux configurations. + +No application-level heartbeat, no ping/pong timer, no `asyncio.wait_for()` wrapper. The BingX listenKey keepalive (every 30 min) is HTTP-based, not WS-based, so it doesn't detect a WS-level silence. + +**Fix:** Wrap the `async for` with `asyncio.wait_for(..., timeout=60)` or implement WS ping/pong. + +**Severity: High** + +### S11: `bingx_venue._run()` blocks `ThreadPoolExecutor` thread with no timeout — backend HTTP hang freezes the adapter + +**File:** `bingx_venue.py:202-209` + +```python +def _run(self, result: Awaitable) -> Any: + loop = ... + if loop is None: # no running loop + return asyncio.run(result) # blocks until HTTP completes or TCP timeout + else: + pool = self._get_executor() + fut = pool.submit(asyncio.run, result) + return fut.result() # BLOCKS FOREVER — no timeout argument +``` + +`fut.result()` with no timeout argument blocks until the future completes. If the HTTP call hangs (BingX server never responds, TCP half-open), the `ThreadPoolExecutor` thread blocks indefinitely. Since the pool has only 3 threads, 3 hung HTTP calls consume all worker threads, and all subsequent adapter operations (submit, cancel, snapshot) **hang forever** because no threads are available. + +This is a partial-DoS on the adapter. If the BingX API becomes unresponsive, the adapter locks up completely. + +**Fix:** Use `fut.result(timeout=30)` and handle `TimeoutError` with a fallback event. + +**Severity: High** + +### S12: `bingx_venue._rate_limit_retry_after_ms()` regex depends on exchange error message format — non-portable to other exchanges + +**File:** `bingx_venue.py:176` + +```python +m = re.search(r"unblocked after (\d+)", msg) +``` + +The regex looks for the English phrase `"unblocked after "` in the exchange error message. If BingX changes their message format, localizes messages (Chinese exchange — could return Chinese text), or updates the wording, the regex silently returns 0 (caught by `except Exception`). + +Additionally, the phrase `"unblocked"` is specific to BingX's rate-limit error wording. If the adapter is later extended to support other exchanges, this regex needs to be parameterized. + +**Fix:** Prefer the numeric retry-after header field (response header `Retry-After`) rather than parsing the error message body. + +**Severity: Medium** + +### S13: `bingx_venue._row_float()` silently skips malformed rows — missing fields produce silent continue + +**File:** `bingx_venue.py:51-56` + +```python +@staticmethod +def _row_float(row: dict, keys: tuple[str, ...]) -> float: + for k in keys: + v = row.get(k) + if v is not None and v != 0.0: # also filters out 0.0 values! + try: + return float(v) + except Exception: + continue # silently skip; next key tried + return 0.0 +``` + +Two issues: +1. `v != 0.0` filters out legitimate zero values. If an exchange response has `"origQty": "0"` for a cancelled order, `_row_float` skips it and tries the next key — which may be a different field with a non-matching value. +2. `except Exception: continue` silently skips `ValueError`, `TypeError`, and any other parsing error. No log, no diagnostic. A corrupted exchange response produces `0.0` with no trace. + +**Severity: Medium** + +### S14: `bingx_user_stream` reconnection backoff lacks jitter — thundering herd when multiple clients reconnect simultaneously + +**File:** `bingx_user_stream.py:133-138` + +```python +delay_ms = min(self._reconnect_delay_ms * 2, self._reconnect_max_ms) +``` + +Pure exponential backoff with **no jitter**. If multiple `BingxUserStream` instances (for different symbols or accounts) disconnect simultaneously (e.g., BingX WS maintenance), their reconnection attempts synchronize. Each retries at exactly the same intervals, creating a thundering herd against the BingX WebSocket endpoint. + +**Fix:** Add random jitter: `delay_ms = min(base * 2, max_ms) * (0.5 + random.random())`. + +**Severity: Medium** + +### S15: `_venue_event_status_from_row()` falls back to ACKED for unrecognized statuses — masks new rejection types + +**File:** `bingx_venue.py:85-101` + +```python +@staticmethod +def _venue_event_status_from_row(row: dict) -> VenueEventStatus: + status = (row.get("status") or "").strip().upper() + if status == "NEW": return VenueEventStatus.ACKED + elif status == "CANCELED": return VenueEventStatus.CANCELED + elif status == "FILLED": return VenueEventStatus.FILLED + elif status == "PARTIALLY_FILLED": return VenueEventStatus.PARTIALLY_FILLED + elif status == "REJECTED": return VenueEventStatus.REJECTED + elif status in ("EXPIRED", "EXPIRED") : return VenueEventStatus.EXPIRED + else: + return VenueEventStatus.ACKED # fallback — unknown → ACKED (dangerous!) +``` + +If BingX introduces a new status (e.g., `"DEACTIVATED"`, `"PENDING_CANCEL"`, `"SUSPENDED"`), it maps to `ACKED` — which the kernel interprets as "order acknowledged by exchange and working." This could cause the kernel to believe an order is active when it's actually suspended, leading to: +- No cancel sent (kernel thinks order is working and waiting for fill) +- Premature exit intent submission (order not actually active) +- Incorrect slot FSM state + +The fallback should be `REJECTED` (conservative — assume the worst) or should log a warning and escalate. + +**Fix:** Change fallback to `VenueEventStatus.REJECTED` or log an error for unknown statuses. + +**Severity: Medium** + +### S16: `gen2.py` generates `except: pass` in test code — swallows KeyboardInterrupt and SystemExit + +**File:** `gen2.py:335` (embedded in generated test template) + +```python +try: + bundle.close() +except: + pass # bare except — catches KeyboardInterrupt, SystemExit +``` + +The generated test files contain bare `except: pass` blocks in cleanup code. This catches `KeyboardInterrupt` and `SystemExit`, preventing the user from stopping a running test suite with Ctrl+C. The process must be killed with SIGKILL. + +Same pattern in `_build_pink_extended.py` templates and other generated test builders. + +**Fix:** Use `except Exception:` in generated code. + +**Severity: Medium** + +--- + +## Pass 16 Summary + +| # | Flaw | Layer | Severity | +|---|------|-------|----------| +| S1 | `realized_pnl()`/`mark_price()` NaN bypasses `<=0.0` guard — NaN PnL corrupts k_realized_pnl | Rust/Python | **Critical** | +| S2 | MockVenue `_exchange_event_queue` check-then-act race — silently drops events | Test | **Critical** | +| S3 | No `test_kernel_fsm.py` or `test_kernel_fsm_recovery.py` exists | Test | **Critical** | +| S4 | Generated tests use `await asyncio.sleep(0.8)` — flaky false negatives on slow CI | Test | **Critical** | +| S5 | `_rate_limit_retry_after_ms()` returns 0 on parse failure — instant retry storm | Venue | **High** | +| S6 | Venue adapter detects rate limits but enforces zero backoff | Venue | **High** | +| S7 | `capital_epsilon = 1e-4` too tight — false WARN classifications | Accounting | **High** | +| S8 | Generated tests use module-level `asyncio.run()` — leaks tasks on Python 3.12+ | Test | **High** | +| S9 | `str.replace()` patching silently does nothing on format change | Build | **High** | +| S10 | `_consume()` no per-message WS timeout — silent hang blocks forever | Venue | **High** | +| S11 | `_run()` blocks pool thread with no timeout — 3 hung calls lock adapter | Venue | **High** | +| S12 | Rate-limit regex depends on exchange message format — non-portable | Venue | Medium | +| S13 | `_row_float()` silently skips malformed rows, filters zero values | Venue | Medium | +| S14 | Reconnection backoff lacks jitter — thundering herd risk | Venue | Medium | +| S15 | `_venue_event_status_from_row()` falls back to ACKED — masks new rejections | Venue | Medium | +| S16 | `except: pass` in generated test code — swallows KeyboardInterrupt | Test | Medium | + +### Pass 16 Severity + +| Severity | Count | +|----------|-------| +| **Critical** | 4 (S1, S2, S3, S4) | +| **High** | 7 (S5, S6, S7, S8, S9, S10, S11) | +| Medium | 5 (S12, S13, S14, S15, S16) | + +### Combined Catalog (All 16 Passes) + +| Pass | Focus | Count | Critical | High | Medium | Low | Info | +|------|-------|-------|----------|------|--------|-----|------| +| A | Architectural | 15 | 0 | 2 | 0 | 2 | 11 | +| T | Threading/Atomicity | 9 | 1 | 3 | 3 | 2 | 0 | +| E | E2E Trace (Pass 1) | 26 | 0 | 4 | 10 | 11 | 1 | +| F | Deep E2E (Pass 3) | 30 | 0 | 1 | 8 | 17 | 4 | +| G | Domain Scans (Pass 4) | 36 | 4 | 11 | 11 | 8 | 2 | +| H | Edge Domains (Pass 5) | 22 | 3 | 9 | 5 | 4 | 1 | +| I | Pass 6 (Math/Tests/Recovery/Security) | 22 | 3 | 11 | 4 | 2 | 2 | +| J | Pass 7 (Test Infra/Data/Rust/Env/Conn) | 16 | 0 | 7 | 7 | 2 | 0 | +| K | Pass 8 (Observability/Memory/Time/DeadCode) | 23 | 2 | 7 | 7 | 1 | 6 | +| L | Pass 9 (Contracts/Events/Network/FFI/Diffs) | 16 | 0 | 4 | 8 | 4 | 0 | +| M | Pass 10 (Runtime/TestBugs/FSM/Persistence/Metrics) | 18 | 3 | 7 | 5 | 3 | 0 | +| N | Pass 11 (Async/Sync Seams/Locks/Threading) | 10 | 4 | 1 | 3 | 1 | 1 | +| O | Pass 12 (Sync/Async Wider Scope) | 11 | 0 | 3 | 7 | 1 | 0 | +| P | Pass 13 (FFI Safety/Dangling Pointers/Coverage) | 9 | 1 | 3 | 3 | 1 | 1 | +| Q | Pass 14 (Serde Edges/Backup Diffs/Market Data) | 12 | 0 | 4 | 3 | 2 | 3 | +| R | Pass 15 (Resource Leaks/Trust Boundaries/Security) | 14 | 2 | 6 | 3 | 2 | 1 | +| S | Pass 16 (Error Handling/Arithmetic/Test Infra) | 16 | 4 | 7 | 5 | 0 | 0 | +| **Total** | | **305** | **27** | **90** | **87** | **64** | **37** | diff --git a/PINK_DITAv2_FLAW_ANALYSIS_2026-05-31.md b/PINK_DITAv2_FLAW_ANALYSIS_2026-05-31.md index fd64550..5ee7fac 100644 --- a/PINK_DITAv2_FLAW_ANALYSIS_2026-05-31.md +++ b/PINK_DITAv2_FLAW_ANALYSIS_2026-05-31.md @@ -30,7 +30,8 @@ | P | Pass 13 (FFI Safety/Dangling Pointers/Coverage) | 9 | 1 | 3 | 3 | 1 | 1 | | Q | Pass 14 (Serde Edges/Backup Diffs/Market Data) | 12 | 0 | 4 | 3 | 2 | 3 | | R | Pass 15 (Resource Leaks/Trust Boundaries/Security) | 14 | 2 | 6 | 3 | 2 | 1 | -| **Total** | | **289** | **23** | **83** | **82** | **64** | **37** | +| S | Pass 16 (Error Handling/Arithmetic/Test Infra) | 16 | 4 | 7 | 5 | 0 | 0 | +| **Total** | | **305** | **27** | **90** | **87** | **64** | **37** | --- @@ -412,6 +413,31 @@ --- +## S-Series: Error Handling, Arithmetic Stability, Test Infrastructure (Pass 16) + +*Full detail in TRACE doc under "PASS 16 — ERROR HANDLING, ARITHMETIC STABILITY, TEST INFRASTRUCTURE."* + +| # | Flaw | Layer | Severity | +|---|------|-------|----------| +| S1 | `realized_pnl()`/`mark_price()` NaN bypasses `<=0.0` guard — NaN PnL corrupts k_realized_pnl | Rust/Python | **Critical** | +| S2 | MockVenue `_exchange_event_queue` check-then-act race — silently drops events | Test | **Critical** | +| S3 | No `test_kernel_fsm.py` or `test_kernel_fsm_recovery.py` exists | Test | **Critical** | +| S4 | Generated tests use `await asyncio.sleep(0.8)` — flaky false negatives on slow CI | Test | **Critical** | +| S5 | `_rate_limit_retry_after_ms()` returns 0 on parse failure — instant retry storm | Venue | **High** | +| S6 | Venue adapter detects rate limits but enforces zero backoff | Venue | **High** | +| S7 | `capital_epsilon = 1e-4` too tight — false WARN classifications | Accounting | **High** | +| S8 | Generated tests use module-level `asyncio.run()` — leaks tasks on Python 3.12+ | Test | **High** | +| S9 | `str.replace()` patching silently does nothing on format change | Build | **High** | +| S10 | `_consume()` no per-message WS timeout — silent hang blocks forever | Venue | **High** | +| S11 | `_run()` blocks pool thread with no timeout — 3 hung calls lock adapter | Venue | **High** | +| S12 | Rate-limit regex depends on exchange message format — non-portable | Venue | Medium | +| S13 | `_row_float()` silently skips malformed rows, filters zero values | Venue | Medium | +| S14 | Reconnection backoff lacks jitter — thundering herd risk | Venue | Medium | +| S15 | `_venue_event_status_from_row()` falls back to ACKED — masks new rejections | Venue | Medium | +| S16 | `except: pass` in generated test code — swallows KeyboardInterrupt | Test | Medium | + +--- + ## H-Series: Edge Domains — Dependencies, Error Handling, Types, Contracts (Pass 5) *Full detail in TRACE doc under "PASS 5 — EDGE DOMAINS."*