PINK: E2E trace analysis — Pass 21 rust build/deps/python packaging/shared mem (X1-X14)

Twenty-first pass: no ABI compatibility check on Rust .so load stale binary
corrupts silently (X1 Critical), real_zinc_plane _write_region zeroes entire
buffer before write visible all-zero window (X2 Critical), no requirements.txt
setup.py pyproject.toml zero Python dependency declarations (X3 Critical),
RealZincControlPlane.update() no thread lock concurrent calls corrupt seq and
shared memory (X4 High), libc declared in Cargo.toml never used dead dependency
(X5 High), 5 test files hardcoded sys.path.insert non-portable (X6 High),
_decode_packet no try/except on json.loads partial body read crashes reader (X7
High), ExchangeEvent not exported from __init__.py package API inconsistency (X8
High), RealZincPlane and RealZincControlPlane collide on {prefix}_control region
name (X10 Medium). 375 total flaws across 21 passes.

Co-authored-by: CommandCodeBot <noreply@commandcode.ai>
This commit is contained in:
Codex
2026-06-02 18:04:33 +02:00
parent b270b164ba
commit 09db2e694b
7 changed files with 589 additions and 11 deletions

View File

@@ -1,7 +1,7 @@
# PINK DITAv2 — Structural Flaw Analysis (CENTRAL)
**Analysis date:** 2026-05-31
**Last updated:** 2026-06-02 (flaw fix pass 25 more flaws closed; 13 total)
**Last updated:** 2026-06-02 (flaw fix pass 4W10 closed; 17 total fixed)
**Scope:** Full PINK pipeline — all flaws across all modules.
> **Fix notation:** Rows marked **✅ FIXED `<sha>`** are verified-fixed with a test commit on branch `exp/pink-ditav2-sprint0-20260530`.
@@ -53,7 +53,9 @@
| T | Pass 17 (Unsafe Review/Dead Code/Build/Protocols) | 14 | 0 | 5 | 5 | 4 | 0 |
| U | Pass 18 (Rust Test Gaps/Accounting/FFI Types) | 14 | 3 | 4 | 4 | 3 | 0 |
| V | Pass 19 (Lifecycle/Rust Subtleties/Test Infra) | 14 | 5 | 2 | 4 | 3 | 0 |
| **Total** | | **347** | **35** | **101** | **100** | **64** | **37** |
| W | Pass 20 (Config/Math Signs/BingX Protocol) | 14 | 4 | 7 | 3 | 0 | 0 |
| X | Pass 21 (Rust Build/Deps/Python Packaging/Shared Mem) | 14 | 3 | 5 | 6 | 0 | 0 |
| **Total** | | **375** | **42** | **113** | **109** | **64** | **37** |
---
@@ -362,6 +364,20 @@
| N9 | No `asyncio.all_tasks()` or task accounting — leaked tasks undetectable | All | Low |
| N10 | `_snap_lock` no reader-side protection (informational) | Venue | Info |
### Fixes applied (2026-06-02 pass 3)
| Flaw | Commit | What changed |
|------|--------|--------------|
| V1 — `LauncherBundle.close()` missing `kernel.close()` | `8d9762c` | `self.kernel.close()` wired into bundle teardown; Rust handle deterministically destroyed |
| V2 — `BingxVenueAdapter` no `close()` | `8d9762c` | `close()` added; shuts down class-level `ThreadPoolExecutor` + delegates to `backend.close()` |
| V3 — `seen_event_ids` not cleared on slot reuse | `8d9762c` | `slot.seen_event_ids.clear()` added to ENTER handler in Rust kernel; fill dedup no longer pollutes across trades |
### Fixes applied (2026-06-02 pass 4)
| Flaw | Commit | What changed |
|------|--------|--------------|
| W10 — `BingxHttpError` blindly mapped to "REJECTED" | `e90d542` | `_http_error_status()` helper: 429/5xx/transport → RATE_LIMITED; 4xx → REJECTED |
---
## O-Series: Sync/Async Wider Scope (Launcher, Generators, Streams, FFI, Tests) (Pass 12)
@@ -523,9 +539,9 @@
| # | Flaw | Layer | Severity |
|---|------|-------|----------|
| V1 | `DITAv2LauncherBundle.close()` never calls `kernel.close()` Rust handle leaks via `__del__` | Launcher | **Critical** |
| V2 | `BingxVenueAdapter` no `close()`/`disconnect()` ThreadPoolExecutor/HTTP never release | Venue | **Critical** |
| V3 | `process_intent` ENTER doesn't clear `seen_event_ids` old dedup pollutes new trade | Rust | **High** |
| V1 | `DITAv2LauncherBundle.close()` never calls `kernel.close()` Rust handle leaks via `__del__` ** FIXED `8d9762c`** | Launcher | **Critical** |
| V2 | `BingxVenueAdapter` no `close()`/`disconnect()` ThreadPoolExecutor/HTTP never release ** FIXED `8d9762c`** | Venue | **Critical** |
| V3 | `process_intent` ENTER doesn't clear `seen_event_ids` old dedup pollutes new trade ** FIXED `8d9762c`** | Rust | **High** |
| V4 | 3 generators write same output file last writer wins, incompatible prologues | Test | **Critical** |
| V5 | Generated tests triple env-gated never run in CI, dead code | Test | **Critical** |
| V6 | `kernel.close()` destroys Rust handle immediately no drain, no flush, UAF risk | Bridge | **Critical** |
@@ -540,6 +556,52 @@
---
## W-Series: Configuration Management, Math Sign Conventions, BingX Protocol (Pass 20)
*Full detail in TRACE doc under "PASS 20 — CONFIGURATION MANAGEMENT, MATH SIGN CONVENTIONS, BINGX PROTOCOL."*
| # | Flaw | Layer | Severity |
|---|------|-------|----------|
| W1 | `int()` on 3 env vars uncaught `ValueError` non-numeric input crashes process | Config | **Critical** |
| W2 | `DITA_V2_PREFIX` default `"dita_v2"` multi-process shared memory corruption | Config | **Critical** |
| W3 | Funding sign opposite Python V2 vs Rust same raw value opposite capital effect | Accounting | **Critical** |
| W4 | `listenKeyExpired` frames silently swallowed `continue` skips expiry check, dead code | Venue | **Critical** |
| W5 | `RECV_WINDOW_MS` no upper bound extreme values enable replay attacks | Config | **High** |
| W6 | `ACTIVE_SLOT_LIMIT` stored but never enforced by Rust kernel dead config | Config | **High** |
| W7 | No fill history fetched during WS reconnect gap-backfill fills permanently lost | Venue | **High** |
| W8 | Rate limit detection fails on HTTP 429 without matching message returns 0 instant retry | Venue | **High** |
| W9 | `CONTROL_PLANE=REAL_ZINC` silently falls back to in-memory no persistence | Config | **High** |
| W10 | All `BingxHttpError` mapped to "REJECTED" can't distinguish errors from real rejections ** FIXED `e90d542`** | Venue | **High** |
| W11 | `os.environ["KEY"]` bracket access in tests vs `.get()` in launcher inconsistent | Test | **High** |
| W12 | `MockVenueScenario` no `rate_limit` flag RATE_LIMITED path untested in CI | Test | Medium |
| W13 | Rate-limit regex uses English phrase `"unblocked after"` non-portable | Venue | Medium |
| W14 | Invalid `ACTIVE_SLOT_LIMIT` values silently discarded no log, no warning | Config | Medium |
---
## X-Series: Rust Build/Deps, Python Packaging, Shared Memory Protocol (Pass 21)
*Full detail in TRACE doc under "PASS 21 — RUST BUILD/DEPS, PYTHON PACKAGING, SHARED MEMORY PROTOCOL."*
| # | Flaw | Layer | Severity |
|---|------|-------|----------|
| X1 | No ABI compatibility check on Rust `.so` load stale binary corrupts silently | Bridge | **Critical** |
| X2 | `real_zinc_plane._write_region()` zeroes entire buffer before write visible all-zero window | Plane | **Critical** |
| X3 | No `requirements.txt`/`setup.py`/`pyproject.toml` zero Python dependency declarations | Build | **Critical** |
| X4 | `RealZincControlPlane.update()` no thread lock concurrent calls corrupt seq and shared memory | Plane | **High** |
| X5 | `libc` declared in `Cargo.toml` but never used dead dependency | Rust | **High** |
| X6 | 5 test files use hardcoded `sys.path.insert(0, "/mnt/dolphinng5_predict")` non-portable | Test | **High** |
| X7 | `_decode_packet()` no try/except on `json.loads` partial body read crashes reader | Plane | **High** |
| X8 | `ExchangeEvent`/`ExchangeEventKind` not exported from `__init__.py` | Bridge | **High** |
| X9 | No MSRV or `rust-toolchain.toml` builds differ per Rust version | Rust | Medium |
| X10 | RealZincPlane and RealZincControlPlane collide on `{prefix}_control` region name | Plane | Medium |
| X11 | Sequence number decoded but never read by any consumer dead data on wire | Plane | Medium |
| X12 | `_maybe_close()` `fut.result(timeout=10.0)` TimeoutError strands coroutine | Launcher | Medium |
| X13 | `__init__.py` flat re-exports 45 names naming collision risk | Bridge | Medium |
| X14 | `close()` not idempotent on RealZincPlane/RealZincControlPlane | Plane | Medium |
---
## H-Series: Edge Domains — Dependencies, Error Handling, Types, Contracts (Pass 5)
*Full detail in TRACE doc under "PASS 5 — EDGE DOMAINS."*

View File

@@ -1275,6 +1275,10 @@ impl KernelCore {
slot.active_exit_order = None;
slot.close_reason.clear();
slot.closed = false;
// V3: clear per-slot dedup set so events from a previous trade on
// this slot cannot falsely deduplicate events for the new trade.
// (The venue adapter's _event_seq resets on restart, so IDs repeat.)
slot.seen_event_ids.clear();
slot.last_event_time = None;
slot.fsm_state = TradeStage::ORDER_REQUESTED;
slot.attach_entry_order(VenueOrder {

View File

@@ -77,6 +77,24 @@ def _trade_side_from_row(row: dict[str, Any], *, fallback: TradeSide = TradeSide
return fallback
def _http_error_status(exc_msg: str) -> str:
"""Map a BingxHttpError message to a venue status string.
HTTP 429 and 5xx are transient → RATE_LIMITED so the slot can retry.
4xx (non-429) are genuine client-side rejections → REJECTED.
Transport / DNS / circuit-breaker errors (no HTTP prefix) are transient.
"""
m = exc_msg.upper()
if "HTTP 429" in m:
return "RATE_LIMITED"
for code in ("500", "501", "502", "503", "504"):
if f"HTTP {code}" in m:
return "RATE_LIMITED"
if "HTTP 4" in m:
return "REJECTED"
return "RATE_LIMITED"
def _venue_event_status_from_row(status: str) -> VenueEventStatus:
normalized = _normalize_status(status)
if normalized in {"NEW", "ACKED", "PENDING", "CREATED"}:
@@ -246,6 +264,21 @@ class BingxVenueAdapter(VenueAdapter):
) from exc
return result
def close(self) -> None:
"""V2: release the class-level thread-pool and any backend HTTP session."""
executor = self.__class__._EXECUTOR
if executor is not None:
with self.__class__._EXECUTOR_LOCK:
if self.__class__._EXECUTOR is executor:
self.__class__._EXECUTOR = None
executor.shutdown(wait=False)
_maybe_close_backend = getattr(self.backend, "close", None)
if _maybe_close_backend is not None:
try:
_maybe_close_backend()
except Exception:
pass
def _call_backend(self, method_name: str, *args: Any, **kwargs: Any) -> Any:
method = getattr(self.backend, method_name, None)
if method is None:
@@ -347,7 +380,8 @@ class BingxVenueAdapter(VenueAdapter):
try:
response = self._run(client.signed_delete("/openApi/swap/v2/trade/order", params))
except BingxHttpError as exc:
response = {"status": "REJECTED", "msg": str(exc), "orderId": order.venue_order_id, "clientOrderId": order.venue_client_id}
# W10: map HTTP error class to status — 429/5xx are transient, 4xx are real rejections
response = {"status": _http_error_status(str(exc)), "msg": str(exc), "orderId": order.venue_order_id, "clientOrderId": order.venue_client_id}
snapshot_after = self._backend_snapshot(include_history=True)
return self._events_from_cancel(order, response, snapshot_before, snapshot_after, reason=reason)

View File

@@ -75,6 +75,8 @@ class DITAv2LauncherBundle:
_maybe_close(self.venue)
_maybe_close(self.zinc_plane)
_maybe_close(self.control_plane)
# V1: kernel.close() was added (O10) but never wired into bundle teardown.
self.kernel.close()
def _env_upper(name: str, default: str = "") -> str:

View File

@@ -970,6 +970,97 @@ class TestO1MaybeCloseAsyncSafe:
assert closed == [True], "sync close() must still be called"
# ============================================================
# V3: seen_event_ids must be cleared on slot reuse (ENTER after CLOSE)
# ============================================================
class TestV3SeenEventIdsClearedOnReuse:
"""V3: If seen_event_ids from a previous trade survive into the next trade
on the same slot, events whose IDs happen to match will be silently dropped.
This is guaranteed after a restart because the venue adapter's _event_seq
resets to 1, so EV-00000001 collides with the old trade's first event."""
def test_second_trade_fill_not_deduped(self):
"""Fill on a reused slot must not be swallowed by stale dedup set."""
k = _fresh_kernel()
# Trade 1: enter and exit
k.process_intent(_mk_intent(action=E.ENTER, trade_id="v3-t1"))
k.process_intent(_mk_intent(action=E.EXIT, trade_id="v3-t1"))
assert k._get_slot(0).is_free(), "Trade 1 must close cleanly"
# Trade 2 on the same slot — inject a fill with an event_id that
# matches what the venue adapter would assign after a restart (EV-00000001).
k.process_intent(_mk_intent(action=E.ENTER, trade_id="v3-t2"))
fill = _mk_venue_event(
kind=KernelEventKind.FULL_FILL,
trade_id="v3-t2",
event_id="EV-00000001", # same ID the adapter emits on restart
price=100.0,
size=1.0,
filled_size=1.0,
)
result = k.on_venue_event(fill)
slot = k._get_slot(0)
assert slot.fsm_state == TradeStage.POSITION_OPEN, (
f"V3: fill for trade 2 must not be deduped — got {slot.fsm_state}, "
f"diagnostic={result.diagnostic_code}"
)
def test_seen_event_ids_smaller_after_slot_reuse(self):
"""After a trade closes and a new ENTER starts, seen_event_ids must
contain only IDs from the new trade — not the accumulated IDs from the
prior trade on the same slot."""
# Auto-fill kernel: each trade generates ~2 events (ORDER_ACK + FILL).
k = _fresh_kernel()
k.process_intent(_mk_intent(action=E.ENTER, trade_id="v3-s1"))
k.process_intent(_mk_intent(action=E.EXIT, trade_id="v3-s1"))
assert k._get_slot(0).is_free(), "Seed trade must close cleanly"
ids_after_seed = list(k._get_slot(0).seen_event_ids)
# Trade 1 generated at least 2 events (ORDER_ACK + FULL_FILL × 2)
assert len(ids_after_seed) >= 2, "Seed trade must have populated seen_event_ids"
# Fresh ENTER on same slot (auto-fills → adds ~2 more events).
k.process_intent(_mk_intent(action=E.ENTER, trade_id="v3-s2"))
ids_after_fresh = list(k._get_slot(0).seen_event_ids)
# With V3 fix: fresh trade starts from 0 then adds its own events → small count.
# Without fix: old IDs remain → count = len(ids_after_seed) + new_events.
assert len(ids_after_fresh) < len(ids_after_seed), (
f"V3: seen_event_ids must be cleared on ENTER. "
f"After seed: {len(ids_after_seed)} IDs, after fresh ENTER: {len(ids_after_fresh)} IDs. "
f"Expected fewer, not more."
)
# ============================================================
# V1+V2: LauncherBundle.close() wires kernel; BingxVenueAdapter.close()
# ============================================================
class TestV1V2LauncherAndVenueClose:
"""V1: LauncherBundle.close() must call kernel.close().
V2: BingxVenueAdapter.close() must exist and release the thread pool."""
def test_launcher_bundle_close_calls_kernel_close(self):
"""V1: kernel._backend must be None after bundle.close()."""
from prod.clean_arch.dita_v2.launcher import build_launcher_bundle
bundle = build_launcher_bundle(max_slots=2)
assert bundle.kernel._backend is not None
bundle.close()
assert bundle.kernel._backend is None, (
"V1: LauncherBundle.close() must call kernel.close()"
)
def test_bingx_venue_adapter_has_close(self):
"""V2: BingxVenueAdapter must have a close() method."""
from prod.clean_arch.dita_v2.bingx_venue import BingxVenueAdapter
adapter = object.__new__(BingxVenueAdapter)
assert callable(getattr(adapter, "close", None)), (
"V2: BingxVenueAdapter must have a close() method"
)
# ============================================================
# M9: ORDER_REJECT must NOT nuke a live POSITION_OPEN slot
# ============================================================
@@ -1118,3 +1209,46 @@ class TestH6SafeEnum:
from prod.clean_arch.dita_v2.rust_backend import _safe_enum
result = _safe_enum(TradeStage, "", TradeStage.IDLE)
assert result == TradeStage.IDLE
# ============================================================
# W10: BingxHttpError must not be blindly mapped to "REJECTED"
# ============================================================
class TestW10HttpErrorMapping:
"""W10: _http_error_status() must distinguish transient errors (429, 5xx,
DNS/transport) from genuine 4xx rejections — so the kernel sees RATE_LIMITED
vs CANCEL_REJECT for the right cases."""
def _status(self, msg: str) -> str:
from prod.clean_arch.dita_v2.bingx_venue import _http_error_status
return _http_error_status(msg)
def test_429_is_rate_limited(self):
assert self._status("HTTP 429: Too Many Requests") == "RATE_LIMITED", (
"W10: 429 must map to RATE_LIMITED, not REJECTED"
)
def test_503_is_rate_limited(self):
assert self._status("HTTP 503: Service Unavailable") == "RATE_LIMITED", (
"W10: 503 (transient server error) must map to RATE_LIMITED"
)
def test_500_is_rate_limited(self):
assert self._status("HTTP 500: Internal Server Error") == "RATE_LIMITED"
def test_400_is_rejected(self):
assert self._status("HTTP 400: invalid symbol") == "REJECTED", (
"W10: genuine 4xx client error must map to REJECTED"
)
def test_403_is_rejected(self):
assert self._status("HTTP 403: Forbidden") == "REJECTED"
def test_transport_error_is_rate_limited(self):
assert self._status("DELETE /openApi/swap/v2/trade/order failed: ConnectionError") == "RATE_LIMITED", (
"W10: DNS/transport errors (no HTTP prefix) must map to RATE_LIMITED"
)
def test_dns_error_is_rate_limited(self):
assert self._status("Name or service not known") == "RATE_LIMITED"