# DITAv2 Kernel Reference **Status:** active **Scope:** DITAv2 execution kernel, operator launcher, shared-memory control plane, venue adapters, and observability integration. **Primary runtime path:** `dolphin:dita_v2` This document is the canonical reference for the DITAv2 stack under `prod/clean_arch/dita_v2/`. It describes: - the execution kernel contract - the kernel state model and FSM - Zinc / Hazelcast boundaries - mock and BingX venue adapters - launcher and operator control surfaces - debug and replay semantics - failure and recovery behavior - test strategy and invariants The DITAv2 stack is intentionally separate from the legacy `prod.clean_arch.dita` surface. It can be exercised in isolation, with safe defaults for tests and explicit opt-in for real shared-memory and live venue wiring. Recent hardening additions: - direct slot writes now mirror into the Zinc state region immediately - the regression surface includes a 50-case hardening suite for diagnostics, duplicate replay, stale-state handling, and Zinc mirroring --- ## 1. What DITAv2 Is DITAv2 is a multi-slot execution kernel for trade lifecycle management. It sits between the alpha layer and the exchange layer. Its responsibilities are limited to: 1. receiving intents 2. mutating slot state 3. normalizing venue events 4. projecting account state 5. emitting deterministic transition and diagnostic records 6. mirroring confirmed state to durable surfaces It is not responsible for alpha generation. It does not compute signals. It does not decide entry/exit thesis. Those inputs come from BLUE/PINK or another upstream strategy layer. ### Design intent DITAv2 is built to make execution state: - explicit - replayable - debuggable - observable - testable at the FSM edge The goal is to eliminate shadow-state drift between local memory, exchange truth, and durable observability surfaces. --- ## 2. Canonical Components ### Kernel File: - `prod/clean_arch/dita_v2/rust_backend.py` - `prod/clean_arch/dita_v2/_rust_kernel/` The Python-facing `ExecutionKernel` is backed by a Rust implementation loaded through `ctypes`. The Python wrapper keeps the public API stable and writes through to the Rust backend on slot mutations and event processing. ### Control plane Files: - `prod/clean_arch/dita_v2/control.py` - `prod/clean_arch/dita_v2/real_control_plane.py` The control plane holds runtime mode, verbosity, backend selection, slot limits, and debug flags. It supports: - `NORMAL` / `DEBUG` - `QUIET` / `VERBOSE` / `TRACE` - `MOCK` / `BINGX` - mirror-to-Hazelcast toggles - restart reconciliation toggles ### Zinc plane Files: - `prod/clean_arch/dita_v2/zinc_plane.py` - `prod/clean_arch/dita_v2/real_zinc_plane.py` The Zinc plane is the hot-path shared-memory substrate for: - intents - slot snapshots - control snapshots It follows Zinc's one-shot signal pattern wherever possible: - writers publish the latest data and then notify - readers wait for a sequence change from the last value they observed - state-based sync is preferred over event-count sync - the in-memory stand-ins emulate the same notify/wait contract for tests The in-memory plane is used by default for tests. The real Zinc plane is opt-in and uses the `zinc` Python adapter over shared memory. Direct slot mutation is intentionally write-through: the Rust-backed kernel and the Zinc mirror must stay aligned on every `_set_slot()`, venue event, and reconcile path. The tests assert that a direct slot write is visible in the state region without waiting for a separate flush cycle. The same update path also notifies waiters so cross-process readers can wake on the latest state change instead of polling. ### Projection Files: - `prod/clean_arch/dita_v2/projection.py` - `prod/clean_arch/dita_v2/hazelcast_projection.py` The projection layer writes BLUE/PINK-compatible state rows to Hazelcast and emits lifecycle rows suitable for ClickHouse observability. ### Venue adapters Files: - `prod/clean_arch/dita_v2/mock_venue.py` - `prod/clean_arch/dita_v2/bingx_venue.py` The mock adapter is deterministic and BingX-shaped. The BingX adapter is a thin normalization layer over the direct BingX execution client surface. ### Launcher and operator controls Files: - `prod/clean_arch/dita_v2/launcher.py` - `prod/launch_dita_v2.py` - `prod/ops/dita_v2_ctl.py` - `prod/supervisor/supervisorctl.sh` - `prod/ops/dita_v2_live_bingx_smoke.py` The launcher assembles a full runtime bundle. The operator scripts provide status, healthcheck, start, stop, and restart paths. The smoke wrapper provides a repeatable BingX testnet command that runs the full live E2E suite with the correct live-smoke environment gates and supervisor precheck. Repeatable live smoke command: ```bash python /mnt/dolphinng5_predict/prod/ops/dita_v2_live_bingx_smoke.py --symbol TRXUSDT ``` Use `--dry-run` to print the exact env and pytest command without sending orders. --- ## 3. Runtime Topology ### Default test topology ```text ExecutionKernel ├─ InMemoryControlPlane ├─ InMemoryZincPlane ├─ MockVenueAdapter └─ HazelcastProjection(writer=callback) ``` ### Real operator topology ```text ExecutionKernel ├─ RealZincControlPlane or mirrored in-memory control plane ├─ RealZincPlane ├─ BingxVenueAdapter └─ HazelcastProjection(client-backed writer) ``` ### Supervisord-managed service Program: ```text dolphin:dita_v2 ``` Launcher: ```text /mnt/dolphinng5_predict/prod/launch_dita_v2.py ``` Default supervised posture: - `DITA_V2_LAUNCHER_MODE=serve` - `DITA_V2_VENUE=BINGX` - `DITA_V2_ZINC=REAL` - `DITA_V2_CONTROL_PLANE=REAL_ZINC` - `DITA_V2_HAZELCAST=REAL` - `DITA_V2_MODE=DEBUG` - `DITA_V2_VERBOSITY=TRACE` The supervised path is intentionally separate from the legacy PINK and BLUE entrypoints. --- ## 4. Data Contracts ### Core contract files - `prod/clean_arch/dita_v2/contracts.py` - `prod/clean_arch/dita_v2/venue.py` ### Important types - `TradeStage` - `TradeSlot` - `VenueOrder` - `VenueEvent` - `KernelIntent` - `KernelTransition` - `KernelOutcome` - `KernelDiagnosticCode` - `KernelCommandType` - `KernelEventKind` - `KernelMode` - `KernelVerbosity` - `BackendMode` ### Slot model Each slot is the unit of execution. It carries: - trade identity - asset - side - entry price - current size - leverage - open/close state - active entry/exit order handles - leg progression - idempotency tracking via seen event IDs The slot is the primary kernel state object. The kernel maintains multiple slots but one slot can be actively traded while the others remain idle or recoverable. ### Order model `VenueOrder` captures the venue-specific identity of an order: - internal trade ID - venue order ID - venue client ID - side - intended size - filled size - average fill price - status - metadata ### Event model `VenueEvent` captures the normalized venue response surface: - ack - partial fill - full fill - cancel ack - cancel reject - reject The kernel consumes normalized events, not raw exchange payloads. --- ## 5. State Machine ### Core states - `IDLE` - `ENTRY_WORKING` - `POSITION_OPEN` - `EXIT_WORKING` - `CLOSED` - `STALE_STATE_RECONCILING` ### Basic transitions ```text IDLE └─ ENTER intent ─> ENTRY_WORKING ENTRY_WORKING ├─ PARTIAL_FILL ─> ENTRY_WORKING ├─ FULL_FILL ─> POSITION_OPEN └─ ORDER_REJECT ─> IDLE POSITION_OPEN ├─ EXIT intent ─> EXIT_WORKING └─ MARK_PRICE ─> POSITION_OPEN EXIT_WORKING ├─ PARTIAL_FILL ─> EXIT_WORKING ├─ FULL_FILL ─> IDLE or POSITION_OPEN (multi-leg) ├─ CANCEL_ACK ─> POSITION_OPEN └─ CANCEL_REJECT ─> EXIT_WORKING ``` ### Idempotency Duplicate venue events are tracked via event IDs in the slot image. Repeated events are treated as no-ops, not as extra fills or duplicate state changes. ### Recovery state `STALE_STATE_RECONCILING` blocks normal event progression until reconciliation completes. This state exists to make restart, replay, and venue divergence explicit. ### Rate limit handling BingX rate limiting is treated as a first-class retryable condition, not a generic failure. The kernel surfaces it with: - `KernelDiagnosticCode.RATE_LIMITED` - `KernelSeverity.WARNING` - `details["release_eta"] = "few minutes"` when the exchange provides no precise retry window - `details["retry_after_ms"]` when the adapter or venue response includes a retry hint - `details["retryable"] = true` This is intentionally downstream-friendly: operators and orchestration layers can distinguish transient throttling from hard rejections and choose a retry policy explicitly. --- ## 6. Control Plane Semantics The control plane is used to steer runtime behavior without changing kernel logic. ### Modes - `NORMAL` for production-like execution - `DEBUG` for full state and transition tracing ### Verbosity - `QUIET` - `VERBOSE` - `TRACE` ### Backend mode - `MOCK` - `BINGX` ### Key toggles - `debug_clickhouse_enabled` - `trace_transitions` - `mirror_to_hazelcast` - `active_slot_limit` - `reconcile_on_restart` ### Shared-memory selection The launcher uses env-driven selection: - `DITA_V2_CONTROL_PLANE=REAL_ZINC` - `DITA_V2_ZINC=REAL` - `DITA_V2_HAZELCAST=REAL` - `DITA_V2_VENUE=BINGX` Defaults remain safe and testable. Real shared-memory and live venue wiring are opt-in. --- ## 7. Zinc Boundary ### Why Zinc is used Zinc provides the shared-memory substrate for: - low-latency control-plane reads - intent publication - slot state snapshots - zero-copy observation across processes ### Hot-path intent region Written by the alpha/launcher side, read by the kernel. ### Hot-path state region Written by the kernel, read by the alpha side or operator tooling. ### Control region Used for runtime mode switches and operator commands. ### Invariants 1. Shared-memory state must not silently diverge from kernel state. 2. Writes should be explicit and versioned. 3. The kernel must not rely on duplicated Python shadow state as authority. --- ## 8. Hazelcast / ClickHouse Boundary ### Hazelcast Hazelcast is the durable projection mirror for: - confirmed slot state - control snapshot mirroring - active slot registry - trade event topic emission ### ClickHouse ClickHouse is the observability and debug journal sink. In debug mode, the kernel should emit enough rows to reconstruct a transition timeline. ### Compatibility rule All emitted rows must remain compatible with the BLUE/PINK schema family. The DITAv2 layer does not invent a new observability universe unless the schema is explicitly versioned. --- ## 9. Venue Adapters ### Mock venue File: - `prod/clean_arch/dita_v2/mock_venue.py` Behavior: - deterministic - BingX-shaped semantics - configurable reject / partial fill / cancel reject scenarios - useful for FSM and race testing ### BingX venue File: - `prod/clean_arch/dita_v2/bingx_venue.py` Behavior: - thin normalization layer - converts BingX order/account payloads into DITAv2 events/orders - no reimplementation of exchange logic - live adapter backed by the direct BingX client path ### Adapter rule If a mock cannot faithfully mirror BingX behavior in an in-scope path, the adapter layer must map actual BingX responses into DITAv2 contracts instead of inventing a separate semantic model. --- ## 10. Launcher and Operator Flow ### Launcher responsibilities - assemble control plane - assemble Zinc plane - assemble projection sink - select venue adapter - create the kernel ### Operator controls Supported command surfaces: - `prod/ops/dita_v2_ctl.py` - `prod/supervisor/supervisorctl.sh dita_v2 ...` - direct `supervisorctl` against `dolphin:dita_v2` ### Script modes `prod/launch_dita_v2.py` supports: - `once` - `serve` `serve` is the supervised long-running mode. `once` is for snapshot/debug use. --- ## 11. Observability and Debugging ### Debug mode When debug mode is enabled, the kernel should log: - state image changes - transition triggers - venue requests and responses - local lock / unlock points - reconciliation events - diagnostics and anomaly codes ### Error surface The kernel must emit deterministic diagnostic codes for: - invalid slot ID - busy slot - no active exit order - invalid transition - stale-state reconcile - duplicate event / replay no-op - venue rejection The point is to make failures explainable and machine-queryable. --- ## 12. Testing Strategy The DITAv2 suite is intentionally wide. It includes: - kernel-only FSM tests - extensive state-machine tests - race / off-by-one / memory anomaly tests - Zinc interaction tests - Hazelcast projection tests - BingX adapter tests - full-stack E2E / functional tests through the kernel - BLUE/PINK-style signal gamut coverage, including entry, exit, partial exit, TP, hung orders, cancel-reject, and non-close cases - launcher and operator path tests - supervisor config / documentation tests - a dedicated kernel hardening suite with 50 collected cases - mocked exchange-first and BingX-basic E2E paths - chaos / fuzz coverage over both mock and BingX paths ### Testing order 1. kernel-only unit tests 2. Zinc interaction tests 3. projection tests 4. BingX adapter tests 5. launcher and operator wiring tests 6. full suite rerun 7. full-stack E2E / functional coverage through the kernel 8. chaos / fuzz coverage across mock and BingX ### Current validated result The DITAv2 suite is currently green with a broad test surface covering the kernel, launcher, operator wrappers, Zinc, venue adapters, and the full-stack E2E/chaos matrix through the kernel. --- ## 13. Files of Interest ### Core runtime - `prod/clean_arch/dita_v2/rust_backend.py` - `prod/clean_arch/dita_v2/launcher.py` - `prod/clean_arch/dita_v2/control.py` - `prod/clean_arch/dita_v2/projection.py` - `prod/clean_arch/dita_v2/mock_venue.py` - `prod/clean_arch/dita_v2/bingx_venue.py` - `prod/clean_arch/dita_v2/real_control_plane.py` - `prod/clean_arch/dita_v2/real_zinc_plane.py` - `prod/launch_dita_v2.py` - `prod/ops/dita_v2_ctl.py` - `prod/supervisor/supervisorctl.sh` - `prod/supervisor/dolphin-supervisord.conf` ### Tests - `prod/tests/test_dita_v2_kernel.py` - `prod/tests/test_dita_v2_zinc.py` - `prod/tests/test_dita_v2_hazelcast.py` - `prod/tests/test_dita_v2_bingx_adapter.py` - `prod/tests/test_dita_v2_launcher.py` - `prod/tests/test_launch_dita_v2.py` - `prod/tests/test_dita_v2_ops.py` ### Operator docs - `prod/docs/DITA_V2_OPERATOR_PLAYBOOK.md` - `prod/docs/OPERATIONAL_STATUS.md` --- ## 14. Canonical References This DITAv2 reference is the canonical entry for the new execution kernel. Supporting references: - `prod/docs/DITA_V2_OPERATOR_PLAYBOOK.md` - `prod/docs/OPERATIONAL_STATUS.md` - `prod/AGENT_READ_Supervisor_migration.md` --- ## 15. PINK Integration (2026-05-27) PINK now executes trades through the DITAv2 kernel exclusively. ### How it works The PINK launcher (`launch_dolphin_pink.py`) calls `build_launcher_bundle()` to construct a DITAv2 bundle (kernel + BingXVenueAdapter + control plane + Zinc plane + Hazelcast projection). The `PinkDirectRuntime` bridges policy (DecisionEngine/IntentEngine) to execution through a `_decision_to_kernel_intent()` translation seam that maps `Decision`/`Intent` → `KernelIntent`. ### Capital simplification The kernel's `AccountProjection` is the **single local capital authority**: 1. Exchange balance seeds `kernel.account.snapshot.capital` once at startup/recovery. 2. `kernel.account.settle(slot.realized_pnl)` is called in `on_venue_event()` when a fill transitions a slot to CLOSED — the **only** capital mutation post-startup. 3. `observe_slots()` handles mark-to-market (unrealized PnL) — no capital writes. 4. `PinkClickHousePersistence` reads capital/peak/trade_seq from the kernel snapshot. No balance-poll overwrites during the hot loop. ### Files added/changed - `prod/launch_dolphin_pink.py` — uses `build_launcher_bundle()` - `prod/clean_arch/runtime/pink_direct.py` — `ExecutionKernel`-backed runtime - `prod/clean_arch/persistence/pink_clickhouse.py` — reads from kernel account - `prod/ops/pink_ctl.py` — added `ditav2-status` subcommand - `prod/tests/test_pink_ditav2_kernel_bridge.py` — mapping tests (7) - `prod/tests/test_pink_ditav2_rate_limit_contract.py` (1) - `prod/tests/test_pink_ditav2_restart_reconcile.py` (3) - `prod/tests/test_pink_ditav2_accounting_invariants.py` (2) ### Live smoke ```bash python /mnt/dolphinng5_predict/prod/ops/dita_v2_live_bingx_smoke.py --pink --symbol TRXUSDT ``` ### PENDING — Live exchange chaos/fuzz **Status**: Not implemented. Requires a dedicated orchestration layer. The mock-venue and BingX-basic chaos/fuzz matrix in `test_dita_v2_e2e_functional.py` provides deterministic fuzzing over mock and BingX adapter paths (24 cases, all green). True live-testnet chaos/fuzz against a real order book — non-deterministic event ordering, partial fills at unpredictable prices, race conditions between submissions and exchange responses — requires: - A **live-chaos orchestrator** that submits adversarial intents (rapid entries/exits, competing cancels, size-at-lot-boundary, cross-book) against a live BingX testnet symbol. - An **event-sequencer** that captures raw exchange callback order and replays it against the kernel to verify deterministic convergence. - A **state-invariant checker** that asserts slot/account state converges to the same terminal state regardless of callback ordering. This is deferred. The current live smoke tests (`test_pink_bingx_dita_live_e2e.py`, `test_dita_v2_live_bingx_testnet_e2e.py`) cover happy-path E2E cycles only. ### BLUE Non-Impact Proof Checklist | # | Assertion | Method | Status | |---|---|---|---| | 1 | Zero PINK rows in `dolphin` (BLUE) ClickHouse tables | `pink_ctl.py mode-verify` (CH query by `strategy='pink'`) | VERIFIED | | 2 | Zero PINK rows in `dolphin_prodgreen` ClickHouse tables | `pink_ctl.py mode-verify` (CH query by `strategy='pink'` on prodgreen DB) | VERIFIED | | 3 | No PINK keys written to BLUE Hazelcast maps (`DOLPHIN_STATE_BLUE`, `DOLPHIN_PNL_BLUE`) | Hazelcast key scan | VERIFIED | | 4 | No PINK keys written to PRODGREEN Hazelcast maps | Hazelcast key scan | VERIFIED | | 5 | PINK `trade_events` baseline unchanged (106 rows) | CH count query | VERIFIED | | 6 | Stopping/restarting PINK does not affect BLUE supervisor programs | `supervisorctl status` before/after | VERIFIED | | 7 | No BLUE files modified in refactor | `git diff --name-only` (only PINK/DITAv2 paths) | VERIFIED | | 8 | BLUE runtime env vars unchanged (`DOLPHIN_STATE_BLUE`, `dolphin` DB) | env comparison | VERIFIED | **Cutover gate**: all 8 assertions must pass before PINK goes live. **Rollback trigger**: any violation of assertions 1-4 triggers immediate rollback per §6.2 of the refactor guide. ### 15.1 Sync↔Async Seam Analysis (2026-05-27) **7 distinct boundaries identified and tested**: | # | Seam | Bridging Mechanism | Test Coverage | |---|---|---|---| | 1 | `BingxVenueAdapter._run()` → async backend | 3 modes: passthrough, `asyncio.run()` (no-loop), `ThreadPoolExecutor` (in-loop) | `test_pink_sync_async_seams.py` (36 tests) | | 2 | `BingxVenueAdapter.connect()` → `BingxDirectExecutionAdapter.connect()` | `_run()` bridges sync→async | 3 tests | | 3 | `kernel.process_intent()` (sync) → `venue.submit()` (sync) → `_run()` → async HTTP | Thread pool per-call | 4 race-condition tests | | 4 | `PinkDirectRuntime.step()` (async) → `kernel.process_intent()` (sync) | Direct sync call inside coroutine | 1 nested loop test | | 5 | `launcher._maybe_close()` (sync) → async close/disconnect | `asyncio.run()` with RuntimeError catch | 4 tests | | 6 | `_backend_snapshot()` thread safety | No lock — `_last_snapshot` is a plain attribute | 2 concurrent access tests | | 7 | HTTP client timeout propagation | `httpx.AsyncClient` timeout config | 2 timeout tests | **Key findings**: - `_run()` ThreadPoolExecutor creates a new pool per call. At high frequency this could leak threads. Mitigation: chaos harness 10-thread concurrent test verified no leaks under load. - `_maybe_close()` swallows `RuntimeError` from `asyncio.run()` inside a running loop. This is correct behavior — the close call is best-effort. - `pink_direct.py` `connect()` now handles both sync and async venue connect methods via `inspect.isawaitable()`. **Chaos harness**: `test_pink_ditav2_chaos_harness.py` (22 tests) covers: - Rapid entry→exit, two-leg partial, competing cancel, cancel-after-fill, mark-price, reconcile, size-at-boundary, 10x entry-exit loop - Edge cases: zero-size entry, negative price entry - Deterministic replay (ordered and shuffled) — verifies kernel doesn't crash under any event ordering - State invariants: no stuck slots, no negative capital, no illegal FSM transitions, no critical diagnostics ### 15.2 TODO — Live testnet chaos E2E **Status**: Not implemented. Requires dedicated work. The chaos harness (`test_pink_ditav2_chaos_harness.py`) runs all adversarial scenarios (rapid entry-exit, competing cancel, size-at-boundary, 10x loops) against the `MockVenueAdapter` only. To reach prod confidence, these same scenarios must be run against a live BingX VST symbol with: 1. **Exchange-side verification** — orders/positions/account queried directly from the exchange after each chaos step, not just from kernel state. 2. **Quantity-compliance monitoring** — BingX may truncate or round lot sizes differently than the adapter expects; the test must assert the exchange accepted the intended size. 3. **Fill-price tracking** — partial fills at unpredictable prices under rapid entry-exit must be captured and reconciled against the kernel's accounting. 4. **Rate-limit cascade testing** — the parallel HTTP gather in `_refresh_exchange_state` must be verified under sustained rate-limit pressure. **Design sketch**: - Extend `ChaosOrchestrator.run_chaos_scenario()` to accept a `BingxVenueAdapter` (live) in addition to `MockVenueAdapter`. - Add a `LiveStateVerifier` that hits the BingX REST API after each step and asserts kernel state ≈ exchange state within rounding tolerance. - Gate the live chaos tests with the same `BINGX_SMOKE_LIVE=1` env convention. - Run the chaos scenarios that are safe for testnet (no cross-book, no size-at-boundary that would cause a reject chain). This is deferred because the current live E2E tests cover happy-path cycles only, and the mock-venue chaos harness validates kernel invariants. Bridging the two for live chaos is a separate engineering effort.