PINK DITAv2 Sprint 2-3: accounting parity + multi-leg groundwork
Sprint 2 (accounting + observability parity, PINK scope):
- Verified pink_clickhouse.py writes the 8 BLUE-legacy row families at
matching schema and that capital authority in pink_direct.step() is
solely kernel.account (no balance-poll overwrite in the hot loop).
- Report: prod/clean_arch/dita_v2/SPRINT2_ACCOUNTING_PARITY.md.
Sprint 3 offline groundwork (no exchange contact):
- Add _write_trade_exit_leg to pink_clickhouse.py: one BLUE-schema-faithful
trade_exit_legs row per exit leg, with isolated (non-cumulative) per-leg
deltas tracked via _leg_state (reset on ENTER). Closes the docstring gap.
- New offline suite test_pink_multi_exit_groundwork.py (3 passed):
* Flaw 4 — two-leg exit closes once, realized accrues per leg, closed
slot rejects further EXIT (no double-close).
* Overshoot invariant — a final EXIT requesting more than the remaining
size CLAMPS (size to 0, no oversell), retiring the Sprint 0 cumulative-
ratio risk empirically.
* trade_exit_legs delta + full BLUE column-set assertions.
- Persistence regression after edits: 10 passed.
BLUE untouched: no changes to dolphin.* / DOLPHIN_*_BLUE / nautilus_event_trader.py.
Live VST multi-leg run remains deferred pending explicit authorization.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
764
prod/docs/DITA_V2_KERNEL_REFERENCE.md
Normal file
764
prod/docs/DITA_V2_KERNEL_REFERENCE.md
Normal file
@@ -0,0 +1,764 @@
|
||||
# DITAv2 Kernel Reference
|
||||
|
||||
**Status:** active
|
||||
**Scope:** DITAv2 execution kernel, operator launcher, shared-memory control plane, venue adapters, and observability integration.
|
||||
**Primary runtime path:** `dolphin:dita_v2`
|
||||
|
||||
This document is the canonical reference for the DITAv2 stack under
|
||||
`prod/clean_arch/dita_v2/`.
|
||||
|
||||
It describes:
|
||||
|
||||
- the execution kernel contract
|
||||
- the kernel state model and FSM
|
||||
- Zinc / Hazelcast boundaries
|
||||
- mock and BingX venue adapters
|
||||
- launcher and operator control surfaces
|
||||
- debug and replay semantics
|
||||
- failure and recovery behavior
|
||||
- test strategy and invariants
|
||||
|
||||
The DITAv2 stack is intentionally separate from the legacy `prod.clean_arch.dita`
|
||||
surface. It can be exercised in isolation, with safe defaults for tests and
|
||||
explicit opt-in for real shared-memory and live venue wiring.
|
||||
|
||||
Recent hardening additions:
|
||||
|
||||
- direct slot writes now mirror into the Zinc state region immediately
|
||||
- the regression surface includes a 50-case hardening suite for diagnostics,
|
||||
duplicate replay, stale-state handling, and Zinc mirroring
|
||||
|
||||
---
|
||||
|
||||
## 1. What DITAv2 Is
|
||||
|
||||
DITAv2 is a multi-slot execution kernel for trade lifecycle management.
|
||||
It sits between the alpha layer and the exchange layer.
|
||||
|
||||
Its responsibilities are limited to:
|
||||
|
||||
1. receiving intents
|
||||
2. mutating slot state
|
||||
3. normalizing venue events
|
||||
4. projecting account state
|
||||
5. emitting deterministic transition and diagnostic records
|
||||
6. mirroring confirmed state to durable surfaces
|
||||
|
||||
It is not responsible for alpha generation. It does not compute signals.
|
||||
It does not decide entry/exit thesis. Those inputs come from BLUE/PINK or
|
||||
another upstream strategy layer.
|
||||
|
||||
### Design intent
|
||||
|
||||
DITAv2 is built to make execution state:
|
||||
|
||||
- explicit
|
||||
- replayable
|
||||
- debuggable
|
||||
- observable
|
||||
- testable at the FSM edge
|
||||
|
||||
The goal is to eliminate shadow-state drift between local memory, exchange
|
||||
truth, and durable observability surfaces.
|
||||
|
||||
---
|
||||
|
||||
## 2. Canonical Components
|
||||
|
||||
### Kernel
|
||||
|
||||
File:
|
||||
|
||||
- `prod/clean_arch/dita_v2/rust_backend.py`
|
||||
- `prod/clean_arch/dita_v2/_rust_kernel/`
|
||||
|
||||
The Python-facing `ExecutionKernel` is backed by a Rust implementation loaded
|
||||
through `ctypes`. The Python wrapper keeps the public API stable and writes
|
||||
through to the Rust backend on slot mutations and event processing.
|
||||
|
||||
### Control plane
|
||||
|
||||
Files:
|
||||
|
||||
- `prod/clean_arch/dita_v2/control.py`
|
||||
- `prod/clean_arch/dita_v2/real_control_plane.py`
|
||||
|
||||
The control plane holds runtime mode, verbosity, backend selection, slot
|
||||
limits, and debug flags. It supports:
|
||||
|
||||
- `NORMAL` / `DEBUG`
|
||||
- `QUIET` / `VERBOSE` / `TRACE`
|
||||
- `MOCK` / `BINGX`
|
||||
- mirror-to-Hazelcast toggles
|
||||
- restart reconciliation toggles
|
||||
|
||||
### Zinc plane
|
||||
|
||||
Files:
|
||||
|
||||
- `prod/clean_arch/dita_v2/zinc_plane.py`
|
||||
- `prod/clean_arch/dita_v2/real_zinc_plane.py`
|
||||
|
||||
The Zinc plane is the hot-path shared-memory substrate for:
|
||||
|
||||
- intents
|
||||
- slot snapshots
|
||||
- control snapshots
|
||||
|
||||
It follows Zinc's one-shot signal pattern wherever possible:
|
||||
|
||||
- writers publish the latest data and then notify
|
||||
- readers wait for a sequence change from the last value they observed
|
||||
- state-based sync is preferred over event-count sync
|
||||
- the in-memory stand-ins emulate the same notify/wait contract for tests
|
||||
|
||||
The in-memory plane is used by default for tests. The real Zinc plane is
|
||||
opt-in and uses the `zinc` Python adapter over shared memory.
|
||||
|
||||
Direct slot mutation is intentionally write-through: the Rust-backed kernel
|
||||
and the Zinc mirror must stay aligned on every `_set_slot()`, venue event, and
|
||||
reconcile path. The tests assert that a direct slot write is visible in the
|
||||
state region without waiting for a separate flush cycle. The same update path
|
||||
also notifies waiters so cross-process readers can wake on the latest state
|
||||
change instead of polling.
|
||||
|
||||
### Projection
|
||||
|
||||
Files:
|
||||
|
||||
- `prod/clean_arch/dita_v2/projection.py`
|
||||
- `prod/clean_arch/dita_v2/hazelcast_projection.py`
|
||||
|
||||
The projection layer writes BLUE/PINK-compatible state rows to Hazelcast
|
||||
and emits lifecycle rows suitable for ClickHouse observability.
|
||||
|
||||
### Venue adapters
|
||||
|
||||
Files:
|
||||
|
||||
- `prod/clean_arch/dita_v2/mock_venue.py`
|
||||
- `prod/clean_arch/dita_v2/bingx_venue.py`
|
||||
|
||||
The mock adapter is deterministic and BingX-shaped. The BingX adapter is a
|
||||
thin normalization layer over the direct BingX execution client surface.
|
||||
|
||||
### Launcher and operator controls
|
||||
|
||||
Files:
|
||||
|
||||
- `prod/clean_arch/dita_v2/launcher.py`
|
||||
- `prod/launch_dita_v2.py`
|
||||
- `prod/ops/dita_v2_ctl.py`
|
||||
- `prod/supervisor/supervisorctl.sh`
|
||||
- `prod/ops/dita_v2_live_bingx_smoke.py`
|
||||
|
||||
The launcher assembles a full runtime bundle. The operator scripts provide
|
||||
status, healthcheck, start, stop, and restart paths. The smoke wrapper
|
||||
provides a repeatable BingX testnet command that runs the full live E2E suite
|
||||
with the correct live-smoke environment gates and supervisor precheck.
|
||||
|
||||
Repeatable live smoke command:
|
||||
|
||||
```bash
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_live_bingx_smoke.py --symbol TRXUSDT
|
||||
```
|
||||
|
||||
Use `--dry-run` to print the exact env and pytest command without sending
|
||||
orders.
|
||||
|
||||
---
|
||||
|
||||
## 3. Runtime Topology
|
||||
|
||||
### Default test topology
|
||||
|
||||
```text
|
||||
ExecutionKernel
|
||||
├─ InMemoryControlPlane
|
||||
├─ InMemoryZincPlane
|
||||
├─ MockVenueAdapter
|
||||
└─ HazelcastProjection(writer=callback)
|
||||
```
|
||||
|
||||
### Real operator topology
|
||||
|
||||
```text
|
||||
ExecutionKernel
|
||||
├─ RealZincControlPlane or mirrored in-memory control plane
|
||||
├─ RealZincPlane
|
||||
├─ BingxVenueAdapter
|
||||
└─ HazelcastProjection(client-backed writer)
|
||||
```
|
||||
|
||||
### Supervisord-managed service
|
||||
|
||||
Program:
|
||||
|
||||
```text
|
||||
dolphin:dita_v2
|
||||
```
|
||||
|
||||
Launcher:
|
||||
|
||||
```text
|
||||
/mnt/dolphinng5_predict/prod/launch_dita_v2.py
|
||||
```
|
||||
|
||||
Default supervised posture:
|
||||
|
||||
- `DITA_V2_LAUNCHER_MODE=serve`
|
||||
- `DITA_V2_VENUE=BINGX`
|
||||
- `DITA_V2_ZINC=REAL`
|
||||
- `DITA_V2_CONTROL_PLANE=REAL_ZINC`
|
||||
- `DITA_V2_HAZELCAST=REAL`
|
||||
- `DITA_V2_MODE=DEBUG`
|
||||
- `DITA_V2_VERBOSITY=TRACE`
|
||||
|
||||
The supervised path is intentionally separate from the legacy PINK and BLUE
|
||||
entrypoints.
|
||||
|
||||
---
|
||||
|
||||
## 4. Data Contracts
|
||||
|
||||
### Core contract files
|
||||
|
||||
- `prod/clean_arch/dita_v2/contracts.py`
|
||||
- `prod/clean_arch/dita_v2/venue.py`
|
||||
|
||||
### Important types
|
||||
|
||||
- `TradeStage`
|
||||
- `TradeSlot`
|
||||
- `VenueOrder`
|
||||
- `VenueEvent`
|
||||
- `KernelIntent`
|
||||
- `KernelTransition`
|
||||
- `KernelOutcome`
|
||||
- `KernelDiagnosticCode`
|
||||
- `KernelCommandType`
|
||||
- `KernelEventKind`
|
||||
- `KernelMode`
|
||||
- `KernelVerbosity`
|
||||
- `BackendMode`
|
||||
|
||||
### Slot model
|
||||
|
||||
Each slot is the unit of execution. It carries:
|
||||
|
||||
- trade identity
|
||||
- asset
|
||||
- side
|
||||
- entry price
|
||||
- current size
|
||||
- leverage
|
||||
- open/close state
|
||||
- active entry/exit order handles
|
||||
- leg progression
|
||||
- idempotency tracking via seen event IDs
|
||||
|
||||
The slot is the primary kernel state object. The kernel maintains multiple
|
||||
slots but one slot can be actively traded while the others remain idle or
|
||||
recoverable.
|
||||
|
||||
### Order model
|
||||
|
||||
`VenueOrder` captures the venue-specific identity of an order:
|
||||
|
||||
- internal trade ID
|
||||
- venue order ID
|
||||
- venue client ID
|
||||
- side
|
||||
- intended size
|
||||
- filled size
|
||||
- average fill price
|
||||
- status
|
||||
- metadata
|
||||
|
||||
### Event model
|
||||
|
||||
`VenueEvent` captures the normalized venue response surface:
|
||||
|
||||
- ack
|
||||
- partial fill
|
||||
- full fill
|
||||
- cancel ack
|
||||
- cancel reject
|
||||
- reject
|
||||
|
||||
The kernel consumes normalized events, not raw exchange payloads.
|
||||
|
||||
---
|
||||
|
||||
## 5. State Machine
|
||||
|
||||
### Core states
|
||||
|
||||
- `IDLE`
|
||||
- `ENTRY_WORKING`
|
||||
- `POSITION_OPEN`
|
||||
- `EXIT_WORKING`
|
||||
- `CLOSED`
|
||||
- `STALE_STATE_RECONCILING`
|
||||
|
||||
### Basic transitions
|
||||
|
||||
```text
|
||||
IDLE
|
||||
└─ ENTER intent ─> ENTRY_WORKING
|
||||
ENTRY_WORKING
|
||||
├─ PARTIAL_FILL ─> ENTRY_WORKING
|
||||
├─ FULL_FILL ─> POSITION_OPEN
|
||||
└─ ORDER_REJECT ─> IDLE
|
||||
POSITION_OPEN
|
||||
├─ EXIT intent ─> EXIT_WORKING
|
||||
└─ MARK_PRICE ─> POSITION_OPEN
|
||||
EXIT_WORKING
|
||||
├─ PARTIAL_FILL ─> EXIT_WORKING
|
||||
├─ FULL_FILL ─> IDLE or POSITION_OPEN (multi-leg)
|
||||
├─ CANCEL_ACK ─> POSITION_OPEN
|
||||
└─ CANCEL_REJECT ─> EXIT_WORKING
|
||||
```
|
||||
|
||||
### Idempotency
|
||||
|
||||
Duplicate venue events are tracked via event IDs in the slot image. Repeated
|
||||
events are treated as no-ops, not as extra fills or duplicate state changes.
|
||||
|
||||
### Recovery state
|
||||
|
||||
`STALE_STATE_RECONCILING` blocks normal event progression until reconciliation
|
||||
completes. This state exists to make restart, replay, and venue divergence
|
||||
explicit.
|
||||
|
||||
### Rate limit handling
|
||||
|
||||
BingX rate limiting is treated as a first-class retryable condition, not a
|
||||
generic failure. The kernel surfaces it with:
|
||||
|
||||
- `KernelDiagnosticCode.RATE_LIMITED`
|
||||
- `KernelSeverity.WARNING`
|
||||
- `details["release_eta"] = "few minutes"` when the exchange provides no
|
||||
precise retry window
|
||||
- `details["retry_after_ms"]` when the adapter or venue response includes a
|
||||
retry hint
|
||||
- `details["retryable"] = true`
|
||||
|
||||
This is intentionally downstream-friendly: operators and orchestration layers
|
||||
can distinguish transient throttling from hard rejections and choose a retry
|
||||
policy explicitly.
|
||||
|
||||
---
|
||||
|
||||
## 6. Control Plane Semantics
|
||||
|
||||
The control plane is used to steer runtime behavior without changing kernel
|
||||
logic.
|
||||
|
||||
### Modes
|
||||
|
||||
- `NORMAL` for production-like execution
|
||||
- `DEBUG` for full state and transition tracing
|
||||
|
||||
### Verbosity
|
||||
|
||||
- `QUIET`
|
||||
- `VERBOSE`
|
||||
- `TRACE`
|
||||
|
||||
### Backend mode
|
||||
|
||||
- `MOCK`
|
||||
- `BINGX`
|
||||
|
||||
### Key toggles
|
||||
|
||||
- `debug_clickhouse_enabled`
|
||||
- `trace_transitions`
|
||||
- `mirror_to_hazelcast`
|
||||
- `active_slot_limit`
|
||||
- `reconcile_on_restart`
|
||||
|
||||
### Shared-memory selection
|
||||
|
||||
The launcher uses env-driven selection:
|
||||
|
||||
- `DITA_V2_CONTROL_PLANE=REAL_ZINC`
|
||||
- `DITA_V2_ZINC=REAL`
|
||||
- `DITA_V2_HAZELCAST=REAL`
|
||||
- `DITA_V2_VENUE=BINGX`
|
||||
|
||||
Defaults remain safe and testable. Real shared-memory and live venue wiring are
|
||||
opt-in.
|
||||
|
||||
---
|
||||
|
||||
## 7. Zinc Boundary
|
||||
|
||||
### Why Zinc is used
|
||||
|
||||
Zinc provides the shared-memory substrate for:
|
||||
|
||||
- low-latency control-plane reads
|
||||
- intent publication
|
||||
- slot state snapshots
|
||||
- zero-copy observation across processes
|
||||
|
||||
### Hot-path intent region
|
||||
|
||||
Written by the alpha/launcher side, read by the kernel.
|
||||
|
||||
### Hot-path state region
|
||||
|
||||
Written by the kernel, read by the alpha side or operator tooling.
|
||||
|
||||
### Control region
|
||||
|
||||
Used for runtime mode switches and operator commands.
|
||||
|
||||
### Invariants
|
||||
|
||||
1. Shared-memory state must not silently diverge from kernel state.
|
||||
2. Writes should be explicit and versioned.
|
||||
3. The kernel must not rely on duplicated Python shadow state as authority.
|
||||
|
||||
---
|
||||
|
||||
## 8. Hazelcast / ClickHouse Boundary
|
||||
|
||||
### Hazelcast
|
||||
|
||||
Hazelcast is the durable projection mirror for:
|
||||
|
||||
- confirmed slot state
|
||||
- control snapshot mirroring
|
||||
- active slot registry
|
||||
- trade event topic emission
|
||||
|
||||
### ClickHouse
|
||||
|
||||
ClickHouse is the observability and debug journal sink. In debug mode, the
|
||||
kernel should emit enough rows to reconstruct a transition timeline.
|
||||
|
||||
### Compatibility rule
|
||||
|
||||
All emitted rows must remain compatible with the BLUE/PINK schema family.
|
||||
The DITAv2 layer does not invent a new observability universe unless the
|
||||
schema is explicitly versioned.
|
||||
|
||||
---
|
||||
|
||||
## 9. Venue Adapters
|
||||
|
||||
### Mock venue
|
||||
|
||||
File:
|
||||
|
||||
- `prod/clean_arch/dita_v2/mock_venue.py`
|
||||
|
||||
Behavior:
|
||||
|
||||
- deterministic
|
||||
- BingX-shaped semantics
|
||||
- configurable reject / partial fill / cancel reject scenarios
|
||||
- useful for FSM and race testing
|
||||
|
||||
### BingX venue
|
||||
|
||||
File:
|
||||
|
||||
- `prod/clean_arch/dita_v2/bingx_venue.py`
|
||||
|
||||
Behavior:
|
||||
|
||||
- thin normalization layer
|
||||
- converts BingX order/account payloads into DITAv2 events/orders
|
||||
- no reimplementation of exchange logic
|
||||
- live adapter backed by the direct BingX client path
|
||||
|
||||
### Adapter rule
|
||||
|
||||
If a mock cannot faithfully mirror BingX behavior in an in-scope path, the
|
||||
adapter layer must map actual BingX responses into DITAv2 contracts instead of
|
||||
inventing a separate semantic model.
|
||||
|
||||
---
|
||||
|
||||
## 10. Launcher and Operator Flow
|
||||
|
||||
### Launcher responsibilities
|
||||
|
||||
- assemble control plane
|
||||
- assemble Zinc plane
|
||||
- assemble projection sink
|
||||
- select venue adapter
|
||||
- create the kernel
|
||||
|
||||
### Operator controls
|
||||
|
||||
Supported command surfaces:
|
||||
|
||||
- `prod/ops/dita_v2_ctl.py`
|
||||
- `prod/supervisor/supervisorctl.sh dita_v2 ...`
|
||||
- direct `supervisorctl` against `dolphin:dita_v2`
|
||||
|
||||
### Script modes
|
||||
|
||||
`prod/launch_dita_v2.py` supports:
|
||||
|
||||
- `once`
|
||||
- `serve`
|
||||
|
||||
`serve` is the supervised long-running mode. `once` is for snapshot/debug use.
|
||||
|
||||
---
|
||||
|
||||
## 11. Observability and Debugging
|
||||
|
||||
### Debug mode
|
||||
|
||||
When debug mode is enabled, the kernel should log:
|
||||
|
||||
- state image changes
|
||||
- transition triggers
|
||||
- venue requests and responses
|
||||
- local lock / unlock points
|
||||
- reconciliation events
|
||||
- diagnostics and anomaly codes
|
||||
|
||||
### Error surface
|
||||
|
||||
The kernel must emit deterministic diagnostic codes for:
|
||||
|
||||
- invalid slot ID
|
||||
- busy slot
|
||||
- no active exit order
|
||||
- invalid transition
|
||||
- stale-state reconcile
|
||||
- duplicate event / replay no-op
|
||||
- venue rejection
|
||||
|
||||
The point is to make failures explainable and machine-queryable.
|
||||
|
||||
---
|
||||
|
||||
## 12. Testing Strategy
|
||||
|
||||
The DITAv2 suite is intentionally wide. It includes:
|
||||
|
||||
- kernel-only FSM tests
|
||||
- extensive state-machine tests
|
||||
- race / off-by-one / memory anomaly tests
|
||||
- Zinc interaction tests
|
||||
- Hazelcast projection tests
|
||||
- BingX adapter tests
|
||||
- full-stack E2E / functional tests through the kernel
|
||||
- BLUE/PINK-style signal gamut coverage, including entry, exit, partial exit, TP, hung orders, cancel-reject, and non-close cases
|
||||
- launcher and operator path tests
|
||||
- supervisor config / documentation tests
|
||||
- a dedicated kernel hardening suite with 50 collected cases
|
||||
- mocked exchange-first and BingX-basic E2E paths
|
||||
- chaos / fuzz coverage over both mock and BingX paths
|
||||
|
||||
### Testing order
|
||||
|
||||
1. kernel-only unit tests
|
||||
2. Zinc interaction tests
|
||||
3. projection tests
|
||||
4. BingX adapter tests
|
||||
5. launcher and operator wiring tests
|
||||
6. full suite rerun
|
||||
7. full-stack E2E / functional coverage through the kernel
|
||||
8. chaos / fuzz coverage across mock and BingX
|
||||
|
||||
### Current validated result
|
||||
|
||||
The DITAv2 suite is currently green with a broad test surface covering the
|
||||
kernel, launcher, operator wrappers, Zinc, venue adapters, and the full-stack
|
||||
E2E/chaos matrix through the kernel.
|
||||
|
||||
---
|
||||
|
||||
## 13. Files of Interest
|
||||
|
||||
### Core runtime
|
||||
|
||||
- `prod/clean_arch/dita_v2/rust_backend.py`
|
||||
- `prod/clean_arch/dita_v2/launcher.py`
|
||||
- `prod/clean_arch/dita_v2/control.py`
|
||||
- `prod/clean_arch/dita_v2/projection.py`
|
||||
- `prod/clean_arch/dita_v2/mock_venue.py`
|
||||
- `prod/clean_arch/dita_v2/bingx_venue.py`
|
||||
- `prod/clean_arch/dita_v2/real_control_plane.py`
|
||||
- `prod/clean_arch/dita_v2/real_zinc_plane.py`
|
||||
- `prod/launch_dita_v2.py`
|
||||
- `prod/ops/dita_v2_ctl.py`
|
||||
- `prod/supervisor/supervisorctl.sh`
|
||||
- `prod/supervisor/dolphin-supervisord.conf`
|
||||
|
||||
### Tests
|
||||
|
||||
- `prod/tests/test_dita_v2_kernel.py`
|
||||
- `prod/tests/test_dita_v2_zinc.py`
|
||||
- `prod/tests/test_dita_v2_hazelcast.py`
|
||||
- `prod/tests/test_dita_v2_bingx_adapter.py`
|
||||
- `prod/tests/test_dita_v2_launcher.py`
|
||||
- `prod/tests/test_launch_dita_v2.py`
|
||||
- `prod/tests/test_dita_v2_ops.py`
|
||||
|
||||
### Operator docs
|
||||
|
||||
- `prod/docs/DITA_V2_OPERATOR_PLAYBOOK.md`
|
||||
- `prod/docs/OPERATIONAL_STATUS.md`
|
||||
|
||||
---
|
||||
|
||||
## 14. Canonical References
|
||||
|
||||
This DITAv2 reference is the canonical entry for the new execution kernel.
|
||||
|
||||
Supporting references:
|
||||
|
||||
- `prod/docs/DITA_V2_OPERATOR_PLAYBOOK.md`
|
||||
- `prod/docs/OPERATIONAL_STATUS.md`
|
||||
- `prod/AGENT_READ_Supervisor_migration.md`
|
||||
|
||||
---
|
||||
|
||||
## 15. PINK Integration (2026-05-27)
|
||||
|
||||
PINK now executes trades through the DITAv2 kernel exclusively.
|
||||
|
||||
### How it works
|
||||
|
||||
The PINK launcher (`launch_dolphin_pink.py`) calls `build_launcher_bundle()` to
|
||||
construct a DITAv2 bundle (kernel + BingXVenueAdapter + control plane + Zinc
|
||||
plane + Hazelcast projection). The `PinkDirectRuntime` bridges policy
|
||||
(DecisionEngine/IntentEngine) to execution through a `_decision_to_kernel_intent()`
|
||||
translation seam that maps `Decision`/`Intent` → `KernelIntent`.
|
||||
|
||||
### Capital simplification
|
||||
|
||||
The kernel's `AccountProjection` is the **single local capital authority**:
|
||||
|
||||
1. Exchange balance seeds `kernel.account.snapshot.capital` once at startup/recovery.
|
||||
2. `kernel.account.settle(slot.realized_pnl)` is called in `on_venue_event()` when
|
||||
a fill transitions a slot to CLOSED — the **only** capital mutation post-startup.
|
||||
3. `observe_slots()` handles mark-to-market (unrealized PnL) — no capital writes.
|
||||
4. `PinkClickHousePersistence` reads capital/peak/trade_seq from the kernel snapshot.
|
||||
|
||||
No balance-poll overwrites during the hot loop.
|
||||
|
||||
### Files added/changed
|
||||
|
||||
- `prod/launch_dolphin_pink.py` — uses `build_launcher_bundle()`
|
||||
- `prod/clean_arch/runtime/pink_direct.py` — `ExecutionKernel`-backed runtime
|
||||
- `prod/clean_arch/persistence/pink_clickhouse.py` — reads from kernel account
|
||||
- `prod/ops/pink_ctl.py` — added `ditav2-status` subcommand
|
||||
- `prod/tests/test_pink_ditav2_kernel_bridge.py` — mapping tests (7)
|
||||
- `prod/tests/test_pink_ditav2_rate_limit_contract.py` (1)
|
||||
- `prod/tests/test_pink_ditav2_restart_reconcile.py` (3)
|
||||
- `prod/tests/test_pink_ditav2_accounting_invariants.py` (2)
|
||||
|
||||
### Live smoke
|
||||
|
||||
```bash
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_live_bingx_smoke.py --pink --symbol TRXUSDT
|
||||
```
|
||||
|
||||
### PENDING — Live exchange chaos/fuzz
|
||||
|
||||
**Status**: Not implemented. Requires a dedicated orchestration layer.
|
||||
|
||||
The mock-venue and BingX-basic chaos/fuzz matrix in
|
||||
`test_dita_v2_e2e_functional.py` provides deterministic fuzzing over mock and
|
||||
BingX adapter paths (24 cases, all green). True live-testnet chaos/fuzz
|
||||
against a real order book — non-deterministic event ordering, partial fills at
|
||||
unpredictable prices, race conditions between submissions and exchange
|
||||
responses — requires:
|
||||
|
||||
- A **live-chaos orchestrator** that submits adversarial intents (rapid
|
||||
entries/exits, competing cancels, size-at-lot-boundary, cross-book) against
|
||||
a live BingX testnet symbol.
|
||||
- An **event-sequencer** that captures raw exchange callback order and
|
||||
replays it against the kernel to verify deterministic convergence.
|
||||
- A **state-invariant checker** that asserts slot/account state converges to
|
||||
the same terminal state regardless of callback ordering.
|
||||
|
||||
This is deferred. The current live smoke tests (`test_pink_bingx_dita_live_e2e.py`,
|
||||
`test_dita_v2_live_bingx_testnet_e2e.py`) cover happy-path E2E cycles only.
|
||||
|
||||
### BLUE Non-Impact Proof Checklist
|
||||
|
||||
| # | Assertion | Method | Status |
|
||||
|---|---|---|---|
|
||||
| 1 | Zero PINK rows in `dolphin` (BLUE) ClickHouse tables | `pink_ctl.py mode-verify` (CH query by `strategy='pink'`) | VERIFIED |
|
||||
| 2 | Zero PINK rows in `dolphin_prodgreen` ClickHouse tables | `pink_ctl.py mode-verify` (CH query by `strategy='pink'` on prodgreen DB) | VERIFIED |
|
||||
| 3 | No PINK keys written to BLUE Hazelcast maps (`DOLPHIN_STATE_BLUE`, `DOLPHIN_PNL_BLUE`) | Hazelcast key scan | VERIFIED |
|
||||
| 4 | No PINK keys written to PRODGREEN Hazelcast maps | Hazelcast key scan | VERIFIED |
|
||||
| 5 | PINK `trade_events` baseline unchanged (106 rows) | CH count query | VERIFIED |
|
||||
| 6 | Stopping/restarting PINK does not affect BLUE supervisor programs | `supervisorctl status` before/after | VERIFIED |
|
||||
| 7 | No BLUE files modified in refactor | `git diff --name-only` (only PINK/DITAv2 paths) | VERIFIED |
|
||||
| 8 | BLUE runtime env vars unchanged (`DOLPHIN_STATE_BLUE`, `dolphin` DB) | env comparison | VERIFIED |
|
||||
|
||||
**Cutover gate**: all 8 assertions must pass before PINK goes live.
|
||||
**Rollback trigger**: any violation of assertions 1-4 triggers immediate rollback per §6.2 of the refactor guide.
|
||||
|
||||
### 15.1 Sync↔Async Seam Analysis (2026-05-27)
|
||||
|
||||
**7 distinct boundaries identified and tested**:
|
||||
|
||||
| # | Seam | Bridging Mechanism | Test Coverage |
|
||||
|---|---|---|---|
|
||||
| 1 | `BingxVenueAdapter._run()` → async backend | 3 modes: passthrough, `asyncio.run()` (no-loop), `ThreadPoolExecutor` (in-loop) | `test_pink_sync_async_seams.py` (36 tests) |
|
||||
| 2 | `BingxVenueAdapter.connect()` → `BingxDirectExecutionAdapter.connect()` | `_run()` bridges sync→async | 3 tests |
|
||||
| 3 | `kernel.process_intent()` (sync) → `venue.submit()` (sync) → `_run()` → async HTTP | Thread pool per-call | 4 race-condition tests |
|
||||
| 4 | `PinkDirectRuntime.step()` (async) → `kernel.process_intent()` (sync) | Direct sync call inside coroutine | 1 nested loop test |
|
||||
| 5 | `launcher._maybe_close()` (sync) → async close/disconnect | `asyncio.run()` with RuntimeError catch | 4 tests |
|
||||
| 6 | `_backend_snapshot()` thread safety | No lock — `_last_snapshot` is a plain attribute | 2 concurrent access tests |
|
||||
| 7 | HTTP client timeout propagation | `httpx.AsyncClient` timeout config | 2 timeout tests |
|
||||
|
||||
**Key findings**:
|
||||
- `_run()` ThreadPoolExecutor creates a new pool per call. At high frequency this could leak threads. Mitigation: chaos harness 10-thread concurrent test verified no leaks under load.
|
||||
- `_maybe_close()` swallows `RuntimeError` from `asyncio.run()` inside a running loop. This is correct behavior — the close call is best-effort.
|
||||
- `pink_direct.py` `connect()` now handles both sync and async venue connect methods via `inspect.isawaitable()`.
|
||||
|
||||
**Chaos harness**: `test_pink_ditav2_chaos_harness.py` (22 tests) covers:
|
||||
- Rapid entry→exit, two-leg partial, competing cancel, cancel-after-fill, mark-price, reconcile, size-at-boundary, 10x entry-exit loop
|
||||
- Edge cases: zero-size entry, negative price entry
|
||||
- Deterministic replay (ordered and shuffled) — verifies kernel doesn't crash under any event ordering
|
||||
- State invariants: no stuck slots, no negative capital, no illegal FSM transitions, no critical diagnostics
|
||||
|
||||
### 15.2 TODO — Live testnet chaos E2E
|
||||
|
||||
**Status**: Not implemented. Requires dedicated work.
|
||||
|
||||
The chaos harness (`test_pink_ditav2_chaos_harness.py`) runs all adversarial
|
||||
scenarios (rapid entry-exit, competing cancel, size-at-boundary, 10x loops)
|
||||
against the `MockVenueAdapter` only. To reach prod confidence, these same
|
||||
scenarios must be run against a live BingX VST symbol with:
|
||||
|
||||
1. **Exchange-side verification** — orders/positions/account queried directly
|
||||
from the exchange after each chaos step, not just from kernel state.
|
||||
2. **Quantity-compliance monitoring** — BingX may truncate or round lot sizes
|
||||
differently than the adapter expects; the test must assert the exchange
|
||||
accepted the intended size.
|
||||
3. **Fill-price tracking** — partial fills at unpredictable prices under
|
||||
rapid entry-exit must be captured and reconciled against the kernel's
|
||||
accounting.
|
||||
4. **Rate-limit cascade testing** — the parallel HTTP gather in
|
||||
`_refresh_exchange_state` must be verified under sustained rate-limit
|
||||
pressure.
|
||||
|
||||
**Design sketch**:
|
||||
- Extend `ChaosOrchestrator.run_chaos_scenario()` to accept a
|
||||
`BingxVenueAdapter` (live) in addition to `MockVenueAdapter`.
|
||||
- Add a `LiveStateVerifier` that hits the BingX REST API after each step
|
||||
and asserts kernel state ≈ exchange state within rounding tolerance.
|
||||
- Gate the live chaos tests with the same `BINGX_SMOKE_LIVE=1` env convention.
|
||||
- Run the chaos scenarios that are safe for testnet (no cross-book, no
|
||||
size-at-boundary that would cause a reject chain).
|
||||
|
||||
This is deferred because the current live E2E tests cover happy-path cycles
|
||||
only, and the mock-venue chaos harness validates kernel invariants. Bridging
|
||||
the two for live chaos is a separate engineering effort.
|
||||
116
prod/docs/DITA_V2_OPERATOR_PLAYBOOK.md
Normal file
116
prod/docs/DITA_V2_OPERATOR_PLAYBOOK.md
Normal file
@@ -0,0 +1,116 @@
|
||||
# DITAv2 Operator Playbook
|
||||
|
||||
This is the operator-facing control surface for the DITAv2 execution kernel.
|
||||
|
||||
## Supervisor program
|
||||
|
||||
The process is managed as:
|
||||
|
||||
`dolphin:dita_v2`
|
||||
|
||||
Launcher:
|
||||
|
||||
`/mnt/dolphinng5_predict/prod/launch_dita_v2.py`
|
||||
|
||||
## Default runtime posture
|
||||
|
||||
- `DITA_V2_LAUNCHER_MODE=serve`
|
||||
- `DITA_V2_VENUE=BINGX`
|
||||
- `DITA_V2_ZINC=REAL`
|
||||
- `DITA_V2_CONTROL_PLANE=REAL_ZINC`
|
||||
- `DITA_V2_HAZELCAST=REAL`
|
||||
- `DITA_V2_MODE=DEBUG`
|
||||
- `DITA_V2_VERBOSITY=TRACE`
|
||||
|
||||
The launcher defaults remain safe in-process for tests, but the supervised
|
||||
program is configured for the real shared-memory / live venue path.
|
||||
|
||||
## Control commands
|
||||
|
||||
Use:
|
||||
|
||||
```bash
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_ctl.py status
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_ctl.py start
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_ctl.py stop
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_ctl.py restart
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_ctl.py healthcheck
|
||||
```
|
||||
|
||||
These map to:
|
||||
|
||||
```bash
|
||||
supervisorctl -c /mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf <action> dolphin:dita_v2
|
||||
```
|
||||
|
||||
## Live BingX testnet smoke
|
||||
|
||||
Use the repeatable live smoke wrapper:
|
||||
|
||||
```bash
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_live_bingx_smoke.py
|
||||
```
|
||||
|
||||
Recommended explicit symbol:
|
||||
|
||||
```bash
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_live_bingx_smoke.py --symbol TRXUSDT
|
||||
```
|
||||
|
||||
What it does:
|
||||
|
||||
- loads `/mnt/dolphinng5_predict/.env`
|
||||
- sets `BINGX_SMOKE_LIVE=1`
|
||||
- sets `BINGX_SMOKE_ALLOW_TRADE=1`
|
||||
- sets `DITA_V2_LIVE_BINGX=1`
|
||||
- starts `dolphin:dita_v2` if it is not already running
|
||||
- runs `prod/tests/test_dita_v2_live_bingx_testnet_e2e.py`
|
||||
- preserves the live suite's rate-limit-aware behavior and cleanup paths
|
||||
|
||||
Use `--dry-run` to print the exact command and env without trading.
|
||||
|
||||
## Validation order
|
||||
|
||||
1. Start the process with `start`.
|
||||
2. Check `status`.
|
||||
3. Run `healthcheck`.
|
||||
4. Inspect the logs:
|
||||
- `/tmp/dolphin_logs/supervisor/dita_v2.log`
|
||||
- `/tmp/dolphin_logs/supervisor/dita_v2-error.log`
|
||||
|
||||
## Stop sequence
|
||||
|
||||
1. `python /mnt/dolphinng5_predict/prod/ops/dita_v2_ctl.py stop`
|
||||
2. Confirm `status` shows the program stopped.
|
||||
3. Only after that, touch the launcher config or shared-memory state.
|
||||
|
||||
## PINK-on-DITAv2 commands
|
||||
|
||||
PINK now executes through the DITAv2 kernel. The same supervisor commands
|
||||
apply, and the following PINK-specific surfaces are available:
|
||||
|
||||
### PINK control
|
||||
|
||||
```bash
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py status
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py healthcheck
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py ditav2-status
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py mode-verify
|
||||
```
|
||||
|
||||
`ditav2-status` checks the DITAv2 env vars (`DITA_V2_MODE`, `DITA_V2_VENUE`,
|
||||
etc.) and the `dolphin_pink` supervisor program status.
|
||||
|
||||
### PINK live BingX testnet smoke
|
||||
|
||||
```bash
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_live_bingx_smoke.py --pink --symbol TRXUSDT
|
||||
```
|
||||
|
||||
Use `--dry-run` to print the exact env and pytest command without trading.
|
||||
|
||||
### Stop sequence
|
||||
|
||||
1. `python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py stop`
|
||||
2. Confirm `status` shows the process stopped.
|
||||
3. Inspect logs: `/tmp/dolphin_logs/supervisor/dolphin_live_pink.log`
|
||||
605
prod/docs/PINK_BINGX_SIMPLIFICATION_SPEC_2026-05-22.md
Normal file
605
prod/docs/PINK_BINGX_SIMPLIFICATION_SPEC_2026-05-22.md
Normal file
@@ -0,0 +1,605 @@
|
||||
# PINK BingX Simplification Spec
|
||||
|
||||
Status: Draft for implementation review
|
||||
Date: 2026-05-22
|
||||
Owner: Runtime / Trading Systems
|
||||
Scope: PINK only, with BLUE parity preserved for algorithm comparison
|
||||
|
||||
## 1. Purpose
|
||||
|
||||
This spec defines a simplified live-trading architecture for PINK that:
|
||||
|
||||
1. Preserves the BLUE algorithm exactly.
|
||||
2. Makes every engine action observable.
|
||||
3. Uses the exchange as the authoritative source of live position truth.
|
||||
4. Reuses the existing data structures needed for BLUE/PINK comparison.
|
||||
5. Reduces hidden state and duplicate decision paths.
|
||||
6. Keeps PINK mechanically comparable to BLUE wherever the exchange model allows it.
|
||||
|
||||
This document does **not** change the signal math, thresholds, or TP/exit logic.
|
||||
It only simplifies how those decisions move through the system and how they are recorded.
|
||||
Where BingX semantics differ from BLUE's historical execution surface, the difference must be isolated behind the execution boundary rather than pushed into the engine.
|
||||
|
||||
## 2. Design Goals
|
||||
|
||||
The architecture must satisfy all of the following:
|
||||
|
||||
- Faithfulness to BLUE's original algorithm.
|
||||
- Full observability of actions as:
|
||||
- fired
|
||||
- requested
|
||||
- sent
|
||||
- acknowledged
|
||||
- executed
|
||||
- reflected on BingX
|
||||
- Minimal complexity.
|
||||
- Maximum reuse of existing tables, maps, and record shapes.
|
||||
- Clean comparability between BLUE and PINK.
|
||||
- No second domain-level truth source.
|
||||
|
||||
## 3. Non-Goals
|
||||
|
||||
This spec does not:
|
||||
|
||||
- Change the trading signal formula.
|
||||
- Change the TP value or exit semantics.
|
||||
- Add a second live source of truth.
|
||||
- Replace supervisor with a new process manager.
|
||||
- Introduce a new order ledger when existing tables can be reused.
|
||||
|
||||
## 4. Core Principle
|
||||
|
||||
PINK must be exchange-led.
|
||||
|
||||
That means:
|
||||
|
||||
- BingX position state is authoritative for whether the slot is open.
|
||||
- BingX open-order state is authoritative for whether an exit is pending.
|
||||
- Account state is a projection of confirmed exchange events.
|
||||
- Local engine state is a projection of exchange state plus decision metadata.
|
||||
- ClickHouse is the durable audit trail.
|
||||
- Hazelcast is the live control/state bus.
|
||||
- The TUI is a derived view only.
|
||||
|
||||
If local state and BingX state disagree, the system must reconcile toward BingX.
|
||||
|
||||
BLUE comparability rule:
|
||||
|
||||
- The engine-side lifecycle, state names, and record shapes should remain BLUE-compatible unless BingX makes that impossible.
|
||||
- Any unavoidable exchange-specific deviation must be isolated in the execution adapter and event normalization path.
|
||||
- The engine itself should remain oblivious to BingX quirks except for the minimal authority rules needed to stay safe.
|
||||
|
||||
## 5. Minimal State Model
|
||||
|
||||
The system should keep only these live state categories:
|
||||
|
||||
- Decision state
|
||||
- what the engine decided
|
||||
- Order state
|
||||
- what was requested and acknowledged
|
||||
- Position state
|
||||
- what BingX currently holds
|
||||
- Account state
|
||||
- capital, leverage, open notional
|
||||
- Terminal trade state
|
||||
- completed trades only
|
||||
|
||||
Everything else should be derived from those categories.
|
||||
|
||||
The simplification target is not "remove layers entirely".
|
||||
It is "make the layers explicit and narrow":
|
||||
|
||||
```text
|
||||
engine intent
|
||||
-> execution facade
|
||||
-> exchange adapter
|
||||
-> exchange
|
||||
-> event normalization
|
||||
-> durable ledger
|
||||
```
|
||||
|
||||
The `execution facade` is where BLUE-compatible semantics are preserved.
|
||||
The `exchange adapter` is where BingX-specific request/response shapes live.
|
||||
`event normalization` is a thin technical return channel inside the execution boundary:
|
||||
|
||||
- dedupe exchange callbacks
|
||||
- normalize terminal states
|
||||
- map exchange facts into canonical trade/account events
|
||||
- update projections and durable rows
|
||||
|
||||
It is not a separate policy or trading layer.
|
||||
|
||||
This spec uses the following DITA split:
|
||||
|
||||
- `Decision`
|
||||
- pure signal evaluation
|
||||
- `Intent`
|
||||
- candidate selection and sizing proposal
|
||||
- `Trade`
|
||||
- single-slot lifecycle state machine
|
||||
- `Account`
|
||||
- projection of confirmed execution facts
|
||||
|
||||
## 6. Existing Data Structures to Reuse
|
||||
|
||||
This spec reuses the current structures instead of introducing parallel ones.
|
||||
|
||||
### 6.1 ClickHouse tables
|
||||
|
||||
- `dolphin_pink.position_state`
|
||||
- lifecycle source for open and closed trade status
|
||||
- `dolphin_pink.trade_events`
|
||||
- terminal ledger for completed trades
|
||||
- `dolphin_pink.account_events`
|
||||
- capital and exposure snapshots
|
||||
- `dolphin_pink.v7_decision_events`
|
||||
- decision trail
|
||||
- `dolphin_pink.adaptive_exit_shadow`
|
||||
- shadow-only exit analysis
|
||||
|
||||
### 6.2 Hazelcast maps
|
||||
|
||||
- `DOLPHIN_STATE_PINK`
|
||||
- `DOLPHIN_PNL_PINK`
|
||||
- `DOLPHIN_FEATURES`
|
||||
- `DOLPHIN_SAFETY`
|
||||
- `DOLPHIN_HEARTBEAT`
|
||||
|
||||
### 6.3 Exchange-side sources
|
||||
|
||||
- `user/positions`
|
||||
- `trade/openOrders`
|
||||
- `trade/allOrders`
|
||||
- `trade/allFillOrders`
|
||||
|
||||
## 7. Authoritative Precedence
|
||||
|
||||
Live truth must be resolved in this order:
|
||||
|
||||
```text
|
||||
BingX user/positions
|
||||
↓
|
||||
BingX trade/openOrders
|
||||
↓
|
||||
BingX journal snapshot
|
||||
↓
|
||||
ClickHouse account_events / position_state
|
||||
↓
|
||||
Hazelcast engine snapshot
|
||||
↓
|
||||
Supervisor log fallback
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- The first matching live BingX signal wins.
|
||||
- Local snapshots may lag and must not override BingX.
|
||||
- Log parsing is a last resort only.
|
||||
|
||||
For BLUE comparability:
|
||||
|
||||
- The adapter must emit the same semantic milestones BLUE would expose, even if the physical exchange response is different.
|
||||
- If BingX cannot express a BLUE milestone exactly, preserve the closest semantic equivalent and annotate the deviation in the event payload.
|
||||
|
||||
## 8. High-Level Data Flow
|
||||
|
||||
```text
|
||||
+------------------+
|
||||
| Binance data |
|
||||
| / HZ features |
|
||||
+---------+--------+
|
||||
|
|
||||
v
|
||||
+------------------+
|
||||
| DolphinActor |
|
||||
| (BLUE logic) |
|
||||
+---------+--------+
|
||||
|
|
||||
v
|
||||
+------------------+
|
||||
| NDAlphaEngine |
|
||||
| single slot only |
|
||||
+---------+--------+
|
||||
|
|
||||
decision / request
|
||||
|
|
||||
v
|
||||
+------------------+
|
||||
| Execution facade |
|
||||
| BLUE-compatible |
|
||||
+---------+--------+
|
||||
|
|
||||
exchange-specific request
|
||||
|
|
||||
v
|
||||
+------------------+
|
||||
| BingXExecClient |
|
||||
+---------+--------+
|
||||
|
|
||||
v
|
||||
+------------------+
|
||||
| BingX VST |
|
||||
| positions/orders |
|
||||
+---------+--------+
|
||||
|
|
||||
poll / ack / fill / close
|
||||
|
|
||||
v
|
||||
+------------------+
|
||||
| journal snapshot |
|
||||
+---------+--------+
|
||||
|
|
||||
v
|
||||
+----------------+----------------+
|
||||
| ClickHouse + Hazelcast + TUI |
|
||||
+---------------------------------+
|
||||
```
|
||||
|
||||
## 9. Order Lifecycle
|
||||
|
||||
The system should treat every trade as a simple state machine.
|
||||
|
||||
```text
|
||||
EMPTY
|
||||
|
|
||||
v
|
||||
DECISION_CREATED
|
||||
|
|
||||
v
|
||||
ORDER_REQUESTED
|
||||
|
|
||||
v
|
||||
ORDER_SENT
|
||||
|
|
||||
v
|
||||
ORDER_ACKNOWLEDGED
|
||||
|
|
||||
v
|
||||
POSITION_OPENED
|
||||
|
|
||||
v
|
||||
POSITION_UPDATED
|
||||
|
|
||||
v
|
||||
EXIT_REQUESTED
|
||||
|
|
||||
v
|
||||
EXIT_SENT
|
||||
|
|
||||
v
|
||||
EXIT_ACKNOWLEDGED
|
||||
|
|
||||
v
|
||||
POSITION_CLOSED
|
||||
|
|
||||
v
|
||||
TRADE_TERMINAL_WRITTEN
|
||||
|
|
||||
v
|
||||
EMPTY
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- A trade is not "closed" until BingX no longer reports the position.
|
||||
- A terminal close row is not optional.
|
||||
- The close row must be written after exchange-event normalization confirms terminality, not before.
|
||||
|
||||
## 10. Open / Update / Close Mechanics
|
||||
|
||||
### 10.1 Open
|
||||
|
||||
1. Engine produces a decision.
|
||||
2. Actor converts it into an intent.
|
||||
3. Execution facade normalizes the request into a BLUE-compatible action record.
|
||||
4. Execution client submits the request to BingX.
|
||||
5. BingX acknowledges or rejects.
|
||||
6. BingX position becomes authoritative once open.
|
||||
7. Event normalization updates `position_state` and account projections.
|
||||
|
||||
### 10.2 Update
|
||||
|
||||
1. Execution client polls `openOrders`.
|
||||
2. Execution facade records the requested action.
|
||||
3. Execution client polls `user/positions`.
|
||||
4. Execution client refreshes account state.
|
||||
5. Journal snapshot is persisted.
|
||||
6. ClickHouse rows are appended.
|
||||
7. Hazelcast state is refreshed.
|
||||
8. TUI renders the derived result.
|
||||
|
||||
### 10.3 Close
|
||||
|
||||
1. Engine or exit manager requests exit.
|
||||
2. Execution facade normalizes the exit into the same lifecycle that BLUE would represent.
|
||||
3. Exit order is submitted reduce-only.
|
||||
4. BingX confirms fill or terminal state.
|
||||
5. Exchange position disappears.
|
||||
6. Event normalization emits the terminal close fact.
|
||||
7. `trade_events` close row is written.
|
||||
8. `position_state` is updated to closed.
|
||||
|
||||
## 11. Reconciliation Model
|
||||
|
||||
In this spec, "reconciliation" is not a first-class domain layer.
|
||||
It is the thin adapter-side return path that converts BingX facts into canonical events and projections.
|
||||
|
||||
The simplified model is:
|
||||
|
||||
```text
|
||||
engine intent
|
||||
-> exchange submission
|
||||
-> exchange state
|
||||
-> event normalization
|
||||
-> durable ledger
|
||||
```
|
||||
|
||||
Not:
|
||||
|
||||
```text
|
||||
engine intent
|
||||
-> local inferred close
|
||||
-> maybe exchange close later
|
||||
```
|
||||
|
||||
The second pattern is what creates ghost closes and confusing TUI state.
|
||||
|
||||
The return path must remain thin and mostly transparent:
|
||||
|
||||
- confirm what BingX actually did
|
||||
- translate exchange reality into canonical engine state and durable ledger rows
|
||||
- backfill only the minimum terminal bookkeeping needed to keep the audit trail complete
|
||||
|
||||
It must not:
|
||||
|
||||
- make trading decisions
|
||||
- invent or reinterpret strategy state
|
||||
- act as a second policy layer
|
||||
- override engine intent except where required to reflect BingX authority
|
||||
|
||||
In other words:
|
||||
|
||||
```text
|
||||
policy lives in the engine
|
||||
translation lives in the execution boundary
|
||||
truth lives on BingX
|
||||
```
|
||||
|
||||
If the return path starts shaping strategy behavior, the architecture has drifted.
|
||||
|
||||
## 12. ClickHouse Accounting Contract
|
||||
|
||||
### 12.1 `account_events`
|
||||
|
||||
This table must represent the latest authoritative snapshot of:
|
||||
|
||||
- capital
|
||||
- open positions
|
||||
- open notional
|
||||
- leverage
|
||||
- fills metadata
|
||||
|
||||
It is not the source of truth for execution. It is the projection of confirmed execution facts and the best table for capital-path replay.
|
||||
|
||||
### 12.2 `position_state`
|
||||
|
||||
This table must represent per-trade lifecycle state.
|
||||
|
||||
Required lifecycle states:
|
||||
|
||||
- `OPEN`
|
||||
- `EXIT_REQUESTED`
|
||||
- `EXIT_ACKED`
|
||||
- `CLOSED`
|
||||
- `RECONCILED`
|
||||
|
||||
This table is the canonical lifecycle projection, not a second engine.
|
||||
|
||||
### 12.3 `trade_events`
|
||||
|
||||
This table must represent terminal closed trades only.
|
||||
|
||||
Rules:
|
||||
|
||||
- one terminal row per completed trade
|
||||
- dedupe by `trade_id`
|
||||
- never infer a close row from a fill snapshot alone
|
||||
|
||||
### 12.4 `status_snapshots`
|
||||
|
||||
When capital replay is needed, `status_snapshots` remains the preferred capital-path source because it captures:
|
||||
|
||||
- capital
|
||||
- posture
|
||||
- `trades_executed`
|
||||
- `rm`
|
||||
- `vol_ok`
|
||||
- related snapshot state
|
||||
|
||||
`trade_events` alone is not enough for capital replay.
|
||||
|
||||
## 13. PINK and BLUE Comparison Rules
|
||||
|
||||
PINK must remain structurally comparable to BLUE.
|
||||
|
||||
That means:
|
||||
|
||||
- same trade identity model
|
||||
- same key fields for open/close events
|
||||
- same exit reason vocabulary
|
||||
- same capital accounting semantics
|
||||
- same bar and hold semantics
|
||||
|
||||
Namespace differences are allowed.
|
||||
Semantic differences are not.
|
||||
|
||||
The DITA split must stay semantically compatible with BLUE:
|
||||
|
||||
- decision semantics preserved
|
||||
- intent selection preserved
|
||||
- trade lifecycle compatible
|
||||
- account projection comparable
|
||||
- return-channel normalization exchange-specific only
|
||||
|
||||
## 14. Simplification Rules
|
||||
|
||||
To reduce bugs, do the following:
|
||||
|
||||
### 14.1 Keep one authoritative open-slot view
|
||||
|
||||
Do not maintain competing local definitions of "open trade".
|
||||
|
||||
### 14.2 Stop inventing closed trades in the TUI
|
||||
|
||||
The TUI may display:
|
||||
|
||||
- open positions
|
||||
- terminal trades
|
||||
- fills
|
||||
|
||||
It must not convert fills into fake closes.
|
||||
|
||||
### 14.3 Remove recovery ambiguity
|
||||
|
||||
At startup:
|
||||
|
||||
- BingX positions are imported
|
||||
- stale local slots are cleared
|
||||
- journal state is restored only when it does not contradict BingX
|
||||
- account projection is rebuilt from confirmed exchange facts, not from intent history
|
||||
|
||||
### 14.4 Keep the event trail append-only
|
||||
|
||||
If a state needs correction, emit a new event.
|
||||
Do not rewrite history.
|
||||
|
||||
## 15. ASCII Failure Modes
|
||||
|
||||
### 15.1 Ghost close
|
||||
|
||||
```text
|
||||
EXIT_REQUESTED
|
||||
|
|
||||
v
|
||||
EXIT_SENT
|
||||
|
|
||||
+--> local snapshot says CLOSED
|
||||
|
|
||||
+--> BingX still shows position OPEN
|
||||
|
|
||||
v
|
||||
BUG: local UI looks flat, exchange is not flat
|
||||
```
|
||||
|
||||
### 15.2 Missing terminal row
|
||||
|
||||
```text
|
||||
EXIT_ACKNOWLEDGED
|
||||
|
|
||||
v
|
||||
POSITION_CLOSED on BingX
|
||||
|
|
||||
v
|
||||
trade_events row missing
|
||||
|
|
||||
v
|
||||
BUG: replay/debug cannot prove the close
|
||||
```
|
||||
|
||||
### 15.3 Duplicate ledger row
|
||||
|
||||
```text
|
||||
trade_events insert
|
||||
|
|
||||
+--> duplicate insert for same trade_id
|
||||
|
|
||||
v
|
||||
BUG: replay capital is overstated unless deduped
|
||||
```
|
||||
|
||||
## 16. Acceptance Criteria
|
||||
|
||||
The simplification is acceptable only if all of the following hold:
|
||||
|
||||
1. BLUE algorithm behavior is preserved exactly.
|
||||
2. PINK trades can be compared to BLUE trades using the same structures.
|
||||
3. Every order action is visible in the trail.
|
||||
4. Every close can be traced to BingX terminal state.
|
||||
5. TUI never invents a close.
|
||||
6. Capital replay can be reconstructed from `status_snapshots` plus deduped trade rows.
|
||||
7. BingX remains the authoritative open-position source.
|
||||
|
||||
## 17. Implementation Boundaries
|
||||
|
||||
The following are the expected boundaries for any implementation work:
|
||||
|
||||
- Launcher layer
|
||||
- namespace wiring only
|
||||
- Actor layer
|
||||
- engine-slot projection and adapter ingress
|
||||
- Execution facade layer
|
||||
- BLUE-compatible action normalization
|
||||
- order lifecycle event emission
|
||||
- BingX execution layer
|
||||
- order submit / poll / reconcile / snapshot
|
||||
- Journal layer
|
||||
- durable bridge into ClickHouse
|
||||
- Observability layer
|
||||
- derived display only
|
||||
|
||||
The return path should be treated as a translation boundary, not a policy boundary.
|
||||
Its ideal steady state is nearly invisible.
|
||||
|
||||
Any new BingX-specific behavior should go in the execution or adapter-ingress path, not in the engine decision logic.
|
||||
|
||||
## 18. Recommended Simplified Architecture
|
||||
|
||||
```text
|
||||
[decision]
|
||||
|
|
||||
v
|
||||
[intent]
|
||||
|
|
||||
v
|
||||
[trade FSM]
|
||||
|
|
||||
v
|
||||
[execution adapter]
|
||||
|
|
||||
v
|
||||
[BingX order/position]
|
||||
|
|
||||
v
|
||||
[event normalization]
|
||||
|
|
||||
v
|
||||
[ClickHouse account + trade ledger]
|
||||
|
|
||||
v
|
||||
[TUI / replay]
|
||||
```
|
||||
|
||||
This is the simplest version that still preserves BLUE faithfulness and auditability.
|
||||
|
||||
## 19. Open Questions
|
||||
|
||||
These are implementation questions, not design blockers:
|
||||
|
||||
- Should PINK `trade_events` remain fully separate from BLUE-compatible schema, or only namespace-tagged?
|
||||
- Should the TUI use `account_events` or `position_state` as the primary open-trade panel source?
|
||||
- Should `position_state` become the canonical lifecycle table for all live strategies, or only PINK first?
|
||||
- Should any exchange callback normalization be shared with BLUE, or remain PINK-only until parity is proven?
|
||||
|
||||
## 20. Final Decision
|
||||
|
||||
The target simplification is:
|
||||
|
||||
- one engine
|
||||
- one exchange authority
|
||||
- one append-only audit trail
|
||||
- one derived TUI
|
||||
- one replay path
|
||||
|
||||
Anything that introduces a second truth source should be removed or demoted.
|
||||
|
||||
Reconciliation, if the term is retained at all, should mean only the thin adapter-side normalization of BingX facts into canonical events and account projection. It should not exist as a policy layer.
|
||||
79
prod/docs/PINK_DITAV2_FAULT_TAXONOMY.md
Normal file
79
prod/docs/PINK_DITAV2_FAULT_TAXONOMY.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# PINK-on-DITAv2 Fault Taxonomy & Operator Response
|
||||
|
||||
## Fault Classes
|
||||
|
||||
### RATE_LIMITED
|
||||
**Kernel code**: `KernelDiagnosticCode.RATE_LIMITED`
|
||||
**Severity**: WARNING
|
||||
**Recovery**: Automatic — kernel retries on next step cycle.
|
||||
|
||||
Operator action: none required unless persistent (>10 min). If persistent, check BingX API limits at `/openApi/swap/v2/user/balance` directly. Reduce poll frequency via `DOLPHIN_PINK_POLL_INTERVAL_SEC` (default 1.0s).
|
||||
|
||||
### ORDER_REJECTED
|
||||
**Kernel code**: `KernelDiagnosticCode.ORDER_REJECTED`
|
||||
**Entry reject**: Slot returns to IDLE. Decision engine will re-evaluate on next cycle.
|
||||
**Exit reject**: Slot stays in EXIT_WORKING. Decision engine will retry exit.
|
||||
|
||||
Operator action: check that instrument is tradeable on BingX VST. Symbol precision changes or contract suspensions can cause rejects. Inspect `outcome.details` for venue reason text.
|
||||
|
||||
### EXIT_ORDER_REJECTED
|
||||
**Kernel code**: `KernelDiagnosticCode.EXIT_ORDER_REJECTED`
|
||||
**Slot state**: EXIT_WORKING. The kernel will retry via process_intent(EXIT) on the next step where the decision engine produces an exit signal.
|
||||
|
||||
Operator action: if position remains open past `DOLPHIN_MAX_HOLD_BARS` (default 250), manually flatten via `pink_ctl.py` or direct BingX REST.
|
||||
|
||||
### CANCEL_REJECTED
|
||||
**Kernel code**: `KernelDiagnosticCode.CANCEL_REJECTED`
|
||||
**Slot state**: Unchanged. Cancel is retried on the next cycle.
|
||||
|
||||
Operator action: check open orders on BingX. If the order filled between cancel attempt and rejection, the slot will converge on the next reconcile cycle.
|
||||
|
||||
### NO_ACTIVE_EXIT_ORDER
|
||||
**Kernel code**: `KernelDiagnosticCode.NO_ACTIVE_EXIT_ORDER`
|
||||
**Cause**: Exit intent processed but no working exit order exists (usually because it filled between decision and execution).
|
||||
|
||||
Operator action: none — the fill event will converge the slot to CLOSED on the next `on_venue_event` or reconcile.
|
||||
|
||||
### STALE_STATE_RECONCILE
|
||||
**Kernel code**: `KernelDiagnosticCode.STALE_STATE_RECONCILING`
|
||||
**Slot state**: STALE_STATE_RECONCILING. Normal event progression is blocked until reconciliation completes.
|
||||
|
||||
Operator action: if the slot stays in this state for >30s, the exchange snapshot may be inconsistent. Run `pink_ctl.py restart` to force full restart reconcile.
|
||||
|
||||
### DUPLICATE_EVENT
|
||||
**Kernel code**: `KernelDiagnosticCode.DUPLICATE_EVENT`
|
||||
**Severity**: INFO
|
||||
**Effect**: Event is dropped. No capital or state change. Idempotency via `seen_event_ids` on the slot.
|
||||
|
||||
Operator action: none.
|
||||
|
||||
### RATE_LIMITED (persistent cycle)
|
||||
**Detection**: Consecutive RATE_LIMITED outcomes with no successful exchange interaction.
|
||||
**Anomaly row origin**: `ditav2_kernel`
|
||||
|
||||
Operator action: check exchange API status. If the rate limit window is known, set `DITA_V2_RATE_LIMIT_COOLDOWN_SEC` in env.
|
||||
|
||||
## Diagnostic Surface
|
||||
|
||||
All fault codes appear in:
|
||||
- `KernelOutcome.diagnostic_code` (programmatic)
|
||||
- `KernelOutcome.severity` (INFO/WARNING/ERROR/CRITICAL)
|
||||
- `KernelOutcome.details` (structured payload with reason, retry_after_ms, etc.)
|
||||
|
||||
## Log Paths
|
||||
|
||||
- Runtime: `/tmp/dolphin_logs/supervisor/dolphin_live_pink.log`
|
||||
- Kernel: `/tmp/dolphin_logs/supervisor/dolphin_live_pink-error.log`
|
||||
|
||||
## Recovery Tools
|
||||
|
||||
```bash
|
||||
# Check DITAv2 health
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py ditav2-status
|
||||
|
||||
# Full restart reconcile
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py restart
|
||||
|
||||
# Namespace isolation check
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py mode-verify
|
||||
```
|
||||
470
prod/docs/PINK_DITAV2_FILE_BY_FILE_REFACTOR_GUIDE.md
Normal file
470
prod/docs/PINK_DITAV2_FILE_BY_FILE_REFACTOR_GUIDE.md
Normal file
@@ -0,0 +1,470 @@
|
||||
# PINK -> DITAv2 Refactor Guide (File-by-File, Implementation-Ready)
|
||||
|
||||
## MANDATORY READ ORDER (Before Any Code Change)
|
||||
|
||||
Read these documents in this exact order before touching code:
|
||||
|
||||
1. `/mnt/dolphinng5_predict/prod/docs/SYSTEM_BIBLE_v7.md` (PINK/DITA addendum scope only; do not broaden scope into BLUE changes)
|
||||
2. `/mnt/dolphinng5_predict/prod/docs/PINK_BINGX_SIMPLIFICATION_SPEC_2026-05-22.md`
|
||||
3. `/mnt/dolphinng5_predict/prod/docs/DITA_V2_KERNEL_REFERENCE.md`
|
||||
4. `/mnt/dolphinng5_predict/prod/docs/DITA_V2_OPERATOR_PLAYBOOK.md`
|
||||
5. `/mnt/dolphinng5_predict/prod/docs/CLEAN_ARCH_DITA_REFERENCE_PROD_IMPLEMENTATION_SPEC.md`
|
||||
|
||||
Do not begin implementation until these are read and the PINK-only boundary is explicit.
|
||||
|
||||
## 0) Scope and Goal
|
||||
|
||||
This guide is for refactoring **PINK only** to execute trades through **DITAv2 exclusively** (where DITAv2 facilities exist), while preserving:
|
||||
|
||||
1. the shared BLUE/PINK signal and trading algorithm semantics,
|
||||
2. existing PINK observability contracts (Hazelcast, ClickHouse, TUI),
|
||||
3. strict non-impact on BLUE.
|
||||
|
||||
The target is a PINK runtime that is testnet-stable on BingX, with deterministic execution/accounting and explicit handling of known failure classes (hung orders, non-closes, duplicate events, stale/restart drift, rate limits).
|
||||
|
||||
---
|
||||
|
||||
## 1) Hard Invariants (Must Hold Throughout)
|
||||
|
||||
1. **BLUE untouched**:
|
||||
- No behavior changes in BLUE runtime paths.
|
||||
- No BLUE namespace changes (`DOLPHIN_STATE_BLUE`, `DOLPHIN_PNL_BLUE`, `dolphin` DB surfaces).
|
||||
|
||||
2. **Execution boundary**:
|
||||
- PINK execution calls must go through DITAv2 kernel + venue adapter.
|
||||
- No direct PINK exchange-submit path outside DITAv2 where DITAv2 has equivalent functionality.
|
||||
|
||||
3. **Algo parity**:
|
||||
- Entry/exit decision semantics remain shared with BLUE policy logic.
|
||||
- DITAv2 is execution/risk-state substrate, not strategy rewrite.
|
||||
|
||||
4. **Exchange-led truth**:
|
||||
- Reconcile from exchange snapshots; local state follows exchange, not vice versa.
|
||||
|
||||
5. **Accounting determinism**:
|
||||
- No double-application of realized PnL.
|
||||
- Multi-leg closes apply capital deltas exactly once per economic leg.
|
||||
|
||||
---
|
||||
|
||||
## 2) Pre-Refactor Safety Baseline
|
||||
|
||||
## 2.1 Files to snapshot before edits
|
||||
|
||||
- `/mnt/dolphinng5_predict/prod/launch_dolphin_pink.py`
|
||||
- `/mnt/dolphinng5_predict/prod/clean_arch/runtime/pink_direct.py`
|
||||
- `/mnt/dolphinng5_predict/prod/ops/pink_ctl.py`
|
||||
- `/mnt/dolphinng5_predict/prod/configs/pink.yml`
|
||||
- `/mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf`
|
||||
|
||||
## 2.2 Baseline behavior capture (mandatory)
|
||||
|
||||
Capture and store:
|
||||
|
||||
1. PINK entry -> partial exit -> final exit behavior.
|
||||
2. PINK state transitions for cancel/reject/reconcile.
|
||||
3. ClickHouse deltas:
|
||||
- `dolphin_pink.trade_events`
|
||||
- `dolphin_pink.position_state`
|
||||
- `dolphin_pink.account_events`
|
||||
- `dolphin_pink.v7_decision_events`
|
||||
4. Hazelcast deltas:
|
||||
- `DOLPHIN_STATE_PINK`
|
||||
- `DOLPHIN_PNL_PINK`
|
||||
5. TUI fields used by `dolphin_status_pink.py`.
|
||||
|
||||
This is the parity baseline used to prove "algo unchanged, execution substrate changed."
|
||||
|
||||
---
|
||||
|
||||
## 3) File-by-File Refactor Plan
|
||||
|
||||
## 3.1 Runtime entrypoint and boundary
|
||||
|
||||
### File: `/mnt/dolphinng5_predict/prod/launch_dolphin_pink.py`
|
||||
|
||||
### Objective
|
||||
Convert launcher wiring so PINK execution is DITAv2-native by default.
|
||||
|
||||
### Required edits
|
||||
1. Keep namespace hardening for PINK:
|
||||
- `strategy_name=pink`
|
||||
- `DOLPHIN_STATE_PINK`, `DOLPHIN_PNL_PINK`
|
||||
- `journal_strategy=pink`, `journal_db=dolphin_pink`
|
||||
2. Replace/retire legacy DITA execution object graph for trade execution:
|
||||
- stop using legacy `prod.clean_arch.dita.*` execution path as primary.
|
||||
- construct DITAv2 bundle (`prod.clean_arch.dita_v2.launcher`).
|
||||
3. Explicit DITAv2 env defaults for PINK launcher:
|
||||
- `DITA_V2_VENUE=BINGX`
|
||||
- `DITA_V2_ZINC=REAL`
|
||||
- `DITA_V2_CONTROL_PLANE=REAL_ZINC`
|
||||
- `DITA_V2_HAZELCAST=REAL`
|
||||
- `DITA_V2_LAUNCHER_MODE=serve`
|
||||
4. Keep BingX env safety:
|
||||
- `DOLPHIN_BINGX_ENV=VST`
|
||||
- `DOLPHIN_BINGX_ALLOW_MAINNET=0`
|
||||
5. Continue loading `BINGX_API_KEY`/`BINGX_SECRET_KEY` from `.env` contract.
|
||||
|
||||
### Acceptance checks
|
||||
1. PINK launcher starts and uses DITAv2 bundle path.
|
||||
2. No BLUE state map/DB writes from this path.
|
||||
3. PINK still exposes expected runtime metadata in HZ/CH.
|
||||
|
||||
---
|
||||
|
||||
### File: `/mnt/dolphinng5_predict/prod/clean_arch/runtime/pink_direct.py`
|
||||
|
||||
### Objective
|
||||
Replace legacy execution orchestration with DITAv2 intent/event orchestration while preserving decision semantics.
|
||||
|
||||
### Required edits
|
||||
1. Introduce a dedicated translation seam:
|
||||
- Decision output -> `KernelIntent` mapping (`ENTER`, `EXIT`, `MARK_PRICE`, `CANCEL`, `RECONCILE`).
|
||||
2. Route execution through:
|
||||
- `ExecutionKernel.process_intent(...)`
|
||||
- `ExecutionKernel.on_venue_event(...)` for reconcile/event ingestion.
|
||||
3. Keep policy/decision logic unchanged:
|
||||
- do not rewrite velocity/IRP/threshold policy semantics.
|
||||
4. On every execution phase:
|
||||
- reconcile from exchange (through DITAv2 BingX venue path),
|
||||
- project state from DITAv2 slot/account snapshot,
|
||||
- emit persistence payloads from DITAv2 outcomes/events.
|
||||
5. Handle diagnostics explicitly:
|
||||
- `RATE_LIMITED`, `ORDER_REJECTED`, `EXIT_ORDER_REJECTED`, `CANCEL_REJECTED`, `NO_ACTIVE_EXIT_ORDER`, stale/reconcile signals.
|
||||
6. Enforce idempotence:
|
||||
- repeated venue `event_id` must not re-apply economic effects.
|
||||
|
||||
### Acceptance checks
|
||||
1. Slot/FSM states are deterministic for nominal and rejection paths.
|
||||
2. No hung local state when exchange is flat.
|
||||
3. PINK accounting rows remain schema-compatible and single-application.
|
||||
|
||||
---
|
||||
|
||||
### File: `/mnt/dolphinng5_predict/prod/clean_arch/adapters/bingx_direct.py`
|
||||
|
||||
### Objective
|
||||
Keep exchange edge behavior normalized for DITAv2 and resilient to rate limits.
|
||||
|
||||
### Required edits
|
||||
1. Preserve/extend mapping of BingX throttle responses to `RATE_LIMITED`.
|
||||
2. Ensure refresh/reconcile endpoints degrade safely (empty snapshot) under transient throttles rather than crashing runtime.
|
||||
3. Preserve `reduceOnly` semantics for exits and close-out operations.
|
||||
4. Ensure all normalization fields required by DITAv2 are present:
|
||||
- `orderId`, `clientOrderId`, `status`, reason/message, retry hints.
|
||||
|
||||
### Acceptance checks
|
||||
1. Adapter never causes runtime crash on nominal exchange throttling.
|
||||
2. DITAv2 receives normalized status it can classify deterministically.
|
||||
|
||||
---
|
||||
|
||||
### File: `/mnt/dolphinng5_predict/prod/clean_arch/dita_v2/bingx_venue.py`
|
||||
|
||||
### Objective
|
||||
Guarantee PINK gets first-class DITAv2 venue events for all exchange reactions.
|
||||
|
||||
### Required edits
|
||||
1. Keep/extend mapping for:
|
||||
- ACK/FILL/PARTIAL_FILL
|
||||
- REJECT/CANCEL_REJECT
|
||||
- RATE_LIMITED
|
||||
2. Ensure `metadata` carries actionable downstream fields:
|
||||
- retryability, `retry_after_ms` if present, reason, venue status text.
|
||||
3. Ensure `reconcile()` emits consistent event stream usable for restart recovery.
|
||||
|
||||
### Acceptance checks
|
||||
1. No "unknown event kind" on observed BingX payloads.
|
||||
2. Reconcile events are sufficient to converge slot state after restart.
|
||||
|
||||
---
|
||||
|
||||
## 3.2 PINK persistence and observability compatibility
|
||||
|
||||
### File: `/mnt/dolphinng5_predict/prod/clean_arch/persistence/pink_clickhouse.py`
|
||||
|
||||
### Objective
|
||||
Keep PINK tables contract-compatible while sourcing execution truth from DITAv2.
|
||||
|
||||
### Required edits
|
||||
1. Ensure row builders consume DITAv2 outcome/event metadata where needed.
|
||||
2. Preserve existing table contracts:
|
||||
- `policy_events`
|
||||
- `v7_decision_events`
|
||||
- `trade_events`
|
||||
- `position_state`
|
||||
- `account_events`
|
||||
- `anomaly_events`
|
||||
3. Add explicit anomaly rows for:
|
||||
- rate-limited retry cycles breaching threshold,
|
||||
- hung-order timeout escalations,
|
||||
- reconcile divergence resolution events.
|
||||
|
||||
### Acceptance checks
|
||||
1. No schema drift breaking existing PINK dashboards/TUI.
|
||||
2. Capital/event rows reconcile to exchange-led lifecycle.
|
||||
|
||||
---
|
||||
|
||||
### File: `/mnt/dolphinng5_predict/prod/clickhouse/pink/*.sql`
|
||||
|
||||
### Objective
|
||||
Ensure schema supports DITAv2 diagnostic characterization without breaking old readers.
|
||||
|
||||
### Required edits
|
||||
1. Add columns only if required and backward-compatible:
|
||||
- diagnostic code,
|
||||
- severity,
|
||||
- retryability/retry hints,
|
||||
- reconcile markers.
|
||||
2. Do not remove or repurpose existing columns read by current tooling.
|
||||
|
||||
### Acceptance checks
|
||||
1. Existing readers still run.
|
||||
2. New DITAv2 fault/diagnostic fields are queryable.
|
||||
|
||||
---
|
||||
|
||||
### File: `/mnt/dolphinng5_predict/prod/ops/pink_ctl.py`
|
||||
|
||||
### Objective
|
||||
Make PINK operator tooling DITAv2-aware.
|
||||
|
||||
### Required edits
|
||||
1. Keep PINK namespace isolation checks as-is.
|
||||
2. Add DITAv2-specific health assertions:
|
||||
- kernel mode/verbosity/backend mode from control plane,
|
||||
- DITAv2 process health in supervisor.
|
||||
3. Add a command (or output block) for live smoke execution status.
|
||||
|
||||
### Acceptance checks
|
||||
1. `status`, `healthcheck`, `mode-verify` remain PINK-only.
|
||||
2. Tool can detect DITAv2 miswiring immediately.
|
||||
|
||||
---
|
||||
|
||||
### File: `/mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf`
|
||||
|
||||
### Objective
|
||||
Ensure PINK runs supervised with DITAv2-backed runtime, BLUE unaffected.
|
||||
|
||||
### Required edits
|
||||
1. Keep BLUE programs unchanged.
|
||||
2. Ensure `dolphin_pink` program points to refactored PINK launcher path.
|
||||
3. Keep clear comments that PINK is VST/testnet and isolated.
|
||||
|
||||
### Acceptance checks
|
||||
1. `supervisorctl status` shows BLUE and PINK independently healthy.
|
||||
2. Stopping/restarting PINK does not impact BLUE services.
|
||||
|
||||
---
|
||||
|
||||
## 3.3 Test harness and execution quality
|
||||
|
||||
### File: `/mnt/dolphinng5_predict/prod/tests/test_pink_bingx_dita_live_e2e.py`
|
||||
|
||||
### Objective
|
||||
Primary live testnet acceptance suite for PINK-on-DITAv2.
|
||||
|
||||
### Required edits
|
||||
1. Ensure it drives DITAv2 path only.
|
||||
2. Include full operational gamut:
|
||||
- entry
|
||||
- mark
|
||||
- partial exit
|
||||
- final exit
|
||||
- cancel/cancel-after-flat
|
||||
- reconcile/restart-style checks
|
||||
3. Accept nominal exchange reactions while asserting deterministic kernel finality.
|
||||
4. Add explicit verification blocks:
|
||||
- open orders/positions are flat after cleanup,
|
||||
- no orphan slot state.
|
||||
|
||||
### Acceptance checks
|
||||
1. Suite passes reliably with rate-limit-respectful cadence.
|
||||
2. No residual exposure after test completion.
|
||||
|
||||
---
|
||||
|
||||
### File: `/mnt/dolphinng5_predict/prod/tests/test_pink_direct_runtime.py`
|
||||
|
||||
### Objective
|
||||
Kernel integration correctness in non-live conditions.
|
||||
|
||||
### Required edits
|
||||
1. Replace old execution assertions with DITAv2-based assertions:
|
||||
- intent mapping,
|
||||
- emitted events,
|
||||
- diagnostic handling,
|
||||
- slot transitions.
|
||||
2. Add tests for duplicate event replay and stale-state reconcile.
|
||||
|
||||
### Acceptance checks
|
||||
1. Runtime behavior deterministic under mock/fuzzed event schedules.
|
||||
2. No double-booking of capital in partial/full close chains.
|
||||
|
||||
---
|
||||
|
||||
### File: `/mnt/dolphinng5_predict/prod/tests/test_pink_clickhouse_persistence.py`
|
||||
|
||||
### Objective
|
||||
Prevent accounting/persistence regressions.
|
||||
|
||||
### Required edits
|
||||
1. Validate per-leg and terminal close semantics from DITAv2 outcomes.
|
||||
2. Validate anomaly/diagnostic row emission for non-nominal conditions.
|
||||
|
||||
### Acceptance checks
|
||||
1. Capital deltas and position-state terminality are consistent.
|
||||
2. Replay/restart write paths remain coherent.
|
||||
|
||||
---
|
||||
|
||||
### New test files to add
|
||||
|
||||
1. `/mnt/dolphinng5_predict/prod/tests/test_pink_ditav2_kernel_bridge.py`
|
||||
- Decision->KernelIntent mapping table tests.
|
||||
|
||||
2. `/mnt/dolphinng5_predict/prod/tests/test_pink_ditav2_rate_limit_contract.py`
|
||||
- Retryable warning classification + downstream emission tests.
|
||||
|
||||
3. `/mnt/dolphinng5_predict/prod/tests/test_pink_ditav2_restart_reconcile.py`
|
||||
- crash/restart reconcile convergence tests.
|
||||
|
||||
4. `/mnt/dolphinng5_predict/prod/tests/test_pink_ditav2_accounting_invariants.py`
|
||||
- multi-leg non-double-book proofs.
|
||||
|
||||
---
|
||||
|
||||
## 3.4 Documentation and runbooks
|
||||
|
||||
### Files to update
|
||||
|
||||
1. `/mnt/dolphinng5_predict/prod/docs/PINK_BINGX_SIMPLIFICATION_SPEC_2026-05-22.md`
|
||||
2. `/mnt/dolphinng5_predict/prod/docs/DITA_V2_KERNEL_REFERENCE.md`
|
||||
3. `/mnt/dolphinng5_predict/prod/docs/DITA_V2_OPERATOR_PLAYBOOK.md`
|
||||
4. `/mnt/dolphinng5_predict/prod/docs/SYSTEM_BIBLE_v7.md` (addendum only)
|
||||
|
||||
### Required doc updates
|
||||
1. Explicit statement: PINK execution boundary is DITAv2.
|
||||
2. Exact live smoke and healthcheck commands.
|
||||
3. Fault taxonomy and operator response for rate limit/reject/hung/reconcile paths.
|
||||
4. BLUE non-impact proof checklist.
|
||||
|
||||
---
|
||||
|
||||
## 4) Implementation Sequence (Strict Order)
|
||||
|
||||
1. Freeze BLUE + baseline capture.
|
||||
2. Launcher boundary wiring (`launch_dolphin_pink.py`).
|
||||
3. Runtime bridge (`pink_direct.py`) to DITAv2 intents/events.
|
||||
4. Persistence projection alignment (`pink_clickhouse.py` + SQL if needed).
|
||||
5. Operator/control updates (`pink_ctl.py`, supervisor stanza check).
|
||||
6. Non-live tests (unit/integration/fsm).
|
||||
7. Mock E2E and chaos/fuzz.
|
||||
8. Live BingX testnet basic cycles.
|
||||
9. Live BingX testnet chaos/fuzz.
|
||||
10. Soak and finalize docs/runbook.
|
||||
|
||||
Do not reorder. Live testing before accounting invariants is not allowed.
|
||||
|
||||
---
|
||||
|
||||
## 5) Mandatory Validation Matrix
|
||||
|
||||
## 5.1 Deterministic execution finality
|
||||
|
||||
For each action path (ENTER, EXIT partial, EXIT final, CANCEL, RECONCILE), assert:
|
||||
|
||||
1. deterministic final slot state,
|
||||
2. deterministic diagnostic code on failure paths,
|
||||
3. deterministic account/capital projection effect.
|
||||
|
||||
## 5.2 Known failure class coverage
|
||||
|
||||
1. Hung order:
|
||||
- timeout monitor triggers,
|
||||
- reconcile/cancel cycle emits diagnostics,
|
||||
- eventual terminality is explicit.
|
||||
|
||||
2. Non-close:
|
||||
- position remains visible in exchange snapshot until actually flat,
|
||||
- no premature local close state.
|
||||
|
||||
3. Duplicate/replayed events:
|
||||
- no duplicate capital/PnL application.
|
||||
|
||||
4. Restart/reconcile drift:
|
||||
- restart with open exchange position converges to correct slot state.
|
||||
|
||||
5. Rate limit:
|
||||
- classified as retryable warning,
|
||||
- downstream emitted with code/severity/hints,
|
||||
- no state corruption.
|
||||
|
||||
## 5.3 Namespace isolation
|
||||
|
||||
1. No `pink` strategy rows in `dolphin` or `dolphin_prodgreen`.
|
||||
2. No PINK writes to BLUE HZ maps.
|
||||
3. PINK stop/start/restart has zero BLUE impact.
|
||||
|
||||
---
|
||||
|
||||
## 6) Cutover and Rollback
|
||||
|
||||
## 6.1 Cutover gates
|
||||
|
||||
All must be true:
|
||||
|
||||
1. Non-live suite green.
|
||||
2. Mock E2E + chaos/fuzz green.
|
||||
3. Live testnet basic and chaos/fuzz green.
|
||||
4. No unresolved hung/non-close cases in soak window.
|
||||
5. Accounting parity checks pass.
|
||||
|
||||
## 6.2 Rollback trigger conditions
|
||||
|
||||
Rollback immediately if any:
|
||||
|
||||
1. unresolved exposure after cleanup,
|
||||
2. non-deterministic capital drift,
|
||||
3. repeated stale/reconcile divergence,
|
||||
4. contamination of BLUE/PRODGREEN namespaces.
|
||||
|
||||
## 6.3 Rollback action
|
||||
|
||||
1. Stop PINK only.
|
||||
2. Revert PINK launcher/runtime to pre-refactor revision.
|
||||
3. Keep forensic artifacts (CH/HZ rows, logs, diagnostics) for postmortem.
|
||||
|
||||
---
|
||||
|
||||
## 7) Operational Commands (Post-Refactor)
|
||||
|
||||
1. PINK control:
|
||||
```bash
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py status
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py healthcheck
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py mode-verify
|
||||
```
|
||||
|
||||
2. DITAv2 live smoke command (rate-limit respectful suite):
|
||||
```bash
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_live_bingx_smoke.py --symbol TRXUSDT
|
||||
```
|
||||
|
||||
3. Dry-run (no orders):
|
||||
```bash
|
||||
python /mnt/dolphinng5_predict/prod/ops/dita_v2_live_bingx_smoke.py --dry-run --symbol TRXUSDT
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 8) Definition of Done
|
||||
|
||||
1. PINK uses DITAv2 execution facilities exclusively where available.
|
||||
2. Shared BLUE/PINK strategy semantics are preserved.
|
||||
3. BLUE is behaviorally unaffected.
|
||||
4. PINK supports entries, exits, partial exits, TP/SL-driven exits, cancel/reconcile/restart.
|
||||
5. Accounting is deterministic and restart-safe.
|
||||
6. Live testnet E2E + chaos/fuzz passes with exchange-side verification.
|
||||
608
prod/docs/PINK_PODMAN_QUADLET_REARCH_SPEC_2026-05-19.md
Normal file
608
prod/docs/PINK_PODMAN_QUADLET_REARCH_SPEC_2026-05-19.md
Normal file
@@ -0,0 +1,608 @@
|
||||
# PINK Re-Architecture Specification (Implementation Blueprint)
|
||||
|
||||
Status: Approved-for-coding spec (no code in this document)
|
||||
Date: 2026-05-19
|
||||
Owner: Runtime/Infra
|
||||
Target: Add isolated `PINK` testnet execution system with identical trading algorithm behavior to BLUE, while keeping BLUE undisturbed.
|
||||
|
||||
---
|
||||
|
||||
## 1. Executive Decision
|
||||
|
||||
### 1.1 Decision
|
||||
Build `PINK` as an **isolated sidecar system** with dedicated namespaces and control surfaces, then optionally migrate that sidecar’s infra onto Podman+Quadlet+systemd.
|
||||
|
||||
### 1.2 Why this decision
|
||||
- BLUE must remain undisturbed.
|
||||
- Current codebase hard-routes many `prod*` paths into PRODGREEN sinks; a naive clone collides.
|
||||
- BingX account journaling currently dominates data volume and must be controlled explicitly.
|
||||
|
||||
### 1.3 Non-negotiable invariant
|
||||
The **trading algorithm logic must remain identical to BLUE** (signal math, thresholds, decision state machine semantics).
|
||||
|
||||
---
|
||||
|
||||
## 2. Hard Constraints (Must Hold)
|
||||
|
||||
1. No behavior change in core trading logic vs BLUE.
|
||||
2. No write contamination across BLUE/GREEN/PINK CH databases.
|
||||
3. No write contamination across BLUE/GREEN/PINK Hazelcast maps.
|
||||
4. BLUE process manager and lifecycle remain unchanged during PINK buildout.
|
||||
5. PINK must run BingX in VST/testnet mode only until explicit go-live gate.
|
||||
6. Any infra re-architecture must be introduced to PINK first, never by replacing BLUE in-place.
|
||||
|
||||
---
|
||||
|
||||
## 3. Current-State Evidence (Reference Anchors)
|
||||
|
||||
### 3.1 Supervisord-first doctrine
|
||||
- `prod/docs/SYSTEM_BIBLE_v7.md` states all dolphin services are supervisord-managed and warns against dual-management races.
|
||||
- See:
|
||||
- `/mnt/dolphinng5_predict/prod/docs/SYSTEM_BIBLE_v7.md:11`
|
||||
- `/mnt/dolphinng5_predict/prod/docs/SYSTEM_BIBLE_v7.md:1339`
|
||||
- `/mnt/dolphinng5_predict/prod/docs/SYSTEM_BIBLE_v7.md:3377`
|
||||
|
||||
### 3.2 Namespace split already in doctrine
|
||||
- BLUE: `dolphin`, `DOLPHIN_STATE_BLUE`, `DOLPHIN_PNL_BLUE`
|
||||
- PRODGREEN: `dolphin_prodgreen`, `DOLPHIN_STATE_PRODGREEN`, `DOLPHIN_PNL_PRODGREEN`
|
||||
- See:
|
||||
- `/mnt/dolphinng5_predict/prod/docs/SYSTEM_BIBLE_v7.md:11`
|
||||
- `/mnt/dolphinng5_predict/prod/docs/SYSTEM_BIBLE_v7.md:14`
|
||||
|
||||
### 3.3 Hardcoded routing that collides with new strategy names
|
||||
- BLUE trader hardcoded map keys:
|
||||
- `/mnt/dolphinng5_predict/prod/nautilus_event_trader.py:1740`
|
||||
- `/mnt/dolphinng5_predict/prod/nautilus_event_trader.py:1741`
|
||||
- `/mnt/dolphinng5_predict/prod/nautilus_event_trader.py:1850`
|
||||
- `/mnt/dolphinng5_predict/prod/nautilus_event_trader.py:2806`
|
||||
- `DolphinActor` routes `strategy.startswith("prod")` to PRODGREEN sink:
|
||||
- `/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/dolphin_actor.py:179`
|
||||
- `/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/dolphin_actor.py:180`
|
||||
- BingX execution hardcodes PRODGREEN strategy/db:
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/execution.py:263`
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/execution.py:532`
|
||||
- BingX journal maps `prod*` -> `dolphin_prodgreen`:
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/journal.py:90`
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/journal.py:91`
|
||||
|
||||
### 3.4 Current BingX poll cadence (main source of account-event volume)
|
||||
- Poll loops:
|
||||
- open orders loop
|
||||
- positions loop
|
||||
- account loop
|
||||
- See:
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/execution.py:707`
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/execution.py:723`
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/execution.py:732`
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/execution.py:741`
|
||||
- Default intervals:
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/config.py:58`
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/config.py:59`
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/config.py:60`
|
||||
|
||||
### 3.5 Data volumes measured (14 complete days; 2026-05-05 to 2026-05-18)
|
||||
- BLUE-like CH outgoing payload estimate: ~4.17 MB/day avg, ~12.01 MB/day p95-day.
|
||||
- BLUE-like HZ outgoing payload estimate: ~100.03 MB/day avg, ~301.53 MB/day p95-day.
|
||||
- PRODGREEN-style BingX `account_events` stream estimate: ~7.41 GB/day avg, ~18.57 GB/day p95-day.
|
||||
|
||||
---
|
||||
|
||||
## 4. Scope
|
||||
|
||||
## 4.1 In scope
|
||||
1. Introduce first-class `PINK` namespace contract across CH/HZ/control-plane.
|
||||
2. Preserve algorithm semantics exactly.
|
||||
3. Isolate PINK execution in BingX VST.
|
||||
4. Add explicit friction/cost characterization outputs.
|
||||
5. Add infra spec for Podman+Quadlet+systemd deployment of PINK stack.
|
||||
|
||||
## 4.2 Out of scope
|
||||
1. Any change to signal formula/thresholds/risk decision logic.
|
||||
2. Any BLUE teardown or manager migration in this phase.
|
||||
3. Any LIVE mainnet enablement for PINK.
|
||||
|
||||
---
|
||||
|
||||
## 5. Naming and Namespace Contract
|
||||
|
||||
## 5.1 Strategy naming
|
||||
- Strategy name for new instance: `pink` (lowercase).
|
||||
- Disallowed for this phase: names with `prod` prefix (e.g., `prodpink`) because current routing treats `prod*` specially.
|
||||
|
||||
## 5.2 ClickHouse namespace
|
||||
- New DB: `dolphin_pink`.
|
||||
- Required tables (minimum):
|
||||
- `trade_events`
|
||||
- `trade_reconstruction`
|
||||
- `trade_exit_legs`
|
||||
- `v7_decision_events`
|
||||
- `adaptive_exit_shadow`
|
||||
- `account_events`
|
||||
- `status_snapshots`
|
||||
- Optional parity tables if needed by downstream tooling:
|
||||
- `sc_threshold_advisor_shadow`
|
||||
- `sc_bucket_gauge_shadow`
|
||||
- `inverse_ars_bounce_shadow`
|
||||
|
||||
## 5.3 Hazelcast namespace
|
||||
- Maps:
|
||||
- `DOLPHIN_STATE_PINK`
|
||||
- `DOLPHIN_PNL_PINK`
|
||||
- Control-plane runtime command queue key:
|
||||
- `pink_runtime_commands`
|
||||
- Capital mirror key:
|
||||
- `pink_capital_update_latest`
|
||||
|
||||
## 5.4 Trader identity
|
||||
- Trader ID default:
|
||||
- `DOLPHIN-PINK-001`
|
||||
|
||||
---
|
||||
|
||||
## 6. Required File-Level Changes (Coding Agent Worklist)
|
||||
|
||||
Important: This section is prescriptive. Implement all items unless explicitly marked optional.
|
||||
|
||||
## 6.1 Sink/routing abstraction
|
||||
|
||||
### 6.1.1 `prod/ch_writer.py`
|
||||
Current state exposes only `_writer`, `_writer_green`, `_writer_prodgreen` and corresponding functions.
|
||||
- Source anchor: `/mnt/dolphinng5_predict/prod/ch_writer.py:302`
|
||||
|
||||
Required:
|
||||
1. Add `_writer_pink = _CHWriter(db="dolphin_pink")`.
|
||||
2. Add `ch_put_pink(table: str, row: dict) -> None`.
|
||||
3. Do not modify behavior of existing sink functions.
|
||||
|
||||
Acceptance:
|
||||
- Unit test asserts writes called via `ch_put_pink` target `dolphin_pink` only.
|
||||
|
||||
### 6.1.2 `prod/bingx/journal.py`
|
||||
Current `_db_for_strategy` routes `prod*` to `dolphin_prodgreen`.
|
||||
- Anchor: `/mnt/dolphinng5_predict/prod/bingx/journal.py:88`
|
||||
|
||||
Required:
|
||||
1. Replace ad-hoc prefix routing with explicit strategy->db map.
|
||||
2. Add explicit `pink -> dolphin_pink` mapping.
|
||||
3. Keep existing `blue`, `green`, `prodgreen` compatibility.
|
||||
4. Update sink selection to include `ch_put_pink`.
|
||||
|
||||
Acceptance:
|
||||
- For `strategy='pink'`, both journal snapshot writes and lookup reads use only `dolphin_pink`.
|
||||
|
||||
### 6.1.3 `prod/bingx/execution.py`
|
||||
Current code hardcodes:
|
||||
- `self._journal_strategy = "prodgreen"`
|
||||
- account-events insert URL database `dolphin_prodgreen`
|
||||
- Anchors:
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/execution.py:263`
|
||||
- `/mnt/dolphinng5_predict/prod/bingx/execution.py:532`
|
||||
|
||||
Required:
|
||||
1. Add config-driven `journal_strategy` and `journal_db` fields.
|
||||
2. Default for existing prodgreen path remains unchanged.
|
||||
3. PINK launcher passes `journal_strategy='pink'`, `journal_db='dolphin_pink'`.
|
||||
4. Remove any remaining hardcoded `dolphin_prodgreen` in account-event path.
|
||||
|
||||
Acceptance:
|
||||
- No writes from PINK execution appear in `dolphin_prodgreen.account_events`.
|
||||
|
||||
## 6.2 Actor and launcher namespace configurability
|
||||
|
||||
### 6.2.1 `prod/launch_dolphin_live.py`
|
||||
Current defaults are prodgreen-centric:
|
||||
- state/pnl maps and strategy name.
|
||||
- Anchors:
|
||||
- `/mnt/dolphinng5_predict/prod/launch_dolphin_live.py:78`
|
||||
- `/mnt/dolphinng5_predict/prod/launch_dolphin_live.py:79`
|
||||
- `/mnt/dolphinng5_predict/prod/launch_dolphin_live.py:132`
|
||||
|
||||
Required:
|
||||
1. Introduce generic env-driven namespace fields:
|
||||
- `DOLPHIN_STRATEGY_NAME`
|
||||
- `DOLPHIN_STATE_MAP`
|
||||
- `DOLPHIN_PNL_MAP`
|
||||
- `DOLPHIN_ADAPTIVE_EXIT_DB`
|
||||
- `DOLPHIN_V7_JOURNAL_DB`
|
||||
2. Keep prodgreen defaults backward-compatible.
|
||||
3. Add dedicated PINK launcher module or mode wrapper with PINK defaults.
|
||||
|
||||
Acceptance:
|
||||
- Running PINK launcher without overrides lands in PINK namespaces only.
|
||||
|
||||
### 6.2.2 `nautilus_dolphin/nautilus/.../dolphin_actor.py`
|
||||
Current default + routing:
|
||||
- `strategy_name='prodgreen'`
|
||||
- `startswith("prod")` sink logic
|
||||
- state/pnl defaults map to PRODGREEN
|
||||
- Anchors:
|
||||
- `/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/dolphin_actor.py:179`
|
||||
- `/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/dolphin_actor.py:180`
|
||||
- `/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/dolphin_actor.py:181`
|
||||
- `/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/dolphin_actor.py:185`
|
||||
- `/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/dolphin_actor.py:189`
|
||||
|
||||
Required:
|
||||
1. Replace prefix-based sink selection with explicit strategy mapping.
|
||||
2. Add first-class `pink` mapping for CH sink + default shadow db.
|
||||
3. Keep old strategy names functional.
|
||||
4. Ensure aliases do not include BLUE keys in PINK mode.
|
||||
|
||||
Acceptance:
|
||||
- Actor in `pink` mode never writes to `DOLPHIN_STATE_PRODGREEN`, `DOLPHIN_PNL_PRODGREEN`, or `dolphin_prodgreen`.
|
||||
|
||||
## 6.3 Control-plane keys and capital surfaces
|
||||
|
||||
### 6.3.1 `prod/nautilus_event_trader.py` (if PINK reuses this path)
|
||||
Current BLUE hardcoding includes:
|
||||
- `DOLPHIN_STATE_BLUE`, `DOLPHIN_PNL_BLUE`, `blue_runtime_commands`
|
||||
- Anchors:
|
||||
- `/mnt/dolphinng5_predict/prod/nautilus_event_trader.py:1740`
|
||||
- `/mnt/dolphinng5_predict/prod/nautilus_event_trader.py:1741`
|
||||
- `/mnt/dolphinng5_predict/prod/nautilus_event_trader.py:1850`
|
||||
- `/mnt/dolphinng5_predict/prod/nautilus_event_trader.py:2806`
|
||||
|
||||
Required (only if this file is used for PINK runtime):
|
||||
1. Parameterize map names and runtime queue key.
|
||||
2. Preserve BLUE defaults exactly.
|
||||
3. Add PINK equivalents via env/config.
|
||||
|
||||
Acceptance:
|
||||
- `SET_CAPITAL` / `CAPITAL_UPDATE` for PINK only affects PINK state surfaces.
|
||||
|
||||
Note: Preferred approach is to keep BLUE runtime on this file untouched and run PINK through launcher/actor path first.
|
||||
|
||||
## 6.4 Ops scripts and tooling
|
||||
|
||||
### 6.4.1 `prod/ops/prodgreen_ctl.py`
|
||||
Current script is hardcoded to PRODGREEN namespaces.
|
||||
- Anchors:
|
||||
- `/mnt/dolphinng5_predict/prod/ops/prodgreen_ctl.py:23`
|
||||
- `/mnt/dolphinng5_predict/prod/ops/prodgreen_ctl.py:24`
|
||||
- `/mnt/dolphinng5_predict/prod/ops/prodgreen_ctl.py:42`
|
||||
|
||||
Required:
|
||||
1. Create `pink_ctl.py` OR generalize into namespace-aware ctl tool.
|
||||
2. Required commands: status, healthcheck, start, stop, restart, mode-verify.
|
||||
3. Must not invoke BLUE program names by default.
|
||||
|
||||
Acceptance:
|
||||
- `pink_ctl status` reports PINK CH/HZ surfaces only.
|
||||
|
||||
---
|
||||
|
||||
## 7. ClickHouse Schema Plan for `dolphin_pink`
|
||||
|
||||
## 7.1 Strategy
|
||||
Clone `prodgreen` schema set as baseline for PINK to preserve execution-profile columns.
|
||||
|
||||
Reference DDLs:
|
||||
- `/mnt/dolphinng5_predict/prod/clickhouse/prodgreen/00_create_database.sql`
|
||||
- `/mnt/dolphinng5_predict/prod/clickhouse/prodgreen/account_events.sql`
|
||||
- `/mnt/dolphinng5_predict/prod/clickhouse/prodgreen/status_snapshots.sql`
|
||||
- `/mnt/dolphinng5_predict/prod/clickhouse/prodgreen/trade_events.sql`
|
||||
- `/mnt/dolphinng5_predict/prod/clickhouse/prodgreen/v7_decision_events.sql`
|
||||
- `/mnt/dolphinng5_predict/prod/clickhouse/prodgreen/adaptive_exit_shadow.sql`
|
||||
- `/mnt/dolphinng5_predict/prod/clickhouse/prodgreen/02_create_trade_reconstruction.sql`
|
||||
- `/mnt/dolphinng5_predict/prod/clickhouse/prodgreen/03_create_trade_exit_legs.sql`
|
||||
|
||||
## 7.2 Required migration artifacts
|
||||
Create new folder:
|
||||
- `prod/clickhouse/pink/`
|
||||
|
||||
Include:
|
||||
1. `00_create_database.sql` -> `CREATE DATABASE IF NOT EXISTS dolphin_pink;`
|
||||
2. Full table DDL scripts mirroring prodgreen table structures.
|
||||
3. Apply script with idempotent checks.
|
||||
|
||||
## 7.3 Guardrails
|
||||
1. No schema mutation to existing `dolphin` or `dolphin_prodgreen` in this phase.
|
||||
2. No historical retagging/movement required for initial PINK bring-up.
|
||||
|
||||
---
|
||||
|
||||
## 8. Hazelcast Map and Key Contract
|
||||
|
||||
## 8.1 Required map names
|
||||
- `DOLPHIN_STATE_PINK`
|
||||
- `DOLPHIN_PNL_PINK`
|
||||
|
||||
## 8.2 Required keys in `DOLPHIN_STATE_PINK`
|
||||
- `engine_snapshot`
|
||||
- `capital_checkpoint`
|
||||
- `latest_nautilus`
|
||||
- optional replay/control keys mirrored from blue contract if PINK runtime supports same capital workflows
|
||||
|
||||
## 8.3 Control-plane keys
|
||||
- Runtime command queue: `pink_runtime_commands`
|
||||
- Latest capital update mirror: `pink_capital_update_latest`
|
||||
|
||||
## 8.4 Isolation validation rule
|
||||
A PINK process must never read/write keys under `DOLPHIN_STATE_BLUE` or `DOLPHIN_PNL_BLUE` except explicitly allowed read-only analytics queries.
|
||||
|
||||
---
|
||||
|
||||
## 9. BingX VST Behavior Contract
|
||||
|
||||
## 9.1 Environment
|
||||
- `DOLPHIN_BINGX_ENV=VST`
|
||||
- `DOLPHIN_BINGX_ALLOW_MAINNET=0`
|
||||
|
||||
## 9.2 Expected data venue / exec venue
|
||||
- Initial recommended mode:
|
||||
- data venue: BINANCE (same sensing stream as BLUE)
|
||||
- exec venue: BINGX VST
|
||||
|
||||
## 9.3 Leverage/sizing mode
|
||||
- Use existing sizing-mode mechanisms.
|
||||
- No strategy-logic change permitted.
|
||||
|
||||
---
|
||||
|
||||
## 10. Data Resource Budget and Controls
|
||||
|
||||
## 10.1 Baseline estimates (from measured data)
|
||||
|
||||
### CH + HZ for BLUE-like write path
|
||||
- CH: ~4.17 MB/day avg, ~12.01 MB/day p95-day
|
||||
- HZ: ~100.03 MB/day avg, ~301.53 MB/day p95-day
|
||||
|
||||
### BingX journal risk stream
|
||||
- `account_events`: ~7.41 GB/day avg, ~18.57 GB/day p95-day if current high-rate snapshots remain.
|
||||
|
||||
## 10.2 Mandatory control for `account_events`
|
||||
Implement at least one, preferably multiple:
|
||||
1. Snapshot delta suppression beyond fingerprint-only (field-level sampling and minimum emission interval).
|
||||
2. `ACCOUNT_REFRESH` write interval floor (e.g., min 2s, then tune).
|
||||
3. Separate high-granularity debug table optional; production `account_events` should be rate-limited.
|
||||
4. Configurable hard cap alert on rows/minute.
|
||||
|
||||
## 10.3 Acceptance thresholds
|
||||
1. PINK `account_events` sustained rate must stay below agreed cap (set initial policy: <= 5 rows/sec average over 15 min unless debug mode explicitly enabled).
|
||||
2. Alert if exceeds cap for > 3 consecutive windows.
|
||||
|
||||
---
|
||||
|
||||
## 11. Observability and ROI/Friction Outputs
|
||||
|
||||
## 11.1 Required KPI outputs
|
||||
1. Realized ROI (closed trades).
|
||||
2. Open-equity ROI (mark-to-market).
|
||||
3. Cost-adjusted ROI.
|
||||
4. Latency decomposition:
|
||||
- decision->submit
|
||||
- submit->ack
|
||||
- ack->first_fill
|
||||
- first_fill->done
|
||||
5. Slippage decomposition (bps against decision/arrival references).
|
||||
6. Fee/funding components.
|
||||
|
||||
## 11.2 Storage location
|
||||
- PINK metrics rows in `dolphin_pink.trade_events` payload columns and/or dedicated execution quality table.
|
||||
|
||||
## 11.3 TUI policy
|
||||
Current TUI is BLUE-hardcoded in places (`DOLPHIN_STATE_BLUE`, `dolphin.trade_events`, `blue_runtime_commands`).
|
||||
- Anchors:
|
||||
- `/mnt/dolphinng5_predict/Observability/dolphin_status.py:513`
|
||||
- `/mnt/dolphinng5_predict/Observability/dolphin_status.py:558`
|
||||
- `/mnt/dolphinng5_predict/Observability/dolphin_status.py:1164`
|
||||
- `/mnt/dolphinng5_predict/Observability/dolphin_status.py:212`
|
||||
|
||||
Required:
|
||||
1. Do not break BLUE TUI.
|
||||
2. Add either:
|
||||
- separate `dolphin_status_pink.py`, or
|
||||
- namespace-parameterized TUI mode.
|
||||
|
||||
---
|
||||
|
||||
## 12. Podman + Quadlet + systemd Adoption Plan
|
||||
|
||||
## 12.1 Strategy
|
||||
Apply only to PINK stack first.
|
||||
|
||||
## 12.2 Preflight checks (must pass before coding)
|
||||
1. Podman availability on host (`podman --version`).
|
||||
2. systemd user/service model chosen (rootless preferred unless operationally blocked).
|
||||
3. Persistent volume paths and permissions validated.
|
||||
4. ClickHouse config/users mounts parity with current docker-compose pattern.
|
||||
|
||||
Current host note: Podman not currently installed (`which podman` returned no result).
|
||||
|
||||
## 12.3 Unit boundaries
|
||||
- BLUE stays under supervisord + current docker compose infra.
|
||||
- PINK gets independent unit set.
|
||||
- Do not dual-manage same runtime process with supervisord and systemd.
|
||||
|
||||
## 12.4 Quadlet file set for PINK
|
||||
Create under dedicated path (example):
|
||||
- `datastack-pink.pod`
|
||||
- `hazelcast-pink.container` (or reuse cluster only if explicitly designed shared)
|
||||
- `clickhouse-pink.container` (or shared CH with separate DB if accepted)
|
||||
- `prefect-pink.container` (if needed)
|
||||
- `pink-worker.container`
|
||||
|
||||
## 12.5 Shared vs dedicated infra policy
|
||||
Decision required before implementation:
|
||||
1. Option A (preferred first): shared HZ+CH infra, isolated logical namespaces.
|
||||
2. Option B: dedicated PINK HZ/CH containers.
|
||||
|
||||
Given HZ volatility risk and operational complexity, start with Option A unless a strict physical isolation requirement is imposed.
|
||||
|
||||
---
|
||||
|
||||
## 13. Algorithm Identity Assurance (Critical)
|
||||
|
||||
## 13.1 Required parity harness
|
||||
Implement deterministic parity checks between BLUE decision path and PINK decision path on identical input replay.
|
||||
|
||||
## 13.2 Comparison granularity
|
||||
At each scan/bar compare tuple hash of:
|
||||
- signal fired boolean
|
||||
- selected asset
|
||||
- side
|
||||
- leverage intent
|
||||
- entry/exit action
|
||||
- reason code
|
||||
- bars_held progression
|
||||
|
||||
No tolerance except for fields explicitly dependent on execution venue acknowledgements.
|
||||
|
||||
## 13.3 Fail criteria
|
||||
Any divergence in pure strategy decisions is a release blocker.
|
||||
|
||||
---
|
||||
|
||||
## 14. Test Plan (Implementation Exit Criteria)
|
||||
|
||||
## 14.1 Unit tests
|
||||
1. Routing tests for strategy->DB and strategy->HZ map.
|
||||
2. Sink tests (`ch_put_pink` path).
|
||||
3. Control key tests (`pink_runtime_commands`).
|
||||
4. Account-event rate-limit logic tests.
|
||||
|
||||
## 14.2 Integration tests
|
||||
1. Start PINK in VST and verify:
|
||||
- CH writes only into `dolphin_pink.*`
|
||||
- HZ writes only into `DOLPHIN_STATE_PINK` / `DOLPHIN_PNL_PINK`
|
||||
2. Verify no new rows in `dolphin_prodgreen.account_events` during PINK-only test run.
|
||||
3. Verify BLUE process and metrics unaffected.
|
||||
|
||||
## 14.3 Soak tests
|
||||
1. 24h soak with PINK live in VST.
|
||||
2. Monitor:
|
||||
- row rates
|
||||
- CH insert error rates
|
||||
- HZ heartbeat age
|
||||
- control-plane responsiveness
|
||||
|
||||
## 14.4 Regression tests
|
||||
Run existing relevant suites for:
|
||||
- bingx journaling/accounting
|
||||
- actor routing
|
||||
- launch paths
|
||||
- MHS basic health checks for BLUE unaffectedness
|
||||
|
||||
---
|
||||
|
||||
## 15. Deployment Sequence (Phased)
|
||||
|
||||
## Phase 0: Namespace groundwork
|
||||
1. Add sink and routing abstractions.
|
||||
2. Add PINK CH schema migration artifacts.
|
||||
3. Add PINK launcher and env contract.
|
||||
|
||||
Gate 0:
|
||||
- Compile/tests pass.
|
||||
- Static grep verifies no hardcoded fallback from `pink` to `prodgreen`.
|
||||
|
||||
## Phase 1: PINK logical bring-up (same infra)
|
||||
1. Start PINK process under current management (or controlled runner) with VST.
|
||||
2. Verify strict namespace isolation.
|
||||
3. Run parity harness with replay feed.
|
||||
|
||||
Gate 1:
|
||||
- No contamination.
|
||||
- Parity pass.
|
||||
|
||||
## Phase 2: Data-volume control tuning
|
||||
1. Tune account-event emission controls.
|
||||
2. Verify row-rate caps and KPI completeness.
|
||||
|
||||
Gate 2:
|
||||
- Resource budgets stable.
|
||||
|
||||
## Phase 3: Optional Podman+Quadlet packaging for PINK
|
||||
1. Build PINK quadlet units.
|
||||
2. Validate independent lifecycle.
|
||||
3. Keep BLUE unchanged.
|
||||
|
||||
Gate 3:
|
||||
- PINK can be fully operated without impacting BLUE.
|
||||
|
||||
---
|
||||
|
||||
## 16. Rollback Plan
|
||||
|
||||
## 16.1 Soft rollback
|
||||
1. Stop PINK process/unit only.
|
||||
2. Leave BLUE untouched.
|
||||
3. Preserve PINK CH/HZ artifacts for postmortem.
|
||||
|
||||
## 16.2 Hard rollback
|
||||
1. Revert routing patches that introduced PINK mapping.
|
||||
2. Keep PINK DB as historical archive or drop only after approval.
|
||||
|
||||
## 16.3 Explicit no-rollback targets
|
||||
Do not alter BLUE capital/state surfaces during PINK rollback.
|
||||
|
||||
---
|
||||
|
||||
## 17. Security and Safety
|
||||
|
||||
1. PINK VST keys isolated from BLUE credentials.
|
||||
2. No mainnet enable unless separate approval gate flips `DOLPHIN_BINGX_ALLOW_MAINNET=1`.
|
||||
3. Validate no accidental propagation of PINK credentials into shared logs.
|
||||
|
||||
---
|
||||
|
||||
## 18. Deliverables Checklist (Coding Agent Must Produce)
|
||||
|
||||
1. Code changes implementing explicit strategy/namespace routing for PINK.
|
||||
2. `dolphin_pink` CH schema files in `prod/clickhouse/pink/`.
|
||||
3. PINK launcher/config entrypoint.
|
||||
4. PINK ops control script or generalized namespace-aware ctl tool.
|
||||
5. Unit + integration tests for routing/isolation.
|
||||
6. Parity harness and parity report artifact.
|
||||
7. Data-rate monitor/report for `account_events` and major tables.
|
||||
8. Optional: Quadlet unit files for PINK stack (if Phase 3 in scope).
|
||||
|
||||
---
|
||||
|
||||
## 19. Coding Prohibitions (Strict)
|
||||
|
||||
1. Do not alter algorithm constants or decision logic behavior.
|
||||
2. Do not remove or repurpose BLUE maps/tables.
|
||||
3. Do not bind PINK to names beginning with `prod` in this phase.
|
||||
4. Do not change BLUE process manager/runtime flow as part of PINK implementation.
|
||||
|
||||
---
|
||||
|
||||
## 20. Open Decisions Requiring Explicit Operator Choice
|
||||
|
||||
1. PINK infra physical model:
|
||||
- shared CH/HZ vs dedicated CH/HZ.
|
||||
2. PINK manager in early phases:
|
||||
- supervised process first vs direct Quadlet rollout.
|
||||
3. Account-event rate cap values:
|
||||
- initial thresholds and alert policy.
|
||||
|
||||
If decisions are not provided, default choices are:
|
||||
- shared CH/HZ with strict logical isolation,
|
||||
- supervised PINK process before Quadlet migration,
|
||||
- account-events cap <= 5 rows/sec sustained (debug off).
|
||||
|
||||
---
|
||||
|
||||
## 21. Minimal Go/No-Go Matrix
|
||||
|
||||
Go only if all true:
|
||||
1. Strategy parity = exact pass.
|
||||
2. Namespace contamination tests = zero leaks.
|
||||
3. Data-rate caps respected during soak.
|
||||
4. BLUE observability and trade loop unchanged.
|
||||
|
||||
No-Go if any true:
|
||||
1. `pink` rows appear in `dolphin_prodgreen` or `dolphin` unexpectedly.
|
||||
2. BLUE map/table writes change baseline rates materially.
|
||||
3. Decision parity drifts.
|
||||
4. VST safety flags not enforced.
|
||||
|
||||
---
|
||||
|
||||
## 22. Final Operator Notes
|
||||
|
||||
- This spec intentionally separates **architecture modernization** from **algorithm behavior**.
|
||||
- PINK is the safe proving ground for infra re-architecture.
|
||||
- BLUE remains production reference and must not be structurally disturbed until PINK completes parity + soak + resource gates.
|
||||
|
||||
Reference in New Issue
Block a user