PINK DITAv2 Sprint 2-3: accounting parity + multi-leg groundwork
Sprint 2 (accounting + observability parity, PINK scope):
- Verified pink_clickhouse.py writes the 8 BLUE-legacy row families at
matching schema and that capital authority in pink_direct.step() is
solely kernel.account (no balance-poll overwrite in the hot loop).
- Report: prod/clean_arch/dita_v2/SPRINT2_ACCOUNTING_PARITY.md.
Sprint 3 offline groundwork (no exchange contact):
- Add _write_trade_exit_leg to pink_clickhouse.py: one BLUE-schema-faithful
trade_exit_legs row per exit leg, with isolated (non-cumulative) per-leg
deltas tracked via _leg_state (reset on ENTER). Closes the docstring gap.
- New offline suite test_pink_multi_exit_groundwork.py (3 passed):
* Flaw 4 — two-leg exit closes once, realized accrues per leg, closed
slot rejects further EXIT (no double-close).
* Overshoot invariant — a final EXIT requesting more than the remaining
size CLAMPS (size to 0, no oversell), retiring the Sprint 0 cumulative-
ratio risk empirically.
* trade_exit_legs delta + full BLUE column-set assertions.
- Persistence regression after edits: 10 passed.
BLUE untouched: no changes to dolphin.* / DOLPHIN_*_BLUE / nautilus_event_trader.py.
Live VST multi-leg run remains deferred pending explicit authorization.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
79
prod/docs/PINK_DITAV2_FAULT_TAXONOMY.md
Normal file
79
prod/docs/PINK_DITAV2_FAULT_TAXONOMY.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# PINK-on-DITAv2 Fault Taxonomy & Operator Response
|
||||
|
||||
## Fault Classes
|
||||
|
||||
### RATE_LIMITED
|
||||
**Kernel code**: `KernelDiagnosticCode.RATE_LIMITED`
|
||||
**Severity**: WARNING
|
||||
**Recovery**: Automatic — kernel retries on next step cycle.
|
||||
|
||||
Operator action: none required unless persistent (>10 min). If persistent, check BingX API limits at `/openApi/swap/v2/user/balance` directly. Reduce poll frequency via `DOLPHIN_PINK_POLL_INTERVAL_SEC` (default 1.0s).
|
||||
|
||||
### ORDER_REJECTED
|
||||
**Kernel code**: `KernelDiagnosticCode.ORDER_REJECTED`
|
||||
**Entry reject**: Slot returns to IDLE. Decision engine will re-evaluate on next cycle.
|
||||
**Exit reject**: Slot stays in EXIT_WORKING. Decision engine will retry exit.
|
||||
|
||||
Operator action: check that instrument is tradeable on BingX VST. Symbol precision changes or contract suspensions can cause rejects. Inspect `outcome.details` for venue reason text.
|
||||
|
||||
### EXIT_ORDER_REJECTED
|
||||
**Kernel code**: `KernelDiagnosticCode.EXIT_ORDER_REJECTED`
|
||||
**Slot state**: EXIT_WORKING. The kernel will retry via process_intent(EXIT) on the next step where the decision engine produces an exit signal.
|
||||
|
||||
Operator action: if position remains open past `DOLPHIN_MAX_HOLD_BARS` (default 250), manually flatten via `pink_ctl.py` or direct BingX REST.
|
||||
|
||||
### CANCEL_REJECTED
|
||||
**Kernel code**: `KernelDiagnosticCode.CANCEL_REJECTED`
|
||||
**Slot state**: Unchanged. Cancel is retried on the next cycle.
|
||||
|
||||
Operator action: check open orders on BingX. If the order filled between cancel attempt and rejection, the slot will converge on the next reconcile cycle.
|
||||
|
||||
### NO_ACTIVE_EXIT_ORDER
|
||||
**Kernel code**: `KernelDiagnosticCode.NO_ACTIVE_EXIT_ORDER`
|
||||
**Cause**: Exit intent processed but no working exit order exists (usually because it filled between decision and execution).
|
||||
|
||||
Operator action: none — the fill event will converge the slot to CLOSED on the next `on_venue_event` or reconcile.
|
||||
|
||||
### STALE_STATE_RECONCILE
|
||||
**Kernel code**: `KernelDiagnosticCode.STALE_STATE_RECONCILING`
|
||||
**Slot state**: STALE_STATE_RECONCILING. Normal event progression is blocked until reconciliation completes.
|
||||
|
||||
Operator action: if the slot stays in this state for >30s, the exchange snapshot may be inconsistent. Run `pink_ctl.py restart` to force full restart reconcile.
|
||||
|
||||
### DUPLICATE_EVENT
|
||||
**Kernel code**: `KernelDiagnosticCode.DUPLICATE_EVENT`
|
||||
**Severity**: INFO
|
||||
**Effect**: Event is dropped. No capital or state change. Idempotency via `seen_event_ids` on the slot.
|
||||
|
||||
Operator action: none.
|
||||
|
||||
### RATE_LIMITED (persistent cycle)
|
||||
**Detection**: Consecutive RATE_LIMITED outcomes with no successful exchange interaction.
|
||||
**Anomaly row origin**: `ditav2_kernel`
|
||||
|
||||
Operator action: check exchange API status. If the rate limit window is known, set `DITA_V2_RATE_LIMIT_COOLDOWN_SEC` in env.
|
||||
|
||||
## Diagnostic Surface
|
||||
|
||||
All fault codes appear in:
|
||||
- `KernelOutcome.diagnostic_code` (programmatic)
|
||||
- `KernelOutcome.severity` (INFO/WARNING/ERROR/CRITICAL)
|
||||
- `KernelOutcome.details` (structured payload with reason, retry_after_ms, etc.)
|
||||
|
||||
## Log Paths
|
||||
|
||||
- Runtime: `/tmp/dolphin_logs/supervisor/dolphin_live_pink.log`
|
||||
- Kernel: `/tmp/dolphin_logs/supervisor/dolphin_live_pink-error.log`
|
||||
|
||||
## Recovery Tools
|
||||
|
||||
```bash
|
||||
# Check DITAv2 health
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py ditav2-status
|
||||
|
||||
# Full restart reconcile
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py restart
|
||||
|
||||
# Namespace isolation check
|
||||
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py mode-verify
|
||||
```
|
||||
Reference in New Issue
Block a user