Files
siloqy/prod/docs/PINK_DITAV2_FAULT_TAXONOMY.md
Codex d4b73b236a PINK DITAv2 Sprint 2-3: accounting parity + multi-leg groundwork
Sprint 2 (accounting + observability parity, PINK scope):
- Verified pink_clickhouse.py writes the 8 BLUE-legacy row families at
  matching schema and that capital authority in pink_direct.step() is
  solely kernel.account (no balance-poll overwrite in the hot loop).
- Report: prod/clean_arch/dita_v2/SPRINT2_ACCOUNTING_PARITY.md.

Sprint 3 offline groundwork (no exchange contact):
- Add _write_trade_exit_leg to pink_clickhouse.py: one BLUE-schema-faithful
  trade_exit_legs row per exit leg, with isolated (non-cumulative) per-leg
  deltas tracked via _leg_state (reset on ENTER). Closes the docstring gap.
- New offline suite test_pink_multi_exit_groundwork.py (3 passed):
  * Flaw 4 — two-leg exit closes once, realized accrues per leg, closed
    slot rejects further EXIT (no double-close).
  * Overshoot invariant — a final EXIT requesting more than the remaining
    size CLAMPS (size to 0, no oversell), retiring the Sprint 0 cumulative-
    ratio risk empirically.
  * trade_exit_legs delta + full BLUE column-set assertions.
- Persistence regression after edits: 10 passed.

BLUE untouched: no changes to dolphin.* / DOLPHIN_*_BLUE / nautilus_event_trader.py.
Live VST multi-leg run remains deferred pending explicit authorization.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 19:21:45 +02:00

80 lines
3.4 KiB
Markdown

# PINK-on-DITAv2 Fault Taxonomy & Operator Response
## Fault Classes
### RATE_LIMITED
**Kernel code**: `KernelDiagnosticCode.RATE_LIMITED`
**Severity**: WARNING
**Recovery**: Automatic — kernel retries on next step cycle.
Operator action: none required unless persistent (>10 min). If persistent, check BingX API limits at `/openApi/swap/v2/user/balance` directly. Reduce poll frequency via `DOLPHIN_PINK_POLL_INTERVAL_SEC` (default 1.0s).
### ORDER_REJECTED
**Kernel code**: `KernelDiagnosticCode.ORDER_REJECTED`
**Entry reject**: Slot returns to IDLE. Decision engine will re-evaluate on next cycle.
**Exit reject**: Slot stays in EXIT_WORKING. Decision engine will retry exit.
Operator action: check that instrument is tradeable on BingX VST. Symbol precision changes or contract suspensions can cause rejects. Inspect `outcome.details` for venue reason text.
### EXIT_ORDER_REJECTED
**Kernel code**: `KernelDiagnosticCode.EXIT_ORDER_REJECTED`
**Slot state**: EXIT_WORKING. The kernel will retry via process_intent(EXIT) on the next step where the decision engine produces an exit signal.
Operator action: if position remains open past `DOLPHIN_MAX_HOLD_BARS` (default 250), manually flatten via `pink_ctl.py` or direct BingX REST.
### CANCEL_REJECTED
**Kernel code**: `KernelDiagnosticCode.CANCEL_REJECTED`
**Slot state**: Unchanged. Cancel is retried on the next cycle.
Operator action: check open orders on BingX. If the order filled between cancel attempt and rejection, the slot will converge on the next reconcile cycle.
### NO_ACTIVE_EXIT_ORDER
**Kernel code**: `KernelDiagnosticCode.NO_ACTIVE_EXIT_ORDER`
**Cause**: Exit intent processed but no working exit order exists (usually because it filled between decision and execution).
Operator action: none — the fill event will converge the slot to CLOSED on the next `on_venue_event` or reconcile.
### STALE_STATE_RECONCILE
**Kernel code**: `KernelDiagnosticCode.STALE_STATE_RECONCILING`
**Slot state**: STALE_STATE_RECONCILING. Normal event progression is blocked until reconciliation completes.
Operator action: if the slot stays in this state for >30s, the exchange snapshot may be inconsistent. Run `pink_ctl.py restart` to force full restart reconcile.
### DUPLICATE_EVENT
**Kernel code**: `KernelDiagnosticCode.DUPLICATE_EVENT`
**Severity**: INFO
**Effect**: Event is dropped. No capital or state change. Idempotency via `seen_event_ids` on the slot.
Operator action: none.
### RATE_LIMITED (persistent cycle)
**Detection**: Consecutive RATE_LIMITED outcomes with no successful exchange interaction.
**Anomaly row origin**: `ditav2_kernel`
Operator action: check exchange API status. If the rate limit window is known, set `DITA_V2_RATE_LIMIT_COOLDOWN_SEC` in env.
## Diagnostic Surface
All fault codes appear in:
- `KernelOutcome.diagnostic_code` (programmatic)
- `KernelOutcome.severity` (INFO/WARNING/ERROR/CRITICAL)
- `KernelOutcome.details` (structured payload with reason, retry_after_ms, etc.)
## Log Paths
- Runtime: `/tmp/dolphin_logs/supervisor/dolphin_live_pink.log`
- Kernel: `/tmp/dolphin_logs/supervisor/dolphin_live_pink-error.log`
## Recovery Tools
```bash
# Check DITAv2 health
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py ditav2-status
# Full restart reconcile
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py restart
# Namespace isolation check
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py mode-verify
```