Files
siloqy/prod/docs/PINK_DITAV2_FAULT_TAXONOMY.md
Codex d4b73b236a PINK DITAv2 Sprint 2-3: accounting parity + multi-leg groundwork
Sprint 2 (accounting + observability parity, PINK scope):
- Verified pink_clickhouse.py writes the 8 BLUE-legacy row families at
  matching schema and that capital authority in pink_direct.step() is
  solely kernel.account (no balance-poll overwrite in the hot loop).
- Report: prod/clean_arch/dita_v2/SPRINT2_ACCOUNTING_PARITY.md.

Sprint 3 offline groundwork (no exchange contact):
- Add _write_trade_exit_leg to pink_clickhouse.py: one BLUE-schema-faithful
  trade_exit_legs row per exit leg, with isolated (non-cumulative) per-leg
  deltas tracked via _leg_state (reset on ENTER). Closes the docstring gap.
- New offline suite test_pink_multi_exit_groundwork.py (3 passed):
  * Flaw 4 — two-leg exit closes once, realized accrues per leg, closed
    slot rejects further EXIT (no double-close).
  * Overshoot invariant — a final EXIT requesting more than the remaining
    size CLAMPS (size to 0, no oversell), retiring the Sprint 0 cumulative-
    ratio risk empirically.
  * trade_exit_legs delta + full BLUE column-set assertions.
- Persistence regression after edits: 10 passed.

BLUE untouched: no changes to dolphin.* / DOLPHIN_*_BLUE / nautilus_event_trader.py.
Live VST multi-leg run remains deferred pending explicit authorization.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 19:21:45 +02:00

3.4 KiB

PINK-on-DITAv2 Fault Taxonomy & Operator Response

Fault Classes

RATE_LIMITED

Kernel code: KernelDiagnosticCode.RATE_LIMITED
Severity: WARNING
Recovery: Automatic — kernel retries on next step cycle.

Operator action: none required unless persistent (>10 min). If persistent, check BingX API limits at /openApi/swap/v2/user/balance directly. Reduce poll frequency via DOLPHIN_PINK_POLL_INTERVAL_SEC (default 1.0s).

ORDER_REJECTED

Kernel code: KernelDiagnosticCode.ORDER_REJECTED
Entry reject: Slot returns to IDLE. Decision engine will re-evaluate on next cycle.
Exit reject: Slot stays in EXIT_WORKING. Decision engine will retry exit.

Operator action: check that instrument is tradeable on BingX VST. Symbol precision changes or contract suspensions can cause rejects. Inspect outcome.details for venue reason text.

EXIT_ORDER_REJECTED

Kernel code: KernelDiagnosticCode.EXIT_ORDER_REJECTED
Slot state: EXIT_WORKING. The kernel will retry via process_intent(EXIT) on the next step where the decision engine produces an exit signal.

Operator action: if position remains open past DOLPHIN_MAX_HOLD_BARS (default 250), manually flatten via pink_ctl.py or direct BingX REST.

CANCEL_REJECTED

Kernel code: KernelDiagnosticCode.CANCEL_REJECTED
Slot state: Unchanged. Cancel is retried on the next cycle.

Operator action: check open orders on BingX. If the order filled between cancel attempt and rejection, the slot will converge on the next reconcile cycle.

NO_ACTIVE_EXIT_ORDER

Kernel code: KernelDiagnosticCode.NO_ACTIVE_EXIT_ORDER
Cause: Exit intent processed but no working exit order exists (usually because it filled between decision and execution).

Operator action: none — the fill event will converge the slot to CLOSED on the next on_venue_event or reconcile.

STALE_STATE_RECONCILE

Kernel code: KernelDiagnosticCode.STALE_STATE_RECONCILING
Slot state: STALE_STATE_RECONCILING. Normal event progression is blocked until reconciliation completes.

Operator action: if the slot stays in this state for >30s, the exchange snapshot may be inconsistent. Run pink_ctl.py restart to force full restart reconcile.

DUPLICATE_EVENT

Kernel code: KernelDiagnosticCode.DUPLICATE_EVENT
Severity: INFO
Effect: Event is dropped. No capital or state change. Idempotency via seen_event_ids on the slot.

Operator action: none.

RATE_LIMITED (persistent cycle)

Detection: Consecutive RATE_LIMITED outcomes with no successful exchange interaction. Anomaly row origin: ditav2_kernel

Operator action: check exchange API status. If the rate limit window is known, set DITA_V2_RATE_LIMIT_COOLDOWN_SEC in env.

Diagnostic Surface

All fault codes appear in:

  • KernelOutcome.diagnostic_code (programmatic)
  • KernelOutcome.severity (INFO/WARNING/ERROR/CRITICAL)
  • KernelOutcome.details (structured payload with reason, retry_after_ms, etc.)

Log Paths

  • Runtime: /tmp/dolphin_logs/supervisor/dolphin_live_pink.log
  • Kernel: /tmp/dolphin_logs/supervisor/dolphin_live_pink-error.log

Recovery Tools

# Check DITAv2 health
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py ditav2-status

# Full restart reconcile
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py restart

# Namespace isolation check
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py mode-verify