Sprint 2 (accounting + observability parity, PINK scope):
- Verified pink_clickhouse.py writes the 8 BLUE-legacy row families at
matching schema and that capital authority in pink_direct.step() is
solely kernel.account (no balance-poll overwrite in the hot loop).
- Report: prod/clean_arch/dita_v2/SPRINT2_ACCOUNTING_PARITY.md.
Sprint 3 offline groundwork (no exchange contact):
- Add _write_trade_exit_leg to pink_clickhouse.py: one BLUE-schema-faithful
trade_exit_legs row per exit leg, with isolated (non-cumulative) per-leg
deltas tracked via _leg_state (reset on ENTER). Closes the docstring gap.
- New offline suite test_pink_multi_exit_groundwork.py (3 passed):
* Flaw 4 — two-leg exit closes once, realized accrues per leg, closed
slot rejects further EXIT (no double-close).
* Overshoot invariant — a final EXIT requesting more than the remaining
size CLAMPS (size to 0, no oversell), retiring the Sprint 0 cumulative-
ratio risk empirically.
* trade_exit_legs delta + full BLUE column-set assertions.
- Persistence regression after edits: 10 passed.
BLUE untouched: no changes to dolphin.* / DOLPHIN_*_BLUE / nautilus_event_trader.py.
Live VST multi-leg run remains deferred pending explicit authorization.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
80 lines
3.4 KiB
Markdown
80 lines
3.4 KiB
Markdown
# PINK-on-DITAv2 Fault Taxonomy & Operator Response
|
|
|
|
## Fault Classes
|
|
|
|
### RATE_LIMITED
|
|
**Kernel code**: `KernelDiagnosticCode.RATE_LIMITED`
|
|
**Severity**: WARNING
|
|
**Recovery**: Automatic — kernel retries on next step cycle.
|
|
|
|
Operator action: none required unless persistent (>10 min). If persistent, check BingX API limits at `/openApi/swap/v2/user/balance` directly. Reduce poll frequency via `DOLPHIN_PINK_POLL_INTERVAL_SEC` (default 1.0s).
|
|
|
|
### ORDER_REJECTED
|
|
**Kernel code**: `KernelDiagnosticCode.ORDER_REJECTED`
|
|
**Entry reject**: Slot returns to IDLE. Decision engine will re-evaluate on next cycle.
|
|
**Exit reject**: Slot stays in EXIT_WORKING. Decision engine will retry exit.
|
|
|
|
Operator action: check that instrument is tradeable on BingX VST. Symbol precision changes or contract suspensions can cause rejects. Inspect `outcome.details` for venue reason text.
|
|
|
|
### EXIT_ORDER_REJECTED
|
|
**Kernel code**: `KernelDiagnosticCode.EXIT_ORDER_REJECTED`
|
|
**Slot state**: EXIT_WORKING. The kernel will retry via process_intent(EXIT) on the next step where the decision engine produces an exit signal.
|
|
|
|
Operator action: if position remains open past `DOLPHIN_MAX_HOLD_BARS` (default 250), manually flatten via `pink_ctl.py` or direct BingX REST.
|
|
|
|
### CANCEL_REJECTED
|
|
**Kernel code**: `KernelDiagnosticCode.CANCEL_REJECTED`
|
|
**Slot state**: Unchanged. Cancel is retried on the next cycle.
|
|
|
|
Operator action: check open orders on BingX. If the order filled between cancel attempt and rejection, the slot will converge on the next reconcile cycle.
|
|
|
|
### NO_ACTIVE_EXIT_ORDER
|
|
**Kernel code**: `KernelDiagnosticCode.NO_ACTIVE_EXIT_ORDER`
|
|
**Cause**: Exit intent processed but no working exit order exists (usually because it filled between decision and execution).
|
|
|
|
Operator action: none — the fill event will converge the slot to CLOSED on the next `on_venue_event` or reconcile.
|
|
|
|
### STALE_STATE_RECONCILE
|
|
**Kernel code**: `KernelDiagnosticCode.STALE_STATE_RECONCILING`
|
|
**Slot state**: STALE_STATE_RECONCILING. Normal event progression is blocked until reconciliation completes.
|
|
|
|
Operator action: if the slot stays in this state for >30s, the exchange snapshot may be inconsistent. Run `pink_ctl.py restart` to force full restart reconcile.
|
|
|
|
### DUPLICATE_EVENT
|
|
**Kernel code**: `KernelDiagnosticCode.DUPLICATE_EVENT`
|
|
**Severity**: INFO
|
|
**Effect**: Event is dropped. No capital or state change. Idempotency via `seen_event_ids` on the slot.
|
|
|
|
Operator action: none.
|
|
|
|
### RATE_LIMITED (persistent cycle)
|
|
**Detection**: Consecutive RATE_LIMITED outcomes with no successful exchange interaction.
|
|
**Anomaly row origin**: `ditav2_kernel`
|
|
|
|
Operator action: check exchange API status. If the rate limit window is known, set `DITA_V2_RATE_LIMIT_COOLDOWN_SEC` in env.
|
|
|
|
## Diagnostic Surface
|
|
|
|
All fault codes appear in:
|
|
- `KernelOutcome.diagnostic_code` (programmatic)
|
|
- `KernelOutcome.severity` (INFO/WARNING/ERROR/CRITICAL)
|
|
- `KernelOutcome.details` (structured payload with reason, retry_after_ms, etc.)
|
|
|
|
## Log Paths
|
|
|
|
- Runtime: `/tmp/dolphin_logs/supervisor/dolphin_live_pink.log`
|
|
- Kernel: `/tmp/dolphin_logs/supervisor/dolphin_live_pink-error.log`
|
|
|
|
## Recovery Tools
|
|
|
|
```bash
|
|
# Check DITAv2 health
|
|
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py ditav2-status
|
|
|
|
# Full restart reconcile
|
|
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py restart
|
|
|
|
# Namespace isolation check
|
|
python /mnt/dolphinng5_predict/prod/ops/pink_ctl.py mode-verify
|
|
```
|