docs: VIBRISS spec (+ §10.6 cascade/adaptive-TP paramsets), PINK accounting fix spec, BLUE incident docs

VIBRISS_PARAMETER_GOVERNANCE_SPEC §10.6: ob_cascade.count_threshold
(currently cascade_count>0 = ONE asset widens every TP x1.40),
tp_widen_factor, withdrawal_velocity_threshold as governance candidates;
adaptive/Dynamic-TP threshold marked fit for VIBRISS governance; TP_FLOOR
joint-policy reward requirement.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
Codex
2026-06-12 15:04:15 +02:00
parent f4ff1cd9b7
commit c3a18f693a
4 changed files with 3653 additions and 0 deletions

View File

@@ -0,0 +1,182 @@
# Critical Violet Design: BLUE hydration bug
Date: 2026-06-11
## Summary
This incident is a BLUE hydration / restore bug on the XTZUSDT short trade `863c21da`.
The important facts are:
1. The XTZ trade was real and opened at `2026-06-11 17:22:12.678265+00:00`.
2. The trade did **not** close via TP, SL, or MAX_HOLD before hydration.
3. The restore path later rebuilt the open slot from `position_state` and `trade_reconstruction`.
4. The restored state had a chain-token mismatch, but the engine continued with the derived token instead of hard-failing.
5. A later hydrate-time stop was recorded at `2026-06-11 18:35:52.789008+00:00` with `STOP_LOSS`.
6. The ledger shows the next trade was admitted while XTZ was still officially open, which violates the single-slot invariant.
## Trade identity
- trade_id: `863c21da`
- asset: `XTZUSDT`
- side: `SHORT`
- entry price: `0.2276`
- entry notional: `56484.4305702418`
- leverage: `6.374647927191287`
- entry bar: `238`
- tp_base_pct: `0.002`
- tp_effective_pct: `0.0019999655500463724`
## Ledger evidence
### Open record
`dolphin.trade_reconstruction` contains the canonical open record:
- ts: `2026-06-11 17:22:12.911989`
- event_type: `OPEN`
- event_id: `863c21da:open`
- chain_token: `26852fa25fb5cdaa3b4c354d5e3eea93e27bce0ebdcd0da896d4f981642eeeb2`
The payload confirms:
- `entry_ts = 1781198532678265`
- `entry_bar = 238`
- `retraction_legs = 0`
- `realized_pnl_legs_total = 0.0`
- `chain_mode = LIVE`
- `chain_kind = ROOT`
### No close before hydrate
`dolphin.trade_exit_legs` has no rows for `863c21da`.
`dolphin.trade_events` also has no close row for `863c21da`.
So there is no official TP, SL, or MAX_HOLD exit recorded before the restore/hydration event.
### Decision tape before hydrate
`dolphin.v7_decision_events` shows the trade was live and being evaluated:
- `2026-06-11 17:22:13.274556` `HOLD`
- `2026-06-11 17:22:23.124863` `HOLD`
- `2026-06-11 17:22:45.232894` `HOLD`
- `2026-06-11 17:23:28.274004` `HOLD`
- `2026-06-11 17:24:43.182413` `RETRACT / V7_RISK_DOMINANT`
The best favorable excursion in the pre-hydrate tape was only about `+0.065905094%`, which is far below the fixed TP threshold.
## Restore / hydration behavior
At restore time the engine logged:
- `chain token mismatch on restore: trade=863c21da stored=26852fa25fb5 derived=98875e225e9e — continuing with derived token`
- `position_state RESTORED: XTZUSDT SHORT entry=0.2276 notional=56484 bars_held≈0 trade=863c21da`
The restore path in [`prod/nautilus_event_trader.py`](../nautilus_event_trader.py) does the following:
- reads `position_state`
- reconstructs `restored_entry_bar = max(0, self.bar_idx - stored_bars)`
- loads reconstruction data from `dolphin.trade_reconstruction`
- rebuilds chain state from the persisted payload
- if the stored chain token differs from the derived token, it logs the mismatch and continues with the derived token
Relevant code:
- `_chain_state_from_reconstruction(...)` around lines `3315-3348`
- restore from `position_state` around lines `1944-2058`
This is a validator, not a hard guardrail.
## Single-slot violation
The next distinct open trade in the reconstruction ledger is:
- ts: `2026-06-11 17:50:50.420620`
- trade_id: `43494ade`
- asset: `TRXUSDT`
- side: `SHORT`
That means the system admitted a new trade while XTZ was still officially open in the ledger.
On a single-slot engine, that should not happen.
## What would have happened without hydration
This is the conservative conclusion from the tape:
- The trade did not hit TP on the observed pre-hydrate tape.
- The trade did not have an official close row before hydration.
- The tape does not contain a clean uninterrupted decision path beyond the first pre-hydrate window.
The best-supported natural outcome from the observed tape is the live `RETRACT` state at `2026-06-11 17:24:43.182413`, where the engine still considered the slot active and the trade had only reached `bars_held = 14`.
At that point:
- `current_price = 0.22765000000000002`
- `pnl_pct = -0.021968365`
- `reason = V7_RISK_DOMINANT`
If that retract state had been executed immediately, the estimated trade PnL would have been:
- `-12.4087058758423` USDT on the recorded notional
- trade ROI: `-0.021968365%`
The max-hold clock also would have forced a decision long before the 18:35 restore:
- trade-specific `market_state_max_hold_bars = 102`
- live tape reached `bars_held = 14` by `17:24:43`
- at an ~11 second cadence, the max-hold boundary would have arrived around `17:40-17:41`
So the 18:35 stop-loss is not the natural continuation of the original entry. It is a restore-time artifact on top of a stale open slot.
What is observable is the hydrated-path close that actually got booked:
- exit ts: `2026-06-11 18:35:52.789008+00:00`
- exit reason: `STOP_LOSS`
- exit price: `0.23526757499999998`
- realized pnl_pct: `-0.033056485743551446`
- realized net_pnl: `-1913.155101369921`
That realized stop corresponds to:
- price move against the short of about `3.3056%`
- account-level ROI of about `-2.726636%` using capital before exit (`70165.39`)
## Root cause
The bug is the restore path itself:
1. The open trade state was preserved in `trade_reconstruction`.
2. The current `position_state` snapshot was lossy or stale enough to rehydrate with `bars_held≈0`.
3. The chain token mismatch was detected, but the code explicitly continues with the derived token.
4. The engine therefore recovered continuity without enforcing strict equality between the live open chain and the reconstructed state.
That combination makes orphaned trades possible after a bad hydrate.
## Operational impact
- The XTZ short remained open in the ledger with no formal close.
- The engine later allowed a new trade while the slot should still have been occupied.
- Capital accounting diverged from the true live slot history.
- The restore path masked the inconsistency instead of stopping the recovery.
## Recommended fix direction
1. Treat a chain-token mismatch on restore as a hard failure for BLUE when a live open slot exists.
2. Preserve the original `entry_bar` and bar counter from the open-chain payload instead of reconstructing them from the current `position_state` row when the two disagree materially.
3. Refuse to admit a new trade until the single-slot invariant is proven flat.
4. Add a regression test for:
- open XTZ trade
- stale `position_state`
- chain-token mismatch
- no new trade admission while the open slot remains unresolved
## Bottom line
XTZ was a real open trade.
It never got a clean pre-hydrate exit.
The restore path tolerated chain drift and rebuilt a misleading open state.
The best-supported no-freeze outcome is the 17:24 retract, roughly flat to slightly negative.
The realized hydrated-path loss was `-3.3056485743551446%` on the position and `-2.726636%` of capital before exit, but that is a restore artifact, not the natural end of the original trade.