Seven uncommitted production fixes to BLUE's main runner that the LIVE
process has already been running since the 2026-06-15 17:23 restart (file
mtime 17:17, pid started 17:23). Each fix answers a documented incident;
committing now so they survive in history and a stray checkout can't
silently revert running-config code on the next restart.
1. bars_held = max(0, int(...)) at BOTH journal sites (terminal + sub-day).
CH column is UInt16 — a negative value poisons the spool with a
head-of-line jam (incident 2026-06-12: bars_held=-106).
2. entry_bar = int(restored_entry_bar) at BOTH reconstruction sites; NEVER
from chain_meta. trade_reconstruction payloads carry the DEAD session's
bar counter, so the old override reinstated the stale clock frame the
re-anchor exists to fix → negative bars_held → same UInt16 spool poison
(zombie-trade resurrections, incident 2026-06-12). restored_entry_bar
already encodes hold continuity via stored_bars in THIS session's frame.
3. capital parse handles list/ledger-style payloads: when the restore blob
is a list of update rows, take the latest dict row instead of falling
through to {} and losing the capital anchor.
4. _connect_hz routes the `hazelcast` logger to stderr at INFO. The
silent-HZ-death investigation found ZERO client log lines because
nothing routed them; without this the reactor's health is invisible.
5. _dump_blackbox(reason): forensic thread dump before a watchdog restart —
lifecycle.is_running, active_connections, every thread's stack, and a
flag when any hazelcast/reactor-named thread is MISSING (= reactor died,
the prime suspect for the silent 40min–8h client deaths). print()-only,
CIFS-safe. _watchdog_restart calls it first.
6. _drain_runtime_commands / _process_runtime_commands gain
`*, allow_retract=True`; the heartbeat path drains with
allow_retract=False and re-queues any RETRACT commands. A RETRACT can
force a terminal close that must run through the scan-thread close
finalizer, so the heartbeat must not race it.
7. +import traceback (for the black-box stack dumps).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
TP_FLOOR (LINK 5e05eeeb, -$1,248.71): once the BASE 0.20% TP is crossed,
regression to base exits — caps the left tail of the OB cascade x1.40
TP-widening (which is logged per decision now: dynamic_tp_pct,
tp_mod_factor, cascade_count, ob_regime_signal, tp_floor_armed on
v7_decision_events). Class default OFF (champion parity); live ON via
DOLPHIN_TP_FLOOR.
Malformed-OPEN Option A (causal fix): POSITION_DUST_NOTIONAL_USD shared by
the full-close decision and the single _ps_write_open lifecycle gate (OPEN
rows can never round to zero size on disk); retract terminal leg writes its
trade_exit_legs + trade_reconstruction rows; restore reject-exhaustion halts
for unknown-corruption classes and flat-continues only for the documented
zero-size tombstone class; chain-token mismatch emits a CHAIN_TOKEN_MISMATCH
journal event; restored entry_bar preserves bars_held continuity (negative
entry_bar allowed, Int32) in both CH and HZ restore paths.
Tests: test_tp_floor.py 16/16 incl. LINK golden replay;
test_malformed_open_distal.py 11/11. Suites before/after identical except
one PRE-EXISTING failure fixed (full-close zero-size-row test).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>