DOLPHIN/prod/docs/SYSTEM_BIBLE.md

# DOLPHIN-NAUTILUS SYSTEM BIBLE
## Doctrinal Reference — As Running 2026-04-05

**Version**: v6.0 — NG8 Linux Scanner + TUI v3 Live Observability + Test Footer CI
**Previous version**: v5.0 — Supervisord-First Architecture + MHS v3 + OBF Universe (2026-03-30)
**Previous version**: v4.1 — Multi-Speed Event-Driven Architecture (2026-03-25)
**CI gate (Nautilus)**: 46/46 tests green
**CI gate (MHS)**: 111/111 tests green (unit + E2E + race + Hypothesis)
**CI gate (ACB)**: 118/118 tests green
**Execution**: Binance Futures (USDT-M) verified via `verify_testnet_creds.py` and `binance_test.py`
**Status**: Supervisord-managed. MHS v3 live. OBF universe 540 assets. RM_META=0.975–1.000 [GREEN].
**NG8**: Linux-native eigenscan service. Fixes NG7 double-output bug. Bit-for-bit schema-identical to doctrinal NG5.
**TUI v3**: Live event-driven observability terminal. All panels hooked to HZ entry listeners. Zero origin-system load.

### What changed since v5.0 (2026-04-05 — THIS VERSION)

| Area | Change |
|---|---|
| **NG8 Linux Scanner — NEW** | `- Dolphin NG8/ng8_scanner.py` — Linux-native eigenscan service replacing Windows NG7. Fixes double-output bug. Single `enhance()` call processes all 4 windows (w50/150/300/750) in one pass → exactly one Arrow file + one HZ write per scan_number. See §27. |
| **Arrow Writer Shim — NEW** | `- Dolphin NG8/arrow_writer.py` — thin re-export so `dolphin_correlation_arb512_with_eigen_tracking.py` imports correctly on Linux (Windows had this file natively). |
| **TUI v3 — NEW** | `Observability/TUI/dolphin_tui_v3.py` — full live observability terminal. All panels event-driven via HZ entry listeners. Zero origin-system load. Replaces mocked TUI v2. See §28. |
| **Test Footer CI Hook — NEW** | `run_logs/test_results_latest.json` + `write_test_results()` API in TUI v3. Test scripts push results; TUI footer displays live. See §28.4 and `TEST_REPORTING.md`. |
| **NG7 Double-Output — Root Cause Confirmed** | Windows NG7 ran two independent tracker cycles (fast w50/w150 + slow w300/w750) sharing the same scan_number counter → two Arrow files + two HZ writes per scan, second file arriving ~3 min late with stale prices. NG8 eliminates this by design. |

---

### What changed since v4.1 (2026-03-30 — PREVIOUS)

| Area | Change |
|---|---|
| **Process Manager: Systemd → Supervisord** | ALL dolphin services migrated exclusively to supervisord. No service is managed by both. `dolphin-supervisord.conf` is the single source of process truth. See §16, §26. |
| **"Random Killer" Root Cause Fixed** | `meta_health_daemon_v2.py` had been running under systemd for 4 days calling `systemctl restart` on supervisord-managed services every 5s. Dual-management race caused random service kills. Stopped + disabled. |
| **MHS v3 — Complete Rewrite** | `meta_health_service_v3.py` — product formula bug fixed (zero-collapse replaced by weighted sum), recovery via supervisorctl not systemctl, `RECOVERY_COOLDOWN_CRITICAL_S=10s` (was 600s), non-blocking daemon thread recovery. See §24.5. |
| **OBF Universe Service — NEW** | `obf_universe_service.py` — 540 USDT perp assets on 3 WebSocket connections, zero REST weight, 60s health snapshots → HZ `obf_universe_latest`. Supervisord `autostart=true`. See §26. |
| **OBF Retention Fix** | `obf_persistence.py` `MAX_FILE_AGE_DAYS = 0` (was 7 — was deleting all backtesting data). Data now accumulates indefinitely for backtesting. |
| **Test Suite: MHS** | NEW `prod/tests/test_mhs_v3.py` — 111 tests: unit, live integration, E2E kill/revive, race conditions, 13 Hypothesis property tests. |
| **HZ Schema additions** | `DOLPHIN_FEATURES["obf_universe_latest"]`, `DOLPHIN_META_HEALTH["latest"]`. See §15. |
| **Supervisord groups** | `dolphin_data` group: exf_fetcher, acb_processor, obf_universe, meta_health (all autostart=true). `dolphin` group: nautilus_trader, scan_bridge, clean_arch_trader (autostart=false). |

### What changed since v4 (2026-03-24)

| Area | Change |
|---|---|
| **Multi-Speed Architecture** | NEW multi-layer frequency isolation: OBF (0.1s), Scan (5s), ExtF (varied), Health (5s), Daily batch. See §24. |
| **Event-Driven Nautilus** | NEW `nautilus_event_trader.py` — Hz entry listener for <1ms scan-to-trade latency. Not a Prefect flow — long-running systemd daemon. See §24.2. |
| **MHS v2** | ENHANCED `meta_health_daemon_v2.py` — Full 5-sensor monitoring (M1-M5), per-subsystem data freshness tracking, automated recovery. See §24.3. |
| **Resource Safety** | NEW systemd resource limits: MemoryMax=2G, CPUQuota=200%, TasksMax=50 per service. Prevents process explosion. |
| **Scan Bridge Hardening** | Deployment concurrency limit=1, work pool concurrency=1, cgroups integration. See §24.1. |
| **Systemd Service Mesh** | NEW services: `dolphin-nautilus-trader.service`, updated `meta_health_daemon.service`. Systemd-managed, not Prefect-managed. |
| **Incident Response** | Post-mortem: 2026-03-24 kernel deadlock from 60+ uncontrolled Prefect processes. Fixed via concurrency controls. |

### What changed since v3 (2026-03-22)

| Area | Change |
|---|---|
| **Clean Architecture** | NEW hexagonal architecture in `prod/clean_arch/` — Ports, Adapters, Core separation. Adapter-agnostic business logic. |
| **Hazelcast DataFeed** | NEW `HazelcastDataFeed` adapter implementing `DataFeedPort` — reads from DolphinNG6 via Hazelcast (single source of truth). |
| **Scan Bridge Service** | NEW `scan_bridge_service.py` — Linux Arrow file watcher that pushes to Hazelcast. Uses file mtime (not scan #) to handle NG6 restarts. **Phase 2: Prefect daemon integration complete** — auto-restart, health monitoring, unified logging. **18 unit tests** in `tests/test_scan_bridge_prefect_daemon.py`.
| **Paper Trading Engine** | NEW `paper_trade.py` — Clean architecture trading CLI with 23 round-trip trades executed in testing. |
| **Market Data** | Live data flowing: 50 assets, BTC @ $71,281.03, velocity divergence signals active. |

### What changed since v2 (2026-03-22)

| Area | Change |
|---|---|
| **Binance Futures** | Switched system focus from Spot to Perpetuals; updated API endpoints (`fapi.binance.com`); added `recvWindow` for signature stability. |
| **Friction Management** | **SP Bypass Logic**: Alpha engines now support disabling internal fees/slippage to allow Nautilus to handle costs natively. Prevents double-counting. |
| **Paper Trading** | NEW `launch_paper_portfolio.py` — uses Sandbox matching with live Binance data; includes realistic Tier 0 friction (0.02/0.05). |
| **Session Logging** | NEW `TradeLoggerActor` — independent CSV/JSON audit trails for every session. |

| Area | Change |
|---|---|
| **DolphinActor** | Refactored to step_bar() API (incremental, not batch); threading.Lock on ACB; _GateSnap stale-state detection; replay vs live mode; bar_idx tracking |
| **OBF Subsystem** | Sprint 1 hardening complete: circuit breaker, stall watchdog, crossed-book guard, dark streak, first flush 60s, fire-and-forget HZ pushes, dynamic asset discovery |
| **nautilus_prefect_flow.py** | NEW — Prefect-supervised BacktestEngine daily flow; champion SHA256 hash check; HZ heartbeats; capital continuity; HIBERNATE guard |
| **Test suite** | +35 DolphinActor tests (test_dolphin_actor.py); total 46 Nautilus + ~120 OBF |
| **prod/docs/** | All prod .md files consolidated; SYSTEM_FILE_MAP.md; NAUTILUS_DOLPHIN_SPEC.md added |
| **0.1s resolution** | Assessed: BLOCKED by 3 hard blockers (see §22) |
| **Capital Sync** | NEW — DolphinActor now syncs initial_capital with Nautilus Portfolio balance on_start. |
| **Verification** | NEW — `TODO_CHECK_SIGNAL_PATHS.md` systematic test spec for local agents. |
| **MC-Forewarner** | Now wired in `DolphinActor.on_start()` — both flows run full gold-performance stack; `_MC_BASE_CFG` + `_MC_MODELS_DIR_DEFAULT` as frozen module constants; empty-parquet early-return bug fixed in `on_bar` replay path |

---

## TABLE OF CONTENTS

1. [System Philosophy](#1-system-philosophy)
2. [Physical Architecture](#2-physical-architecture)
2a. [Clean Architecture Layer (NEW v4)](#2a-clean-architecture-layer)
3. [Data Layer](#3-data-layer)
4. [Signal Layer — vel_div & DC](#4-signal-layer)
5. [Asset Selection — IRP](#5-asset-selection-irp)
6. [Position Sizing — AlphaBetSizer](#6-position-sizing)
7. [Exit Management](#7-exit-management)
8. [Fee & Slippage Model](#8-fee--slippage-model)
9. [OB Intelligence Layer](#9-ob-intelligence-layer)
10. [ACB v6 — Adaptive Circuit Breaker](#10-acb-v6)
11. [Survival Stack — Posture Control](#11-survival-stack)
12. [MC-Forewarner Envelope Gate](#12-mc-forewarner-envelope-gate)
13. [NDAlphaEngine — Full Bar Loop](#13-ndalpha-engine-full-bar-loop)
14. [DolphinActor — Nautilus Integration](#14-dolphin-actor)
15. [Hazelcast — Full IMap Schema](#15-hazelcast-full-imap-schema)
16. [Production Daemon Topology & HZ Bridge](#16-production-daemon-topology)
17. [Prefect Orchestration Layer](#17-prefect-orchestration-layer)
18. [CI Test Suite](#18-ci-test-suite)
19. [Parameter Reference](#19-parameter-reference)
20. [OBF Sprint 1 Hardening](#20-obf-sprint-1-hardening)
21. [Known Research TODOs](#21-known-research-todos)
22. [0.1s Resolution — Readiness Assessment](#22-01s-resolution-readiness-assessment)
23. [Signal Path Verification Specification](#23-signal-path-verification)
24. [Multi-Speed Event-Driven Architecture (v4.1)](#24-multi-speed-event-driven-architecture)
25. [Numerical Precision Policy](#25-numerical-precision-policy)
26. [Supervisord Architecture & OBF Universe (v5.0)](#26-supervisord-architecture--obf-universe)
27. [NG8 Linux Eigenscan Service (v6.0)](#27-ng8-linux-eigenscan-service)
28. [TUI v3 — Live Observability Terminal (v6.0)](#28-tui-v3-live-observability-terminal)

---

## 1. SYSTEM PHILOSOPHY

DOLPHIN-NAUTILUS is a **SHORT-only** (champion configuration) systematic trading engine targeting crypto perpetual futures on Binance.

**Core thesis**: When crypto market correlation matrices show accelerating eigenvalue-velocity divergence (`vel_div < -0.02`), the market is entering an instability regime. Shorting during early instability onset and exiting at fixed take-profit captures the mean-reversion from panic to normalization.

**Design constraints**:
- Zero signal re-implementation in the Nautilus layer. All alpha logic lives in `NDAlphaEngine`.
- 512-bit arithmetic for correlation matrix processing (separate NG3 pipeline; not in hot path of this engine).
- Champion parameters are FROZEN. They were validated via exhaustive VBT backtest on `dolphin_vbt_real.py`.
- The Nautilus actor is a thin wire, not a strategy. It routes parquet data → NDAlphaEngine → HZ result.

**Champion performance** (ACBv6 + IRP + DC + OB, full-stack 55-day Dec31–Feb25):
- ROI: +54.67% | PF: 1.141 | Sharpe: 2.84 | Max DD: 15.80% | WR: 49.5% | Trades: 2145
- Log: `run_logs/summary_20260307_163401.json`

> **Data correction note (2026-03-07)**: An earlier reference showed ROI=+57.18%, PF=1.149,
> Sharpe=3.00. Those figures came from a stale `vbt_cache/2026-02-25.parquet` that was built
> mid-day — missing 435 scans and carrying corrupt vel_div on 492 rows for the final day of the
> window. ALGO-3 parity testing caught the mismatch (max_diff=1.22 vs tolerance 1e-10).
> The parquet was rebuilt from live NG3 JSON (`build_parquet_cache(dates=['2026-02-25'], force=True)`).
> The stale file is preserved as `2026-02-25.parquet.STALE_20260307` for replicability.
> The corrected numbers above are the canonical reference. The ~2.5pp ROI drop reflects real
> late-day trades on Feb 25 that the stale parquet had silently omitted.

---

## 2. PHYSICAL ARCHITECTURE

```
┌──────────────────────────────────────────────────────────────────────┐
│  DATA SOURCES                                                         │
│  NG3 Scanner (Win) → /mnt/ng6_data/eigenvalues/ (SMB DolphinNG6_Data)│
│  Binance WS → 5s OHLCV bars + live order book (48+ USDT perpetuals) │
│  VBT Cache  → vbt_cache_klines/*.parquet (DOLPHIN-local + /mnt/dolphin)│
└────────────────────────┬─────────────────────────────────────────────┘
                         │
┌────────────────────────▼─────────────────────────────────────────────┐
│  HAZELCAST IN-MEMORY GRID  (localhost:5701, cluster: "dolphin")       │
│  *** SYSTEM MEMORY — primary real-time data bus ***                  │
│  DOLPHIN_SAFETY          → posture + Rm (CP AtomicRef / IMap)        │
│  DOLPHIN_FEATURES        → acb_boost {boost,beta}, latest_eigen_scan │
│  DOLPHIN_PNL_BLUE/GREEN  → per-date trade results                    │
│  DOLPHIN_STATE_BLUE      → capital continuity (latest + per-run)     │
│  DOLPHIN_HEARTBEAT       → liveness pulses (nautilus_prefect_flow)   │
│  DOLPHIN_OB              → order book snapshots                       │
│  DOLPHIN_FEATURES_SHARD_00..09 → 400-asset OBF sharded store         │
└────────────────────────┬─────────────────────────────────────────────┘
                         │
┌────────────────────────▼─────────────────────────────────────────────┐
│  PREFECT ORCHESTRATION  (localhost:4200, work-pool: dolphin)          │
│  paper_trade_flow.py        00:05 UTC — NDAlphaEngine direct         │
│  nautilus_prefect_flow.py   00:10 UTC — BacktestEngine + DolphinActor│
│  obf_prefect_flow.py        Continuous ~500ms — OB ingestion         │
│  mc_forewarner_flow.py      Daily — MC gate prediction               │
│  exf_fetcher_flow.py        Periodic — ExF macro data fetch          │
└────────────────────────┬─────────────────────────────────────────────┘
                         │
┌────────────────────────▼─────────────────────────────────────────────┐
│  SUPERVISORD (v5.0 — sole process manager)                            │
│  Config: prod/supervisor/dolphin-supervisord.conf                     │
│  Socket: /tmp/dolphin-supervisor.sock                                 │
│                                                                       │
│  dolphin_data group (autostart=true):                                 │
│  ├── exf_fetcher_flow.py      — ExF live daemon                      │
│  ├── acb_processor_service.py — ACB boost + HZ write (CP lock)       │
│  ├── obf_universe_service.py  — 540-asset OBF universe (NEW v5.0)    │
│  └── meta_health_service_v3.py — MHS watchdog (NEW v5.0)             │
│                                                                       │
│  dolphin group (autostart=false):                                     │
│  ├── nautilus_event_trader.py — HZ entry listener trader              │
│  ├── scan_bridge_service.py   — Arrow → HZ scan bridge               │
│  └── clean_arch/main.py       — Clean architecture trader            │
└────────────────────────┬─────────────────────────────────────────────┘
                         │
┌────────────────────────▼─────────────────────────────────────────────┐
│  NAUTILUS TRADING ENGINE  (siloqy-env, nautilus_trader 1.219.0)      │
│  BacktestEngine + DolphinActor(Strategy) → NDAlphaEngine             │
│  on_bar() fires per date tick; step_bar() iterates parquet rows      │
│  HZ ACB listener → pending-flag → applied at top of next on_bar()   │
│  TradingNode (launcher.py) → future live exchange connectivity        │
└──────────────────────────────────────────────────────────────────────┘
```

**Key invariant v2**: `DolphinActor.on_bar()` receives one synthetic bar per date in paper mode, which triggers `engine.begin_day()` then iterates through all parquet rows via `step_bar()`. In live mode, one real bar → one `step_bar()` call. The `_processed_dates` guard is replaced by date-boundary detection comparing `current_date` to the bar's timestamp date.

---

## 2a. CLEAN ARCHITECTURE LAYER (NEW v4)

### 2a.1 Overview

The Clean Architecture layer provides a **hexagonal** (ports & adapters) implementation for paper trading, ensuring core business logic is independent of infrastructure concerns.

```
┌─────────────────────────────────────────────────────────────────────────┐
│                     CLEAN ARCHITECTURE (prod/clean_arch/)               │
├─────────────────────────────────────────────────────────────────────────┤
│  PORTS (Interfaces)                                                      │
│  ├── DataFeedPort          → Abstract market data source                │
│  └── TradingPort           → Abstract order execution                   │
├─────────────────────────────────────────────────────────────────────────┤
│  ADAPTERS (Infrastructure)                                               │
│  ├── HazelcastDataFeed     → Reads from DOLPHIN_FEATURES map            │
│  └── PaperTradeExecutor    → Simulated execution (no real orders)       │
├─────────────────────────────────────────────────────────────────────────┤
│  CORE (Business Logic)                                                   │
│  ├── TradingEngine         → Position sizing, signal processing         │
│  ├── SignalProcessor       → Eigenvalue-based signal generation         │
│  └── PortfolioManager      → PnL tracking, position management          │
└─────────────────────────────────────────────────────────────────────────┘
```

### 2a.2 Key Design Principles

**Dependency Rule**: Dependencies only point inward. Core knows nothing about Hazelcast, Arrow files, or Binance.

**Single Source of Truth**: All data comes from Hazelcast `DOLPHIN_FEATURES.latest_eigen_scan`, written atomically by DolphinNG6.

**File Timestamp vs Scan Number**: The Scan Bridge uses file modification time (mtime) instead of scan numbers because DolphinNG6 resets counters on restarts.

### 2a.3 Components

| Component | File | Purpose |
|-----------|------|---------|
| `DataFeedPort` | `ports/data_feed.py` | Abstract interface for market data |
| `HazelcastDataFeed` | `adapters/hazelcast_feed.py` | Hz implementation of DataFeedPort |
| `TradingEngine` | `core/trading_engine.py` | Pure business logic |
| `Scan Bridge` | `../scan_bridge_service.py` | Arrow → Hazelcast bridge |
| `Paper Trader` | `paper_trade.py` | CLI trading session |

### 2a.4 Data Flow

```
DolphinNG6 → Arrow Files (/mnt/ng6_data/arrow_scans/) → Scan Bridge → Hazelcast → HazelcastDataFeed → TradingEngine
     (5s)                                               (watchdog)    (SSOT)      (Adapter)          (Core)
                                                          ↑
                                                   (Prefect daemon
                                                    supervises)
```

**Management**: The scan bridge runs as a Prefect-supervised daemon (`scan_bridge_prefect_daemon.py`):
- Health checks every 30 seconds
- Automatic restart on crash or stale data (>60s)
- Centralized logging via Prefect UI
- Deployed to `dolphin-daemon-pool`

### 2a.5 MarketSnapshot Structure

```python
MarketSnapshot(
    timestamp=datetime,
    symbol="BTCUSDT",
    price=71281.03,              # From asset_prices[0]
    eigenvalues=[...],           # From asset_loadings (50 values)
    velocity_divergence=-0.0058, # vel_div field
    scan_number=7315
)
```

### 2a.6 Current Status

- **Assets Tracked**: 50 (BTC, ETH, BNB, etc.)
- **BTC Price**: $71,281.03
- **Test Trades**: 23 round-trip trades executed
- **Strategy**: Mean reversion on velocity divergence
- **Data Latency**: ~5 seconds (DolphinNG6 pulse)
- **Bridge Management**: Prefect daemon (auto-restart, health checks every 30s)

### 2a.7 Testing

**Unit Tests:** `prod/tests/test_scan_bridge_prefect_daemon.py` (18 tests)

| Test Category | Count | Description |
|--------------|-------|-------------|
| ScanBridgeProcess | 6 | Process lifecycle (start, stop, restart) |
| Hazelcast Freshness | 6 | Data age detection (fresh, stale, warning) |
| Health Check Task | 3 | Prefect task health validation |
| Integration | 3 | Real Hz connection, process lifecycle |

**Run Tests:**
```bash
cd /mnt/dolphinng5_predict/prod
source /home/dolphin/siloqy_env/bin/activate
pytest tests/test_scan_bridge_prefect_daemon.py -v
```

**Test Coverage:**
- ✅ Process start/stop/restart
- ✅ Graceful and force kill
- ✅ Fresh/stale/warning data detection
- ✅ Hazelcast connection error handling
- ✅ Health check state transitions

---

## 3. DATA LAYER

### 3.1 vbt_cache_klines Parquet Schema

Location: `C:\Users\Lenovo\Documents\- DOLPHIN NG HD HCM TSF Predict\vbt_cache_klines\YYYY-MM-DD.parquet`

| Column | Type | Description |
|--------|------|-------------|
| `vel_div` | float64 | Eigenvalue velocity divergence: `v50_vel − v750_vel` (primary signal) |
| `v50_lambda_max_velocity` | float64 | Short-window (50-bar) lambda_max rate of change |
| `v150_lambda_max_velocity` | float64 | 150-bar window lambda velocity |
| `v300_lambda_max_velocity` | float64 | 300-bar window lambda velocity |
| `v750_lambda_max_velocity` | float64 | Long-window (750-bar) macro eigenvalue velocity |
| `instability_50` | float64 | General market instability index (50-bar) |
| `instability_150` | float64 | General market instability index (150-bar) |
| `BTCUSDT` … `STXUSDT` | float64 | Per-asset close prices (48 assets in current dataset) |

Each file: 1,439 rows (1 per 5-second bar over 24h), 57 columns.

### 3.2 NG3 Eigenvalue Data

Location: `C:\Users\Lenovo\Documents\- Dolphin NG HD (NG3)\correlation_arb512\`

```
eigenvalues/
  YYYY-MM-DD/
    scan_NNNNNN__Indicators.npz   ← ACBv6 external factors (funding, dvol, fng, taker)
    scan_NNNNNN__scan_global.npz  ← lambda_vel_w750 for dynamic beta
matrices/
  YYYY-MM-DD/
    scan_NNNNNN_w50_HHMMSS.arb512.pkl.zst  ← 512-bit correlation matrix (unused in hot path)
```

NPZ files loaded by `AdaptiveCircuitBreaker._load_external_factors()` (max 10 scans per date, median-aggregated).

---

## 4. SIGNAL LAYER

### 4.1 Primary Signal: vel_div Threshold Gate

**Source**: `alpha_signal_generator.py`, `AlphaSignalGenerator.generate()`

**SHORT signal condition**:
```
vel_div < VEL_DIV_THRESHOLD (-0.02)
```

**LONG signal condition** (green posture, not champion):
```
vel_div > LONG_THRESHOLD (0.01)
```

**Confidence calculation** (SHORT path):
```python
ratio = clamp((threshold - vel_div) / (threshold - extreme), 0, 1)
      = clamp((-0.02 - vel_div) / (-0.02 - (-0.05)), 0, 1)
      = clamp((-0.02 - vel_div) / 0.03, 0, 1)
confidence = 0.50 + ratio * 0.40   # range: [0.50, 0.90]
```

`is_extreme = (vel_div <= -0.05)`

### 4.2 Direction Confirmation (DC) — Layer 6

**Source**: `alpha_signal_generator.py`, `check_dc_nb()` (numba JIT)

```python
# Looks back dc_lookback_bars (default 7) bars on the selected asset price
p0 = price[n - lookback - 1]
p1 = price[n - 1]
chg_bps = (p1 - p0) / p0 * 10000.0

if chg_bps < -min_magnitude_bps (-0.75):  return CONFIRM   # falling price → SHORT OK
if chg_bps >  min_magnitude_bps (+0.75):  return CONTRADICT
else:                                       return NEUTRAL
```

**dc_skip_contradicts = True** (champion): CONTRADICT returns null signal (skip entry).
**Effect on leverage**: DC has `dc_leverage_boost=1.0` (no boost in champion). CONTRADICT kills entry.

### 4.3 OB Sub-2: Per-Asset Imbalance Confirmation

When `ob_engine` is wired in (`use_ob_edge=True`):
```python
eff_imb = -ob_signal.imbalance_ma5  # For SHORT: sell pressure = positive eff_imb

if eff_imb > 0.10:                              # OB confirms → confidence boost ≤+15%
    ob_adj = 1 + min(0.15, eff_imb * persistence * 0.5)
    confidence *= ob_adj
elif eff_imb < -0.15 and persistence > 0.60:   # Strong persistent OB contradiction → HARD SKIP
    return null_signal
elif eff_imb < -0.10:                           # Moderate → soft dampen confidence
    ob_adj = max(0.85, 1 - |eff_imb| * persistence * 0.4)
    confidence *= ob_adj
```

---

## 5. ASSET SELECTION — IRP

### 5.1 Overview

**Source**: `alpha_asset_selector.py`, `AlphaAssetSelector.rank_assets()` + numba kernels

IRP = **Impulse Response Profiling**. Ranks all available assets by historical behavior over the last 50 bars in the regime direction. Selects the asset with the highest ARS (Asset Ranking Score) that passes all filters.

**Enabled by**: `use_asset_selection=True` (production default).

### 5.2 Numba Kernel: compute_irp_nb

```python
# Input: price_segment (last 50 prices), direction (-1 or +1)
dir_returns[i] = (price[i+1] - price[i]) * direction   # directional returns

cumulative = cumsum(dir_returns)
mfe = max(cumulative)          # Maximum Favorable Excursion
mae = abs(min(cumulative, 0))  # Maximum Adverse Excursion
efficiency  = mfe / (mae + 1e-6)
alignment   = count(dir_returns > 0) / n_ret
noise       = variance(dir_returns)
latency     = bars_to_reach_10pct_of_mfe   # (default: 50 if mfe==0)
```

### 5.3 Numba Kernel: compute_ars_nb

```
ARS = 0.5 * log1p(efficiency) + 0.35 * alignment - 0.15 * noise * 1000
```

### 5.4 Numba Kernel: rank_assets_irp_nb

For each asset:
1. Compute IRP in DIRECT direction (regime_direction)
2. Compute IRP in INVERSE direction (-regime_direction)
3. Take whichever gives higher ARS (allows inverse selection)
4. Apply filter gates:
   - `noise > 500` → skip
   - `latency > 20` → skip (must reach 10% MFE within 20 bars)
   - `alignment < 0.20` → skip
5. Bubble-sort by ARS descending (numba nopython)

### 5.5 AlphaAssetSelector Python Wrapper

```python
# Build 2D array (max_len × n_assets), right-aligned
valid = rank_assets_irp_nb(prices_2d, idx=max_len, regime_direction, ...)
# Walk ranked list:
for r in rankings:
    if min_irp_alignment > 0 and r.metrics.alignment < min_irp_alignment:
        continue          # alignment gate (default 0.45)
    if r.action != expected_action:
        continue          # direction gate
    if ob_engine and ob_placement.depth_quality < 0.20:
        continue          # OB depth gate (try next asset)
    trade_asset = r.asset
    break
# No match → return None (no fallback to BTCUSDT when IRP enabled)
```

**OB Sub-1**: ARS adjusted ±5%/10% by per-asset OB depth quality before sorting.

---

## 6. POSITION SIZING

### 6.1 Signal Strength Score

**Source**: `alpha_bet_sizer.py`, `compute_sizing_nb()` (numba JIT)

```python
# SHORT path (vel_div < threshold):
if vel_div <= extreme (-0.05):
    strength_score = 1.0
else:
    strength_score = (threshold - vel_div) / (threshold - extreme)
    # = (-0.02 - vel_div) / 0.03
    strength_score = clamp(strength_score, 0.0, 1.0)
```

### 6.2 Dynamic Leverage (Cubic Convex)

```python
scaled_score  = strength_score ** leverage_convexity (3.0)
eff_leverage  = min_leverage + scaled_score * (max_leverage - min_leverage)
              = 0.5 + scaled_score³ * 4.5   # range: [0.5, 5.0] before ACB
```

### 6.3 Signal Bucket Classification

```python
# Numba: get_signal_bucket_nb(vel_div, threshold=-0.02, extreme=-0.05)
if vel_div <= extreme * 1.5 (-0.075):  bucket = 0  # "extreme"
elif vel_div <= extreme (-0.05):        bucket = 1  # "strong"
elif vel_div <= (threshold+extreme)/2:  bucket = 2  # "moderate"  (-0.035)
else:                                   bucket = 3  # "weak"
```

### 6.4 Alpha Layers (Layer 7)

**Bucket Boost** — adaptive win-rate feedback:
```python
# get_bucket_boost_nb: per-bucket win rate → multiplier
wr > 0.60 → 1.3x  |  wr > 0.55 → 1.1x  |  wr < 0.40 → 0.7x  |  wr < 0.45 → 0.85x
```

**Streak Multiplier** — recent 5-trade loss streak:
```python
# get_streak_mult_nb
losses_in_last_5 >= 4 → 0.5x  |  >= 3 → 0.7x  |  <= 1 → 1.1x
```

**Trend Multiplier** — vel_div acceleration:
```python
# get_trend_mult_nb(vd_trend = vel_div_history[-1] - vel_div_history[-10])
vd_trend < -0.01 → 1.3x (deepening instability)
vd_trend >  0.01 → 0.7x (recovering)
```

**Effective Fraction computation**:
```python
confidence = 0.70 if is_extreme else 0.55
conf_mult  = confidence / 0.95
extreme_boost = 2.0 if is_extreme else 1.0

base_frac   = 0.02 + strength_score * (base_fraction - 0.02)
eff_fraction = base_frac * conf_mult * extreme_boost * trend_mult * bucket_boost * streak_mult
eff_fraction = clamp(eff_fraction, 0.02, base_fraction=0.20)
```

**Final notional**:
```python
notional = capital * eff_fraction * final_leverage
```

### 6.5 ACB + MC Size Multiplier

```python
# regime_size_mult is recomputed every bar via _update_regime_size_mult(vel_div)
if day_beta > 0:
    strength_cubic = clamp((threshold - vel_div) / (threshold - extreme), 0, 1) ** convexity
    regime_size_mult = day_base_boost * (1.0 + day_beta * strength_cubic) * day_mc_scale
else:
    regime_size_mult = day_base_boost * day_mc_scale

# Applied to leverage ceiling:
clamped_max_leverage = min(base_max_leverage * regime_size_mult * market_ob_mult, abs_max_leverage=6.0)
raw_leverage = size_result["leverage"] * dc_lev_mult * regime_size_mult * market_ob_mult

# STALKER posture hard cap:
if posture == 'STALKER': clamped_max_leverage = min(clamped_max_leverage, 2.0)

final_leverage = clamp(raw_leverage, min_leverage=0.5, clamped_max_leverage)
```

---

## 7. EXIT MANAGEMENT

### 7.1 Exit Priority Order (champion)

**Source**: `alpha_exit_manager.py`, `AlphaExitManager.evaluate()`

1. **FIXED_TP**: `pnl_pct >= 0.0095` (95 basis points)
2. **STOP_LOSS**: `pnl_pct <= -1.0` (DISABLED in practice — 100% loss never triggers before TP/max_hold)
3. **OB DURESS exits** (when ob_engine != None):
   - Cascade Detection: `cascade_count > 0` → widen TP ×1.40, halve max_hold
   - Liquidity Withdrawal: `regime_signal == 1` → hard SL 10%, TP ×0.60
4. **vel_div adverse-turn exits** (`vd_enabled=False` by default — disabled pending calibration)
5. **MAX_HOLD**: `bars_held >= 120` (= 600 seconds)

### 7.2 OB Dynamic Exit Parameter Adjustment

```python
if cascade_count > 0:
    dynamic_tp_pct  *= 1.40
    dynamic_max_hold = int(max_hold_bars * 0.50)    # take profit fast before snap-back

elif regime_signal == 1:   # LIQUIDITY WITHDRAWAL STRESS
    dynamic_sl_pct  = 0.10                          # hard 10% stop (tail protection)
    if pnl_pct > 0.0:
        dynamic_tp_pct *= 0.60                      # take profit sooner under stress
    if eff_imb < -0.10:                             # OB actively opposing
        dynamic_max_hold = int(max_hold_bars * 0.40)

elif regime_signal == -1 and eff_imb > 0.15:        # CALM + FAVORABLE
    dynamic_max_hold = int(max_hold_bars * 1.50)    # let winners run

# Per-asset withdrawal (micro-level):
if withdrawal_velocity < -0.20 and not in cascade/stress:
    dynamic_max_hold = min(dynamic_max_hold, int(max_hold_bars * 0.40))
    if pnl_pct > 0.0: dynamic_tp_pct *= 0.75
```

### 7.3 Sub-day ACB Force Exit

When HZ listener fires an ACB update mid-day:
```python
# In update_acb_boost(boost, beta):
if old_boost >= 1.25 and boost < 1.10:
    evaluate_subday_exits()   # → _execute_exit("SUBDAY_ACB_NORMALIZATION", ...)
```

Threshold is ARBITRARY (not backtested). Marked research TODO. Safe under pending-flag pattern (fires on next bar, not mid-loop).

### 7.4 Slippage on Exit

```python
# SHORT position exit:
exit_price = current_price * (1.0 + slip)   # slippage against us when covering short
# STOP_LOSS:   slip = 0.0005 (5 bps — market order fill)
# FIXED_TP:   slip = 0.0002 (2 bps — likely limit fill)
# All others: slip = 0.0002
```

---

## 8. FEE & SLIPPAGE MODEL

### 8.1 SmartPlacer Fee Model (Layer 3)

**Source**: `esf_alpha_orchestrator.py`, `_execute_exit()`

Blended taker/maker fee rates based on historical SP fill statistics. **IMPORTANT**: In production/paper sessions using Nautilus friction, these MUST be disabled via `use_sp_fees=False`.

```python
# Entry fee (ONLY applied if use_sp_fees=True):
entry_fee = (0.0002 * sp_maker_entry_rate + 0.0005 * (1 - sp_maker_entry_rate)) * notional
          = (0.0002 * 0.62 + 0.0005 * 0.38) * notional
          = (0.0001240 + 0.0001900) * notional
          = 0.000314 * notional   (31.4 bps)
```

### 8.2 SP Slippage Refund (Layer 3)

Also disabled when `use_sp_slippage=False` is passed to the engine. These were used to "re-approximate" fills in low-fidelity simulations. In paper/live trading, the matching engine provides the fill price directly.

### 8.3 Production-Grade Native Friction (Nautilus)

In `launch_paper_portfolio.py` and live production flows:
1. **Engine Bypass**: `use_sp_fees = False`, `use_sp_slippage = False`.
2. **Nautilus Node Side**: Commissions are applied by the kernel via `CommissionConfig`.
3. **Execution**: Slippage is realized via the spread in the Nautilus Sandbox (Paper) or on-chain (Live).

### 8.4 Independent Session Logging

Every high-fidelity session now deploys a `TradeLoggerActor` that independently captures:
- `logs/paper_trading/settings_<ts>.json`: Full configuration metadata.
- `logs/paper_trading/trades_<ts>.csv`: Every execution event.

### 8.3 OB Edge (Layer 4)

```python
# With real OB engine:
if ob_placement.depth_quality > 0.5:
    pnl_pct_raw += ob_placement.fill_probability * ob_edge_bps * 1e-4

# Without OB engine (legacy Monte Carlo fallback):
if rng.random() < ob_confirm_rate (0.40):
    pnl_pct_raw += ob_edge_bps * 1e-4   # default: +5 bps
```

**Net PnL**:
```python
gross_pnl = pnl_pct_raw * notional
net_pnl   = gross_pnl - entry_fee - exit_fee
capital  += net_pnl
```

---

## 9. OB INTELLIGENCE LAYER

**Source**: `ob_features.py`, `ob_provider.py`, `hz_ob_provider.py`

The OB layer is wired in via `engine.set_ob_engine(ob_engine)` which propagates to signal_gen, asset_selector, and exit_manager. It is OPTIONAL — the engine degrades gracefully to legacy Monte Carlo when `ob_engine=None`.

### 9.1 OB Signals Per Asset

```python
ob_signal = ob_engine.get_signal(asset, timestamp)
# Fields:
#   imbalance_ma5        — 5-bar MA of bid/ask size imbalance ([-1, +1])
#   imbalance_persistence — fraction of last N bars sustaining sign
#   withdrawal_velocity  — rate of depth decay (negative = book thinning)
```

### 9.2 OB Macro (Market-Wide)

```python
ob_macro = ob_engine.get_macro()
# Fields:
#   cascade_count    — number of assets in liquidation cascade
#   regime_signal    — (-1=calm/trending, 0=neutral, +1=withdrawal stress)
```

### 9.3 OB Placement Quality

```python
ob_placement = ob_engine.get_placement(asset, timestamp)
# Fields:
#   depth_quality    — book depth score ([0, 2+]; >1 = deep book)
#   fill_probability — probability of maker fill at entry price
```

### 9.4 OB Sub-Systems Summary

| Sub | Location | Effect |
|-----|----------|--------|
| OB-1 | AlphaAssetSelector | ARS adjusted ±5%/10% by depth quality |
| OB-2 | AlphaSignalGenerator | Confidence boost/dampen by imbalance; hard skip if persistent contradiction |
| OB-3 | NDAlphaEngine._try_entry | Market-wide imbalance multiplier on final leverage (±20%/15%) |
| OB-4 | AdaptiveCircuitBreaker | Macro withdrawal stress modulates ACBv6 dynamic beta (+25% max) |
| OB-5 | AlphaExitManager | Dynamic TP/SL/max_hold based on cascade/withdrawal/calm regime |

---

## 10. ACB v6 — ADAPTIVE CIRCUIT BREAKER

### 10.1 Architecture (3-Scale Confirmation)

**Source**: `adaptive_circuit_breaker.py`, `AdaptiveCircuitBreaker`

```
Scale 1 (Daily):  External macro factors → log_0.5 base_boost
Scale 2 (Per-bar): vel_div signal strength → meta-boost multiplier
Scale 3 (Macro):  w750 eigenvalue velocity → dynamic beta switch
```

### 10.2 Scale 1 — External Factor Signals

Loaded from NG3 `scan_*__Indicators.npz` files (median of first 10 scans per date):

| Factor | Source | Bearish Threshold | Signal Weight |
|--------|--------|------------------|---------------|
| `funding_btc` | Funding rate | < -0.0001 (very) / < 0.0 (mild) | 1.0 / 0.5 |
| `dvol_btc` | BTC implied vol | > 80 (extreme) / > 55 (elevated) | 1.0 / 0.5 |
| `fng` | Fear & Greed Index | < 25 (extreme fear) / < 40 (fear) | 1.0 / 0.5 (requires confirmation) |
| `taker` | Taker buy ratio | < 0.80 (selling) / < 0.90 (mild) | 1.0 / 0.5 |

**Signal counting**:
```python
signals = sum(individual_signal_weights)   # float, e.g. 2.5
```

**Base boost formula**:
```python
if signals >= 1.0:
    base_boost = 1.0 + 0.5 * log1p(signals)
    # signals=1 → 1.347x | signals=2 → 1.549x | signals=3 → 1.693x
else:
    base_boost = 1.0
```

### 10.3 Scale 3 — Dynamic Beta

```python
# Preloaded: w750_threshold = np.percentile(all_w750_vels, 60)
if w750_vel >= w750_threshold:
    beta = BETA_HIGH = 0.8    # aggressive meta-boost during macro acceleration
else:
    beta = BETA_LOW  = 0.2    # conservative during calm macro
```

### 10.4 Scale 2 — Per-Bar Meta-Boost

```python
# Computed every bar inside _update_regime_size_mult(vel_div):
strength_cubic = clamp((threshold - vel_div) / (threshold - extreme), 0, 1) ** 3
# leverage_convexity = 3 → cubic

if day_beta > 0:
    regime_size_mult = base_boost * (1.0 + beta * strength_cubic) * mc_scale
else:
    regime_size_mult = base_boost * mc_scale
```

### 10.5 Sub-Day ACB Update (HZ Listener)

The `acb_processor_service.py` re-runs ACB computation mid-day when new NG3 scan data arrives and writes `{boost, beta}` to `DOLPHIN_FEATURES` IMap.

`_on_acb_event()` in `DolphinActor` stores the payload in `self._pending_acb` (GIL-safe dict write). Applied at start of next `on_bar()` iteration:

```python
# In on_bar() — BEFORE processing:
if _pending_acb is not None and engine is not None:
    engine.update_acb_boost(pending_acb['boost'], pending_acb['beta'])
    _pending_acb = None
```

---

## 11. SURVIVAL STACK — POSTURE CONTROL

### 11.1 Overview

**Source**: `survival_stack.py`, `SurvivalStack`

Computes a continuous Risk Multiplier `Rm ∈ [0, 1]` from 5 sensor categories. Maps to discrete posture {APEX, STALKER, TURTLE, HIBERNATE}.

### 11.2 Five Sensor Categories

**Cat1 — Binary Invariant** (kill switch):
```python
if hz_nodes < 1 or heartbeat_age_s > 30:
    return 0.0   # Total system failure → HIBERNATE immediately
return 1.0
```

**Cat2 — Structural** (MC-Forewarner + data staleness):
```python
base = {OK: 1.0, ORANGE: 0.5, RED: 0.1}[mc_status]
decay = exp(-max(0, staleness_hours - 6) / 3)
f_structural = base * decay   # Exponential decay after 6h stale
```

**Cat3 — Microstructure** (OB depth/fill quality):
```python
if ob_stale:
    return 0.5
score = min(depth_quality, fill_prob)
return clamp(0.3 + 0.7 * score, 0.3, 1.0)
```

**Cat4 — Environmental** (DVOL spike impulse):
```python
if dvol_spike and t_since_spike_min == 0:
    return 0.3   # Instant degradation at spike
return 0.3 + 0.7 * (1 - exp(-t_since_spike_min / 60))  # 60-min recovery tau
```

**Cat5 — Capital** (sigmoid drawdown constraint):
```python
# Rm5 ≈ 1.0 at DD<5%, ≈ 0.5 at DD=12%, ≈ 0.1 at DD=20%
return 1 / (1 + exp(30 * (drawdown - 0.12)))
```

### 11.3 Hierarchical Combination

```python
f_environment = min(f_structural, f_ext)     # worst of Cat2/Cat4
f_execution   = f_micro                       # Cat3
r_target      = Cat1 * Cat5 * f_environment * f_execution

# Correlated sensor collapse penalty:
degraded = count([f_structural < 0.8, f_micro < 0.8, f_ext < 0.8])
if degraded >= 2:
    r_target *= 0.5
```

### 11.4 Bounded Recovery Dynamics

```python
# Fast attack (instant degradation), slow recovery (5%/minute max):
if r_target < last_r_total:
    r_final = r_target          # immediate drop
else:
    alpha = min(1.0, 0.02 * dt_min)
    step  = min(alpha * (r_target - last_r_total), 0.05 * dt_min)
    r_final = last_r_total + step
```

### 11.5 Posture Mapping

**NOTE: Thresholds are deliberately TIGHTER than mathematical spec (safety buffer).**

```python
if Rm >= 0.90: APEX       # Full trading, no constraints
if Rm >= 0.75: STALKER    # Max leverage capped at 2.0x
if Rm >= 0.50: TURTLE     # regime_dd_halt = True (no new entries)
else:          HIBERNATE  # Force-close open positions, no new entries
```

### 11.6 Hysteresis

```python
# Down: requires hysteresis_down=2 consecutive bars at lower level
# Up:   requires hysteresis_up=5 consecutive bars at higher level
# Prevents flip-flopping around thresholds
```

### 11.7 Posture → Engine Effect

| Posture | Engine Effect |
|---------|--------------|
| APEX | No constraint (max leverage = abs_max=6.0 × regime_size_mult) |
| STALKER | `clamped_max_leverage = min(..., 2.0)` in `_try_entry` |
| TURTLE | `regime_dd_halt = True` → `process_bar` skips entry block |
| HIBERNATE | `_manage_position` forces EXIT("HIBERNATE_HALT"), `regime_dd_halt = True` |

---

## 12. MC-FOREWARNER ENVELOPE GATE

**Source**: Called via `engine.set_mc_forewarner(forewarner, mc_base_cfg)`

Runs daily at start of `process_day()`:

```python
mc_cfg = {**mc_base_cfg, 'max_leverage': base_max_leverage * day_base_boost}
mc_report = forewarner.assess_config_dict(mc_cfg)

mc_red    = mc_report.catastrophic_probability > 0.25 or mc_report.envelope_score < -1.0
mc_orange = (not mc_red) and (envelope_score < 0 or catastrophic_probability > 0.10)

day_mc_status = 'RED' | 'ORANGE' | 'OK'
day_mc_scale  = 0.5 if ORANGE else 1.0

if mc_red or posture in ['TURTLE', 'HIBERNATE']:
    regime_dd_halt = True   # No new entries for the entire day
```

---

## 13. NDAlphaEngine — FULL BAR LOOP

### 13.1 process_day() Flow

```python
def process_day(date_str, df, asset_columns, vol_regime_ok, direction, posture):
    # --- Per-day initialization ---
    regime_direction = direction or -1     # -1=SHORT champion
    regime_dd_halt   = False
    day_posture      = posture

    # --- ACB: Scale 1 + Scale 3 (once per day) ---
    if acb:
        info = acb.get_dynamic_boost_for_date(date_str, ob_engine)
        day_base_boost = info['boost']    # log_0.5 curve
        day_beta       = info['beta']     # dynamic beta (w750 percentile switch)

    # --- MC-Forewarner gate (once per day) ---
    if forewarner:
        mc_report = forewarner.assess_config_dict(...)
        if mc_red or posture in [TURTLE, HIBERNATE]:
            regime_dd_halt = True   # kill new entries for day

    # === BAR LOOP ===
    for ri, row in df.iterrows():
        vd       = row['vel_div']            # primary signal
        v50_vel  = row['v50_lambda_max_velocity']
        v750_vel = row['v750_lambda_max_velocity']
        prices   = {col: row[col] for col in asset_columns if row[col] > 0}

        # Append to price_histories (capped at 200; sliding window)
        for ac, p in prices.items():
            price_histories[ac].append(p)
            if len(price_histories[ac]) > 500:
                price_histories[ac] = price_histories[ac][-200:]

        vol_ok = vol_regime_ok[ri] if vol_regime_ok else (bid >= 100)

        # ACB Scale 2: regime_size_mult updated every bar
        _update_regime_size_mult(vd)

        process_bar(global_bar_idx, vd, prices, vol_ok, price_histories, v50_vel, v750_vel)
        global_bar_idx += 1

    return {date, pnl, capital, boost, beta, mc_status, trades}
```

### 13.2 process_bar() Flow

```python
def process_bar(bar_idx, vel_div, prices, vol_regime_ok, price_histories, v50_vel, v750_vel):
    bar_count += 1
    vel_div_history.append(vel_div)   # trimmed to 200

    # === EXIT MANAGEMENT (always first) ===
    if position is not None:
        exit_info = _manage_position(bar_idx, prices, vel_div, v50_vel, v750_vel)
        # → AlphaExitManager.evaluate() → if EXIT: _execute_exit()

    # === ENTRY (only when no position) ===
    if position is None AND bar_idx > last_exit_bar AND NOT regime_dd_halt:
        if bar_count >= lookback (100) AND vol_regime_ok:
            entry_info = _try_entry(bar_idx, vel_div, prices, price_histories, v50_vel, v750_vel)
```

### 13.3 _try_entry() Flow

```python
def _try_entry(bar_idx, vel_div, prices, price_histories, v50_vel, v750_vel):
    if capital <= 0: return None

    # 1. IRP Asset Selection (Layer 2)
    if use_asset_selection:
        market_data = {a: history[-50:] for a, history in price_histories if len >= 50}
        rankings = asset_selector.rank_assets(market_data, regime_direction)
        trade_asset = first_asset_passing_all_gates(rankings)
        if trade_asset is None: return None   # strict: no fallback
    else:
        trade_asset = "BTCUSDT"   # fallback when IRP disabled

    # 2. Signal Generation + DC (Layer 6)
    signal = signal_gen.generate(vel_div, vel_div_history,
                                  price_histories[trade_asset],
                                  regime_direction, trade_asset)
    if not signal.is_valid: return None   # vel_div or DC killed it

    # 3. Position Sizing (Layers 7-8)
    size = bet_sizer.calculate_size(capital, vel_div, signal.vel_div_trend, regime_direction)

    # 4. OB Sub-3: Cross-asset market multiplier
    market_ob_mult = ob_engine.get_market_multiplier(...)  # ±20%

    # 5. ACB leverage ceiling enforcement
    clamped_max = min(base_max_leverage * regime_size_mult * market_ob_mult, abs_max_leverage=6.0)
    if posture == STALKER: clamped_max = min(clamped_max, 2.0)
    final_leverage = clamp(size.leverage * regime_size_mult * market_ob_mult, min_lev, clamped_max)

    # 6. Notional and entry
    notional = capital * size.fraction * final_leverage
    entry_price = prices[trade_asset]

    # 7. Create position
    position = NDPosition(trade_asset, regime_direction, entry_price,
                          notional, final_leverage, ...)
    exit_manager.setup_position(trade_id, entry_price, direction, bar_idx, v50_vel, v750_vel)
```

---

## 14. DOLPHIN ACTOR — NAUTILUS INTEGRATION

**Source**: `nautilus_dolphin/nautilus_dolphin/nautilus/dolphin_actor.py`
**Base**: `nautilus_trader.trading.strategy.Strategy` (Rust/Cython core)
**Lines**: 338

### 14.1 Lifecycle (v2 — step_bar API)

```
__init__:
    dolphin_config, engine=None, hz_client=None
    current_date=None, posture='APEX', _processed_dates=set()
    _pending_acb: dict|None = None
    _acb_lock = threading.Lock()           ← v2: explicit lock (not GIL reliance)
    _stale_state_events = 0
    _day_data = None, _bar_idx_today = 0

on_start():
    1. _connect_hz() → HazelcastClient(cluster="dolphin", members=["localhost:5701"])
    2. _read_posture() → DOLPHIN_SAFETY (CP AtomicRef, map fallback)
    3. _setup_acb_listener() → add_entry_listener(DOLPHIN_FEATURES["acb_boost"])
    4. create_boost_engine(mode=boost_mode, **engine_kwargs) → NDAlphaEngine
    5. MC-Forewarner injection (gold-performance stack — always active):
           mc_models_dir = config.get('mc_models_dir', _MC_MODELS_DIR_DEFAULT)
           if Path(mc_models_dir).exists():
               forewarner = DolphinForewarner(models_dir=mc_models_dir)
               engine.set_mc_forewarner(forewarner, _MC_BASE_CFG)
           ← graceful degradation: logs warning + continues if models missing
           ← disable explicitly: set mc_models_dir=None/'' in config

on_bar(bar):
    ① Drain ACB under _acb_lock:
        pending = _pending_acb; _pending_acb = None  ← atomic swap
        if pending: engine.update_acb_boost(boost, beta)

    ② Date boundary:
        date_str = datetime.fromtimestamp(bar.ts_event/1e9, UTC).strftime('%Y-%m-%d')
        if current_date != date_str:
            if current_date: engine.end_day()
            current_date = date_str
            posture = _read_posture()
            _bar_idx_today = 0
            engine.begin_day(date_str, posture=posture, direction=±1)
            if not live_mode: _load_parquet_data(date_str) → _day_data

    ③ HIBERNATE guard: if posture=='HIBERNATE': return  ← hard skip, no step_bar

    ④ Feature extraction:
        live_mode=False → if _day_data empty: return  ← early exit, no step_bar with zeros
                          elif _bar_idx_today >= len(df): return  ← end-of-day
                          else: row = df.iloc[_bar_idx_today], vol_regime_ok = (idx>=100)
        live_mode=True  → _get_latest_hz_scan(), staleness check (>10s → warning),
                          dedup on scan_number

    ⑤ _GateSnap BEFORE: (acb_boost, acb_beta, posture, mc_gate_open)

    ⑥ engine.pre_bar_proxy_update(inst50, v750_vel)  ← if ProxyBoostEngine

    ⑦ result = engine.step_bar(bar_idx, vel_div, prices, v50_vel, v750_vel, vol_regime_ok)
       _bar_idx_today += 1

    ⑧ _GateSnap AFTER: compare → if changed: stale_state_events++, result['stale_state']=True

    ⑨ _write_result_to_hz(date_str, result)

on_stop():
    _processed_dates.clear()
    _stale_state_events = 0
    if hz_client: hz_client.shutdown()
```

### 14.2 Thread Safety: ACB Pending-Flag Pattern (v2)

**CRITICAL**: HZ entry listeners run on HZ client pool threads, NOT the Nautilus event loop.

```python
# HZ listener thread — parse outside lock, assign inside lock:
def _on_acb_event(event):
    try:
        val = event.value
        if val:
            parsed = json.loads(val)          # CPU work OUTSIDE lock
            with self._acb_lock:
                self._pending_acb = parsed    # atomic write under lock
    except Exception as e:
        self.log.error(f"ACB event parse error: {e}")

# Nautilus event loop — drain under lock, apply outside lock:
def on_bar(bar):
    with self._acb_lock:
        pending = self._pending_acb
        self._pending_acb = None              # atomic consume under lock
    if pending is not None and self.engine is not None:
        boost = float(pending.get('boost', 1.0))
        beta  = float(pending.get('beta',  0.0))
        self.engine.update_acb_boost(boost, beta)
```

**v2 vs v1**: v1 relied on GIL for safety (bare dict assignment). v2 uses explicit `threading.Lock` — correct even if GIL is removed in future Python versions. Lock hold time is minimized to a single pointer swap.

### 14.3 _GateSnap — Stale-State Detection

New in v2. Detects when ACB boost, posture, or MC gate changes between the pre-step and post-step snapshot:

```python
_GateSnap = namedtuple('_GateSnap', ['acb_boost', 'acb_beta', 'posture', 'mc_gate_open'])

before = _GateSnap(engine._day_base_boost, engine._day_beta, posture, engine._mc_gate_open)
result = engine.step_bar(...)
after  = _GateSnap(engine._day_base_boost, engine._day_beta, _read_posture(), engine._mc_gate_open)

if before != after:
    self._stale_state_events += 1
    self.log.warning(f"[STALE_STATE] gate changed mid-eval: {changed_fields}")
    result['stale_state'] = True   # flagged in HZ write — DO NOT use for live orders
```

### 14.4 Replay vs Live Mode

| | Replay Mode (live_mode=False) | Live Mode (live_mode=True) |
|---|---|---|
| Data source | `vbt_cache_klines/YYYY-MM-DD.parquet` | `DOLPHIN_FEATURES["latest_eigen_scan"]` (HZ) |
| Per-bar iteration | `df.iloc[_bar_idx_today]` | One bar = one HZ scan fetch |
| vol_regime_ok | `bar_idx >= 100` (warmup) | From scan dict |
| Stale guard | — | `abs(now_ns - scan_ts_ns) > 10s` → warning |
| Dedup | — | `scan_num == last_scan_number` → skip |

### 14.5 Data Loading (Replay)

```python
def _load_parquet_data(date_str):
    path = HCM_DIR / "vbt_cache_klines" / f"{date_str}.parquet"
    df = pd.read_parquet(path)
    meta_cols = {vel_div, scan_number, v50_..., v750_..., instability_50, instability_150}
    asset_columns = [c for c in df.columns if c not in meta_cols]
    return df, asset_columns, None   # vol_regime_ok deferred to on_bar warmup check
```

### 14.6 Posture Reading

Primary: `HZ CP Subsystem AtomicReference('DOLPHIN_SAFETY')` — linearizable.
Fallback: `HZ IMap('DOLPHIN_SAFETY').get('latest')` — eventually consistent.
Default when HZ unavailable: `'APEX'` (non-fatal degradation).

### 14.7 Result Writing

```python
def _write_result_to_hz(date_str, result):
    if not self.hz_client: return   # silent noop
    imap_pnl = hz_client.get_map('DOLPHIN_PNL_BLUE').blocking()
    imap_pnl.put(date_str, json.dumps(result))
    if result.get('stale_state'):
        self.log.error("[STALE_STATE] DO NOT use for live order submission")
    # result: {date, pnl, capital, boost, beta, mc_status, trades, stale_state?}
```

### 14.8 Important Notes for Callers

- **`actor.log` is read-only** (Rust-backed Cython property). Never try to assign `actor.log = MagicMock()` in tests — use the real Nautilus logger instead.
- **`actor.posture`** is a regular Python attribute (writable in tests).
- **`actor.engine`** is set in `on_start()`. Tests can set directly after `__init__`.

---

## 15. HAZELCAST — FULL IMAP SCHEMA

Hazelcast is the **system memory**. All subsystem state flows through it. Every consumer must treat HZ maps as authoritative real-time sources.

**Infrastructure**: Hazelcast 5.3, Docker (`prod/docker-compose.yml`), `localhost:5701`, cluster `"dolphin"`.
**CP Subsystem**: Enabled — required for ACB atomic operations.
**Management Center**: `http://localhost:8080`.
**Python client**: `hazelcast-python-client 5.6.0` (siloqy-env).

### 15.1 Complete IMap Reference

| Map | Key | Value | Writer | Reader(s) | Notes |
|---|---|---|---|---|---|
| `DOLPHIN_SAFETY` | `"latest"` | JSON `{posture, Rm, sensors, ...}` | `system_watchdog_service.py` | `DolphinActor`, `paper_trade_flow`, `nautilus_prefect_flow` | CP AtomicRef preferred; IMap fallback |
| `DOLPHIN_FEATURES` | `"acb_boost"` | JSON `{boost, beta}` | `acb_processor_service.py` | `DolphinActor` (HZ entry listener) | Triggers `_on_acb_event` |
| `DOLPHIN_FEATURES` | `"latest_eigen_scan"` | JSON `{vel_div, scan_number, asset_prices, timestamp_ns, w50_velocity, w750_velocity, instability_50}` | Eigenvalue scanner bridge | `DolphinActor` (live mode) | Dedup on scan_number |
| `DOLPHIN_PNL_BLUE` | `"YYYY-MM-DD"` | JSON daily result `{pnl, capital, trades, boost, beta, mc_status, posture, stale_state?}` | `paper_trade_flow`, `DolphinActor._write_result_to_hz`, `nautilus_prefect_flow` | Analytics | stale_state=True means DO NOT use for live orders |
| `DOLPHIN_PNL_GREEN` | `"YYYY-MM-DD"` | JSON daily result | `paper_trade_flow` (green) | Analytics | GREEN config only |
| `DOLPHIN_STATE_BLUE` | `"latest"` | JSON `{strategy, capital, date, pnl, trades, peak_capital, drawdown, engine_state, updated_at}` | `paper_trade_flow` | `paper_trade_flow` (capital restore) | Full engine_state for position continuity |
| `DOLPHIN_STATE_BLUE` | `"latest_nautilus"` | JSON `{strategy, capital, date, pnl, trades, posture, param_hash, engine, updated_at}` | `nautilus_prefect_flow` | `nautilus_prefect_flow` (capital restore) | param_hash = champion SHA256[:16] |
| `DOLPHIN_STATE_BLUE` | `"state_{strategy}_{date}"` | JSON per-run snapshot | `paper_trade_flow` | Recovery | Full historical per-run snapshots |
| `DOLPHIN_HEARTBEAT` | `"nautilus_flow_heartbeat"` | JSON `{ts, iso, run_date, phase, flow}` | `nautilus_prefect_flow` (heartbeat_task) | External monitoring | Written at flow_start, engine_start, flow_end |
| `DOLPHIN_HEARTBEAT` | `"probe_ts"` | Timestamp string | `nautilus_prefect_flow` (hz_probe_task) | Liveness check | Written at HZ probe time |
| `DOLPHIN_OB` | per-asset key | JSON OB snapshot | `obf_prefect_flow` | `HZOBProvider` | Raw OB map |
| `DOLPHIN_FEATURES_SHARD_00` | symbol | JSON OB feature dict `{imbalance, fill_probability, depth_quality, regime_signal, ...}` | `obf_prefect_flow` | `HZOBProvider` | shard routing (see §15.2) |
| `DOLPHIN_FEATURES_SHARD_01..09` | symbol | Same schema | `obf_prefect_flow` | `HZOBProvider` | — |
| `DOLPHIN_SIGNALS` | signal key | Signal distribution | `signal_bridge.py` | Strategy consumers | — |
| `DOLPHIN_FEATURES` | `"obf_universe_latest"` | JSON `{_snapshot_utc, _n_assets, assets: {symbol: {spread_bps, depth_1pct_usd, depth_quality, fill_probability, imbalance, best_bid, best_ask, n_bid_levels, n_ask_levels}}}` | `obf_universe_service.py` | MHS v3 (M5 coherence), Asset Picker | 540 USDT perps; 60s push cadence. NEW v5.0 |
| `DOLPHIN_META_HEALTH` | `"latest"` | JSON `{rm_meta, status, m4_control_plane, m1_data_infra, m1_trader, m2_heartbeat, m3_data_freshness, m5_coherence, service_status, hz_key_status, timestamp}` | `meta_health_service_v3.py` | External monitoring, MHS tests | GREEN/DEGRADED/CRITICAL/DEAD. NEW v5.0 |

### 15.2 OBF Shard Routing

```python
SHARD_COUNT = 10
shard_idx = sum(ord(c) for c in symbol) % SHARD_COUNT
imap_name = f"DOLPHIN_FEATURES_SHARD_{shard_idx:02d}"   # ..._00 through ..._09
```

Routing is **stable** (sum-of-ord, not `hash()`) — deterministic across Python versions and process restarts. 400+ assets distribute evenly across 10 shards.

### 15.3 ShardedFeatureStore API

**Source**: `hz_sharded_feature_store.py`, `ShardedFeatureStore`

```python
store = ShardedFeatureStore(hz_client)
store.put('BTCUSDT', 'vel_div', -0.03)   # routes to shard based on symbol hash
val  = store.get('BTCUSDT', 'vel_div')
store.delete('BTCUSDT', 'vel_div')
# Internal key format: "vel_div_BTCUSDT"
```

Near cache config: TTL=300s, invalidate_on_change=True, LRU eviction, max_size=5000 per shard.

### 15.4 HZOBProvider — Dynamic Asset Discovery

```python
# On connect (lazy), discovers which assets are present in any shard:
for shard_idx in range(SHARD_COUNT):
    key_set = client.get_map(f"DOLPHIN_FEATURES_SHARD_{shard_idx:02d}").blocking().key_set()
    discovered_assets.update(key_set)
```

No static asset list required — adapts automatically as OBF flow adds/removes assets.

### 15.5 CP Subsystem (ACB Processor)

`acb_processor_service.py` uses `HZ CP FencedLock` to prevent simultaneous ACB writes from multiple instances. CP Subsystem must be enabled in `docker-compose.yml`. All writers must use the same CP lock name to get protection.

### 15.6 OBF Circuit Breaker (HZ Push)

After 5 consecutive HZ push failures, OBF flow opens a circuit breaker and switches to file-only mode (`ob_cache/latest_ob_features.json`). Consumers should prefer the JSON file during HZ outages.

---

## 16. PRODUCTION DAEMON TOPOLOGY

> **v5.0 NOTE**: ALL services are managed exclusively by **supervisord**. No service is managed by systemd. The `meta_health_daemon.service`, `dolphin-nautilus-trader.service`, and `dolphin-scan-bridge.service` systemd units are stopped and disabled. Any attempt to re-enable them will create a dual-management race condition ("random killer" bug — see §26.1).

### 16.1 Supervisord Config

**File**: `/mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf`
**Socket**: `/tmp/dolphin-supervisor.sock`
**PYTHONPATH** (dolphin_data group): `/mnt/dolphinng5_predict:/mnt/dolphinng5_predict/nautilus_dolphin:/mnt/dolphinng5_predict/prod`

```bash
# Status check
supervisorctl -c /mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf status

# Restart a service
supervisorctl -c /mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf restart dolphin_data:meta_health
```

### 16.2 dolphin_data Group (autostart=true — data pipeline)

| Program | File | Purpose | startsecs |
|---|---|---|---|
| `exf_fetcher` | `exf_fetcher_flow.py --warmup 15` | ExF live daemon: funding/dvol/fng/taker → HZ `exf_latest` | 20 |
| `acb_processor` | `acb_processor_service.py` | ACBv6 daily boost + dynamic beta → HZ `acb_boost` (CP FencedLock) | 10 |
| `obf_universe` | `obf_universe_service.py` | 540-asset OBF universe L2 health → HZ `obf_universe_latest` | 15 |
| `meta_health` | `meta_health_service_v3.py` | MHS v3 watchdog — monitors all data services, auto-restarts | 5 |

### 16.3 dolphin Group (autostart=false — trading, started manually)

| Program | File | Purpose | Notes |
|---|---|---|---|
| `nautilus_trader` | `nautilus_event_trader.py` | HZ entry listener trader | Start only during trading hours |
| `scan_bridge` | `scan_bridge_service.py` | Arrow → HZ scan bridge | Start when DolphinNG6 is active |
| `clean_arch_trader` | `clean_arch/main.py` | Clean architecture trader | Experimental |

### 16.4 ACB Processor (`acb_processor_service.py`)

**Purpose**: ACBv6 daily boost + dynamic beta from NG3 NPZ files → HZ `DOLPHIN_FEATURES["acb_boost"]`.
**HZ**: CP FencedLock prevents simultaneous writes.

### 16.5 OBF Universe (`obf_universe_service.py`) — NEW v5.0

**Purpose**: L2 health monitor for all 540 USDT perpetuals → HZ `DOLPHIN_FEATURES["obf_universe_latest"]`.
**Coverage**: 540 active USDT perps, 3 WS connections (200/200/140 streams).
**Stream**: `{symbol}@depth5@500ms` — zero REST weight.
**Cadence**: 60s health snapshots; 300s Parquet flush.
**Storage**: `/mnt/ng6_data/ob_universe/` (Hive partitioned; `MAX_FILE_AGE_DAYS=0` — never pruned).
**See §26.2 for full schema.**

### 16.6 Meta Health Service v3 (`meta_health_service_v3.py`) — NEW v5.0

**Purpose**: 5-sensor weighted health monitor + auto-recovery for all data pipeline services.
**Recovery**: `supervisorctl restart` via daemon thread. `RECOVERY_COOLDOWN_CRITICAL_S=10s`.
**Output**: `DOLPHIN_META_HEALTH["latest"]` + `/mnt/dolphinng5_predict/run_logs/meta_health.json`.
**See §26.3 for full specification.**

### 16.7 ExF Daemon (`exf_fetcher_flow.py`)

**Purpose**: External factors — funding rate, DVOL, Fear&Greed, taker ratio → HZ `DOLPHIN_FEATURES["exf_latest"]`.
**Field**: `_pushed_at` (Unix timestamp) is the canonical freshness field.

### 16.8 MC-Forewarner Flow (`mc_forewarner_flow.py`)

**Purpose**: Prefect-orchestrated daily ML assessment. Outcome: OK / ORANGE / RED → HZ.
**Effect**: ORANGE → `day_mc_scale=0.5`. RED → `regime_dd_halt=True`.

### 16.9 paper_trade_flow.py (Primary — 00:05 UTC)

**Purpose**: Daily NDAlphaEngine run. Loads klines, wires ACB+OB+MC, runs `begin_day/step_bar/end_day`.
**Direction**: `direction = -1` (SHORT, blue).

### 16.10 Daemon Start Sequence

```
1. docker-compose up -d          ← Hazelcast 5701, ManCenter 8080, Prefect 4200
2. supervisord (auto)            ← starts dolphin_data group automatically on boot
   └── exf_fetcher, acb_processor, obf_universe, meta_health start in parallel

3. (Manual when needed):
   supervisorctl start dolphin:nautilus_trader   ← HZ entry listener
   supervisorctl start dolphin:scan_bridge       ← when DolphinNG6 active

4. Prefect deployments (daily, scheduled):
   paper_trade_flow.py           ← 00:05 UTC
   nautilus_prefect_flow.py      ← 00:10 UTC
   mc_forewarner_flow.py         ← daily
```

### 16.11 Monitoring Endpoints

| Service | URL / Command |
|---|---|
| Hazelcast Management Center | `http://localhost:8080` |
| Prefect UI | `http://localhost:4200` |
| Supervisord status | `supervisorctl -c /mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf status` |
| MHS health JSON | `cat /mnt/dolphinng5_predict/run_logs/meta_health.json` |
| Daily PnL | `HZ IMap DOLPHIN_PNL_BLUE[YYYY-MM-DD]` |
| ACB State | `HZ IMap DOLPHIN_FEATURES["acb_boost"]` |
| OBF Universe | `HZ IMap DOLPHIN_FEATURES["obf_universe_latest"]` |

---

## 17. PREFECT ORCHESTRATION LAYER

**Version**: Prefect 3.6.22 (siloqy-env)
**Server**: `http://localhost:4200/api`
**Work pool**: `dolphin` (process type)
**Worker command**: `prefect worker start --pool dolphin --type process`

### 17.1 Registered Deployments

| Deployment | Flow | Schedule | Config |
|---|---|---|---|
| `dolphin-paper-blue` | `paper_trade_flow.py` | `0 0 * * *` (00:05 UTC) | `configs/blue.yml` |
| `dolphin-paper-green` | `paper_trade_flow.py` | `0 0 * * *` (00:05 UTC) | `configs/green.yml` |
| `dolphin-nautilus-blue` | `nautilus_prefect_flow.py` | `10 0 * * *` (00:10 UTC) | `configs/blue.yml` |

### 17.2 nautilus_prefect_flow.py — Nautilus BacktestEngine Supervisor

New in v2. Tasks in execution order:

```
hz_probe_task             retries=3  timeout=30s   — verify HZ reachable; abort on failure
validate_champion_params  retries=0  timeout=10s   — SHA256 hash vs FROZEN params; ValueError on drift
load_bar_data_task        retries=2  timeout=120s  — load vbt_cache_klines parquet; validate vel_div col
read_posture_task         retries=2  timeout=20s   — read DOLPHIN_SAFETY
restore_capital_task      retries=2  timeout=20s   — restore capital from DOLPHIN_STATE_BLUE
  → HIBERNATE? skip engine, write result, heartbeat, return
run_nautilus_backtest_task retries=0 timeout=600s  — BacktestEngine + DolphinActor full cycle
write_hz_result_task      retries=3  timeout=30s   — DOLPHIN_PNL_BLUE + DOLPHIN_STATE_BLUE write
heartbeat_task            retries=0  timeout=15s   — phase=flow_end
```

**Champion integrity**: `_CHAMPION_HASH = sha256(json.dumps(_CHAMPION_PARAMS, sort_keys=True))[:16]`. Computed at import time. Any config drift triggers `ValueError` before engine starts.

**Capital continuity**: Restores from `DOLPHIN_STATE_BLUE["latest_nautilus"]`. Falls back to `initial_capital` (25,000 USDT) if absent.

### 17.3 paper_trade_flow.py — Task Reference

| Task | Retries | Purpose |
|---|---|---|
| `load_config` | 0 | YAML config load |
| `load_day_scans` | 2 | Parquet (preferred) or JSON fallback; vel_div validation |
| `run_engine_day` | 0 | begin_day/step_bar×N/end_day; returns daily stats |
| `write_hz_state` | 3 | DOLPHIN_STATE_BLUE + DOLPHIN_PNL_BLUE persist |
| `log_pnl` | 0 | Disk JSONL append (`paper_logs/{color}/`) |

### 17.4 Registration Commands

```bash
source /home/dolphin/siloqy_env/bin/activate
PREFECT_API_URL=http://localhost:4200/api

python prod/paper_trade_flow.py --register        # blue + green paper deployments
python prod/nautilus_prefect_flow.py --register   # nautilus blue deployment
```

### 17.5 Manual Run

```bash
# Paper trade:
python prod/paper_trade_flow.py --config prod/configs/blue.yml --date 2026-03-21

# Nautilus supervisor:
python prod/nautilus_prefect_flow.py --date 2026-03-21

# Dry-run (data + param validation, no engine):
python prod/nautilus_prefect_flow.py --date 2026-03-21 --dry-run
```

---

## 18. CI TEST SUITE

### 18.1 Test Suites Overview

| Suite | Location | Runner | Gate |
|-------|----------|--------|------|
| Nautilus bootstrap | `nautilus_dolphin/tests/test_0_nautilus_bootstrap.py` | `pytest nautilus_dolphin/tests/test_0_nautilus_bootstrap.py -v` | 11/11 |
| DolphinActor | `nautilus_dolphin/tests/test_dolphin_actor.py` | `pytest nautilus_dolphin/tests/test_dolphin_actor.py -v` | 35/35 |
| OBF unit tests | `tests/test_obf_unit.py` | `pytest tests/test_obf_unit.py -v` | ~120/~120 |
| Legacy CI | `ci/` directory | `pytest ci/ -v` | 14/14 |
| ACB + HZ status | `prod/tests/test_acb_hz_status_integrity.py` | `pytest prod/tests/test_acb_hz_status_integrity.py -v` | 118/118 |
| **MHS v3** | `prod/tests/test_mhs_v3.py` | `pytest prod/tests/test_mhs_v3.py -v` | **111/111** |

**Total: 46 Nautilus + ~120 OBF + 14 legacy CI + 118 ACB/HZ + 111 MHS = ~409 tests green.**

**Run all prod tests**:
```bash
source /home/dolphin/siloqy_env/bin/activate
cd /mnt/dolphinng5_predict
python -m pytest prod/tests/ -v --tb=short
```

### 18.2 Nautilus Bootstrap Tests (11 tests)

`test_0_nautilus_bootstrap.py` — foundation sanity checks:
- Nautilus import, catalog construction, Bar/BarType creation
- DolphinActor instantiation without full kernel (uses `__new__` + `__init__` pattern)
- Champion config loading from blue.yml
- HZ connectivity probe (skip if HZ unavailable)
- BacktestEngine construction with DolphinActor registered

### 18.3 DolphinActor Tests (35 tests, 8 classes)

`test_dolphin_actor.py` — full behavioral coverage:

| Class | Tests | What It Covers |
|-------|-------|----------------|
| `TestChampionParamInvariants` | 6 | Config loading, SHA256 hash stability, frozen param values, blue.yml parity |
| `TestACBPendingFlagThreadSafety` | 5 | Lock acquisition, JSON parse outside lock, dict assign inside lock, concurrent event safety |
| `TestHibernatePostureGuard` | 3 | HIBERNATE skips engine entirely, APEX/STALKER/TURTLE pass through, posture gate logic |
| `TestDateChangeHandling` | 5 | Date rollover triggers end_day/begin_day, once-per-date guard, bar_idx reset |
| `TestHZUnavailableDegradation` | 4 | HZ down → engine continues with stale OB features; heartbeat errors silenced; file fallback |
| `TestReplayModeBarTracking` | 3 | bar_idx increments per step_bar call; total_bars_processed correct; replay vs live mode flag |
| `TestOnStopCleanup` | 4 | on_stop writes final HZ result; HZ down on stop is non-fatal; engine state serialized |
| `TestStaleStateGuard` | 5 | _GateSnap detects mid-eval posture/acb changes; snap mismatch triggers abort; re-eval on next bar |

**Critical implementation note**: `actor.log` is a Cython/Rust-backed read-only property on `Actor`.
Do NOT attempt `actor.log = MagicMock()` — raises `AttributeError: attribute 'log' of ... objects is not writable`.
The real Nautilus logger is initialized by `super().__init__()` and works in test context.

### 18.4 Legacy CI Tests (14 tests)

**Location**: `ci/` directory. Runner: `pytest ci/ -v`

| File | Tests | What It Covers |
|------|-------|----------------|
| `test_13_nautilus_integration.py` | 6 | Actor import, instantiation, on_bar, HIBERNATE posture, once-per-day guard, ACB thread safety |
| `test_14_long_system.py` | 3 | Multi-day run, capital persistence, trade count |
| `test_15_acb_reactive.py` | 1 | ACB boost update applied correctly mid-day |
| `test_16_scaling.py` | 4 | Memory footprint <4GB (50 assets), shard routing (400 symbols), 400-asset no-crash, 400-asset with IRP |

### 18.5 Key Test Patterns

**ACB pending-flag pattern** (ThreadSafety test):
```python
# JSON parse OUTSIDE lock, dict assign INSIDE lock
with patch.object(actor.engine, 'update_acb_boost') as mock_update:
    actor._on_acb_event(event)
    assert actor._pending_acb['boost'] == 1.35
    mock_update.assert_not_called()  # engine NOT called from listener thread
```

**Date rollover pattern** (DateChange test):
```python
# Fires 3 bars on same date → assert begin_day.call_count == 1
# Fires 1 bar on next date  → assert begin_day.call_count == 2, end_day.call_count == 1
```

**_GateSnap stale-state detection**:
```python
# Snap taken at start of step_bar; posture changes mid-eval → abort, retry next bar
snap = actor._gate_snap  # namedtuple(acb_boost, acb_beta, posture, mc_gate_open)
```

---

## 19. PARAMETER REFERENCE

### 19.1 Champion Parameters (Frozen)

| Parameter | Value | Layer | Notes |
|-----------|-------|-------|-------|
| `vel_div_threshold` | -0.02 | Signal gate | PRIMARY entry threshold |
| `vel_div_extreme` | -0.05 | Signal/sizing | Extreme regime: full size |
| `min_leverage` | 0.5 | Sizing | Floor leverage |
| `max_leverage` | 5.0 | Sizing | Base ceiling (before ACB) |
| `abs_max_leverage` | 6.0 | Sizing | Hard geometric ceiling |
| `leverage_convexity` | 3.0 | Sizing | Cubic convex curve |
| `fraction` | 0.20 | Sizing | Max capital fraction per trade |
| `fixed_tp_pct` | 0.0095 | Exit | 95 bps take-profit |
| `stop_pct` | 1.0 | Exit | Effectively disabled |
| `max_hold_bars` | 120 | Exit | 600 seconds |
| `dc_lookback_bars` | 7 | DC | 35 seconds price momentum |
| `dc_min_magnitude_bps` | 0.75 | DC | Minimum BTC momentum |
| `dc_skip_contradicts` | True | DC | Hard skip on contradiction |
| `min_irp_alignment` | 0.45 | IRP | Alignment gate |
| `sp_maker_entry_rate` | 0.62 | Fees | 62% maker fill at entry |
| `sp_maker_exit_rate` | 0.50 | Fees | 50% maker fill at exit |
| `ob_edge_bps` | 5.0 | OB | Legacy MC OB edge |
| `ob_confirm_rate` | 0.40 | OB | Legacy MC confirmation rate |
| `lookback` | 100 | Warmup | Bars before first entry allowed |
| `seed` | 42 | RNG | Deterministic numpy RandomState |

### 19.2 ACBv6 Parameters (Frozen — Validated)

| Parameter | Value | Notes |
|-----------|-------|-------|
| `BETA_HIGH` | 0.8 | w750 above p60 threshold |
| `BETA_LOW` | 0.2 | w750 below p60 threshold |
| `W750_THRESHOLD_PCT` | 60 | Percentile switch point |
| `FUNDING_VERY_BEARISH` | -0.0001 | 1.0 signal |
| `DVOL_EXTREME` | 80 | 1.0 signal |
| `FNG_EXTREME_FEAR` | 25 | 1.0 signal (needs confirmation) |
| `TAKER_SELLING` | 0.8 | 1.0 signal |

### 19.3 Survival Stack Thresholds (Deliberately Tight)

| Posture | Rm Threshold | vs. Math Spec |
|---------|-------------|---------------|
| APEX | ≥ 0.90 | Tighter — spec was 0.85 |
| STALKER | ≥ 0.75 | Tighter — spec was 0.70 |
| TURTLE | ≥ 0.50 | Tighter — spec was 0.45 |
| HIBERNATE | < 0.50 | — |

**Do NOT loosen these without quantitative justification.**

---

## 20. OBF SPRINT 1 HARDENING

**Completed**: 2026-03-22. All 25 items in `AGENT_TODO_PRIORITY_FIXES_AND_TODOS.md` addressed.

### 20.1 P0/P1/P2 Hardening (Production Safety)

| Item | Change | Severity |
|------|--------|----------|
| Circuit breaker | 5 consecutive HZ push failures → exponential backoff + file-only fallback | P0 |
| Crossed-book guard | Ask ≤ bid on incoming feed → discard snapshot, log warning, continue | P0 |
| Dark streak detector | N consecutive zero-volume bars → emit STALE_DATA warning | P1 |
| First flush delay | No OB features published until 60s after startup (warmup) | P1 |
| Stall watchdog | No new bar for `STALL_TIMEOUT` seconds → alert + optional restart | P1 |
| Fire-and-forget HZ push | HZ write moved to background thread; hot loop never blocks on HZ | P2 |
| Dynamic asset discovery | `hzobprovider` discovers active symbols from HZ at runtime; no hardcoded list | P2 |
| Per-timestamp macro map | `latest_macro_at_ts` keyed by bar timestamp; resolves stale-read race on fast replays | P2 |

### 20.2 P3 Infrastructure Items

| Item | Status |
|------|--------|
| `scripts/verify_parquet_archive.py` — validates all daily parquet files for schema and row count | DONE |
| `ob_cache/SCHEMA.md` — authoritative JSON schema for `latest_ob_features.json` | DONE |
| P3-1 / P3-5 / P3-6 — out of scope for sprint 1, deferred | SKIPPED |

### 20.3 OBF Architecture Post-Sprint

```
Binance WS feed
    ↓
obf_prefect_flow.py (hot loop, ~100ms cadence)
    ├── Crossed-book guard → discard if ask ≤ bid
    ├── Dark streak detector → N zero-vol bars
    ├── First flush delay → 60s warmup
    ├── Feature compute (depth imbalance, spread, vwap, pressure ratio)
    ├── Per-timestamp macro map update
    ├── Fire-and-forget HZ push (background thread)
    │       └── Circuit breaker (5 failures → file-only)
    └── ob_cache/latest_ob_features.json (local fallback)
```

### 20.4 OBF Live Data Gap — KNOWN LIMITATION (2026-03-26)

> **CRITICAL DATA QUALITY CAVEAT**: `nautilus_event_trader.py` (live event trader) is currently wired to `MockOBProvider` with static per-asset imbalance biases (BTC=-0.086, ETH=-0.092, BNB=+0.05, SOL=+0.05). All four OBF functional dimensions compute and produce real outputs — but with frozen, market-unresponsive inputs. The OB cascade regime will always be CALM (no depth drain in mock data).
>
> `HZOBProvider` (`/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/hz_ob_provider.py`) exists and is format-compatible with `obf_prefect_flow.py`'s HZ output, but `OBFeatureEngine` has no live streaming path — only `preload_date()` (batch/backtest). A `step_live()` method must be added before the switch.
>
> **Acceptable for**: paper trading
> **NOT acceptable for**: live capital deployment
>
> **Full spec**: `/mnt/dolphinng5_predict/prod/docs/AGENT_SPEC_OBF_LIVE_SWITCHOVER.md`

### 20.5 Test Coverage

`tests/test_obf_unit.py` — ~120 unit tests covering all hardening items:
- Circuit breaker state machine (CLOSED → OPEN → HALF-OPEN)
- Crossed-book guard triggers on malformed data
- Dark streak threshold detection
- Warmup period gating
- Background thread non-blocking behavior
- Asset discovery via HZ key scan

---

## 21. KNOWN RESEARCH TODOs

| ID | Description | Priority |
|----|-------------|----------|
| TODO-1 | Calibrate `vd_enabled` adverse-turn exits (currently disabled). Requires analysis of trade vel_div distribution at entry vs. subsequent bars. True invalidation threshold likely ~+0.02 sustained for N=3 bars. | MEDIUM |
| TODO-2 | Validate SUBDAY_ACB force-exit threshold (`old_boost >= 1.25 and boost < 1.10`). Currently ARBITRARY — agent-chosen, not backtest-derived. | MEDIUM |
| TODO-3 | MIG8: Binance live adapter (real order execution). OUT OF SCOPE until after 30-day paper trading validation. | LOW |
| TODO-4 | 48-hour chaos test with all daemons running simultaneously. Watch for: KeyError, stale-read anomalies, concurrent HZ writer collisions. | HIGH (before live capital) |
| TODO-5 | Memory profiler with IRP enabled at 400 assets (current 71 MB measurement was without IRP). Projected ~600 MB — verify. | LOW |
| TODO-6 | TF-spread recovery exits (`tf_enabled=False`). Requires sweep of tf_exhaust_ratio and tf_flip_ratio vs. champion backtest. | LOW |
| TODO-7 | GREEN (LONG) posture paper validation. LONG thresholds (long_threshold=0.01, long_extreme=0.04) not yet production-validated. | MEDIUM |
| TODO-8 | ~~ML-MC Forewarner injection into `nautilus_prefect_flow.py`.~~ **DONE 2026-03-22** — wired in `DolphinActor.on_start()` for both flows. | CLOSED |
| TODO-9 | Live TradingNode integration (launcher.py exists; Binance adapter config incomplete). Requires 30-day clean paper run first. | LOW |

---

## 22. 0.1S RESOLUTION — READINESS ASSESSMENT

**Assessment date**: 2026-03-22. **Status: BLOCKED — 3 hard blockers.**

The current system processes 5s OHLCV bars. Upgrading to 0.1s tick resolution requires resolving all three blockers below before any code changes.

### 22.1 Blocker 1 — Async HZ Push

**Problem**: The OBF hot loop fires at ~100ms cadence. At 0.1s resolution, the per-bar HZ write latency (currently synchronous in feature compute path, despite fire-and-forget for the push itself) would exceed bar cadence, causing HZ write queue growth and eventual OOM.

**Required**: Full async HZ client (`hazelcast-python-client` async API or aiohazelcast). Currently all HZ operations are synchronous blocking calls. Estimated effort: 2–3 days of refactor + regression testing.

### 22.2 Blocker 2 — `get_depth` Timeout

**Problem**: `get_depth()` in `HZOBProvider` issues a synchronous HZ `IMap.get()` call with a 500ms timeout. At 0.1s resolution, each bar would wait up to 500ms for OB depth data — 5× the bar cadence. This makes 0.1s resolution impossible without an in-process depth cache.

**Required**: Pre-fetched depth cache (e.g., local dict refreshed by a background subscriber), making `get_depth()` a pure in-process read with <1µs latency. Estimated effort: 1–2 days.

### 22.3 Blocker 3 — Lookback Recalibration

**Problem**: All champion parameters that reference "bars" were validated against 5s bars:
- `lookback=100` (100 × 5s = 500s warmup)
- `max_hold_bars=120` (120 × 5s = 600s max hold)
- `dc_lookback_bars=7` (7 × 5s = 35s DC window)

At 0.1s resolution, the same bar counts would mean 10s warmup, 12s max hold, 0.7s DC window — **completely invalidating champion params**. All params must be re-validated from scratch via VBT backtest at 0.1s resolution.

**Required**: Full backtest sweep at 0.1s. Estimated effort: 1–2 weeks of compute + validation time. This is a research milestone, not an engineering task.

### 22.4 Assessment Summary

| Blocker | Effort | Dependency |
|---------|--------|------------|
| Async HZ push | 2–3 days engineering | None — can start now |
| `get_depth` cache | 1–2 days engineering | None — can start now |
| Lookback recalibration | 1–2 weeks research | Requires blockers 1+2 resolved first |

**Recommendation**: Do NOT attempt 0.1s resolution until after 30-day paper trading validation at 5s. The engineering blockers can be prototyped in parallel, but champion params cannot be certified until post-paper-run stability is confirmed.

## 23. SIGNAL PATH VERIFICATION SPECIFICATION

Testing the asynchronous, multi-scale signal path requires systematic validation of the data bridge and cross-layer trigger logic.

### 23.1 Verification Flow
A local agent (Prefect or standalone) should verify:
1. **Micro Ingestion**: 100ms OB features sharded across 10 HZ maps.
2. **Regime Bridge**: NG5 Arrow scan detection by `scan_hz_bridge.py` and push to `latest_eigen_scan`.
3. **Strategy Reactivity**: `DolphinActor.on_bar` (5s) pulling HZ data and verifying `scan_number` idempotency.
4. **Macro Safety**: Survival Stack Rm-computation pushing `APEX/STALKER/HIBERNATE` posture to `DOLPHIN_SAFETY`.

### 23.2 Reference Document
Full test instructions, triggers, and expected values are defined in:
`TODO_CHECK_SIGNAL_PATHS.md` (Project Root)

---

*End of DOLPHIN-NAUTILUS System Bible v3.0 — 2026-03-23*
*Champion: SHORT only (APEX posture, blue configuration)*
*Automation: Prefect-supervised paper trading active.*
*Status: Capital Sync enabled; Friction SP-bypass active; TradeLogger running.*
*Do NOT deploy real capital until 30-day paper run is clean.*

## 24. MULTI-SPEED EVENT-DRIVEN ARCHITECTURE

**Version**: v4.1 Addition — 2026-03-25
**Status**: DEPLOYED (Production)
**Author**: Kimi Code CLI Agent
**Related**: `AGENT_READ_ARCHITECTURAL_CHANGES_SPEC.md` (detailed specification)

### 24.1 Overview

The DOLPHIN system has been re-architected from a **single-speed batch-oriented Prefect deployment** to a **multi-speed, event-driven, multi-worker architecture** with proper resource isolation and self-healing capabilities.

**Problem Solved**: 2026-03-24 system outage caused by uncontrolled Prefect process explosion (60+ `prefect.engine` zombies → resource exhaustion → kernel deadlock).

**Solution**: Frequency isolation + concurrency limits + systemd resource constraints + event-driven architecture.

### 24.2 Architecture Layers

| Layer | Frequency | Component | Pattern | Status |
|-------|-----------|-----------|---------|--------|
| L1 | <1ms | Nautilus Event Trader | Hz Entry Listener | ✅ Active (PID 159402) |
| L2 | 1-10s | Scan Bridge | File watcher → Hz | ✅ Active (PID 158929) |
| L3 | Varied | ExtF Indicators | Scheduled per-indicator | ⚠️ Not running (NG6 down) |
| L4 | ~5s | Meta Health Service | 5-sensor monitoring | ✅ Active (PID 160052) |
| L5 | Daily | Paper/Nautilus Flows | Prefect scheduled | ✅ Scheduled |

### 24.3 Nautilus Event-Driven Trader

**Purpose**: Millisecond-latency trading via Hazelcast event listener (not polling).

**Implementation**:
```python
# Hz Entry Listener Pattern
features_map.add_entry_listener(
    key='latest_eigen_scan',
    updated_func=on_scan_update  # Called per scan
)

def on_scan_update(event):
    scan = json.loads(event.value)
    signal = compute_signal(scan, ob_data, extf_data)
    if signal.valid:
        execute_trade(signal)  # <10ms total latency
```

**Service**: `dolphin-nautilus-trader.service`
**Resource Limits**: MemoryMax=2G, CPUQuota=200%, TasksMax=50
**Hz Input**: `DOLPHIN_FEATURES["latest_eigen_scan"]`
**Hz Output**: `DOLPHIN_PNL_BLUE[YYYY-MM-DD]`, `DOLPHIN_STATE_BLUE`

### 24.4 Scan Bridge Service

**Purpose**: Detect Arrow scan files from DolphinNG6, push to Hz.

**Deployment**: `scan-bridge-flow/scan-bridge` (Prefect)
**Concurrency**: Strictly limited to 1
**Safety Mechanisms**:
- Work pool concurrency limit: 1
- Deployment concurrency limit: 1
- File mtime-based detection (handles NG6 restarts)

**Current Status**: Running directly (PID 158929) due to Prefect worker scheduling issues.

### 24.5 Meta Health Service v3 (MHS) — REWRITTEN v5.0

> **MHS v2 is retired.** `meta_health_daemon_v2.py` was calling `systemctl restart` on supervisord-managed processes — this was the "random killer" bug. v3 is the canonical implementation.

**File**: `meta_health_service_v3.py`
**Supervisord**: `dolphin_data:meta_health` (`autostart=true`)

#### 24.5.1 Five-Sensor Model (Weighted Sum — NOT product)

| Sensor | Weight | Metric | Thresholds |
|--------|--------|--------|------------|
| M4 | 0.35 | Control Plane (HZ port 5701 + Prefect 4200) | HZ=0.8w, Prefect=0.2w |
| M1 | 0.35 | Process Integrity (supervisord status) | data services scored separately from trader |
| M3 | 0.20 | Data Freshness (HZ key timestamps) | >30s=stale(0.5), >120s=dead(0.0) |
| M5 | 0.10 | Data Coherence (boost range, OBF coverage) | OBF<200 assets=0.5 |
| M2 | — | Heartbeat (informational only) | Not in rm_meta |
| M1_trader | — | Trader process (informational only) | Not in rm_meta (may be intentionally stopped) |

#### 24.5.2 Rm_meta Formula

```python
# FIX-1: Weighted sum — no single sensor can zero rm_meta (v2 bug fixed)
rm_meta = (0.35*m4 + 0.35*m1_data + 0.20*m3 + 0.10*m5) / 1.0

# Thresholds
rm > 0.85: GREEN
rm > 0.60: DEGRADED
rm > 0.30: CRITICAL
rm ≤ 0.30: DEAD → Recovery triggered (only for STOPPED critical_data services)
```

#### 24.5.3 Recovery Policy

```python
# FIX-2: supervisorctl restart, NOT systemctl (v2 bug fixed)
# FIX-3: 10s cooldown for critical services (was 600s)
# FIX-4: Non-blocking daemon thread (hung subprocess won't block check loop)
# FIX-5: Per-service cooldown (independent buckets per program)
# FIX-6: Only STOPPED critical_data services are restarted. Trader never auto-restarted.

RECOVERY_COOLDOWN_CRITICAL_S  = 10.0   # exf, acb, obf_universe
RECOVERY_COOLDOWN_DEFAULT_S   = 300.0  # nautilus_trader, scan_bridge (informational only)
CHECK_INTERVAL_S              = 10.0
```

#### 24.5.4 Monitored Services

| supervisord program | critical_data | Auto-restarted by MHS |
|---|---|---|
| `dolphin_data:exf_fetcher` | ✅ | ✅ (10s cooldown) |
| `dolphin_data:acb_processor` | ✅ | ✅ (10s cooldown) |
| `dolphin_data:obf_universe` | ✅ | ✅ (10s cooldown) |
| `dolphin:nautilus_trader` | ❌ | ❌ (informational) |
| `dolphin:scan_bridge` | ❌ | ❌ (informational) |

#### 24.5.5 Monitored HZ Sources

| Key | Map | Timestamp Field | Notes |
|---|---|---|---|
| `exf_latest` | `DOLPHIN_FEATURES` | `_pushed_at` | Unix float |
| `acb_boost` | `DOLPHIN_FEATURES` | (none — presence only) | — |
| `latest_eigen_scan` | `DOLPHIN_FEATURES` | `timestamp` | ISO string |
| `obf_universe_latest` | `DOLPHIN_FEATURES` | `_snapshot_utc` | Unix float |

**Output**: `DOLPHIN_META_HEALTH["latest"]` — JSON health report, also written to `run_logs/meta_health.json`

### 24.6 Safety Mechanisms

#### 24.6.1 Concurrency Controls (Root Cause Fix)

| Level | Mechanism | Value | Prevents |
|-------|-----------|-------|----------|
| Work Pool | `concurrency_limit` | 1 | Multiple simultaneous runs |
| Deployment | `prefect concurrency-limit` | 1 (tag-based) | Tag-based overflow |
| Systemd | `TasksMax` | 50 | Process fork bombs |
| Systemd | `MemoryMax` | 2G | OOM conditions |
| Systemd | `CPUQuota` | 200% | CPU starvation |

#### 24.6.2 Recovery Procedures

| Scenario | Trigger | Action |
|----------|---------|--------|
| Critical data service STOPPED | rm CRITICAL/DEAD + service STOPPED | `supervisorctl restart <program>` (async, 10s cooldown) |
| Data staleness | M3 < 0.5 | Alert only (external data dependency) |
| Control plane down | M4 < 0.5 | Alert (MHS can't self-heal HZ) |
| Trader stopped | m1_trader < 1.0 | Informational only — NEVER auto-restarted |

### 24.7 Data Flow: Scan-to-Trade

```
DolphinNG6 → Arrow File → Scan Bridge → Hz → Entry Listener → Nautilus → Trade
   (Win)      (SMB)       (5s poll)   (μs)    (<1ms)        (<1ms)    (<10ms)

Target: <10ms from NG6 scan to trade execution
Current: Waiting for NG6 restart to validate
```

### 24.8 Service Status (v5.0 — As Running 2026-03-30)

| supervisord program | Status | Notes |
|---|---|---|
| `dolphin_data:exf_fetcher` | ✅ RUNNING | Pushes exf_latest every ~60s |
| `dolphin_data:acb_processor` | ✅ RUNNING | Pushes acb_boost on NG3 data |
| `dolphin_data:obf_universe` | ✅ RUNNING | 512/540 assets healthy at launch |
| `dolphin_data:meta_health` | ✅ RUNNING | RM_META≈0.975 [GREEN] |
| `dolphin:nautilus_trader` | ⚙️ STOPPED (manual) | Start when trading |
| `dolphin:scan_bridge` | ⚙️ STOPPED (manual) | Start when DolphinNG6 active |
| hazelcast | ✅ (docker) | Port 5701 |
| prefect-server | ✅ (docker) | Port 4200 |

**RETIRED (stopped + disabled)**:
- `dolphin-nautilus-trader.service` (systemd) — was causing dual-management
- `dolphin-scan-bridge.service` (systemd) — was causing dual-management
- `meta_health_daemon.service` (systemd) — was calling `systemctl restart` on supervisord processes (root cause of random killer bug)

### 24.9 Known Issues (v5.0)

| Issue | Status | Notes |
|-------|--------|-------|
| NG6 down (no scan data) | External dependency | `latest_eigen_scan` key absent; MHS reports this cleanly |
| OBF shard store (400 assets) vs universe (540) | Architecture gap | Shard store is used by trading engine; universe is health-only |

### 24.10 Operational Commands

```bash
CONF=/mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf

# Status
supervisorctl -c $CONF status

# Restart a service
supervisorctl -c $CONF restart dolphin_data:exf_fetcher

# Start the trader
supervisorctl -c $CONF start dolphin:nautilus_trader

# View MHS health
cat /mnt/dolphinng5_predict/run_logs/meta_health.json

# View supervisord logs
tail -f /mnt/dolphinng5_predict/prod/supervisor/logs/meta_health.log
```

### 24.11 File Locations

| Component | Path |
|-----------|------|
| Nautilus Trader | `/mnt/dolphinng5_predict/prod/nautilus_event_trader.py` |
| MHS v3 | `/mnt/dolphinng5_predict/prod/meta_health_service_v3.py` |
| MHS v2 (retired) | `/mnt/dolphinng5_predict/prod/meta_health_daemon_v2.py` |
| OBF Universe | `/mnt/dolphinng5_predict/prod/obf_universe_service.py` |
| Scan Bridge | `/mnt/dolphinng5_predict/prod/scan_bridge_service.py` |
| Supervisord Conf | `/mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf` |
| MHS Logs | `/mnt/dolphinng5_predict/prod/supervisor/logs/meta_health.log` |
| MHS Health JSON | `/mnt/dolphinng5_predict/run_logs/meta_health.json` |
| OBF Universe Data | `/mnt/ng6_data/ob_universe/` (Hive partitioned Parquet) |

---

## §25 Numerical Precision Policy

### 25.1 Principle

**Must use 512-bit native precision when possible, if performance allows.**

This system has `python-flint` v0.8.0 installed on all compute nodes.  All critical
financial math (vol computation, returns, rolling std) MUST use `flint.arb` at 512-bit
unless performance constraints make it prohibitive.

### 25.2 Configuration

```python
from flint import arb, ctx
ctx.prec = 512  # 512-bit mantissa; set once per process
```

### 25.3 Performance Benchmark (2026-03-28)

| Mode         | dvol/day  | Total 56d | Overhead |
|--------------|-----------|-----------|---------- |
| numpy float64 | ~94 ms   | ~5.3 s    | baseline  |
| arb 512-bit   | ~295 ms  | ~16.5 s   | +11 s     |

+11 seconds on a ~718 second total run (1.5% overhead) — acceptable.

### 25.4 Validated Equivalence

Benchmark result against full 56-day window (6154 rows/day sampled):
- NaNs in arb result: **0**
- vol_ok=True bars match float64: **MATCH=True**
- Divergent bars: **0**

float64 and 512-bit produce identical `vol_ok` decisions for this signal at current
BTC price magnitudes.  The 512-bit path is used as the primary path to prevent
precision erosion from future edge cases (extreme micro-volatility, very large
or very small price moves).

### 25.5 Implementation Pattern

```python
def _compute_dvol_arb512(prices, n_rows, threshold):
    """Primary: 512-bit arb.  Returns None if flint unavailable (fall back to float64)."""
    try:
        from flint import arb, ctx
        ctx.prec = 512
    except ImportError:
        return None
    # ... arb rolling std ...

# Call site:
vol_ok_mask = _compute_dvol_arb512(btc, n_rows, VOL_P60_THRESHOLD)
if vol_ok_mask is None:
    # float64 fallback — guards only; should not be reached on production nodes
    ...
```

### 25.6 Scope

| Computation | Precision | File |
|-------------|-----------|------|
| Rolling 50-bar dvol (vol_ok) | arb 512-bit | `nautilus_native_continuous.py` |
| All other paths | numpy float64 | — |

Future additions (returns, leverage math, position sizing) should follow the same
pattern: 512-bit primary, float64 last-resort guard.

---

## 26. SUPERVISORD ARCHITECTURE & OBF UNIVERSE (v5.0)

### 26.1 The "Random Killer" Bug — Root Cause & Fix

**Incident**: Services were being unexpectedly killed and restarted at seemingly random intervals. The system appeared healthy according to supervisord but processes would die without obvious cause.

**Root cause** (diagnosed 2026-03-30):
1. `meta_health_daemon_v2.py` had been running under `meta_health_daemon.service` (systemd) for 4+ days.
2. MHS v2's process patterns (`exf_prefect_final`, `esof_prefect_flow`) did not match any running process → M1=0 → `rm_meta = M1*M2*M3*M4*M5 = 0` always → status="DEAD".
3. MHS v2 recovery action: `systemctl restart <service>` — called every 5s.
4. But the services were supervisord-managed, not systemd-managed. `systemctl restart` on a supervisord process:
   - Sends SIGTERM to the process (it dies)
   - Supervisord detects the death and autostarts a new instance
   - Creates brief duplicate processes, interleaved with MHS v2's next kill cycle
5. Additionally, `dolphin-nautilus-trader.service` (systemd) AND supervisord were both managing `nautilus_event_trader.py` simultaneously — two PIDs running at once.

**Fix applied**:
```bash
systemctl stop meta_health_daemon.service && systemctl disable meta_health_daemon.service
systemctl stop dolphin-nautilus-trader.service && systemctl disable dolphin-nautilus-trader.service
systemctl stop dolphin-scan-bridge.service && systemctl disable dolphin-scan-bridge.service
```

**Permanent guard**: `test_mhs_v3.py::TestKillAndRevive::test_no_systemd_units_active_for_managed_services` asserts no conflicting systemd units are active.

### 26.2 OBF Universe Service

**Purpose**: Lightweight L2 order book health monitor for ALL 540 active USDT perpetuals on Binance Futures.

**Why**: Asset Picker needs OB health scores for the full universe (540 assets) to make informed selection decisions, not just the 400 assets covered by the existing OBF shard store.

**Design**: Push streams (zero REST weight), no polling.

```
wss://fstream.binance.com/ws
  Connection 1: 200 symbols × @depth5@500ms
  Connection 2: 200 symbols × @depth5@500ms
  Connection 3: 140 symbols × @depth5@500ms
  (total: 540, Binance limit: 300/conn)
```

**Computed metrics per asset** (every 60s snapshot):

| Field | Description |
|---|---|
| `spread_bps` | (ask - bid) / mid × 10000 |
| `depth_1pct_usd` | Total USD volume within 1% of mid on both sides |
| `depth_quality` | Normalized depth score [0,1] |
| `fill_probability` | Estimated probability of fill at mid |
| `imbalance` | (bid_vol - ask_vol) / (bid_vol + ask_vol) |
| `best_bid`, `best_ask` | L1 prices |
| `n_bid_levels`, `n_ask_levels` | Depth5 levels received |

**HZ output** (`DOLPHIN_FEATURES["obf_universe_latest"]`):
```json
{
  "_snapshot_utc": 1743350400.0,
  "_n_assets": 512,
  "assets": {
    "BTCUSDT": {"spread_bps": 0.42, "depth_quality": 0.91, ...},
    "ETHUSDT": {...},
    ...
  }
}
```

**Parquet storage**: `/mnt/ng6_data/ob_universe/` (Hive: `date=YYYY-MM-DD/part-NNN.parquet`)
- `MAX_FILE_AGE_DAYS = 0` — never pruned, accumulates for backtesting
- Flush cadence: every 300s

**Key constants**:
```python
SNAPSHOT_INTERVAL_S  = 60    # HZ push cadence
MAX_STREAMS_PER_CONN = 200   # Binance limit respected
FLUSH_INTERVAL_S     = 300   # Parquet write cadence
```

### 26.3 MHS v3 — Full Architecture Reference

**File**: `prod/meta_health_service_v3.py`
**Tests**: `prod/tests/test_mhs_v3.py` (111 tests, including Hypothesis property tests)

#### 26.3.1 Constants

```python
CHECK_INTERVAL_S              = 10.0   # main loop cadence
DATA_STALE_S                  = 30.0   # age threshold for stale (score=0.5)
DATA_DEAD_S                   = 120.0  # age threshold for dead (score=0.0)
RECOVERY_COOLDOWN_CRITICAL_S  = 10.0   # critical data infra restart cooldown
RECOVERY_COOLDOWN_DEFAULT_S   = 300.0  # informational services (never restarted)
```

#### 26.3.2 Weighted Sensor Formula

```python
SENSOR_WEIGHTS = {
    "m4_control_plane":  0.35,   # HZ port 5701 (×0.8) + Prefect 4200 (×0.2)
    "m1_data_infra":     0.35,   # fraction of critical_data services RUNNING
    "m3_data_freshness": 0.20,   # average freshness score across HZ keys
    "m5_coherence":      0.10,   # ACB boost range validity + OBF coverage
}
# m1_trader and m2_heartbeat: emitted but NOT in rm_meta (may be intentionally stopped)

rm_meta = sum(weight × sensor) / sum(weights)
```

#### 26.3.3 Recovery Logic

```python
def _restart_via_supervisorctl(self, program: str):
    """
    - Checks per-service cooldown (10s critical, 300s default)
    - Commits timestamp BEFORE spawning thread (prevents double-fire)
    - Runs in daemon thread — never blocks the check loop
    - Uses: supervisorctl -c <conf> restart <program>
    - NEVER calls systemctl
    """
```

#### 26.3.4 Test Suite Summary

| Class | Tests | Coverage |
|---|---|---|
| `TestSupervisordStatusParsing` | 7 | parseg all supervisorctl output variants |
| `TestM1ProcessIntegrity` | 7 | scoring with mocked sv_status, psutil fallback |
| `TestM3DataFreshnessScoring` | 7 | stale/dead thresholds, ISO timestamps |
| `TestRmMetaFormula` | 10 | weighted sum, product-formula regression guard |
| `TestRecoveryGating` | 5 | cooldown, thread isolation |
| `TestRecoveryNeverKillsRunning` | 6 | running services never restarted |
| `TestM4ControlPlane` | 4 | port checks with mocked socket |
| `TestM5Coherence` | 7 | boost range, OBF coverage thresholds |
| `TestLiveIntegration` | 10 | live HZ + supervisord (skip if unavailable) |
| `TestKillAndRevive` | 9 | E2E: stop service → MHS detects → restarts within 30s |
| `TestServiceRegistry` | 7 | invariants: cooldown ≤ 10s, check interval ≤ 15s |
| `TestRaceConditions` | 5 | 10 concurrent restarts same service → only 1 fires |
| `TestEdgeCases` | 14 | garbage JSON, future timestamps, NaN sensors |
| `TestHypothesisProperties` | 13 | 300–500 examples each: rm∈[0,1], monotone sensors, status valid |

**Run**:
```bash
source /home/dolphin/siloqy_env/bin/activate
cd /mnt/dolphinng5_predict
python -m pytest prod/tests/test_mhs_v3.py -v --tb=short   # ~5 minutes (E2E tests)
```

### 26.4 OBF Persistence Fix

**File**: `prod/obf_persistence.py`

**Bug (v4.1)**: `MAX_FILE_AGE_DAYS = 7` — every daily cleanup run deleted all OBF Parquet data older than 7 days, destroying the entire backtesting dataset.

**Fix (v5.0)**:
```python
MAX_FILE_AGE_DAYS = 0   # 0 = disabled — never prune, accumulate for backtesting

def _cleanup_old_partitions(self):
    """0 = disabled."""
    if not MAX_FILE_AGE_DAYS or not self.base_dir.exists():
        return
    ...
```

Data now accumulates indefinitely in `/mnt/ng6_data/ob_features/` (existing OBF) and `/mnt/ng6_data/ob_universe/` (new universe service).

---

---

## 27. NG8 LINUX EIGENSCAN SERVICE

**File**: `- Dolphin NG8/ng8_scanner.py`
**Status**: Built, smoke-tested. Replaces Windows NG7 eigenscan.
**Run**: `source /home/dolphin/siloqy_env/bin/activate && cd "/mnt/dolphinng5_predict/- Dolphin NG8" && python3 ng8_scanner.py`

### 27.1 Root Cause: NG7 Double-Output Bug

Windows NG7 maintained two independent tracker cycles:
- **Fast cycle** (w50, w150): completed ~11s after scan start → wrote Arrow file 1, HZ write 1
- **Slow cycle** (w300, w750): completed ~3 min later with **stale BTC price** → wrote Arrow file 2, HZ write 2

Both cycles shared the same `scan_number` counter. Result: two Arrow files per logical scan, the second containing stale prices from 3 minutes earlier. The scan bridge de-duplicated by file mtime (file 1 is always the useful one).

### 27.2 NG8 Fix: Single `enhance()` Pass

`DolphinCorrelationEnhancerArb512.enhance()` processes all four windows (50, 150, 300, 750) in a single sequential loop. NG8 calls this once per scan cycle:

```python
result = self.engine.enhance(price_data, PRIORITY_SYMBOLS, now)
# result.multi_window_results has all four windows populated
# Exactly one Arrow write + one HZ write follows
```

`use_arrow=False` is passed to the engine constructor so the engine does **not** perform its own internal Arrow write — `ng8_scanner.py` owns that write exclusively.

### 27.3 Schema Contract (Doctrinal NG5)

Arrow IPC schema is defined in `ng7_arrow_writer_original.py` → `SCAN_SCHEMA` (27 fields, `SCHEMA_VERSION="5.0.0"`). `arrow_writer.py` is a thin re-export shim:

```python
# arrow_writer.py
from ng7_arrow_writer_original import (
    ArrowEigenvalueWriter, ArrowScanReader, write_scan_arrow, read_scan_arrow,
)
```

**NEVER** modify `arrow_writer.py` schema — edit `ng7_arrow_writer_original.py`.

Key schema fields:
| Field | Type | Description |
|---|---|---|
| `scan_number` | int64 | monotonic counter, resumes from last Arrow file on restart |
| `timestamp_ns` | int64 | Unix nanoseconds at scan start |
| `w50_lambda_max` … `w750_instability` | float64 × 16 | per-window eigenstats |
| `vel_div` | float64 | velocity divergence (cross-window signal) |
| `regime_signal` | float64 | -1 / 0 / +1 |
| `instability_composite` | float64 | composite of w50…w750 instability |
| `assets` / `prices` / `loadings` | utf8 | JSON-serialised |
| `schema_version` | utf8 | "5.0.0" |

### 27.4 Storage

```
Arrow files : /mnt/dolphinng6_data/arrow_scans/YYYY-MM-DD/scan_NNNNNN_HHMMSS.arrow
ArrowEigenvalueWriter storage_root = /mnt/dolphinng6_data   # writer appends arrow_scans/ internally
```

**Critical**: pass `get_arrow_scans_path().parent` (= `/mnt/dolphinng6_data`) — NOT `get_arrow_scans_path()` — or the writer creates `arrow_scans/arrow_scans/` double-nesting.

### 27.5 Hazelcast Output

Map: `DOLPHIN_FEATURES` → key `latest_eigen_scan`

**NG8 flat payload** (written by NG8, differs from NG7 nested payload):
```python
{
    "scan_number":         int,
    "timestamp":           "ISO-8601",
    "bridge_ts":           float,          # Unix epoch at HZ write
    "vel_div":             float,
    "w50_velocity":        float,
    "w150_velocity":       float,
    "w300_velocity":       float,
    "w750_velocity":       float,
    "eigenvalue_gradients": {...},
    "multi_window_results": {...},         # full per-window stats
}
```

TUI v3 `_eigen_from_scan()` normalises both NG7 nested and NG8 flat formats transparently.

### 27.6 Scan Number Continuity

On startup, `_load_last_scan_number(arrow_scans_dir)` scans all `scan_NNNNNN_*.arrow` filenames for the highest N and resumes from N+1. Prevents counter reset gaps after service restart.

### 27.7 Symbol List

50 symbols matching doctrinal NG3/NG5/NG7 `PRIORITY_SYMBOLS`. Do NOT change this list without a full schema migration — historical correlation matrices are computed on this exact universe.

### 27.8 Supervisord Integration (Pending)

Add to `dolphin-supervisord.conf`:
```ini
[program:ng8_scanner]
command=/home/dolphin/siloqy_env/bin/python3 ng8_scanner.py
directory=/mnt/dolphinng5_predict/- Dolphin NG8
autostart=false        ; manual start until NG7 Windows is formally retired
autorestart=true
stderr_logfile=/var/log/dolphin/ng8_scanner.err.log
stdout_logfile=/var/log/dolphin/ng8_scanner.out.log
```

Set `autostart=true` only after confirming Windows NG7 is shut down — dual-write to the same HZ key is safe (last-write-wins) but creates confusing Arrow audit trails.

---

## 28. TUI v3 — LIVE OBSERVABILITY TERMINAL

**File**: `Observability/TUI/dolphin_tui_v3.py`
**Run**: `source /home/dolphin/siloqy_env/bin/activate && cd /mnt/dolphinng5_predict/Observability/TUI && python3 dolphin_tui_v3.py`
**Framework**: Textual 8.1.1 (siloqy_env)
**Bindings**: `q` quit · `r` force-refresh · `l` log panel · `t` toggle test footer

### 28.1 Architecture: Zero Load on Origin System

All data flows via **Hazelcast entry listeners** (push model):

```
HZ maps ──push──► _State (thread-safe dict) ──call_from_thread──► Textual asyncio loop
                                                                         │
                                               set_interval(1s) ────────┘
```

`IMap.add_entry_listener(include_value=True, updated=fn, added=fn)` fires callbacks from the HZ internal thread pool on any map change. No polling of origin systems.

Prefect is the **only** polled source — 60s interval via `run_worker(prefect_poll_loop())`.

### 28.2 Panel Map

| Panel | HZ Source | Update Trigger |
|---|---|---|
| **Header** | `DOLPHIN_HEARTBEAT` | HZ listener |
| **Trader** | `DOLPHIN_STATE_BLUE`, `DOLPHIN_FEATURES/latest_eigen_scan`, `DOLPHIN_HEARTBEAT` | HZ listener |
| **SysHealth (M1–M5)** | `DOLPHIN_META_HEALTH/latest` | HZ listener |
| **AlphaEngine** | `DOLPHIN_FEATURES/latest_eigen_scan` | HZ listener (eigenscan) |
| **Scan** | `DOLPHIN_FEATURES/latest_eigen_scan` | HZ listener (eigenscan) |
| **ExtF** | `DOLPHIN_FEATURES/ext_features_latest` | HZ listener |
| **OBF** | `DOLPHIN_FEATURES/obf_features_latest` | HZ listener |
| **Capital** | `DOLPHIN_STATE_BLUE`, `DOLPHIN_SAFETY` | HZ listener |
| **Prefect** | Prefect SDK | 60s poll |
| **ACB** | `DOLPHIN_FEATURES/acb_state_latest` | HZ listener |
| **MC-Forewarner** | `DOLPHIN_FEATURES/mc_forewarner_latest` | HZ listener (or "not deployed") |
| **Test Footer** | `run_logs/test_results_latest.json` | File read on mount + `t` toggle |

### 28.3 HZ Maps Listened

```python
DOLPHIN_FEATURES:    latest_eigen_scan, ext_features_latest, obf_features_latest,
                     acb_state_latest, mc_forewarner_latest
DOLPHIN_META_HEALTH: latest
DOLPHIN_SAFETY:      latest
DOLPHIN_STATE_BLUE:  latest
DOLPHIN_HEARTBEAT:   latest
```

### 28.4 Test Results Footer

The footer reads `run_logs/test_results_latest.json` (relative to `dolphin_tui_v3.py`'s working directory, i.e., `/mnt/dolphinng5_predict/run_logs/test_results_latest.json`).

**Schema**:
```json
{
  "_run_at": "2026-04-05T12:00:00",
  "data_integrity":  {"passed": 15, "total": 15, "status": "PASS"},
  "finance_fuzz":    {"passed": null, "total": null, "status": "N/A"},
  "signal_fill":     {"passed": null, "total": null, "status": "N/A"},
  "degradation":     {"passed": 12, "total": 12, "status": "PASS"},
  "actor":           {"passed": null, "total": null, "status": "N/A"}
}
```

**Write API** (exported from `dolphin_tui_v3.py`):
```python
from dolphin_tui_v3 import write_test_results

write_test_results({
    "data_integrity": {"passed": 15, "total": 15, "status": "PASS"},
    "finance_fuzz":   {"passed":  8, "total":  8, "status": "PASS"},
    ...
})
```

`write_test_results()` atomically writes `_run_at` (current UTC ISO timestamp) + the provided category dict. The TUI footer auto-refreshes on next mount or `t` keypress.

Full integration documentation: `prod/docs/TEST_REPORTING.md`.

### 28.5 NG7 / NG8 Dual Format Normalisation

`_eigen_from_scan(scan)` handles both live HZ formats:

```python
def _eigen_from_scan(scan):
    # NG7 nested: scan["result"]["multi_window_results"]["50"]["velocity"]
    # NG8 flat:   scan["multi_window_results"]["50"]["velocity"]
    result = scan.get("result", scan)
    mwr = result.get("multi_window_results", {})
    for w in (50, 150, 300, 750):
        row = mwr.get(w) or mwr.get(str(w)) or {}
        ...
```

### 28.6 MC-Forewarner Integration

**Status: DEPLOYED AND RUNNING** — `prod/mc_forewarner_flow.py`, Prefect schedule `0 */4 * * *` (every 4 hours UTC).

MC-Forewarner writes to `DOLPHIN_FEATURES` key `mc_forewarner_latest`. The TUI entry listener fires on each write and populates the full MC footer panel: `catastrophic_prob` Digits + ProgressBar, `envelope_score` bar, prob sparkline history, `source` label (`REAL_MODEL` / `FALLBACK_NO_DATA` / `FALLBACK_ERROR`).

If the TUI starts between 4-hour runs and HZ has never been written to (e.g., fresh HZ instance), the footer shows `"awaiting HZ data (runs every 4h via Prefect)"` in yellow. This is a cold-start state only — once the first Prefect run completes the key persists in HZ indefinitely (no TTL).

**MC payload schema**:
```json
{
  "status":            "GREEN | ORANGE | RED",
  "catastrophic_prob": 0.07,
  "envelope_score":    0.91,
  "source":            "REAL_MODEL | FALLBACK_NO_DATA | FALLBACK_ERROR",
  "timestamp":         "2026-04-05T14:00:00+00:00"
}
```

**Thresholds**: GREEN `prob < 0.10` · ORANGE `0.10–0.30` · RED `≥ 0.30`

**Models path**: `nautilus_dolphin/mc_results/models/*.pkl` — if absent, falls back to `FALLBACK_NO_DATA` (ORANGE, prob=0.20, env=0.80) which is a safe conservative posture, never random.

### 28.7 Pending: DOLPHIN_PNL_BLUE

The Trader panel contains placeholder text `"read DOLPHIN_PNL_BLUE (not yet wired)"`. Open positions and session PnL data should be sourced from this map when Nautilus live trading is active.

---

*End of DOLPHIN-NAUTILUS System Bible v6.0 — 2026-04-05*
*Champion: SHORT only (APEX posture, blue configuration)*
*Process manager: Supervisord exclusively (systemd units retired).*
*MHS v3: Active, RM_META≈0.975 [GREEN], 10s critical recovery cooldown.*
*OBF Universe: 540 assets live, zero REST weight WS push streams.*
*NG8 Scanner: Built, smoke-tested. Awaiting NG7 Windows retirement before autostart.*
*TUI v3: Live event-driven observability. All HZ panels hot. MC-Forewarner footer live (4h Prefect cadence). Test footer CI-ready.*
*Test gates: 409+ tests green across all suites.*
*Do NOT deploy real capital until 30-day paper run is clean.*