# Dolphin Data Layer Reference ## Overview The Dolphin system has a three-tier data architecture: ``` ┌─────────────────────────────────────────────────────────────────────┐ │ LIVE HOT PATH (sub-second) │ │ Hazelcast 5.3 (RAM-only) — single source of truth for all services │ │ Port: 5701 | Cluster: dolphin │ └─────────────────────────────────────────────────────────────────────┘ │ ch_writer (async fire-and-forget) ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ WARM STORE (analytics, dashboards, recovery) │ │ ClickHouse 24.3 (MergeTree) — structured historical data │ │ Port: 8123 (HTTP) / 9000 (TCP) | DB: dolphin, dolphin_green │ └─────────────────────────────────────────────────────────────────────┘ │ periodic dumps / cache builds ▼ ┌─────────────────────────────────────────────────────────────────────┐ │ COLD STORE (backtesting, training, research) │ │ Parquet / Arrow / JSON files on disk under /mnt/ │ └─────────────────────────────────────────────────────────────────────┘ ``` --- ## 1. ClickHouse ### Connection | Param | Value | |-------|-------| | URL | `http://localhost:8123` | | User | `dolphin` | | Password | `dolphin_ch_2026` | | DB (blue) | `dolphin` | | DB (green) | `dolphin_green` | | Auth headers | `X-ClickHouse-User` / `X-ClickHouse-Key` | ### Quick Query ```bash # CLI (from host) curl -s "http://localhost:8123/?database=dolphin" \ -H "X-ClickHouse-User: dolphin" \ -H "X-ClickHouse-Key: dolphin_ch_2026" \ -d "SELECT count() FROM trade_events FORMAT JSON" # Python (urllib) import urllib.request, json, base64 def ch_query(sql): """Execute ClickHouse query, return parsed JSON result.""" url = "http://localhost:8123/" req = urllib.request.Request(url, data=(sql + "\nFORMAT JSON").encode()) req.add_header("X-ClickHouse-User", "dolphin") req.add_header("X-ClickHouse-Key", "dolphin_ch_2026") resp = urllib.request.urlopen(req, timeout=10) return json.loads(resp.read().decode()) result = ch_query("SELECT * FROM dolphin.trade_events ORDER BY ts DESC LIMIT 10") for row in result["data"]: print(row) ``` ### Insert Pattern ```python # Async fire-and-forget (production — from ch_writer.py) from ch_writer import ch_put, ts_us ch_put("trade_events", { "ts": ts_us(), # DateTime64(6) microsecond precision "date": "2026-04-15", "strategy": "blue", "asset": "BTCUSDT", "side": "SHORT", "entry_price": 84500.0, "exit_price": 84200.0, "quantity": 0.01, "pnl": 3.0, "pnl_pct": 0.00355, "exit_reason": "FIXED_TP", "vel_div_entry": -0.0319, "leverage": 9.0, "bars_held": 45, }) # Direct insert (for one-off scripts) import urllib.request, json body = (json.dumps(row) + "\n").encode() url = "http://localhost:8123/?database=dolphin&query=INSERT+INTO+trade_events+FORMAT+JSONEachRow" req = urllib.request.Request(url, data=body, method="POST") req.add_header("X-ClickHouse-User", "dolphin") req.add_header("X-ClickHouse-Key", "dolphin_ch_2026") urllib.request.urlopen(req, timeout=5) ``` ### `dolphin` Database — Tables **`trade_events`** — Closed trades (471 rows, 2026-03-31 → ongoing) | Column | Type | Description | |--------|------|-------------| | `ts` | DateTime64(6, UTC) | Trade close timestamp (microsecond) | | `date` | Date | Trade date (partition key) | | `strategy` | LowCardinality(String) | "blue" or "green" | | `asset` | LowCardinality(String) | e.g. "ENJUSDT", "LTCUSDT" | | `side` | LowCardinality(String) | "SHORT" (always in champion) | | `entry_price` | Float64 | Entry price | | `exit_price` | Float64 | Exit price | | `quantity` | Float64 | Position size in asset units | | `pnl` | Float64 | Profit/loss in USDT | | `pnl_pct` | Float32 | PnL as fraction of notional | | `exit_reason` | LowCardinality(String) | See exit reasons below | | `vel_div_entry` | Float32 | Velocity divergence at entry | | `boost_at_entry` | Float32 | ACB boost at entry time | | `beta_at_entry` | Float32 | ACB beta at entry time | | `posture` | LowCardinality(String) | System posture at entry | | `leverage` | Float32 | Applied leverage | | `regime_signal` | Int8 | Regime classification | | `capital_before` | Float64 | Capital before trade | | `capital_after` | Float64 | Capital after trade | | `peak_capital` | Float64 | Peak capital at time | | `drawdown_at_entry` | Float32 | Drawdown pct at entry | | `open_positions_count` | UInt8 | Open positions (always 0 or 1) | | `scan_uuid` | String | UUIDv7 trace ID | | `bars_held` | UInt16 | Number of bars held | Engine: MergeTree | Partition: `toYYYYMM(ts)` | Order: `(ts, asset)` **Exit reasons observed in production**: | Exit Reason | Meaning | |-------------|---------| | `MAX_HOLD` | Held for max_hold_bars (125) without TP or stop hit | | `FIXED_TP` | Take-profit target (95bps) reached | | `HIBERNATE_HALT` | Posture changed to HIBERNATE, position closed | | `SUBDAY_ACB_NORMALIZATION` | ACB day-reset forced position close | **`eigen_scans`** — Processed eigenscans (68k rows, ~11s cadence) | Column | Type | Description | |--------|------|-------------| | `ts` | DateTime64(6, UTC) | Scan timestamp | | `scan_number` | UInt32 | Monotonic scan counter | | `vel_div` | Float32 | Velocity divergence (v50 - v750) | | `w50_velocity` | Float32 | 50-window correlation velocity | | `w750_velocity` | Float32 | 750-window correlation velocity | | `instability_50` | Float32 | Instability measure | | `scan_to_fill_ms` | Float32 | Latency: scan → fill | | `step_bar_ms` | Float32 | Latency: step_bar computation | | `scan_uuid` | String | UUIDv7 trace ID | Engine: MergeTree | Partition: `toYYYYMM(ts)` | Order: `(ts, scan_number)` | TTL: 10 years **`status_snapshots`** — System status (686k rows, ~10s cadence, TTL 180 days) | Column | Type | Description | |--------|------|-------------| | `ts` | DateTime64(3, UTC) | Snapshot timestamp | | `capital` | Float64 | Current capital | | `roi_pct` | Float32 | Return on investment % | | `dd_pct` | Float32 | Current drawdown % | | `trades_executed` | UInt16 | Total trades count | | `posture` | LowCardinality(String) | APEX / TURTLE / HIBERNATE | | `rm` | Float32 | Risk metric | | `vel_div` | Float32 | Latest velocity divergence | | `vol_ok` | UInt8 | Volatility within bounds (0/1) | | `phase` | LowCardinality(String) | Trading phase | | `mhs_status` | LowCardinality(String) | Meta health status | | `boost` | Float32 | ACB boost | | `cat5` | Float32 | Category 5 risk metric | Engine: MergeTree | Order: `ts` | TTL: 180 days **`posture_events`** — Posture transitions (92 rows) | Column | Type | Description | |--------|------|-------------| | `ts` | DateTime64(6, UTC) | Transition timestamp | | `posture` | LowCardinality(String) | New posture (APEX/TURTLE/HIBERNATE) | | `rm` | Float32 | Risk metric at transition | | `prev_posture` | LowCardinality(String) | Previous posture | | `trigger` | String | JSON with Cat1-Cat4 values that triggered transition | | `scan_uuid` | String | UUIDv7 trace ID | Postures (ordered by risk): `APEX` (full risk) → `TURTLE` (reduced) → `HIBERNATE` (minimal) **`acb_state`** — Adaptive Circuit Breaker (26k rows, ~30s cadence) | Column | Type | Description | |--------|------|-------------| | `ts` | DateTime64(6, UTC) | Timestamp | | `boost` | Float32 | Leverage boost multiplier (≥1.0) | | `beta` | Float32 | Risk scaling factor | | `signals` | Float32 | Signal quality metric | **`meta_health`** — Meta Health Service v3 (78k rows, ~10s cadence) | Column | Type | Description | |--------|------|-------------| | `ts` | DateTime64(6, UTC) | Timestamp | | `status` | LowCardinality(String) | GREEN / YELLOW / RED | | `rm_meta` | Float32 | Aggregate health score (0–1) | | `m1_data_infra` | Float32 | Data infrastructure health | | `m1_trader` | Float32 | Trader process health | | `m2_heartbeat` | Float32 | Heartbeat freshness | | `m3_data_freshness` | Float32 | Scan data freshness | | `m4_control_plane` | Float32 | Control plane (HZ/Prefect) health | | `m5_coherence` | Float32 | State coherence across services | | `m6_test_integrity` | Float32 | Test gate pass status | | `service_status` | String | JSON service states | | `hz_key_status` | String | HZ key freshness | **`exf_data`** — External Factors (1.56M rows, ~0.5s cadence) | Column | Type | Description | |--------|------|-------------| | `ts` | DateTime64(6, UTC) | Timestamp | | `funding_rate` | Float32 | BTC funding rate | | `dvol` | Float32 | Deribit DVOL (implied volatility) | | `fear_greed` | Float32 | Fear & Greed index | | `taker_ratio` | Float32 | Taker buy/sell ratio | **`obf_universe`** — Order Book Features (821M rows, ~500ms cadence, 542 symbols) | Column | Type | Description | |--------|------|-------------| | `ts` | DateTime64(3, UTC) | Timestamp (millisecond) | | `symbol` | LowCardinality(String) | Trading pair | | `spread_bps` | Float32 | Bid-ask spread in basis points | | `depth_1pct_usd` | Float64 | USD depth at 1% from mid | | `depth_quality` | Float32 | Book quality metric | | `fill_probability` | Float32 | Estimated fill probability | | `imbalance` | Float32 | Bid/ask imbalance | | `best_bid` | Float64 | Best bid price | | `best_ask` | Float64 | Best ask price | | `n_bid_levels` | UInt8 | Number of bid levels | | `n_ask_levels` | UInt8 | Number of ask levels | Engine: MergeTree | Partition: `toYYYYMM(ts)` | Order: `(symbol, ts)` **`supervisord_state`** — Process manager state (138k rows) | Column | Type | Description | |--------|------|-------------| | `ts` | DateTime64(6, UTC) | Timestamp | | `process_name` | LowCardinality(String) | Service name | | `group_name` | LowCardinality(String) | `dolphin_data` or `dolphin` | | `state` | LowCardinality(String) | RUNNING / STOPPED / EXITED | | `pid` | UInt32 | Process ID | | `uptime_s` | UInt32 | Uptime in seconds | | `exit_status` | Int16 | Exit code | | `source` | LowCardinality(String) | Source of state change | **`service_lifecycle`** — Service start/stop events (62 rows) | Column | Type | Description | |--------|------|-------------| | `ts` | DateTime64(6, UTC) | Timestamp | | `service` | LowCardinality(String) | Service name | | `event` | LowCardinality(String) | START / EXIT | | `reason` | String | NORMAL_START / SIGTERM / NORMAL_EXIT | | `exit_code` | Int16 | Exit code | | `signal_num` | Int16 | Signal number | | `pid` | UInt32 | Process ID | **`system_stats`** — Host system metrics (35k rows) | Column | Type | Description | |--------|------|-------------| | `ts` | DateTime64(3, UTC) | Timestamp | | `mem_used_gb` | Float32 | Memory used (GB) | | `mem_available_gb` | Float32 | Memory available (GB) | | `mem_pct` | Float32 | Memory usage % | | `load_1m` / `load_5m` / `load_15m` | Float32 | Load averages | | `net_rx_mb_s` | Float32 | Network receive (MB/s) | | `net_tx_mb_s` | Float32 | Network transmit (MB/s) | | `net_iface` | LowCardinality(String) | Network interface | ### `dolphin` Database — Views **`v_trade_summary_30d`** — 30-day rolling trade stats ```sql SELECT * FROM dolphin.v_trade_summary_30d -- Returns: strategy, n_trades, wins, win_rate_pct, total_pnl, -- avg_pnl_pct, median_pnl_pct, max_dd_seen_pct ``` **`v_current_posture`** — Latest posture state ```sql SELECT * FROM dolphin.v_current_posture -- Returns: posture, rm, trigger, ts ``` **`v_process_health`** — Current process states ```sql SELECT * FROM dolphin.v_process_health ORDER BY group_name, process_name -- Returns: process_name, group_name, state, pid, uptime_s, last_seen ``` **`v_scan_latency_1h`** — Last hour scan latency percentiles ```sql SELECT * FROM dolphin.v_scan_latency_1h -- Returns: p50_ms, p95_ms, p99_ms, n_scans, window_start ``` **`v_system_stats_1h`** — Last hour system metrics (5-min buckets) ```sql SELECT * FROM dolphin.v_system_stats_1h -- Returns: bucket, mem_pct_avg, load_avg, net_rx_peak, net_tx_peak ``` **`v_scan_causal_chain`** — Trace scans → trades by scan_uuid ### `dolphin_green` Database Mirror of `dolphin` with tables: `acb_state`, `account_events`, `daily_pnl`, `eigen_scans`, `exf_data`, `meta_health`, `obf_universe`, `posture_events`, `service_lifecycle`, `status_snapshots`, `supervisord_state`, `system_stats`, `trade_events` (325 rows, 2026-04-12 → ongoing). ### Useful Queries ```sql -- Last 20 trades with key metrics SELECT ts, asset, side, entry_price, exit_price, pnl, pnl_pct, exit_reason, leverage, vel_div_entry, bars_held, posture FROM dolphin.trade_events ORDER BY ts DESC LIMIT 20; -- Today's P&L summary SELECT count() as trades, sum(pnl) as total_pnl, countIf(pnl>0) as wins, round(countIf(pnl>0)/count()*100,1) as win_rate FROM dolphin.trade_events WHERE date = today(); -- Exit reason distribution SELECT exit_reason, count() as n, round(sum(pnl),2) as total_pnl, round(countIf(pnl>0)/count()*100,1) as win_rate FROM dolphin.trade_events GROUP BY exit_reason ORDER BY n DESC; -- Per-asset performance SELECT asset, count() as trades, round(sum(pnl),2) as pnl, round(countIf(pnl>0)/count()*100,1) as wr, round(avg(leverage),1) as avg_lev FROM dolphin.trade_events GROUP BY asset ORDER BY pnl DESC; -- Capital curve (from status snapshots) SELECT ts, capital, roi_pct, dd_pct, posture, vel_div FROM dolphin.status_snapshots WHERE ts >= today() - INTERVAL 7 DAY ORDER BY ts; -- Scan-to-trade latency distribution SELECT quantile(0.5)(scan_to_fill_ms) as p50, quantile(0.95)(scan_to_fill_ms) as p95, quantile(0.99)(scan_to_fill_ms) as p99 FROM dolphin.eigen_scans WHERE ts >= now() - INTERVAL 1 HOUR; -- Leverage distribution SELECT round(leverage,1) as lev, count() as n FROM dolphin.trade_events GROUP BY lev ORDER BY lev; -- Scan rate per hour SELECT toStartOfHour(ts) as hour, count() as scans FROM dolphin.eigen_scans WHERE ts >= now() - INTERVAL 24 HOUR GROUP BY hour ORDER BY hour; ``` --- ## 2. Hazelcast ### Connection | Param | Value | |-------|-------| | Host | `localhost:5701` | | Cluster | `dolphin` | | Python client | `hazelcast-python-client` 5.x | | Management UI | `http://localhost:8080` | ```python import hazelcast client = hazelcast.HazelcastClient( cluster_name="dolphin", cluster_members=["localhost:5701"], connection_timeout=5.0, ) ``` **WARNING**: Hazelcast is RAM-only. Never restart the container — all state is lost on restart. ### IMap Reference #### `DOLPHIN_FEATURES` (547 entries) — Central data bus | Key | Type | Writer | Description | |-----|------|--------|-------------| | `latest_eigen_scan` | JSON string | scan_bridge | Latest eigenvalue scan with scan_number, vel_div, regime, asset data | | `exf_latest` | JSON string | exf_fetcher | External factors: funding rates, OI, L/S ratio, taker, basis, etc. | | `acb_boost` | JSON string | acb_processor | ACB boost/beta/signals with CP lock | | `mc_forewarner_latest` | JSON string | mc_forewarning | Monte Carlo risk envelope status | | `asset__ob` | JSON string | obf_universe | Per-asset order book snapshot | **`latest_eigen_scan` structure**: ```json { "scan_number": 134276, "timestamp": 1776269558.4, "file_mtime": 1776269558.4, "result": { "type": "scan", "timestamp": "2026-04-15 18:12:33", "asset": "BTCUSDT", "regime": "BEAR", "bull_pct": 42.86, "bear_pct": 57.14, "sentiment": "BEARISH", "total_symbols": 50, "correlation_symbol": "...", "vel_div": -0.0319, "...": "..." }, "target_asset": "BTCUSDT", "version": "NG7", "_ng7_metadata": { "scan_number": 134276, "uuid": "...", ... } } ``` **`exf_latest` structure**: ```json { "funding_btc": -5.085e-05, "funding_btc_lagged": -5.085e-05, "funding_eth": 1.648e-05, "oi_btc": 97583.527, "oi_eth": 2246343.627, "ls_btc": 0.8218, "ls_eth": 1.4067, "taker": 0.5317, "taker_lagged": 2.1506, "basis": -0.07784355, "imbalance_btc": -0.786, "dvol": 52.34, "fear_greed": 45.0 } ``` **`asset__ob` structure**: ```json { "best_bid": 84500.0, "best_ask": 84501.0, "spread_bps": 1.18, "depth_1pct_usd": 50000.0, "depth_quality": 0.85, "fill_probability": 0.95, "imbalance": 0.03, "n_bid_levels": 5, "n_ask_levels": 5 } ``` #### `DOLPHIN_STATE_BLUE` (2 entries) — Blue strategy runtime state | Key | Description | |-----|-------------| | `capital_checkpoint` | `{"capital": 25705.50, "ts": 1776269557.97}` | | `engine_snapshot` | Full engine state (see below) | **`engine_snapshot` structure**: ```json { "capital": 25705.50, "open_positions": [], "algo_version": "v2_gold_fix_v50-v750", "last_scan_number": 134276, "last_vel_div": 0.0201, "vol_ok": true, "posture": "APEX", "scans_processed": 6377, "trades_executed": 71, "bar_idx": 4655, "timestamp": "2026-04-15T16:12:37", "leverage_soft_cap": 8.0, "leverage_abs_cap": 9.0, "open_notional": 0.0, "current_leverage": 0.0 } ``` #### `DOLPHIN_PNL_BLUE` (3 entries) — Daily P&L Keyed by date string: `"2026-04-15"` → `{"portfolio_capital": 20654.01, "engine_capital": 20654.01}` #### `DOLPHIN_STATE_GREEN` (1 entry) — Green strategy state Same structure as blue: `capital_checkpoint`. #### `DOLPHIN_META_HEALTH` (1 entry) Key: `"latest"` → `{"rm_meta": 0.94, "status": "GREEN", "m1_data_infra": 1.0, "m1_trader": 1.0, ...}` ### Read/Write Patterns ```python # Read from HZ features = client.get_map("DOLPHIN_FEATURES").blocking() scan = json.loads(features.get("latest_eigen_scan")) # Write to HZ features.put("exf_latest", json.dumps(payload)) # Atomic update with CP lock (used by ACB) lock = client.cp_subsystem.get_lock("acb_update_lock").blocking() lock.lock() try: features.put("acb_boost", json.dumps(acb_data)) finally: lock.unlock() # HZ warmup after restart (reconstruct from ClickHouse) from hz_warmup import _ch_query latest = _ch_query("SELECT * FROM dolphin.acb_state ORDER BY ts DESC LIMIT 1") features.put("acb_boost", json.dumps(latest[0])) ``` --- ## 3. File-Based Data ### Parquet (VBT Cache) **Location**: `/mnt/dolphinng5_predict/vbt_cache_synth_15M/` Daily parquet files (`YYYY-MM-DD.parquet`) containing scan data with columns: `vel_div`, `v50_vel`, `v150_vel`, `v750_vel`, asset prices, BTC price, and derived features. Used by CI test fixtures and backtesting. **Read**: ```python import pandas as pd df = pd.read_parquet("/mnt/dolphinng5_predict/vbt_cache_synth_15M/2026-02-25.parquet") ``` ### Arrow Scans (Live Pipeline) **Location**: `/mnt/dolphinng6_data/arrow_scans//*.arrow` PyArrow IPC files written by NG8 scanner. Each file = one eigenscan. Consumed by scan_bridge_service → pushed to Hazelcast. ### Eigenvalue JSON **Location**: `/mnt/dolphinng6_data/eigenvalues//*.json` Per-scan JSON files with eigenvalue data: scan_number, eigenvalues array, regime, bull/bear percentages. ### Correlation Matrices **Location**: `/mnt/dolphinng6_data/matrices//` ZST-compressed 50×50 correlation matrices: `scan_NNNNNN_wWWW_HHMMSS.arb512.pkl.zst` ### Session Logs **Location**: `/mnt/dolphinng5_predict/session_logs/` Trade session logs: `session_YYYYMMDD_HHMMSS.jsonl` (JSON Lines) and `session_YYYYMMDD.md` (human-readable). ### Run Logs **Location**: `/mnt/dolphinng5_predict/run_logs/` Engine run summaries and backtest parity logs. Key file: `run_logs/summary_*.json`. --- ## 4. Data Flow ``` ┌─────────────┐ │ NG8 Scanner │ (Linux, /mnt/dolphinng6_data/) └──────┬──────┘ │ writes .arrow files ▼ ┌─────────────┐ │ Scan Bridge │ (supervisord, dolphin group) └──────┬──────┘ │ HZ put("latest_eigen_scan") ▼ ┌──── DOLPHIN_FEATURES (HZ) ────┐ │ │ ┌─────────▼──────────┐ ┌─────────▼──────────┐ │ nautilus_event_ │ │ clean_arch/ │ │ trader (prod path) │ │ main.py (alt path) │ └─────────┬──────────┘ └────────────────────┘ │ NDAlphaEngine.step_bar() │ ┌─────────▼──────────┐ │ ch_put("trade_ │ ← async fire-and-forget │ events", {...}) │ └─────────┬──────────┘ │ ▼ ┌──────────────────────┐ │ ClickHouse (dolphin) │ ← queries, dashboards, HZ warmup └──────────────────────┘ Parallel writers to HZ: exf_fetcher → "exf_latest" acb_processor → "acb_boost" (with CP lock) obf_universe → "asset_*_ob" meta_health → DOLPHIN_META_HEALTH["latest"] mc_forewarning → "mc_forewarner_latest" nautilus_trader→ DOLPHIN_STATE_BLUE["engine_snapshot"] DOLPHIN_PNL_BLUE[date_str] ``` --- ## 5. Current System State (Live Snapshot) | Metric | Value | |--------|-------| | Blue capital | $25,705.50 | | Blue ROI | +5.92% (from $25,000) | | Blue trades today | 71 total | | Posture | APEX | | MHS status | GREEN (rm_meta=0.94) | | ACB boost | 1.4581 / beta=0.80 | | Latest scan | #134276 | | Latest vel_div | +0.0201 | | Scan cadence | ~11s | | Scan→fill latency | ~10–27ms | | Process health | All RUNNING (uptime ~22h) | ### Supervisord Groups | Group | Services | Autostart | |-------|----------|-----------| | `dolphin_data` | exf_fetcher, acb_processor, obf_universe, meta_health, system_stats | Yes | | `dolphin` | nautilus_trader, scan_bridge, clean_arch_trader, paper_portfolio, dolphin_live | No (manual) |