Includes core prod + GREEN/BLUE subsystems: - prod/ (BLUE harness, configs, scripts, docs) - nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved) - adaptive_exit/ (AEM engine + models/bucket_assignments.pkl) - Observability/ (EsoF advisor, TUI, dashboards) - external_factors/ (EsoF producer) - mc_forewarning_qlabs_fork/ (MC regime/envelope) Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
23 KiB
Executable File
Dolphin Data Layer Reference
Overview
The Dolphin system has a three-tier data architecture:
┌─────────────────────────────────────────────────────────────────────┐
│ LIVE HOT PATH (sub-second) │
│ Hazelcast 5.3 (RAM-only) — single source of truth for all services │
│ Port: 5701 | Cluster: dolphin │
└─────────────────────────────────────────────────────────────────────┘
│ ch_writer (async fire-and-forget)
▼
┌─────────────────────────────────────────────────────────────────────┐
│ WARM STORE (analytics, dashboards, recovery) │
│ ClickHouse 24.3 (MergeTree) — structured historical data │
│ Port: 8123 (HTTP) / 9000 (TCP) | DB: dolphin, dolphin_green │
└─────────────────────────────────────────────────────────────────────┘
│ periodic dumps / cache builds
▼
┌─────────────────────────────────────────────────────────────────────┐
│ COLD STORE (backtesting, training, research) │
│ Parquet / Arrow / JSON files on disk under /mnt/ │
└─────────────────────────────────────────────────────────────────────┘
1. ClickHouse
Connection
| Param | Value |
|---|---|
| URL | http://localhost:8123 |
| User | dolphin |
| Password | dolphin_ch_2026 |
| DB (blue) | dolphin |
| DB (green) | dolphin_green |
| Auth headers | X-ClickHouse-User / X-ClickHouse-Key |
Quick Query
# CLI (from host)
curl -s "http://localhost:8123/?database=dolphin" \
-H "X-ClickHouse-User: dolphin" \
-H "X-ClickHouse-Key: dolphin_ch_2026" \
-d "SELECT count() FROM trade_events FORMAT JSON"
# Python (urllib)
import urllib.request, json, base64
def ch_query(sql):
"""Execute ClickHouse query, return parsed JSON result."""
url = "http://localhost:8123/"
req = urllib.request.Request(url, data=(sql + "\nFORMAT JSON").encode())
req.add_header("X-ClickHouse-User", "dolphin")
req.add_header("X-ClickHouse-Key", "dolphin_ch_2026")
resp = urllib.request.urlopen(req, timeout=10)
return json.loads(resp.read().decode())
result = ch_query("SELECT * FROM dolphin.trade_events ORDER BY ts DESC LIMIT 10")
for row in result["data"]:
print(row)
Insert Pattern
# Async fire-and-forget (production — from ch_writer.py)
from ch_writer import ch_put, ts_us
ch_put("trade_events", {
"ts": ts_us(), # DateTime64(6) microsecond precision
"date": "2026-04-15",
"strategy": "blue",
"asset": "BTCUSDT",
"side": "SHORT",
"entry_price": 84500.0,
"exit_price": 84200.0,
"quantity": 0.01,
"pnl": 3.0,
"pnl_pct": 0.00355,
"exit_reason": "FIXED_TP",
"vel_div_entry": -0.0319,
"leverage": 9.0,
"bars_held": 45,
})
# Direct insert (for one-off scripts)
import urllib.request, json
body = (json.dumps(row) + "\n").encode()
url = "http://localhost:8123/?database=dolphin&query=INSERT+INTO+trade_events+FORMAT+JSONEachRow"
req = urllib.request.Request(url, data=body, method="POST")
req.add_header("X-ClickHouse-User", "dolphin")
req.add_header("X-ClickHouse-Key", "dolphin_ch_2026")
urllib.request.urlopen(req, timeout=5)
dolphin Database — Tables
trade_events — Closed trades (471 rows, 2026-03-31 → ongoing)
| Column | Type | Description |
|---|---|---|
ts |
DateTime64(6, UTC) | Trade close timestamp (microsecond) |
date |
Date | Trade date (partition key) |
strategy |
LowCardinality(String) | "blue" or "green" |
asset |
LowCardinality(String) | e.g. "ENJUSDT", "LTCUSDT" |
side |
LowCardinality(String) | "SHORT" (always in champion) |
entry_price |
Float64 | Entry price |
exit_price |
Float64 | Exit price |
quantity |
Float64 | Position size in asset units |
pnl |
Float64 | Profit/loss in USDT |
pnl_pct |
Float32 | PnL as fraction of notional |
exit_reason |
LowCardinality(String) | See exit reasons below |
vel_div_entry |
Float32 | Velocity divergence at entry |
boost_at_entry |
Float32 | ACB boost at entry time |
beta_at_entry |
Float32 | ACB beta at entry time |
posture |
LowCardinality(String) | System posture at entry |
leverage |
Float32 | Applied leverage |
regime_signal |
Int8 | Regime classification |
capital_before |
Float64 | Capital before trade |
capital_after |
Float64 | Capital after trade |
peak_capital |
Float64 | Peak capital at time |
drawdown_at_entry |
Float32 | Drawdown pct at entry |
open_positions_count |
UInt8 | Open positions (always 0 or 1) |
scan_uuid |
String | UUIDv7 trace ID |
bars_held |
UInt16 | Number of bars held |
Engine: MergeTree | Partition: toYYYYMM(ts) | Order: (ts, asset)
Exit reasons observed in production:
| Exit Reason | Meaning |
|---|---|
MAX_HOLD |
Held for max_hold_bars (125) without TP or stop hit |
FIXED_TP |
Take-profit target (95bps) reached |
HIBERNATE_HALT |
Posture changed to HIBERNATE, position closed |
SUBDAY_ACB_NORMALIZATION |
ACB day-reset forced position close |
eigen_scans — Processed eigenscans (68k rows, ~11s cadence)
| Column | Type | Description |
|---|---|---|
ts |
DateTime64(6, UTC) | Scan timestamp |
scan_number |
UInt32 | Monotonic scan counter |
vel_div |
Float32 | Velocity divergence (v50 - v750) |
w50_velocity |
Float32 | 50-window correlation velocity |
w750_velocity |
Float32 | 750-window correlation velocity |
instability_50 |
Float32 | Instability measure |
scan_to_fill_ms |
Float32 | Latency: scan → fill |
step_bar_ms |
Float32 | Latency: step_bar computation |
scan_uuid |
String | UUIDv7 trace ID |
Engine: MergeTree | Partition: toYYYYMM(ts) | Order: (ts, scan_number) | TTL: 10 years
status_snapshots — System status (686k rows, ~10s cadence, TTL 180 days)
| Column | Type | Description |
|---|---|---|
ts |
DateTime64(3, UTC) | Snapshot timestamp |
capital |
Float64 | Current capital |
roi_pct |
Float32 | Return on investment % |
dd_pct |
Float32 | Current drawdown % |
trades_executed |
UInt16 | Total trades count |
posture |
LowCardinality(String) | APEX / TURTLE / HIBERNATE |
rm |
Float32 | Risk metric |
vel_div |
Float32 | Latest velocity divergence |
vol_ok |
UInt8 | Volatility within bounds (0/1) |
phase |
LowCardinality(String) | Trading phase |
mhs_status |
LowCardinality(String) | Meta health status |
boost |
Float32 | ACB boost |
cat5 |
Float32 | Category 5 risk metric |
Engine: MergeTree | Order: ts | TTL: 180 days
posture_events — Posture transitions (92 rows)
| Column | Type | Description |
|---|---|---|
ts |
DateTime64(6, UTC) | Transition timestamp |
posture |
LowCardinality(String) | New posture (APEX/TURTLE/HIBERNATE) |
rm |
Float32 | Risk metric at transition |
prev_posture |
LowCardinality(String) | Previous posture |
trigger |
String | JSON with Cat1-Cat4 values that triggered transition |
scan_uuid |
String | UUIDv7 trace ID |
Postures (ordered by risk): APEX (full risk) → TURTLE (reduced) → HIBERNATE (minimal)
acb_state — Adaptive Circuit Breaker (26k rows, ~30s cadence)
| Column | Type | Description |
|---|---|---|
ts |
DateTime64(6, UTC) | Timestamp |
boost |
Float32 | Leverage boost multiplier (≥1.0) |
beta |
Float32 | Risk scaling factor |
signals |
Float32 | Signal quality metric |
meta_health — Meta Health Service v3 (78k rows, ~10s cadence)
| Column | Type | Description |
|---|---|---|
ts |
DateTime64(6, UTC) | Timestamp |
status |
LowCardinality(String) | GREEN / YELLOW / RED |
rm_meta |
Float32 | Aggregate health score (0–1) |
m1_data_infra |
Float32 | Data infrastructure health |
m1_trader |
Float32 | Trader process health |
m2_heartbeat |
Float32 | Heartbeat freshness |
m3_data_freshness |
Float32 | Scan data freshness |
m4_control_plane |
Float32 | Control plane (HZ/Prefect) health |
m5_coherence |
Float32 | State coherence across services |
m6_test_integrity |
Float32 | Test gate pass status |
service_status |
String | JSON service states |
hz_key_status |
String | HZ key freshness |
exf_data — External Factors (1.56M rows, ~0.5s cadence)
| Column | Type | Description |
|---|---|---|
ts |
DateTime64(6, UTC) | Timestamp |
funding_rate |
Float32 | BTC funding rate |
dvol |
Float32 | Deribit DVOL (implied volatility) |
fear_greed |
Float32 | Fear & Greed index |
taker_ratio |
Float32 | Taker buy/sell ratio |
obf_universe — Order Book Features (821M rows, ~500ms cadence, 542 symbols)
| Column | Type | Description |
|---|---|---|
ts |
DateTime64(3, UTC) | Timestamp (millisecond) |
symbol |
LowCardinality(String) | Trading pair |
spread_bps |
Float32 | Bid-ask spread in basis points |
depth_1pct_usd |
Float64 | USD depth at 1% from mid |
depth_quality |
Float32 | Book quality metric |
fill_probability |
Float32 | Estimated fill probability |
imbalance |
Float32 | Bid/ask imbalance |
best_bid |
Float64 | Best bid price |
best_ask |
Float64 | Best ask price |
n_bid_levels |
UInt8 | Number of bid levels |
n_ask_levels |
UInt8 | Number of ask levels |
Engine: MergeTree | Partition: toYYYYMM(ts) | Order: (symbol, ts)
supervisord_state — Process manager state (138k rows)
| Column | Type | Description |
|---|---|---|
ts |
DateTime64(6, UTC) | Timestamp |
process_name |
LowCardinality(String) | Service name |
group_name |
LowCardinality(String) | dolphin_data or dolphin |
state |
LowCardinality(String) | RUNNING / STOPPED / EXITED |
pid |
UInt32 | Process ID |
uptime_s |
UInt32 | Uptime in seconds |
exit_status |
Int16 | Exit code |
source |
LowCardinality(String) | Source of state change |
service_lifecycle — Service start/stop events (62 rows)
| Column | Type | Description |
|---|---|---|
ts |
DateTime64(6, UTC) | Timestamp |
service |
LowCardinality(String) | Service name |
event |
LowCardinality(String) | START / EXIT |
reason |
String | NORMAL_START / SIGTERM / NORMAL_EXIT |
exit_code |
Int16 | Exit code |
signal_num |
Int16 | Signal number |
pid |
UInt32 | Process ID |
system_stats — Host system metrics (35k rows)
| Column | Type | Description |
|---|---|---|
ts |
DateTime64(3, UTC) | Timestamp |
mem_used_gb |
Float32 | Memory used (GB) |
mem_available_gb |
Float32 | Memory available (GB) |
mem_pct |
Float32 | Memory usage % |
load_1m / load_5m / load_15m |
Float32 | Load averages |
net_rx_mb_s |
Float32 | Network receive (MB/s) |
net_tx_mb_s |
Float32 | Network transmit (MB/s) |
net_iface |
LowCardinality(String) | Network interface |
dolphin Database — Views
v_trade_summary_30d — 30-day rolling trade stats
SELECT * FROM dolphin.v_trade_summary_30d
-- Returns: strategy, n_trades, wins, win_rate_pct, total_pnl,
-- avg_pnl_pct, median_pnl_pct, max_dd_seen_pct
v_current_posture — Latest posture state
SELECT * FROM dolphin.v_current_posture
-- Returns: posture, rm, trigger, ts
v_process_health — Current process states
SELECT * FROM dolphin.v_process_health ORDER BY group_name, process_name
-- Returns: process_name, group_name, state, pid, uptime_s, last_seen
v_scan_latency_1h — Last hour scan latency percentiles
SELECT * FROM dolphin.v_scan_latency_1h
-- Returns: p50_ms, p95_ms, p99_ms, n_scans, window_start
v_system_stats_1h — Last hour system metrics (5-min buckets)
SELECT * FROM dolphin.v_system_stats_1h
-- Returns: bucket, mem_pct_avg, load_avg, net_rx_peak, net_tx_peak
v_scan_causal_chain — Trace scans → trades by scan_uuid
dolphin_green Database
Mirror of dolphin with tables: acb_state, account_events, daily_pnl, eigen_scans, exf_data, meta_health, obf_universe, posture_events, service_lifecycle, status_snapshots, supervisord_state, system_stats, trade_events (325 rows, 2026-04-12 → ongoing).
Useful Queries
-- Last 20 trades with key metrics
SELECT ts, asset, side, entry_price, exit_price, pnl, pnl_pct,
exit_reason, leverage, vel_div_entry, bars_held, posture
FROM dolphin.trade_events ORDER BY ts DESC LIMIT 20;
-- Today's P&L summary
SELECT count() as trades, sum(pnl) as total_pnl,
countIf(pnl>0) as wins, round(countIf(pnl>0)/count()*100,1) as win_rate
FROM dolphin.trade_events WHERE date = today();
-- Exit reason distribution
SELECT exit_reason, count() as n, round(sum(pnl),2) as total_pnl,
round(countIf(pnl>0)/count()*100,1) as win_rate
FROM dolphin.trade_events GROUP BY exit_reason ORDER BY n DESC;
-- Per-asset performance
SELECT asset, count() as trades, round(sum(pnl),2) as pnl,
round(countIf(pnl>0)/count()*100,1) as wr, round(avg(leverage),1) as avg_lev
FROM dolphin.trade_events GROUP BY asset ORDER BY pnl DESC;
-- Capital curve (from status snapshots)
SELECT ts, capital, roi_pct, dd_pct, posture, vel_div
FROM dolphin.status_snapshots
WHERE ts >= today() - INTERVAL 7 DAY
ORDER BY ts;
-- Scan-to-trade latency distribution
SELECT quantile(0.5)(scan_to_fill_ms) as p50,
quantile(0.95)(scan_to_fill_ms) as p95,
quantile(0.99)(scan_to_fill_ms) as p99
FROM dolphin.eigen_scans WHERE ts >= now() - INTERVAL 1 HOUR;
-- Leverage distribution
SELECT round(leverage,1) as lev, count() as n
FROM dolphin.trade_events GROUP BY lev ORDER BY lev;
-- Scan rate per hour
SELECT toStartOfHour(ts) as hour, count() as scans
FROM dolphin.eigen_scans
WHERE ts >= now() - INTERVAL 24 HOUR
GROUP BY hour ORDER BY hour;
2. Hazelcast
Connection
| Param | Value |
|---|---|
| Host | localhost:5701 |
| Cluster | dolphin |
| Python client | hazelcast-python-client 5.x |
| Management UI | http://localhost:8080 |
import hazelcast
client = hazelcast.HazelcastClient(
cluster_name="dolphin",
cluster_members=["localhost:5701"],
connection_timeout=5.0,
)
WARNING: Hazelcast is RAM-only. Never restart the container — all state is lost on restart.
IMap Reference
DOLPHIN_FEATURES (547 entries) — Central data bus
| Key | Type | Writer | Description |
|---|---|---|---|
latest_eigen_scan |
JSON string | scan_bridge | Latest eigenvalue scan with scan_number, vel_div, regime, asset data |
exf_latest |
JSON string | exf_fetcher | External factors: funding rates, OI, L/S ratio, taker, basis, etc. |
acb_boost |
JSON string | acb_processor | ACB boost/beta/signals with CP lock |
mc_forewarner_latest |
JSON string | mc_forewarning | Monte Carlo risk envelope status |
asset_<SYMBOL>_ob |
JSON string | obf_universe | Per-asset order book snapshot |
latest_eigen_scan structure:
{
"scan_number": 134276,
"timestamp": 1776269558.4,
"file_mtime": 1776269558.4,
"result": {
"type": "scan",
"timestamp": "2026-04-15 18:12:33",
"asset": "BTCUSDT",
"regime": "BEAR",
"bull_pct": 42.86,
"bear_pct": 57.14,
"sentiment": "BEARISH",
"total_symbols": 50,
"correlation_symbol": "...",
"vel_div": -0.0319,
"...": "..."
},
"target_asset": "BTCUSDT",
"version": "NG7",
"_ng7_metadata": { "scan_number": 134276, "uuid": "...", ... }
}
exf_latest structure:
{
"funding_btc": -5.085e-05,
"funding_btc_lagged": -5.085e-05,
"funding_eth": 1.648e-05,
"oi_btc": 97583.527,
"oi_eth": 2246343.627,
"ls_btc": 0.8218,
"ls_eth": 1.4067,
"taker": 0.5317,
"taker_lagged": 2.1506,
"basis": -0.07784355,
"imbalance_btc": -0.786,
"dvol": 52.34,
"fear_greed": 45.0
}
asset_<SYMBOL>_ob structure:
{
"best_bid": 84500.0,
"best_ask": 84501.0,
"spread_bps": 1.18,
"depth_1pct_usd": 50000.0,
"depth_quality": 0.85,
"fill_probability": 0.95,
"imbalance": 0.03,
"n_bid_levels": 5,
"n_ask_levels": 5
}
DOLPHIN_STATE_BLUE (2 entries) — Blue strategy runtime state
| Key | Description |
|---|---|
capital_checkpoint |
{"capital": 25705.50, "ts": 1776269557.97} |
engine_snapshot |
Full engine state (see below) |
engine_snapshot structure:
{
"capital": 25705.50,
"open_positions": [],
"algo_version": "v2_gold_fix_v50-v750",
"last_scan_number": 134276,
"last_vel_div": 0.0201,
"vol_ok": true,
"posture": "APEX",
"scans_processed": 6377,
"trades_executed": 71,
"bar_idx": 4655,
"timestamp": "2026-04-15T16:12:37",
"leverage_soft_cap": 8.0,
"leverage_abs_cap": 9.0,
"open_notional": 0.0,
"current_leverage": 0.0
}
DOLPHIN_PNL_BLUE (3 entries) — Daily P&L
Keyed by date string: "2026-04-15" → {"portfolio_capital": 20654.01, "engine_capital": 20654.01}
DOLPHIN_STATE_GREEN (1 entry) — Green strategy state
Same structure as blue: capital_checkpoint.
DOLPHIN_META_HEALTH (1 entry)
Key: "latest" → {"rm_meta": 0.94, "status": "GREEN", "m1_data_infra": 1.0, "m1_trader": 1.0, ...}
Read/Write Patterns
# Read from HZ
features = client.get_map("DOLPHIN_FEATURES").blocking()
scan = json.loads(features.get("latest_eigen_scan"))
# Write to HZ
features.put("exf_latest", json.dumps(payload))
# Atomic update with CP lock (used by ACB)
lock = client.cp_subsystem.get_lock("acb_update_lock").blocking()
lock.lock()
try:
features.put("acb_boost", json.dumps(acb_data))
finally:
lock.unlock()
# HZ warmup after restart (reconstruct from ClickHouse)
from hz_warmup import _ch_query
latest = _ch_query("SELECT * FROM dolphin.acb_state ORDER BY ts DESC LIMIT 1")
features.put("acb_boost", json.dumps(latest[0]))
3. File-Based Data
Parquet (VBT Cache)
Location: /mnt/dolphinng5_predict/vbt_cache_synth_15M/
Daily parquet files (YYYY-MM-DD.parquet) containing scan data with columns: vel_div, v50_vel, v150_vel, v750_vel, asset prices, BTC price, and derived features. Used by CI test fixtures and backtesting.
Read:
import pandas as pd
df = pd.read_parquet("/mnt/dolphinng5_predict/vbt_cache_synth_15M/2026-02-25.parquet")
Arrow Scans (Live Pipeline)
Location: /mnt/dolphinng6_data/arrow_scans/<date>/*.arrow
PyArrow IPC files written by NG8 scanner. Each file = one eigenscan. Consumed by scan_bridge_service → pushed to Hazelcast.
Eigenvalue JSON
Location: /mnt/dolphinng6_data/eigenvalues/<date>/*.json
Per-scan JSON files with eigenvalue data: scan_number, eigenvalues array, regime, bull/bear percentages.
Correlation Matrices
Location: /mnt/dolphinng6_data/matrices/<date>/
ZST-compressed 50×50 correlation matrices: scan_NNNNNN_wWWW_HHMMSS.arb512.pkl.zst
Session Logs
Location: /mnt/dolphinng5_predict/session_logs/
Trade session logs: session_YYYYMMDD_HHMMSS.jsonl (JSON Lines) and session_YYYYMMDD.md (human-readable).
Run Logs
Location: /mnt/dolphinng5_predict/run_logs/
Engine run summaries and backtest parity logs. Key file: run_logs/summary_*.json.
4. Data Flow
┌─────────────┐
│ NG8 Scanner │ (Linux, /mnt/dolphinng6_data/)
└──────┬──────┘
│ writes .arrow files
▼
┌─────────────┐
│ Scan Bridge │ (supervisord, dolphin group)
└──────┬──────┘
│ HZ put("latest_eigen_scan")
▼
┌──── DOLPHIN_FEATURES (HZ) ────┐
│ │
┌─────────▼──────────┐ ┌─────────▼──────────┐
│ nautilus_event_ │ │ clean_arch/ │
│ trader (prod path) │ │ main.py (alt path) │
└─────────┬──────────┘ └────────────────────┘
│ NDAlphaEngine.step_bar()
│
┌─────────▼──────────┐
│ ch_put("trade_ │ ← async fire-and-forget
│ events", {...}) │
└─────────┬──────────┘
│
▼
┌──────────────────────┐
│ ClickHouse (dolphin) │ ← queries, dashboards, HZ warmup
└──────────────────────┘
Parallel writers to HZ:
exf_fetcher → "exf_latest"
acb_processor → "acb_boost" (with CP lock)
obf_universe → "asset_*_ob"
meta_health → DOLPHIN_META_HEALTH["latest"]
mc_forewarning → "mc_forewarner_latest"
nautilus_trader→ DOLPHIN_STATE_BLUE["engine_snapshot"]
DOLPHIN_PNL_BLUE[date_str]
5. Current System State (Live Snapshot)
| Metric | Value |
|---|---|
| Blue capital | $25,705.50 |
| Blue ROI | +5.92% (from $25,000) |
| Blue trades today | 71 total |
| Posture | APEX |
| MHS status | GREEN (rm_meta=0.94) |
| ACB boost | 1.4581 / beta=0.80 |
| Latest scan | #134276 |
| Latest vel_div | +0.0201 |
| Scan cadence | ~11s |
| Scan→fill latency | ~10–27ms |
| Process health | All RUNNING (uptime ~22h) |
Supervisord Groups
| Group | Services | Autostart |
|---|---|---|
dolphin_data |
exf_fetcher, acb_processor, obf_universe, meta_health, system_stats | Yes |
dolphin |
nautilus_trader, scan_bridge, clean_arch_trader, paper_portfolio, dolphin_live | No (manual) |