Includes core prod + GREEN/BLUE subsystems: - prod/ (BLUE harness, configs, scripts, docs) - nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved) - adaptive_exit/ (AEM engine + models/bucket_assignments.pkl) - Observability/ (EsoF advisor, TUI, dashboards) - external_factors/ (EsoF producer) - mc_forewarning_qlabs_fork/ (MC regime/envelope) Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
4.1 KiB
Executable File
4.1 KiB
Executable File
ClickHouse Observability Layer
Deployed: 2026-04-06
CH Version: 24.3-alpine
Ports: HTTP :8123, Native :9000
OTel Collector: OTLP gRPC :4317 / HTTP :4318
Play UI: http://100.105.170.6:8123/play
Architecture
Dolphin services → ch_put() → ch_writer.py (async batch) → dolphin-clickhouse:8123
NG7 laptop → ng_otel_writer.py (OTel SDK) → dolphin-otelcol:4317 → dolphin-clickhouse
/proc poller → system_stats_service.py → dolphin.system_stats
supervisord → supervisord_ch_listener.py (eventlistener) → dolphin.supervisord_state
All writes are fire-and-forget — ch_writer batches in a background thread, drops silently on queue full. OBF hot loop (100ms) is never blocked.
Tables
| Table | Source | Rate | Retention |
|---|---|---|---|
eigen_scans |
nautilus_event_trader | ~8/min | 10yr |
posture_events |
meta_health_service_v3 | few/day | forever |
acb_state |
acb_processor_service | ~5/day | forever |
daily_pnl |
paper_trade_flow | 1/day | forever |
trade_events |
DolphinActor (pending) | ~40/day | 10yr |
obf_universe |
obf_universe_service | 540/min | forever |
obf_fast_intrade |
DolphinActor (pending) | 100ms×assets | 5yr |
exf_data |
exf_fetcher_flow | ~1/min | forever |
meta_health |
meta_health_service_v3 | ~1/10s | forever |
account_events |
DolphinActor (pending) | rare | forever |
supervisord_state |
supervisord_ch_listener | push+60s poll | forever |
system_stats |
system_stats_service | 1/30s | forever |
OTel tables (otel_logs, otel_traces, otel_metrics_*) auto-created by collector for NG7 instrumentation.
Distributed Trace ID
scan_uuid (UUIDv7) is the causal trace root across all tables:
eigen_scans.scan_uuid ← NG7 generates one per scan
│
├── obf_fast_intrade.scan_uuid (100ms OBF while in-trade)
├── trade_events.scan_uuid (entry + exit rows)
└── posture_events.scan_uuid (if scan triggered posture re-eval)
NG7 migration: replace uuid.uuid4() with uuid7() from ch_writer.py — same String format, drop-in.
Key Queries (CH Play)
-- Current system state
SELECT * FROM dolphin.v_current_posture;
-- Scan latency last hour
SELECT * FROM dolphin.v_scan_latency_1h;
-- Trade summary last 30 days
SELECT * FROM dolphin.v_trade_summary_30d;
-- Process health
SELECT * FROM dolphin.v_process_health;
-- System resources (5min buckets, last hour)
SELECT * FROM dolphin.v_system_stats_1h ORDER BY bucket;
-- Full causal chain for a scan
SELECT event_type, ts, detail, value1, value2
FROM dolphin.v_scan_causal_chain
WHERE trace_id = '<scan_uuid>'
ORDER BY ts;
-- Scans that preceded losing trades
SELECT e.scan_number, e.vel_div, t.asset, t.pnl, t.exit_reason
FROM dolphin.trade_events t
JOIN dolphin.eigen_scans e ON e.scan_uuid = t.scan_uuid
WHERE t.pnl < 0 AND t.exit_price > 0
ORDER BY t.pnl ASC LIMIT 20;
Files
| File | Purpose |
|---|---|
prod/ch_writer.py |
Shared singleton — from ch_writer import ch_put, ts_us, uuid7 |
prod/system_stats_service.py |
/proc poller, runs under supervisord:system_stats |
prod/supervisord_ch_listener.py |
supervisord eventlistener |
prod/ng_otel_writer.py (on NG7) |
OTel drop-in for remote machines |
prod/clickhouse/config.xml |
CH server config (40% RAM cap, async_insert) |
prod/clickhouse/users.xml |
dolphin user, wait_for_async_insert=0 |
prod/otelcol/config.yaml |
OTel Collector → dolphin.otel_* |
/root/ch-setup/schema.sql |
Full DDL — idempotent, re-runnable |
Credentials
- User:
dolphin/dolphin_ch_2026 - OTel DSN:
http://dolphin_uptrace_token@100.105.170.6:14318/1(if Uptrace ever deployed)
Pending (when DolphinActor is wired)
trade_events— addch_put("trade_events", {...})at entry and exitobf_fast_intrade— add in OBF 100ms tick (only when n_open_positions > 0)account_events— STARTUP/SHUTDOWN/END_DAY hooksdaily_pnl— end-of-day in paper_trade_flow / nautilus_prefect_flow- See
prod/service_integration.pyfor exact copy-paste snippets