Includes core prod + GREEN/BLUE subsystems: - prod/ (BLUE harness, configs, scripts, docs) - nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved) - adaptive_exit/ (AEM engine + models/bucket_assignments.pkl) - Observability/ (EsoF advisor, TUI, dashboards) - external_factors/ (EsoF producer) - mc_forewarning_qlabs_fork/ (MC regime/envelope) Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
24 KiB
Executable File
DOLPHIN Paper Trading — Production Bringup Guide
Purpose: Step-by-step ops guide for standing up the Prefect + Hazelcast paper trading stack.
Audience: Operations agent or junior dev. No research decisions required.
State as of: 2026-03-06
Assumes: Windows 11, Docker Desktop installed, Siloqy venv exists at C:\Users\Lenovo\Documents\- Siloqy\
Architecture Overview
[ARB512 Scanner] ─► eigenvalues/YYYY-MM-DD/ ─► [paper_trade_flow.py]
|
[NDAlphaEngine (Python)]
|
┌──────────────┴──────────────┐
[Hazelcast IMap] [paper_logs/*.jsonl]
|
[Prefect UI :4200]
[HZ-MC UI :8080]
Components:
docker-compose.yml: Hazelcast 5.3 (port 5701) + HZ Management Center (port 8080) + Prefect Server (port 4200)paper_trade_flow.py: Prefect flow, runs daily at 00:05 UTCconfigs/blue.yml: Champion SHORT config (frozen, production)configs/green.yml: Bidirectional config (STATUS: PENDING — LONG validation still in progress)- Python venv:
C:\Users\Lenovo\Documents\- Siloqy\
Data flow: Prefect triggers daily → reads yesterday's Arrow/NPZ scans from eigenvalues dir → NDAlphaEngine processes → writes P&L to Hazelcast IMap + local JSONL log.
Step 1: Prerequisites Check
Open a terminal (Git Bash or PowerShell).
# 1a. Verify Docker Desktop is installed
docker --version
# Expected: Docker version 29.x.x
# 1b. Verify Python venv
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" --version
# Expected: Python 3.11.x or 3.12.x
# 1c. Verify working directories exist
ls "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod/"
# Expected: configs/ docker-compose.yml paper_trade_flow.py BRINGUP_GUIDE.md
ls "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod/configs/"
# Expected: blue.yml green.yml
Step 2: Install Python Dependencies
Run once. Takes ~2-5 minutes.
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/pip.exe" install \
hazelcast-python-client \
prefect \
pyyaml \
pyarrow \
numpy \
pandas
Verify:
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" -c "import hazelcast; import prefect; import yaml; print('OK')"
Step 3: Start Docker Desktop
Docker Desktop must be running before starting containers.
Option A (GUI): Double-click Docker Desktop from Start menu. Wait for the whale icon in the system tray to stop animating (~30-60 seconds).
Option B (command):
Start-Process "C:\Program Files\Docker\Docker\Docker Desktop.exe"
# Wait ~60 seconds, then verify:
docker ps
Verify Docker is ready:
docker info | grep "Server Version"
# Expected: Server Version: 27.x.x
Step 4: Start the Infrastructure Stack
cd "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod"
docker compose up -d
Expected output:
[+] Running 3/3
- Container dolphin-hazelcast Started
- Container dolphin-hazelcast-mc Started
- Container dolphin-prefect Started
Verify all containers healthy:
docker compose ps
# All 3 should show "healthy" or "running"
Wait ~30 seconds for Hazelcast to initialize, then verify:
curl http://localhost:5701/hazelcast/health/ready
# Expected: {"message":"Hazelcast is ready!"}
curl http://localhost:4200/api/health
# Expected: {"status":"healthy"}
UIs:
- Prefect UI: http://localhost:4200
- Hazelcast MC: http://localhost:8080
- Default cluster:
dolphin(auto-connects to hazelcast:5701)
- Default cluster:
Step 5: Register Prefect Deployments
Run once to register the blue and green scheduled deployments.
cd "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod"
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" paper_trade_flow.py --register
Expected output:
Registered: dolphin-paper-blue
Registered: dolphin-paper-green
Verify in Prefect UI: http://localhost:4200 → Deployments → should show 2 deployments with CronSchedule "5 0 * * *".
Step 6: Start the Prefect Worker
The Prefect worker polls for scheduled runs. Run in a separate terminal (keep it open, or run as a service).
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/prefect.exe" worker start --pool "dolphin"
OR (if prefect CLI not in PATH):
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" -m prefect worker start --pool "dolphin"
Leave this terminal running. It will pick up the 00:05 UTC scheduled runs.
Step 7: Manual Test Run
Before relying on the schedule, test with a known good date (a date that has scan data).
cd "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod"
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" paper_trade_flow.py \
--date 2026-03-05 \
--config configs/blue.yml
Expected output (abbreviated):
=== BLUE paper trade: 2026-03-05 ===
Loaded N scans for 2026-03-05 | cols=XX
2026-03-05: PnL=+XX.XX T=X boost=1.XXx MC=OK
HZ write OK → DOLPHIN_PNL_BLUE[2026-03-05]
=== DONE: blue 2026-03-05 | PnL=+XX.XX | Capital=25,XXX.XX ===
Verify data written to Hazelcast:
- Open http://localhost:8080 → Maps → DOLPHIN_PNL_BLUE → should contain entry for 2026-03-05
Verify log file written:
ls "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod/paper_logs/blue/"
cat "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod/paper_logs/blue/paper_pnl_2026-03.jsonl"
Step 8: Scan Data Source Verification
The flow reads scan files from:
C:\Users\Lenovo\Documents\- Dolphin NG HD (NG3)\correlation_arb512\eigenvalues\YYYY-MM-DD\
Each date directory should contain scan_*__Indicators.npz or scan_*.arrow files.
ls "/c/Users/Lenovo/Documents/- Dolphin NG HD (NG3)/correlation_arb512/eigenvalues/" | tail -5
# Expected: recent date directories like 2026-03-05, 2026-03-04, etc.
ls "/c/Users/Lenovo/Documents/- Dolphin NG HD (NG3)/correlation_arb512/eigenvalues/2026-03-05/"
# Expected: scan_NNNN__Indicators.npz files
If a date directory is missing, the flow logs a warning and writes pnl=0 for that day (non-critical).
Step 9: Daily Operations
Normal daily flow (automated):
- ARB512 scanner (extended_main.py) writes scans to eigenvalues/YYYY-MM-DD/ throughout the day
- At 00:05 UTC, Prefect triggers dolphin-paper-blue and dolphin-paper-green
- Each flow reads yesterday's scans, runs the engine, writes to HZ + JSONL log
- Monitor via Prefect UI and HZ-MC
Check today's run result:
# Latest P&L log entry:
tail -1 "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod/paper_logs/blue/paper_pnl_$(date +%Y-%m).jsonl"
Check HZ state:
- http://localhost:8080 → Maps → DOLPHIN_STATE_BLUE → key "latest"
- Should show:
{"capital": XXXXX, "strategy": "blue", "last_date": "YYYY-MM-DD", ...}
Step 10: Restart After Reboot
After Windows restarts:
# 1. Start Docker Desktop (GUI or command — see Step 3)
# 2. Restart containers
cd "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod"
docker compose up -d
# 3. Restart Prefect worker (in a dedicated terminal)
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" -m prefect worker start --pool "dolphin"
Deployments and HZ data persist (docker volumes: hz_data, prefect_data).
Troubleshooting
"No scan dir for YYYY-MM-DD"
- The ARB512 scanner may not have run for that date
- Check:
ls "C:\Users\Lenovo\Documents\- Dolphin NG HD (NG3)\correlation_arb512\eigenvalues\YYYY-MM-DD\" - Non-critical: flow logs pnl=0 and continues
"HZ write failed (not critical)"
- Hazelcast container not running or not yet healthy
- Run:
docker compose ps→ check dolphin-hazelcast shows "healthy" - Run:
docker compose restart hazelcast
"ModuleNotFoundError: No module named 'hazelcast'"
- Dependencies not installed in Siloqy venv
- Rerun Step 2
"error during connect: open //./pipe/dockerDesktopLinuxEngine"
- Docker Desktop not running
- Start Docker Desktop (see Step 3), wait 60 seconds, retry
Prefect worker not picking up runs
- Verify worker is running with
--pool "dolphin"(matches work_queue_name in deployments) - Check Prefect UI → Work Pools → should show "dolphin" pool as online
Green deployment errors on bidirectional config
- Green is PENDING LONG validation. If direction: bidirectional causes engine errors, temporarily set green.yml direction: short_only until LONG system is validated.
Key File Locations
| File | Path |
|---|---|
| Prefect flow | prod/paper_trade_flow.py |
| Blue config | prod/configs/blue.yml |
| Green config | prod/configs/green.yml |
| Docker stack | prod/docker-compose.yml |
| Blue P&L logs | prod/paper_logs/blue/paper_pnl_YYYY-MM.jsonl |
| Green P&L logs | prod/paper_logs/green/paper_pnl_YYYY-MM.jsonl |
| Scan data source | C:\Users\Lenovo\Documents\- Dolphin NG HD (NG3)\correlation_arb512\eigenvalues\ |
| NDAlphaEngine | HCM\nautilus_dolphin\nautilus_dolphin\nautilus\esf_alpha_orchestrator.py |
| MC-Forewarner models | HCM\nautilus_dolphin\mc_results\models\ |
Current Status (2026-03-06)
| Item | Status |
|---|---|
| Docker stack | Built — needs Docker Desktop running |
| Python deps (HZ + Prefect) | Installing (pip background job) |
| Blue config | Frozen champion SHORT — ready |
| Green config | PENDING — LONG validation running (b79rt78uv) |
| Prefect deployments | Not yet registered (run Step 5 after deps install) |
| Manual test run | Not yet done (run Step 7) |
| vol_p60 calibration | Hardcoded 0.000099 (pre-calibrated from 55-day window) — acceptable |
| Engine state persistence | Implemented — engine capital and open positions serialize to Hazelcast STATE IMap |
Engine State Persistence
The NDAlphaEngine is instantiated fresh during each daily Prefect run, but its internal state is loaded from the Hazelcast DOLPHIN_STATE_BLUE/GREEN maps. Both capital and any active position spanning midnight are accurately tracked and restored.
Impact for paper trading: P&L and cumulative capital growth track correctly across days.
Guide written 2026-03-08. Status updated.
Appendix D: Live Operations Monitoring — DEV "Realized Slippage"
Purpose: Track whether ExF latency (~10ms) is causing unacceptable fill slippage vs backtest assumptions.
Background
- Backtest friction assumptions: 8-10 bps round-trip (2bps entry + 2bps exit + fees)
- ExF latency-induced drift: ~0.055 bps (normal vol), ~0.17 bps (high vol events)
- Current Python implementation is sufficient (latency << friction assumptions)
Metric Definition
realized_slippage_bps = abs(fill_price - signal_price) / signal_price * 10000
Monitoring Thresholds
| Threshold | Action |
|---|---|
| < 2 bps | ✅ Nominal — within backtest assumptions |
| 2-5 bps | ⚠️ Watch — approaching friction limits |
| > 5 bps | 🚨 ALERT — investigate latency/market impact issues |
Implementation Notes
- Log
signal_price(price at signal generation) vsfill_price(actual execution) - Track per-trade slippage in paper_logs
- Alert if 24h moving average exceeds 5 bps
- If consistently > 5 bps → escalate to Java/Chronicle Queue port for <100μs latency
TODO
- Add slippage tracking to
paper_trade_flow.pytrade logging - Create Grafana/Prefect alert for slippage > 5 bps
- Document slippage post-trade analysis pipeline
Last updated: 2026-03-17
Appendix E: External Factors (ExF) System v2.0
Date: 2026-03-17
Purpose: Complete production guide for the External Factors real-time data pipeline
Components: exf_fetcher_flow.py, exf_persistence.py, exf_integrity_monitor.py
Architecture
┌─────────────────────────────────────────────────────────────────────────────┐
│ EXTERNAL FACTORS SYSTEM v2.0 │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Data Providers │ │ Data Providers │ │ Data Providers │ │
│ │ (Binance) │ │ (Deribit) │ │ (FRED/Macro) │ │
│ │ - funding_btc │ │ - dvol_btc │ │ - vix │ │
│ │ - basis │ │ - dvol_eth │ │ - dxy │ │
│ │ - spread │ │ - fund_dbt_btc │ │ - us10y │ │
│ │ - imbal_* │ │ │ │ │ │
│ └────────┬─────────┘ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │ │
│ └────────────────────────┼────────────────────────┘ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ RealTimeExFService (28 indicators) │ │
│ │ - Per-indicator async polling at native rate │ │
│ │ - Rate limiting per provider (Binance 20/s, FRED 2/s, etc) │ │
│ │ - In-memory cache with <1ms read latency │ │
│ │ - Daily history rotation for lag support │ │
│ └────────────────────────────────┬─────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────┼───────────────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ HOT PATH │ │ OFF HOT PATH │ │ MONITORING │ │
│ │ (0.5s interval)│ │ (5 min interval│ │ (60s interval) │ │
│ │ │ │ │ │ │ │
│ │ Hazelcast │ │ Disk Persistence│ │ Integrity Check │ │
│ │ DOLPHIN_FEATURES│ │ NPZ Format │ │ HZ vs Disk │ │
│ │ ['exf_latest'] │ │ /mnt/ng6_data/ │ │ Staleness Check │ │
│ │ │ │ eigenvalues/ │ │ ACB Validation │ │
│ │ Instant access │ │ Durability │ │ Alert on drift │ │
│ │ for Alpha Engine│ │ for Backtests │ │ │ │
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Component Reference
| Component | File | Purpose | Update Rate |
|---|---|---|---|
| RealTimeExFService | realtime_exf_service.py |
Fetches 28 indicators from 8 providers | Per-indicator native rate |
| ExF Fetcher Flow | exf_fetcher_flow.py |
Prefect flow orchestrating HZ push | 0.5s (500ms) |
| ExF Persistence | exf_persistence.py |
Disk writer (NPZ format) | 5 minutes |
| ExF Integrity Monitor | exf_integrity_monitor.py |
Data validation & alerts | 60 seconds |
Indicators (28 Total)
| Category | Indicators | Count |
|---|---|---|
| Binance Derivatives | funding_btc, funding_eth, oi_btc, oi_eth, ls_btc, ls_eth, ls_top, taker, basis | 9 |
| Microstructure | imbal_btc, imbal_eth, spread | 3 |
| Deribit | dvol_btc, dvol_eth, fund_dbt_btc, fund_dbt_eth | 4 |
| Macro (FRED) | vix, dxy, us10y, sp500, fedfunds | 5 |
| Sentiment | fng | 1 |
| On-chain | hashrate | 1 |
| DeFi | tvl | 1 |
| Liquidations | liq_vol_24h, liq_long_ratio, liq_z_score, liq_percentile | 4 |
ACB-Critical Indicators (9 Required for _acb_ready=True)
These indicators MUST be present and fresh for the Adaptive Circuit Breaker to function:
ACB_KEYS = [
"funding_btc", "funding_eth", # Binance funding rates
"dvol_btc", "dvol_eth", # Deribit volatility indices
"fng", # Fear & Greed
"vix", # VIX (market fear)
"ls_btc", # Long/Short ratio
"taker", # Taker buy/sell ratio
"oi_btc", # Open interest
]
Data Flow
- Fetch:
RealTimeExFServicepolls each provider at native rate - Cache: Values stored in memory with staleness tracking
- HZ Push (every 0.5s): Hot path to Hazelcast for Alpha Engine
- Persistence (every 5min): Background flush to NPZ on disk
- Integrity Check (every 60s): Validate HZ vs disk consistency
File Locations (Linux)
| Data Type | Path |
|---|---|
| Persistence root | /mnt/ng6_data/eigenvalues/ |
| Daily directory | /mnt/ng6_data/eigenvalues/{YYYY-MM-DD}/ |
| ExF snapshots | /mnt/ng6_data/eigenvalues/{YYYY-MM-DD}/extf_snapshot_{timestamp}__Indicators.npz |
| Checksum files | /mnt/ng6_data/eigenvalues/{YYYY-MM-DD}/extf_snapshot_{timestamp}__Indicators.npz.sha256 |
NPZ File Format
{
# Metadata (JSON string in _metadata array)
"_metadata": json.dumps({
"_timestamp_utc": "2026-03-17T12:00:00+00:00",
"_version": "1.0",
"_service": "ExFPersistence",
"_staleness_s": json.dumps({"basis": 0.2, "funding_btc": 3260.0, ...}),
}),
# Numeric indicators (each as float64 array)
"basis": np.array([0.01178]),
"spread": np.array([0.00143]),
"funding_btc": np.array([7.53e-06]),
"vix": np.array([24.06]),
...
}
Running the ExF System
Option 1: Standalone (Development/Testing)
cd /root/extf_docs
# Test mode (no persistence, no monitoring)
python exf_fetcher_flow.py --no-persist --no-monitor --warmup 15
# With persistence (production)
python exf_fetcher_flow.py --warmup 30
# Run integration tests
python test_exf_integration.py --duration 30 --test all
Option 2: Prefect Deployment (Production)
# Deploy to Prefect
cd /mnt/dolphinng5_predict/prod
prefect deployment build exf_fetcher_flow.py:exf_fetcher_flow \
--name "exf-live" \
--pool dolphin \
--cron "*/5 * * * *" # Or run continuously
# Start worker
prefect worker start --pool dolphin
Monitoring & Alerting
Health Status
The integrity monitor exposes health status via get_health_status():
{
"timestamp": "2026-03-17T12:00:00+00:00",
"overall": "healthy", # healthy | degraded | critical
"hz_connected": True,
"persist_connected": True,
"indicators_present": 28,
"indicators_expected": 28,
"acb_ready": True,
"stale_count": 2,
"alerts_active": 0,
}
Alert Thresholds
| Condition | Severity | Action |
|---|---|---|
| ACB-critical indicator missing | CRITICAL | Alpha engine may fail |
| Hazelcast disconnected | CRITICAL | Real-time data unavailable |
| Indicator stale > 120s | WARNING | Check provider API |
| HZ/disk divergence > 3 indicators | WARNING | Investigate sync issue |
| Overall health = degraded | WARNING | Monitor closely |
| Overall health = critical | CRITICAL | Page on-call engineer |
Troubleshooting
Issue: _acb_ready=False
Symptoms: Health check shows acb_ready: False
Diagnosis: One or more ACRITICAL indicators missing
# Check which indicators are missing
python3 << 'EOF'
import hazelcast, json
client = hazelcast.HazelcastClient(cluster_name='dolphin', cluster_members=['localhost:5701'])
data = json.loads(client.get_map("DOLPHIN_FEATURES").get("exf_latest").result())
acb_keys = ["funding_btc", "funding_eth", "dvol_btc", "dvol_eth", "fng", "vix", "ls_btc", "taker", "oi_btc"]
missing = [k for k in acb_keys if k not in data or data[k] != data[k]] # NaN check
print(f"Missing ACB indicators: {missing}")
print(f"Present: {[k for k in acb_keys if k not in missing]}")
client.shutdown()
EOF
Common Causes:
- Deribit API down (dvol_btc, dvol_eth)
- Alternative.me API down (fng)
- FRED API key expired (vix)
Fix: Check provider status, verify API keys in realtime_exf_service.py
Issue: No disk persistence
Symptoms: files_written: 0 in persistence stats
Diagnosis:
# Check mount
ls -la /mnt/ng6_data/eigenvalues/
# Check permissions
touch /mnt/ng6_data/eigenvalues/write_test && rm /mnt/ng6_data/eigenvalues/write_test
# Check disk space
df -h /mnt/ng6_data/
Fix:
# Remount if needed
sudo mount -t cifs //100.119.158.61/DolphinNG6_Data /mnt/ng6_data -o credentials=/root/.dolphin_creds
Issue: High staleness
Symptoms: Staleness > 120s for critical indicators
Diagnosis:
# Check fetcher process
ps aux | grep exf_fetcher
# Check logs
journalctl -u exf-fetcher -n 100
# Manual fetch test
curl -s "https://fapi.binance.com/fapi/v1/premiumIndex?symbol=BTCUSDT" | head -c 200
curl -s "https://www.deribit.com/api/v2/public/get_volatility_index_data?currency=BTC&resolution=3600&count=1" | head -c 200
Fix: Restart fetcher, check network connectivity, verify API rate limits not exceeded
TODO (Future Enhancements)
- Expand indicators: Add 50+ additional indicators from CoinMetrics, Glassnode, etc.
- Fix dead indicators: Repair broken parsers (see
DEAD_INDICATORSin service) - Adaptive lag: Switch from uniform lag=1 to per-indicator optimal lags (needs 80+ days data)
- Intra-day ACB: Move from daily to continuous ACB calculation
- Arrow format: Dual output NPZ + Arrow for better performance
- Redundancy: Multiple provider failover for critical indicators
Data Retention
| Data Type | Retention | Cleanup |
|---|---|---|
| Hazelcast cache | Real-time only (no history) | N/A |
| Disk snapshots (NPZ) | 7 days | Automatic |
| Logs | 30 days | Manual/Logrotate |
| Backfill data | Permanent | Never |
Last updated: 2026-03-17