Includes core prod + GREEN/BLUE subsystems: - prod/ (BLUE harness, configs, scripts, docs) - nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved) - adaptive_exit/ (AEM engine + models/bucket_assignments.pkl) - Observability/ (EsoF advisor, TUI, dashboards) - external_factors/ (EsoF producer) - mc_forewarning_qlabs_fork/ (MC regime/envelope) Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
655 lines
24 KiB
Markdown
Executable File
655 lines
24 KiB
Markdown
Executable File
# DOLPHIN Paper Trading — Production Bringup Guide
|
||
|
||
**Purpose**: Step-by-step ops guide for standing up the Prefect + Hazelcast paper trading stack.
|
||
**Audience**: Operations agent or junior dev. No research decisions required.
|
||
**State as of**: 2026-03-06
|
||
**Assumes**: Windows 11, Docker Desktop installed, Siloqy venv exists at `C:\Users\Lenovo\Documents\- Siloqy\`
|
||
|
||
---
|
||
|
||
## Architecture Overview
|
||
|
||
```
|
||
[ARB512 Scanner] ─► eigenvalues/YYYY-MM-DD/ ─► [paper_trade_flow.py]
|
||
|
|
||
[NDAlphaEngine (Python)]
|
||
|
|
||
┌──────────────┴──────────────┐
|
||
[Hazelcast IMap] [paper_logs/*.jsonl]
|
||
|
|
||
[Prefect UI :4200]
|
||
[HZ-MC UI :8080]
|
||
```
|
||
|
||
**Components:**
|
||
- `docker-compose.yml`: Hazelcast 5.3 (port 5701) + HZ Management Center (port 8080) + Prefect Server (port 4200)
|
||
- `paper_trade_flow.py`: Prefect flow, runs daily at 00:05 UTC
|
||
- `configs/blue.yml`: Champion SHORT config (frozen, production)
|
||
- `configs/green.yml`: Bidirectional config (STATUS: PENDING — LONG validation still in progress)
|
||
- Python venv: `C:\Users\Lenovo\Documents\- Siloqy\`
|
||
|
||
**Data flow**: Prefect triggers daily → reads yesterday's Arrow/NPZ scans from eigenvalues dir → NDAlphaEngine processes → writes P&L to Hazelcast IMap + local JSONL log.
|
||
|
||
---
|
||
|
||
## Step 1: Prerequisites Check
|
||
|
||
Open a terminal (Git Bash or PowerShell).
|
||
|
||
```bash
|
||
# 1a. Verify Docker Desktop is installed
|
||
docker --version
|
||
# Expected: Docker version 29.x.x
|
||
|
||
# 1b. Verify Python venv
|
||
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" --version
|
||
# Expected: Python 3.11.x or 3.12.x
|
||
|
||
# 1c. Verify working directories exist
|
||
ls "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod/"
|
||
# Expected: configs/ docker-compose.yml paper_trade_flow.py BRINGUP_GUIDE.md
|
||
|
||
ls "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod/configs/"
|
||
# Expected: blue.yml green.yml
|
||
```
|
||
|
||
---
|
||
|
||
## Step 2: Install Python Dependencies
|
||
|
||
Run once. Takes ~2-5 minutes.
|
||
|
||
```bash
|
||
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/pip.exe" install \
|
||
hazelcast-python-client \
|
||
prefect \
|
||
pyyaml \
|
||
pyarrow \
|
||
numpy \
|
||
pandas
|
||
```
|
||
|
||
**Verify:**
|
||
```bash
|
||
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" -c "import hazelcast; import prefect; import yaml; print('OK')"
|
||
```
|
||
|
||
---
|
||
|
||
## Step 3: Start Docker Desktop
|
||
|
||
Docker Desktop must be running before starting containers.
|
||
|
||
**Option A (GUI):** Double-click Docker Desktop from Start menu. Wait for the whale icon in the system tray to stop animating (~30-60 seconds).
|
||
|
||
**Option B (command):**
|
||
```powershell
|
||
Start-Process "C:\Program Files\Docker\Docker\Docker Desktop.exe"
|
||
# Wait ~60 seconds, then verify:
|
||
docker ps
|
||
```
|
||
|
||
**Verify Docker is ready:**
|
||
```bash
|
||
docker info | grep "Server Version"
|
||
# Expected: Server Version: 27.x.x
|
||
```
|
||
|
||
---
|
||
|
||
## Step 4: Start the Infrastructure Stack
|
||
|
||
```bash
|
||
cd "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod"
|
||
docker compose up -d
|
||
```
|
||
|
||
**Expected output:**
|
||
```
|
||
[+] Running 3/3
|
||
- Container dolphin-hazelcast Started
|
||
- Container dolphin-hazelcast-mc Started
|
||
- Container dolphin-prefect Started
|
||
```
|
||
|
||
**Verify all containers healthy:**
|
||
```bash
|
||
docker compose ps
|
||
# All 3 should show "healthy" or "running"
|
||
```
|
||
|
||
**Wait ~30 seconds for Hazelcast to initialize, then verify:**
|
||
```bash
|
||
curl http://localhost:5701/hazelcast/health/ready
|
||
# Expected: {"message":"Hazelcast is ready!"}
|
||
|
||
curl http://localhost:4200/api/health
|
||
# Expected: {"status":"healthy"}
|
||
```
|
||
|
||
**UIs:**
|
||
- Prefect UI: http://localhost:4200
|
||
- Hazelcast MC: http://localhost:8080
|
||
- Default cluster: `dolphin` (auto-connects to hazelcast:5701)
|
||
|
||
---
|
||
|
||
## Step 5: Register Prefect Deployments
|
||
|
||
Run once to register the blue and green scheduled deployments.
|
||
|
||
```bash
|
||
cd "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod"
|
||
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" paper_trade_flow.py --register
|
||
```
|
||
|
||
**Expected output:**
|
||
```
|
||
Registered: dolphin-paper-blue
|
||
Registered: dolphin-paper-green
|
||
```
|
||
|
||
**Verify in Prefect UI:** http://localhost:4200 → Deployments → should show 2 deployments with CronSchedule "5 0 * * *".
|
||
|
||
---
|
||
|
||
## Step 6: Start the Prefect Worker
|
||
|
||
The Prefect worker polls for scheduled runs. Run in a separate terminal (keep it open, or run as a service).
|
||
|
||
```bash
|
||
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/prefect.exe" worker start --pool "dolphin"
|
||
```
|
||
|
||
**OR** (if `prefect` CLI not in PATH):
|
||
```bash
|
||
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" -m prefect worker start --pool "dolphin"
|
||
```
|
||
|
||
Leave this terminal running. It will pick up the 00:05 UTC scheduled runs.
|
||
|
||
---
|
||
|
||
## Step 7: Manual Test Run
|
||
|
||
Before relying on the schedule, test with a known good date (a date that has scan data).
|
||
|
||
```bash
|
||
cd "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod"
|
||
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" paper_trade_flow.py \
|
||
--date 2026-03-05 \
|
||
--config configs/blue.yml
|
||
```
|
||
|
||
**Expected output (abbreviated):**
|
||
```
|
||
=== BLUE paper trade: 2026-03-05 ===
|
||
Loaded N scans for 2026-03-05 | cols=XX
|
||
2026-03-05: PnL=+XX.XX T=X boost=1.XXx MC=OK
|
||
HZ write OK → DOLPHIN_PNL_BLUE[2026-03-05]
|
||
=== DONE: blue 2026-03-05 | PnL=+XX.XX | Capital=25,XXX.XX ===
|
||
```
|
||
|
||
**Verify data written to Hazelcast:**
|
||
- Open http://localhost:8080 → Maps → DOLPHIN_PNL_BLUE → should contain entry for 2026-03-05
|
||
|
||
**Verify log file written:**
|
||
```bash
|
||
ls "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod/paper_logs/blue/"
|
||
cat "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod/paper_logs/blue/paper_pnl_2026-03.jsonl"
|
||
```
|
||
|
||
---
|
||
|
||
## Step 8: Scan Data Source Verification
|
||
|
||
The flow reads scan files from:
|
||
```
|
||
C:\Users\Lenovo\Documents\- Dolphin NG HD (NG3)\correlation_arb512\eigenvalues\YYYY-MM-DD\
|
||
```
|
||
|
||
Each date directory should contain `scan_*__Indicators.npz` or `scan_*.arrow` files.
|
||
|
||
```bash
|
||
ls "/c/Users/Lenovo/Documents/- Dolphin NG HD (NG3)/correlation_arb512/eigenvalues/" | tail -5
|
||
# Expected: recent date directories like 2026-03-05, 2026-03-04, etc.
|
||
|
||
ls "/c/Users/Lenovo/Documents/- Dolphin NG HD (NG3)/correlation_arb512/eigenvalues/2026-03-05/"
|
||
# Expected: scan_NNNN__Indicators.npz files
|
||
```
|
||
|
||
If a date directory is missing, the flow logs a warning and writes pnl=0 for that day (non-critical).
|
||
|
||
---
|
||
|
||
## Step 9: Daily Operations
|
||
|
||
**Normal daily flow (automated):**
|
||
1. ARB512 scanner (extended_main.py) writes scans to eigenvalues/YYYY-MM-DD/ throughout the day
|
||
2. At 00:05 UTC, Prefect triggers dolphin-paper-blue and dolphin-paper-green
|
||
3. Each flow reads yesterday's scans, runs the engine, writes to HZ + JSONL log
|
||
4. Monitor via Prefect UI and HZ-MC
|
||
|
||
**Check today's run result:**
|
||
```bash
|
||
# Latest P&L log entry:
|
||
tail -1 "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod/paper_logs/blue/paper_pnl_$(date +%Y-%m).jsonl"
|
||
```
|
||
|
||
**Check HZ state:**
|
||
- http://localhost:8080 → Maps → DOLPHIN_STATE_BLUE → key "latest"
|
||
- Should show: `{"capital": XXXXX, "strategy": "blue", "last_date": "YYYY-MM-DD", ...}`
|
||
|
||
---
|
||
|
||
## Step 10: Restart After Reboot
|
||
|
||
After Windows restarts:
|
||
|
||
```bash
|
||
# 1. Start Docker Desktop (GUI or command — see Step 3)
|
||
|
||
# 2. Restart containers
|
||
cd "/c/Users/Lenovo/Documents/- DOLPHIN NG HD HCM TSF Predict/prod"
|
||
docker compose up -d
|
||
|
||
# 3. Restart Prefect worker (in a dedicated terminal)
|
||
"/c/Users/Lenovo/Documents/- Siloqy/Scripts/python.exe" -m prefect worker start --pool "dolphin"
|
||
```
|
||
|
||
Deployments and HZ data persist (docker volumes: hz_data, prefect_data).
|
||
|
||
---
|
||
|
||
## Troubleshooting
|
||
|
||
### "No scan dir for YYYY-MM-DD"
|
||
- The ARB512 scanner may not have run for that date
|
||
- Check: `ls "C:\Users\Lenovo\Documents\- Dolphin NG HD (NG3)\correlation_arb512\eigenvalues\YYYY-MM-DD\"`
|
||
- Non-critical: flow logs pnl=0 and continues
|
||
|
||
### "HZ write failed (not critical)"
|
||
- Hazelcast container not running or not yet healthy
|
||
- Run: `docker compose ps` → check dolphin-hazelcast shows "healthy"
|
||
- Run: `docker compose restart hazelcast`
|
||
|
||
### "ModuleNotFoundError: No module named 'hazelcast'"
|
||
- Dependencies not installed in Siloqy venv
|
||
- Rerun Step 2
|
||
|
||
### "error during connect: open //./pipe/dockerDesktopLinuxEngine"
|
||
- Docker Desktop not running
|
||
- Start Docker Desktop (see Step 3), wait 60 seconds, retry
|
||
|
||
### Prefect worker not picking up runs
|
||
- Verify worker is running with `--pool "dolphin"` (matches work_queue_name in deployments)
|
||
- Check Prefect UI → Work Pools → should show "dolphin" pool as online
|
||
|
||
### Green deployment errors on bidirectional config
|
||
- Green is PENDING LONG validation. If direction: bidirectional causes engine errors,
|
||
temporarily set green.yml direction: short_only until LONG system is validated.
|
||
|
||
---
|
||
|
||
## Key File Locations
|
||
|
||
| File | Path |
|
||
|---|---|
|
||
| Prefect flow | `prod/paper_trade_flow.py` |
|
||
| Blue config | `prod/configs/blue.yml` |
|
||
| Green config | `prod/configs/green.yml` |
|
||
| Docker stack | `prod/docker-compose.yml` |
|
||
| Blue P&L logs | `prod/paper_logs/blue/paper_pnl_YYYY-MM.jsonl` |
|
||
| Green P&L logs | `prod/paper_logs/green/paper_pnl_YYYY-MM.jsonl` |
|
||
| Scan data source | `C:\Users\Lenovo\Documents\- Dolphin NG HD (NG3)\correlation_arb512\eigenvalues\` |
|
||
| NDAlphaEngine | `HCM\nautilus_dolphin\nautilus_dolphin\nautilus\esf_alpha_orchestrator.py` |
|
||
| MC-Forewarner models | `HCM\nautilus_dolphin\mc_results\models\` |
|
||
|
||
---
|
||
|
||
## Current Status (2026-03-06)
|
||
|
||
| Item | Status |
|
||
|---|---|
|
||
| Docker stack | Built — needs Docker Desktop running |
|
||
| Python deps (HZ + Prefect) | Installing (pip background job) |
|
||
| Blue config | Frozen champion SHORT — ready |
|
||
| Green config | PENDING — LONG validation running (b79rt78uv) |
|
||
| Prefect deployments | Not yet registered (run Step 5 after deps install) |
|
||
| Manual test run | Not yet done (run Step 7) |
|
||
| vol_p60 calibration | Hardcoded 0.000099 (pre-calibrated from 55-day window) — acceptable |
|
||
| Engine state persistence | Implemented — engine capital and open positions serialize to Hazelcast STATE IMap |
|
||
|
||
### Engine State Persistence
|
||
|
||
The NDAlphaEngine is instantiated fresh during each daily Prefect run, but its internal state is loaded from the Hazelcast `DOLPHIN_STATE_BLUE`/`GREEN` maps. Both `capital` and any active `position` spanning midnight are accurately tracked and restored.
|
||
|
||
**Impact for paper trading**: P&L and cumulative capital growth track correctly across days.
|
||
|
||
---
|
||
|
||
*Guide written 2026-03-08. Status updated.*
|
||
|
||
---
|
||
|
||
## Appendix D: Live Operations Monitoring — DEV "Realized Slippage"
|
||
|
||
**Purpose**: Track whether ExF latency (~10ms) is causing unacceptable fill slippage vs backtest assumptions.
|
||
|
||
### Background
|
||
- Backtest friction assumptions: **8-10 bps** round-trip (2bps entry + 2bps exit + fees)
|
||
- ExF latency-induced drift: **~0.055 bps** (normal vol), **~0.17 bps** (high vol events)
|
||
- Current Python implementation is sufficient (latency << friction assumptions)
|
||
|
||
### Metric Definition
|
||
```python
|
||
realized_slippage_bps = abs(fill_price - signal_price) / signal_price * 10000
|
||
```
|
||
|
||
### Monitoring Thresholds
|
||
|
||
| Threshold | Action |
|
||
|-----------|--------|
|
||
| **< 2 bps** | ✅ Nominal — within backtest assumptions |
|
||
| **2-5 bps** | ⚠️ Watch — approaching friction limits |
|
||
| **> 5 bps** | 🚨 **ALERT** — investigate latency/market impact issues |
|
||
|
||
### Implementation Notes
|
||
- Log `signal_price` (price at signal generation) vs `fill_price` (actual execution)
|
||
- Track per-trade slippage in paper_logs
|
||
- Alert if 24h moving average exceeds 5 bps
|
||
- If consistently > 5 bps → escalate to Java/Chronicle Queue port for <100μs latency
|
||
|
||
### TODO
|
||
- [ ] Add slippage tracking to `paper_trade_flow.py` trade logging
|
||
- [ ] Create Grafana/Prefect alert for slippage > 5 bps
|
||
- [ ] Document slippage post-trade analysis pipeline
|
||
|
||
---
|
||
*Last updated: 2026-03-17*
|
||
|
||
|
||
---
|
||
|
||
## Appendix E: External Factors (ExF) System v2.0
|
||
|
||
**Date**: 2026-03-17
|
||
**Purpose**: Complete production guide for the External Factors real-time data pipeline
|
||
**Components**: `exf_fetcher_flow.py`, `exf_persistence.py`, `exf_integrity_monitor.py`
|
||
|
||
### Architecture
|
||
|
||
```
|
||
┌─────────────────────────────────────────────────────────────────────────────┐
|
||
│ EXTERNAL FACTORS SYSTEM v2.0 │
|
||
├─────────────────────────────────────────────────────────────────────────────┤
|
||
│ │
|
||
│ ┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐ │
|
||
│ │ Data Providers │ │ Data Providers │ │ Data Providers │ │
|
||
│ │ (Binance) │ │ (Deribit) │ │ (FRED/Macro) │ │
|
||
│ │ - funding_btc │ │ - dvol_btc │ │ - vix │ │
|
||
│ │ - basis │ │ - dvol_eth │ │ - dxy │ │
|
||
│ │ - spread │ │ - fund_dbt_btc │ │ - us10y │ │
|
||
│ │ - imbal_* │ │ │ │ │ │
|
||
│ └────────┬─────────┘ └────────┬─────────┘ └────────┬─────────┘ │
|
||
│ │ │ │ │
|
||
│ └────────────────────────┼────────────────────────┘ │
|
||
│ ▼ │
|
||
│ ┌──────────────────────────────────────────────────────────────────┐ │
|
||
│ │ RealTimeExFService (28 indicators) │ │
|
||
│ │ - Per-indicator async polling at native rate │ │
|
||
│ │ - Rate limiting per provider (Binance 20/s, FRED 2/s, etc) │ │
|
||
│ │ - In-memory cache with <1ms read latency │ │
|
||
│ │ - Daily history rotation for lag support │ │
|
||
│ └────────────────────────────────┬─────────────────────────────────┘ │
|
||
│ │ │
|
||
│ ┌───────────────────────┼───────────────────────┐ │
|
||
│ │ │ │ │
|
||
│ ▼ ▼ ▼ │
|
||
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
|
||
│ │ HOT PATH │ │ OFF HOT PATH │ │ MONITORING │ │
|
||
│ │ (0.5s interval)│ │ (5 min interval│ │ (60s interval) │ │
|
||
│ │ │ │ │ │ │ │
|
||
│ │ Hazelcast │ │ Disk Persistence│ │ Integrity Check │ │
|
||
│ │ DOLPHIN_FEATURES│ │ NPZ Format │ │ HZ vs Disk │ │
|
||
│ │ ['exf_latest'] │ │ /mnt/ng6_data/ │ │ Staleness Check │ │
|
||
│ │ │ │ eigenvalues/ │ │ ACB Validation │ │
|
||
│ │ Instant access │ │ Durability │ │ Alert on drift │ │
|
||
│ │ for Alpha Engine│ │ for Backtests │ │ │ │
|
||
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
|
||
│ │
|
||
└─────────────────────────────────────────────────────────────────────────────┘
|
||
```
|
||
|
||
### Component Reference
|
||
|
||
| Component | File | Purpose | Update Rate |
|
||
|-----------|------|---------|-------------|
|
||
| RealTimeExFService | `realtime_exf_service.py` | Fetches 28 indicators from 8 providers | Per-indicator native rate |
|
||
| ExF Fetcher Flow | `exf_fetcher_flow.py` | Prefect flow orchestrating HZ push | 0.5s (500ms) |
|
||
| ExF Persistence | `exf_persistence.py` | Disk writer (NPZ format) | 5 minutes |
|
||
| ExF Integrity Monitor | `exf_integrity_monitor.py` | Data validation & alerts | 60 seconds |
|
||
|
||
### Indicators (28 Total)
|
||
|
||
| Category | Indicators | Count |
|
||
|----------|-----------|-------|
|
||
| **Binance Derivatives** | funding_btc, funding_eth, oi_btc, oi_eth, ls_btc, ls_eth, ls_top, taker, basis | 9 |
|
||
| **Microstructure** | imbal_btc, imbal_eth, spread | 3 |
|
||
| **Deribit** | dvol_btc, dvol_eth, fund_dbt_btc, fund_dbt_eth | 4 |
|
||
| **Macro (FRED)** | vix, dxy, us10y, sp500, fedfunds | 5 |
|
||
| **Sentiment** | fng | 1 |
|
||
| **On-chain** | hashrate | 1 |
|
||
| **DeFi** | tvl | 1 |
|
||
| **Liquidations** | liq_vol_24h, liq_long_ratio, liq_z_score, liq_percentile | 4 |
|
||
|
||
### ACB-Critical Indicators (9 Required for _acb_ready=True)
|
||
|
||
These indicators **MUST** be present and fresh for the Adaptive Circuit Breaker to function:
|
||
|
||
```python
|
||
ACB_KEYS = [
|
||
"funding_btc", "funding_eth", # Binance funding rates
|
||
"dvol_btc", "dvol_eth", # Deribit volatility indices
|
||
"fng", # Fear & Greed
|
||
"vix", # VIX (market fear)
|
||
"ls_btc", # Long/Short ratio
|
||
"taker", # Taker buy/sell ratio
|
||
"oi_btc", # Open interest
|
||
]
|
||
```
|
||
|
||
### Data Flow
|
||
|
||
1. **Fetch**: `RealTimeExFService` polls each provider at native rate
|
||
2. **Cache**: Values stored in memory with staleness tracking
|
||
3. **HZ Push** (every 0.5s): Hot path to Hazelcast for Alpha Engine
|
||
4. **Persistence** (every 5min): Background flush to NPZ on disk
|
||
5. **Integrity Check** (every 60s): Validate HZ vs disk consistency
|
||
|
||
### File Locations (Linux)
|
||
|
||
| Data Type | Path |
|
||
|-----------|------|
|
||
| Persistence root | `/mnt/ng6_data/eigenvalues/` |
|
||
| Daily directory | `/mnt/ng6_data/eigenvalues/{YYYY-MM-DD}/` |
|
||
| ExF snapshots | `/mnt/ng6_data/eigenvalues/{YYYY-MM-DD}/extf_snapshot_{timestamp}__Indicators.npz` |
|
||
| Checksum files | `/mnt/ng6_data/eigenvalues/{YYYY-MM-DD}/extf_snapshot_{timestamp}__Indicators.npz.sha256` |
|
||
|
||
### NPZ File Format
|
||
|
||
```python
|
||
{
|
||
# Metadata (JSON string in _metadata array)
|
||
"_metadata": json.dumps({
|
||
"_timestamp_utc": "2026-03-17T12:00:00+00:00",
|
||
"_version": "1.0",
|
||
"_service": "ExFPersistence",
|
||
"_staleness_s": json.dumps({"basis": 0.2, "funding_btc": 3260.0, ...}),
|
||
}),
|
||
|
||
# Numeric indicators (each as float64 array)
|
||
"basis": np.array([0.01178]),
|
||
"spread": np.array([0.00143]),
|
||
"funding_btc": np.array([7.53e-06]),
|
||
"vix": np.array([24.06]),
|
||
...
|
||
}
|
||
```
|
||
|
||
### Running the ExF System
|
||
|
||
#### Option 1: Standalone (Development/Testing)
|
||
|
||
```bash
|
||
cd /root/extf_docs
|
||
|
||
# Test mode (no persistence, no monitoring)
|
||
python exf_fetcher_flow.py --no-persist --no-monitor --warmup 15
|
||
|
||
# With persistence (production)
|
||
python exf_fetcher_flow.py --warmup 30
|
||
|
||
# Run integration tests
|
||
python test_exf_integration.py --duration 30 --test all
|
||
```
|
||
|
||
#### Option 2: Prefect Deployment (Production)
|
||
|
||
```bash
|
||
# Deploy to Prefect
|
||
cd /mnt/dolphinng5_predict/prod
|
||
prefect deployment build exf_fetcher_flow.py:exf_fetcher_flow \
|
||
--name "exf-live" \
|
||
--pool dolphin \
|
||
--cron "*/5 * * * *" # Or run continuously
|
||
|
||
# Start worker
|
||
prefect worker start --pool dolphin
|
||
```
|
||
|
||
### Monitoring & Alerting
|
||
|
||
#### Health Status
|
||
|
||
The integrity monitor exposes health status via `get_health_status()`:
|
||
|
||
```python
|
||
{
|
||
"timestamp": "2026-03-17T12:00:00+00:00",
|
||
"overall": "healthy", # healthy | degraded | critical
|
||
"hz_connected": True,
|
||
"persist_connected": True,
|
||
"indicators_present": 28,
|
||
"indicators_expected": 28,
|
||
"acb_ready": True,
|
||
"stale_count": 2,
|
||
"alerts_active": 0,
|
||
}
|
||
```
|
||
|
||
#### Alert Thresholds
|
||
|
||
| Condition | Severity | Action |
|
||
|-----------|----------|--------|
|
||
| ACB-critical indicator missing | **CRITICAL** | Alpha engine may fail |
|
||
| Hazelcast disconnected | **CRITICAL** | Real-time data unavailable |
|
||
| Indicator stale > 120s | **WARNING** | Check provider API |
|
||
| HZ/disk divergence > 3 indicators | **WARNING** | Investigate sync issue |
|
||
| Overall health = degraded | **WARNING** | Monitor closely |
|
||
| Overall health = critical | **CRITICAL** | Page on-call engineer |
|
||
|
||
### Troubleshooting
|
||
|
||
#### Issue: `_acb_ready=False`
|
||
|
||
**Symptoms**: Health check shows `acb_ready: False`
|
||
**Diagnosis**: One or more ACRITICAL indicators missing
|
||
|
||
```bash
|
||
# Check which indicators are missing
|
||
python3 << 'EOF'
|
||
import hazelcast, json
|
||
client = hazelcast.HazelcastClient(cluster_name='dolphin', cluster_members=['localhost:5701'])
|
||
data = json.loads(client.get_map("DOLPHIN_FEATURES").get("exf_latest").result())
|
||
acb_keys = ["funding_btc", "funding_eth", "dvol_btc", "dvol_eth", "fng", "vix", "ls_btc", "taker", "oi_btc"]
|
||
missing = [k for k in acb_keys if k not in data or data[k] != data[k]] # NaN check
|
||
print(f"Missing ACB indicators: {missing}")
|
||
print(f"Present: {[k for k in acb_keys if k not in missing]}")
|
||
client.shutdown()
|
||
EOF
|
||
```
|
||
|
||
**Common Causes**:
|
||
- Deribit API down (dvol_btc, dvol_eth)
|
||
- Alternative.me API down (fng)
|
||
- FRED API key expired (vix)
|
||
|
||
**Fix**: Check provider status, verify API keys in `realtime_exf_service.py`
|
||
|
||
---
|
||
|
||
#### Issue: No disk persistence
|
||
|
||
**Symptoms**: `files_written: 0` in persistence stats
|
||
**Diagnosis**:
|
||
|
||
```bash
|
||
# Check mount
|
||
ls -la /mnt/ng6_data/eigenvalues/
|
||
|
||
# Check permissions
|
||
touch /mnt/ng6_data/eigenvalues/write_test && rm /mnt/ng6_data/eigenvalues/write_test
|
||
|
||
# Check disk space
|
||
df -h /mnt/ng6_data/
|
||
```
|
||
|
||
**Fix**:
|
||
```bash
|
||
# Remount if needed
|
||
sudo mount -t cifs //100.119.158.61/DolphinNG6_Data /mnt/ng6_data -o credentials=/root/.dolphin_creds
|
||
```
|
||
|
||
---
|
||
|
||
#### Issue: High staleness
|
||
|
||
**Symptoms**: Staleness > 120s for critical indicators
|
||
**Diagnosis**:
|
||
|
||
```bash
|
||
# Check fetcher process
|
||
ps aux | grep exf_fetcher
|
||
|
||
# Check logs
|
||
journalctl -u exf-fetcher -n 100
|
||
|
||
# Manual fetch test
|
||
curl -s "https://fapi.binance.com/fapi/v1/premiumIndex?symbol=BTCUSDT" | head -c 200
|
||
curl -s "https://www.deribit.com/api/v2/public/get_volatility_index_data?currency=BTC&resolution=3600&count=1" | head -c 200
|
||
```
|
||
|
||
**Fix**: Restart fetcher, check network connectivity, verify API rate limits not exceeded
|
||
|
||
### TODO (Future Enhancements)
|
||
|
||
- [ ] **Expand indicators**: Add 50+ additional indicators from CoinMetrics, Glassnode, etc.
|
||
- [ ] **Fix dead indicators**: Repair broken parsers (see `DEAD_INDICATORS` in service)
|
||
- [ ] **Adaptive lag**: Switch from uniform lag=1 to per-indicator optimal lags (needs 80+ days data)
|
||
- [ ] **Intra-day ACB**: Move from daily to continuous ACB calculation
|
||
- [ ] **Arrow format**: Dual output NPZ + Arrow for better performance
|
||
- [ ] **Redundancy**: Multiple provider failover for critical indicators
|
||
|
||
### Data Retention
|
||
|
||
| Data Type | Retention | Cleanup |
|
||
|-----------|-----------|---------|
|
||
| Hazelcast cache | Real-time only (no history) | N/A |
|
||
| Disk snapshots (NPZ) | 7 days | Automatic |
|
||
| Logs | 30 days | Manual/Logrotate |
|
||
| Backfill data | Permanent | Never |
|
||
|
||
---
|
||
*Last updated: 2026-03-17*
|