Files
DOLPHIN/prod/AGENT_TODO_FIX_NDTRADER.md
hjnormey 01c19662cb initial: import DOLPHIN baseline 2026-04-21 from dolphinng5_predict working tree
Includes core prod + GREEN/BLUE subsystems:
- prod/ (BLUE harness, configs, scripts, docs)
- nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved)
- adaptive_exit/ (AEM engine + models/bucket_assignments.pkl)
- Observability/ (EsoF advisor, TUI, dashboards)
- external_factors/ (EsoF producer)
- mc_forewarning_qlabs_fork/ (MC regime/envelope)

Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
2026-04-21 16:58:38 +02:00

19 KiB
Executable File

AGENT TASK: Fix nautilus_event_trader.py — Wire NDAlphaEngine to Live Hazelcast Feed

File to rewrite: /mnt/dolphinng5_predict/prod/nautilus_event_trader.py Python env: /home/dolphin/siloqy_env/bin/python3 (always use this, never bare python3) Working dir: /mnt/dolphinng5_predict/prod/


1. Background — What This System Is

DOLPHIN is a SHORT-only systematic crypto trading system running on Binance perpetual futures. The signal source is a Windows C++ eigenvalue scanner (NG5) that runs every 5 seconds, computing multi-window correlation eigenvalue decompositions across 50 crypto assets. Those scans are written as Apache Arrow IPC files to a Windows SMB share, then bridged to Hazelcast by scan_bridge_service.py running on Linux.

The live trading daemon (nautilus_event_trader.py) listens to Hazelcast for new scans and must route them through the REAL NDAlphaEngine trading core to decide whether to enter/exit positions. The current file is a stub — it uses a placeholder signal field that doesn't exist in the scan data, allows LONG direction (the system is SHORT-only), and never touches NDAlphaEngine.


2. Data Schema You Will Receive

Every scan in DOLPHIN_FEATURES["latest_eigen_scan"] is a JSON dict with these fields (confirmed from live Arrow scans, schema_version=5.0.0):

{
    # Eigenvalue / velocity fields
    "scan_number":           int,       # monotonically increasing (resets on NG5 restart)
    "timestamp_iso":         str,       # "2026-03-25T14:27:25.143712" (Windows local time)
    "timestamp_ns":          int,       # nanoseconds epoch
    "schema_version":        str,       # "5.0.0"

    "w50_lambda_max":        float,     # dominant eigenvalue, 50-bar window
    "w50_velocity":          float,     # dλ/dt, 50-bar window  ← v50_vel arg to step_bar
    "w50_rotation":          float,
    "w50_instability":       float,

    "w150_lambda_max":       float,
    "w150_velocity":         float,
    "w150_rotation":         float,
    "w150_instability":      float,

    "w300_lambda_max":       float,
    "w300_velocity":         float,
    "w300_rotation":         float,
    "w300_instability":      float,

    "w750_lambda_max":       float,
    "w750_velocity":         float,     # ← v750_vel arg to step_bar
    "w750_rotation":         float,
    "w750_instability":      float,

    "vel_div":               float,     # = w50_velocity - w750_velocity
                                        # THIS IS THE PRIMARY ENTRY GATE
                                        # Entry threshold: < -0.02
                                        # Extreme threshold: < -0.05

    "regime_signal":         int,       # -1 (short bias), 0 (neutral), +1 (long bias)
    "instability_composite": float,

    # Asset data (JSON strings in Arrow, already parsed to Python by scan_bridge)
    "assets":                list,      # list of 50 asset names e.g. ["BTCUSDT", ...]
    "asset_prices":          list,      # list of 50 current prices (same order as assets)
    "asset_loadings":        list,      # eigenvector loadings per asset

    "data_quality_score":    float,     # 1.0 = all good
    "missing_asset_count":   int,       # 0 = all assets present

    # Added by scan_bridge
    "bridge_ts":             str,       # UTC ISO timestamp when bridged
    "file_mtime":            float,     # file modification time
}

Critical: The field scan.get('signal', 0) used in the current stub does NOT exist in real scan data. The real signal is scan['vel_div'].


3. NDAlphaEngine — How to Use It

3a. Engine Construction (do this ONCE at startup, not per scan)

import sys
sys.path.insert(0, '/mnt/dolphinng5_predict')
sys.path.insert(0, '/mnt/dolphinng5_predict/nautilus_dolphin')

from nautilus_dolphin.nautilus.proxy_boost_engine import create_d_liq_engine
from nautilus_dolphin.nautilus.adaptive_circuit_breaker import AdaptiveCircuitBreaker
from nautilus_dolphin.nautilus.ob_features import OBFeatureEngine
from nautilus_dolphin.nautilus.ob_provider import MockOBProvider

# Champion engine config — FROZEN, do not change these values
ENGINE_KWARGS = dict(
    initial_capital=25000.0,        # starting paper capital
    vel_div_threshold=-0.02,        # entry gate
    vel_div_extreme=-0.05,          # extreme regime
    min_leverage=0.5, max_leverage=5.0, leverage_convexity=3.0,
    fraction=0.20, fixed_tp_pct=0.0095, stop_pct=1.0, max_hold_bars=120,
    use_direction_confirm=True, dc_lookback_bars=7, dc_min_magnitude_bps=0.75,
    dc_skip_contradicts=True, dc_leverage_boost=1.0, dc_leverage_reduce=0.5,
    use_asset_selection=True, min_irp_alignment=0.45,
    use_sp_fees=True, use_sp_slippage=True,
    sp_maker_entry_rate=0.62, sp_maker_exit_rate=0.50,
    use_ob_edge=True, ob_edge_bps=5.0, ob_confirm_rate=0.40,
    lookback=100, use_alpha_layers=True, use_dynamic_leverage=True, seed=42,
)

eng = create_d_liq_engine(**ENGINE_KWARGS)
# eng is a LiquidationGuardEngine (subclass of NDAlphaEngine)
# eng.base_max_leverage = 8.0, eng.abs_max_leverage = 9.0  (D_LIQ gold spec)

3b. OBF Setup (mock for prototype — real OBF can be wired later)

# Mock OB provider with gold-spec asset biases
ASSETS_50 = []  # populate from first scan's scan['assets'] list

mock_ob = MockOBProvider(
    imbalance_bias=-0.09, depth_scale=1.0, assets=ASSETS_50,
    imbalance_biases={
        'BTCUSDT': -0.086, 'ETHUSDT': -0.092,
        'BNBUSDT': +0.05,  'SOLUSDT': +0.05,
    },
)
ob_eng = OBFeatureEngine(mock_ob)
ob_eng.preload_date('mock', ASSETS_50)
eng.set_ob_engine(ob_eng)

3c. ACBv6 Setup (optional but important for dynamic leverage)

# ACB uses the eigenvalues dir on SMB
EIGEN_DIR = '/mnt/dolphinng6_data/eigenvalues'
from pathlib import Path
date_strings = sorted([d.name for d in Path(EIGEN_DIR).iterdir() if d.is_dir()])

acb = AdaptiveCircuitBreaker()
try:
    acb.preload_w750(date_strings)
    eng.set_acb(acb)
    logger.info("ACBv6 loaded")
except Exception as e:
    logger.warning(f"ACB preload failed: {e} — running without")

3d. MC Forewarner Setup

MC_MODELS_DIR = '/mnt/dolphinng5_predict/nautilus_dolphin/mc_results/models'
MC_BASE_CFG = {
    'trial_id': 0, 'vel_div_threshold': -0.020, 'vel_div_extreme': -0.050,
    'use_direction_confirm': True, 'dc_lookback_bars': 7,
    'dc_min_magnitude_bps': 0.75, 'dc_skip_contradicts': True,
    'dc_leverage_boost': 1.00, 'dc_leverage_reduce': 0.50,
    'vd_trend_lookback': 10, 'min_leverage': 0.50, 'max_leverage': 5.00,
    'leverage_convexity': 3.00, 'fraction': 0.20, 'use_alpha_layers': True,
    'use_dynamic_leverage': True, 'fixed_tp_pct': 0.0095, 'stop_pct': 1.00,
    'max_hold_bars': 120, 'use_sp_fees': True, 'use_sp_slippage': True,
    'sp_maker_entry_rate': 0.62, 'sp_maker_exit_rate': 0.50,
    'use_ob_edge': True, 'ob_edge_bps': 5.00, 'ob_confirm_rate': 0.40,
    'ob_imbalance_bias': -0.09, 'ob_depth_scale': 1.00,
    'use_asset_selection': True, 'min_irp_alignment': 0.45, 'lookback': 100,
    'acb_beta_high': 0.80, 'acb_beta_low': 0.20, 'acb_w750_threshold_pct': 60,
}

from pathlib import Path
if Path(MC_MODELS_DIR).exists():
    from mc.mc_ml import DolphinForewarner
    forewarner = DolphinForewarner(models_dir=MC_MODELS_DIR)
    eng.set_mc_forewarner(forewarner, MC_BASE_CFG)

3e. begin_day() — Must be called at start of each trading day

The engine must be initialised for the current date before any step_bar() calls. In live mode, call this once per UTC calendar day:

today = datetime.now(timezone.utc).strftime('%Y-%m-%d')
eng.begin_day(today, posture='APEX')
# posture can come from DOLPHIN_SAFETY HZ map key 'posture'

3f. step_bar() — Called on every scan

This is the heart of the rewrite. For each incoming scan:

result = eng.step_bar(
    bar_idx=bar_counter,          # increment by 1 per scan
    vel_div=scan['vel_div'],       # PRIMARY SIGNAL — float
    prices=prices_dict,            # dict: {"BTCUSDT": 84000.0, "ETHUSDT": 2100.0, ...}
    vol_regime_ok=vol_ok,          # bool — see §4 for how to compute
    v50_vel=scan['w50_velocity'],  # float
    v750_vel=scan['w750_velocity'] # float
)
# result['entry'] is not None → a new trade was opened
# result['exit']  is not None → an open trade was closed

Building prices_dict from scan:

prices_dict = dict(zip(scan['assets'], scan['asset_prices']))
# e.g. {"BTCUSDT": 84230.5, "ETHUSDT": 2143.2, ...}

4. vol_regime_ok — How to Compute in Live Mode

In backtesting, vol_ok is computed from a rolling 50-bar std of BTC returns vs a static threshold calibrated from the first 2 parquet files (vol_p60 ≈ 0.00026414).

In live mode, maintain a rolling buffer of BTC prices and compute it per scan:

from collections import deque
import numpy as np

BTC_VOL_WINDOW = 50
VOL_P60_THRESHOLD = 0.00026414  # gold calibration constant — do not change

btc_prices = deque(maxlen=BTC_VOL_WINDOW + 2)

def compute_vol_ok(scan: dict) -> bool:
    """Return True if current BTC vol regime exceeds gold threshold."""
    prices = dict(zip(scan.get('assets', []), scan.get('asset_prices', [])))
    btc_price = prices.get('BTCUSDT')
    if btc_price is None:
        return True  # fail open (don't gate on missing data)
    btc_prices.append(btc_price)
    if len(btc_prices) < BTC_VOL_WINDOW:
        return True  # not enough history yet — fail open
    arr = np.array(btc_prices)
    dvol = float(np.std(np.diff(arr) / arr[:-1]))
    return dvol > VOL_P60_THRESHOLD

5. Day Rollover Handling

The engine must call begin_day() exactly once per UTC day. Track the current date and call it when the date changes:

current_day = None

def maybe_rollover_day(eng, posture='APEX'):
    global current_day
    today = datetime.now(timezone.utc).strftime('%Y-%m-%d')
    if today != current_day:
        eng.begin_day(today, posture=posture)
        current_day = today
        logger.info(f"begin_day({today}) called")

6. Hazelcast Keys Reference

All keys are in the DOLPHIN_FEATURES map unless noted.

Key Map Content
latest_eigen_scan DOLPHIN_FEATURES Latest scan dict (see §2)
exf_latest DOLPHIN_FEATURES External factors: funding rates, OI, etc.
obf_latest DOLPHIN_FEATURES OBF consolidated features (may be empty if OBF daemon down)
posture DOLPHIN_SAFETY String: APEX / CAUTION / TURTLE / HIBERNATE
latest_trade DOLPHIN_PNL_BLUE Last trade record written by trader

Important: The Hazelcast entry listener callback does NOT safely give you event.client — this is unreliable. Instead, create ONE persistent hz_client at startup and reuse it throughout. Pass the map reference into the callback via closure or class attribute.


7. What the Rewritten File Must Do

Replace the entire compute_signal() and execute_trade() functions. The new architecture is:

Startup:
  1. Create NDAlphaEngine (create_d_liq_engine)
  2. Wire OBF (MockOBProvider)
  3. Wire ACBv6 (preload from eigenvalues dir)
  4. Wire MC Forewarner
  5. call begin_day() for today
  6. Connect Hz client (single persistent connection)
  7. Register entry listener on DOLPHIN_FEATURES['latest_eigen_scan']

Per scan (on_scan_update callback):
  1. Deserialise scan JSON
  2. Deduplicate by scan_number (skip if <= last_scan_number)
  3. Call maybe_rollover_day() — handles midnight seamlessly
  4. Build prices_dict from scan['assets'] + scan['asset_prices']
  5. compute vol_ok via rolling BTC vol buffer
  6. Read posture from Hz DOLPHIN_SAFETY (cached, refresh every ~60s)
  7. Call eng.step_bar(bar_idx, vel_div, prices_dict, vol_ok, v50_vel, v750_vel)
  8. Inspect result:
       - result['entry'] is not None → log trade entry to DOLPHIN_PNL_BLUE
       - result['exit']  is not None → log trade exit + PnL to DOLPHIN_PNL_BLUE
  9. Push engine state snapshot to Hz:
       DOLPHIN_STATE_BLUE['engine_snapshot'] = {
           capital, open_positions, last_scan, vel_div, vol_ok, posture, ...
       }
  10. Log summary line to stdout + TRADE_LOG

Shutdown (SIGTERM / SIGINT):
  - Call eng.end_day() to get daily summary
  - Push final state to Hz
  - Disconnect Hz client cleanly

8. Critical Invariants — Do NOT Violate

  1. SHORT-ONLY system. eng.regime_direction is always -1. Never pass direction=1 to begin_day(). Never allow LONG trades.

  2. No set_esoteric_hazard_multiplier() call. This is the gold path — calling it would reduce base_max_leverage from 8.0 to 6.0 (incorrect). Leave it uncalled.

  3. Never call eng.process_day(). That function is for batch backtesting (reads a full parquet). In live mode, use begin_day() + step_bar() per scan.

  4. bar_idx must be a simple incrementing integer (0, 1, 2, ...) reset to 0 at each begin_day() call, or kept global across days — either works. Do NOT use scan_number as bar_idx (scan_number resets on NG5 restart).

  5. Thread safety: The Hz listener fires in a background thread. The engine is NOT thread-safe. Use a threading.Lock() around all eng.step_bar() calls.

  6. Keep the Hz client persistent. Creating a new HazelcastClient per scan is slow and leaks connections. One client at startup, reused throughout.


9. File Structure for the Rewrite

nautilus_event_trader.py
├── Imports + sys.path setup
├── Constants (HZ keys, paths, ENGINE_KWARGS, MC_BASE_CFG, VOL_P60_THRESHOLD)
├── class DolphinLiveTrader:
│   ├── __init__(self)            → creates engine, wires OBF/ACB/MC, inits state
│   ├── _build_engine(self)       → create_d_liq_engine + wire sub-systems
│   ├── _connect_hz(self)         → single persistent HazelcastClient
│   ├── _read_posture(self)       → cached read from DOLPHIN_SAFETY
│   ├── _rollover_day(self)       → call eng.begin_day() when date changes
│   ├── _compute_vol_ok(self, scan) → rolling BTC vol vs VOL_P60_THRESHOLD
│   ├── on_scan(self, event)      → main callback (deduplicate, step_bar, log)
│   ├── _log_trade(self, result)  → push to DOLPHIN_PNL_BLUE
│   ├── _push_state(self)         → push engine snapshot to DOLPHIN_STATE_BLUE
│   └── run(self)                 → register Hz listener, keep-alive loop
└── main()                        → instantiate DolphinLiveTrader, call run()

10. Testing Instructions

Test A — Dry-run against live Hz data (no trades)

# First, check live data is flowing:
/home/dolphin/siloqy_env/bin/python3 - << 'EOF'
import hazelcast, json
hz = hazelcast.HazelcastClient(cluster_name="dolphin", cluster_members=["127.0.0.1:5701"])
m = hz.get_map("DOLPHIN_FEATURES").blocking()
s = json.loads(m.get("latest_eigen_scan"))
print(f"scan_number={s['scan_number']}  vel_div={s['vel_div']:.4f}  assets={len(s['assets'])}")
hz.shutdown()
EOF

# Run the trader in dry-run mode (add DRY_RUN=True flag to skip Hz writes):
DRY_RUN=true /home/dolphin/siloqy_env/bin/python3 /mnt/dolphinng5_predict/prod/nautilus_event_trader.py

Expected output per scan:

[2026-03-25T14:32:00+00:00] Scan #52  vel_div=-0.0205  vol_ok=True  posture=APEX
[2026-03-25T14:32:00+00:00]   step_bar → entry=None  exit=None  capital=$25000.00

When vel_div drops below -0.02 and vol_ok=True:

[2026-03-25T14:32:10+00:00] Scan #55  vel_div=-0.0312  vol_ok=True  posture=APEX
[2026-03-25T14:32:10+00:00]   step_bar → ENTRY SHORT BTCUSDT @ 84230.5  leverage=3.2x

Test B — Verify engine state after 10 scans

/home/dolphin/siloqy_env/bin/python3 - << 'EOF'
import hazelcast, json
hz = hazelcast.HazelcastClient(cluster_name="dolphin", cluster_members=["127.0.0.1:5701"])
snap = hz.get_map("DOLPHIN_STATE_BLUE").blocking().get("engine_snapshot")
if snap:
    s = json.loads(snap)
    print(f"capital={s.get('capital')}  open_pos={s.get('open_positions')}  scans={s.get('scan_count')}")
else:
    print("No snapshot yet")
hz.shutdown()
EOF

Test C — Verify SHORT-only invariant

After running for a few minutes, check the trade log:

grep "direction" /tmp/nautilus_event_trader.log | grep -v SHORT
# Should return ZERO lines. Any LONG trade is a bug.

Test D — Simulate NG5 restart (scan_number reset)

NG5 restarts produce a spike vel_div = -18.92 followed by scan_number resetting to a low value. The deduplication logic must handle this:

# The dedup check must use mtime (file_mtime) NOT scan_number alone,
# because scan_number resets. Use the bridge_ts or file_mtime as the
# true monotonic ordering. Refer to scan_bridge_service.py's handler.last_mtime
# for the same pattern.

After a restart, the first scan's vel_div will be a large negative spike (-18.92 seen in historical data). The engine should see this as a potential entry signal — this is acceptable behaviour for a prototype. A production fix would add a restart-detection filter, but that is OUT OF SCOPE for this prototype.

Test E — systemd service restart

systemctl restart dolphin-nautilus-trader
sleep 5
systemctl status dolphin-nautilus-trader
journalctl -u dolphin-nautilus-trader --no-pager -n 20

The service unit is at /etc/systemd/system/dolphin-nautilus-trader.service. After rewriting the file, restart the service to pick up the change.


11. Out of Scope for This Prototype

  • Real Nautilus order submission (BTCUSD FX instrument mock is acceptable)
  • Live Binance fills or execution feedback
  • OBF live streaming (MockOBProvider is fine)
  • ExtF integration (ignore exf_latest for now — the engine works without it)
  • Position sizing beyond what NDAlphaEngine does internally

The goal of this prototype is: real vel_div → real NDAlphaEngine → real trade decisions logged to Hazelcast. The path from signal to engine must be correct.


12. Key File Locations

File Purpose
/mnt/dolphinng5_predict/prod/nautilus_event_trader.py File to rewrite
/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/esf_alpha_orchestrator.py NDAlphaEngine source (step_bar line 241, begin_day line 793)
/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/proxy_boost_engine.py create_d_liq_engine factory (line 443)
/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/adaptive_circuit_breaker.py ACBv6 (preload_w750 line 336)
/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/ob_features.py OBFeatureEngine
/mnt/dolphinng5_predict/nautilus_dolphin/nautilus_dolphin/nautilus/ob_provider.py MockOBProvider
/mnt/dolphinng5_predict/nautilus_dolphin/mc/mc_ml.py DolphinForewarner
/mnt/dolphinng5_predict/prod/vbt_nautilus_56day_backtest.py Reference implementation — same engine wiring pattern
/mnt/dolphinng6_data/arrow_scans/ Live Arrow scan files from NG5 (SMB mount)
/mnt/dolphinng6_data/eigenvalues/ Historical eigenvalue data for ACB preload
/etc/systemd/system/dolphin-nautilus-trader.service systemd unit — restart after changes