Files
DOLPHIN/nautilus_dolphin/CURRENT_IMPLEMENTATION_ISSUES.txt
hjnormey 01c19662cb initial: import DOLPHIN baseline 2026-04-21 from dolphinng5_predict working tree
Includes core prod + GREEN/BLUE subsystems:
- prod/ (BLUE harness, configs, scripts, docs)
- nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved)
- adaptive_exit/ (AEM engine + models/bucket_assignments.pkl)
- Observability/ (EsoF advisor, TUI, dashboards)
- external_factors/ (EsoF producer)
- mc_forewarning_qlabs_fork/ (MC regime/envelope)

Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
2026-04-21 16:58:38 +02:00

568 lines
18 KiB
Plaintext
Executable File
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

NOTE: ALL THE ISSUES BELOW ARE PUTATIVE. Any further work based on this audit must first PAINSTAKINGLY ascertain the validity of both the issue and the proposed fix(es).-
---
# **COMPREHENSIVE TECHNICAL AUDIT**
## NautilusTrader DOLPHIN Implementation
---
## **EXECUTIVE SUMMARY**
| Component | Status | Critical Issues | Risk Level |
|-----------|--------|-----------------|------------|
| Signal Bridge | ⚠️ NEEDS WORK | 3 issues | MEDIUM |
| Strategy | ✅ SOLID | 1 issue | LOW |
| SmartExecAlgorithm | ❌ INCOMPLETE | 4 issues | HIGH |
| Volatility Detector | ✅ CORRECT | 0 issues | LOW |
| Circuit Breaker | ⚠️ NEEDS WORK | 2 issues | MEDIUM |
| Validation Framework | ⚠️ PLACEHOLDER | 2 issues | MEDIUM |
| Signal Generator | ⚠️ NEEDS WORK | 3 issues | MEDIUM |
---
## **DETAILED ANALYSIS BY COMPONENT**
### **1. SIGNAL BRIDGE ACTOR** (`signal_bridge.py`)
#### ✅ **What's Correct**
```python
# CORRECT: Nanosecond timestamp handling
def _parse_timestamp_ns(self, ts) -> int:
if ts > 1e15:
return int(ts) # Already nanoseconds
elif ts > 1e12:
return int(ts * 1_000) # milliseconds to ns
```
The timestamp parsing handles multiple formats correctly - this is critical for Nautilus compatibility.
```python
# CORRECT: Redis Streams (not pub/sub)
messages = await self._redis.xread(
{self.stream_key: self._last_id},
count=10,
block=50
)
```
Using Redis Streams with `xread` is the correct choice - it provides durability and exactly-once semantics.
#### ❌ **CRITICAL ISSUES**
**Issue #1: Missing `clock` attribute causes crash**
```python
# Line 244 - WILL CRASH in production
age_ns = self.clock.timestamp_ns() - signal_ts
```
**Problem**: The `Actor` base class doesn't automatically expose `self.clock`. In Nautilus, you access the clock via `self._clock` (protected) or use `self.clock` only after `on_start()` has been called.
**Fix**:
```python
def _validate_signal(self, signal: dict) -> bool:
# Use the correct clock access pattern
current_ns = self._clock.timestamp_ns() if hasattr(self, '_clock') else time.time_ns()
```
**Issue #2: No reconnection logic for Redis**
```python
# Line 155-157 - Connection error doesn't reconnect
except redis.ConnectionError as e:
self.log.error(f"Redis connection error: {e}")
await asyncio.sleep(1) # Backoff before retry
```
**Problem**: After a connection error, `self._redis` is in a broken state. You need to re-establish the connection.
**Fix**:
```python
except redis.ConnectionError as e:
self.log.error(f"Redis connection error: {e}")
await self._reconnect_redis()
async def _reconnect_redis(self):
"""Reconnect to Redis with exponential backoff."""
for attempt in range(5):
try:
self._redis = await redis.from_url(self.redis_url)
self.log.info(f"Redis reconnected after {attempt+1} attempts")
return
except Exception as e:
await asyncio.sleep(min(2 ** attempt, 30))
self.log.error("Failed to reconnect to Redis after 5 attempts")
```
**Issue #3: `self.is_running` doesn't exist on Actor**
```python
# Line 141 - AttributeError in production
while self.is_running:
```
**Problem**: Nautilus `Actor` uses different lifecycle management. Check `self._state` or use a local flag.
**Fix**:
```python
def on_start(self):
self._running = True
# ...
async def _consume_stream(self):
while self._running: # Use local flag
```
---
### **2. EXECUTION STRATEGY** (`strategy.py`)
#### ✅ **What's Correct**
```python
# CORRECT: Grid 5F champion parameters
self.irp_alignment_min = config.get('irp_alignment_min', 0.45)
self.momentum_magnitude_min = config.get('momentum_magnitude_min', 0.000075)
self.leverage_convexity = config.get('leverage_convexity', 3.0)
self.tp_bps = config.get('tp_bps', 99)
```
All champion strategy parameters are correctly implemented.
```python
# CORRECT: Signal data extraction
def on_signal(self, signal):
signal_data = signal.value if hasattr(signal, 'value') else signal
```
Properly handles Nautilus `Signal` object.
```python
# CORRECT: Using actual price from signal (for validation)
signal_price = signal_data.get('price')
if signal_price and signal_price > 0:
price = float(signal_price)
price_source = "signal"
```
This is **excellent** - using the actual price from the eigenvalue JSON for validation backtests.
#### ⚠️ **ISSUES**
**Issue #1: Missing `register_exec_algorithm` method**
```python
# Line 291-302 - This method doesn't exist in Nautilus 1.219.0
self.register_exec_algorithm(
SmartExecAlgorithm,
config={...},
exec_algorithm_id="SMART_EXEC"
)
```
**Problem**: Nautilus doesn't have `register_exec_algorithm` on Strategy. You register exec algorithms at the `TradingNode` level, not the strategy level.
**Fix**: Move registration to the main node setup:
```python
# In main.py or node setup
node.add_exec_algorithm(SmartExecAlgorithm(config={...}))
```
**Issue #2: Nautilus 1.219.0 Logger bug workaround is fragile**
```python
# Line 264-274 - Workaround may not work in all cases
try:
super().__init__(config)
except TypeError as e:
# Workaround: Nautilus 1.219.0 Logger expects str but gets StrategyId
class SimpleLogger:
def info(self, msg): print(f"[INFO] {msg}")
```
This is a known Nautilus bug, but the workaround may cause issues with proper logging integration. Consider upgrading to Nautilus 1.220.0+ where this was fixed.
---
### **3. SMART EXEC ALGORITHM** (`smart_exec_algorithm.py`)
#### ❌ **CRITICAL ISSUES - THIS COMPONENT IS INCOMPLETE**
**Issue #1: `on_order` should be `on_submit`**
```python
# Line 239 - Wrong method name
def on_order(self, order):
```
**Problem**: In Nautilus `ExecAlgorithm`, the method is `on_submit(command: SubmitOrder)`, not `on_order(order)`. This will never be called.
**Fix**:
```python
def on_submit(self, command: SubmitOrder):
order = command.order
# ... rest of logic
```
**Issue #2: No actual limit order creation**
```python
# Lines 254-268 - This doesn't create any orders!
def _handle_entry(self, order, instrument, tags):
limit_price = tags.get('limit_price')
if limit_price:
self._pending_entries[order.id] = {...} # Just stores it
self.log.info(f"Entry limit order submitted: {order.id}") # Lies!
```
**Problem**: The code logs "limit order submitted" but never actually creates or submits a limit order. It just stores metadata.
**What it SHOULD do**:
```python
def _handle_entry(self, order, instrument, tags):
quote = self.cache.quote_tick(instrument.id)
bid = float(quote.bid)
ask = float(quote.ask)
spread = ask - bid
# Calculate limit price (1bps inside spread)
if order.side == OrderSide.SELL:
limit_price = bid + (spread * 0.01)
else:
limit_price = ask - (spread * 0.01)
# CREATE the limit order
limit_order = self.order_factory.limit(
instrument_id=instrument.id,
order_side=order.side,
quantity=order.quantity,
price=Price(limit_price, precision=instrument.price_precision),
time_in_force=TimeInForce.GTD,
expire_time_ns=self.clock.timestamp_ns() + (25 * 1_000_000_000),
post_only=True,
tags={**tags, 'fill_type': 'maker'}
)
# SUBMIT it
self.submit_order(limit_order)
```
**Issue #3: No fallback timer logic**
```python
# Lines 65-74 - Metrics tracked but no timers scheduled
self._metrics = {
'entries_maker': 0,
'entries_taker': 0,
...
}
```
**Problem**: There's no code to schedule the fallback from maker to taker after 25 seconds. The algorithm just tracks metrics but doesn't execute the maker→taker transition.
**Fix**:
```python
def _handle_entry(self, order, instrument, tags):
# ... create and submit limit order ...
# Schedule fallback timer
self.clock.set_timer(
name=f"entry_fallback_{limit_order.client_order_id}",
interval_ns=25_000_000_000, # 25 seconds
callback=self._on_entry_fallback,
callback_data={'original_order': order, 'limit_order_id': limit_order.client_order_id}
)
def _on_entry_fallback(self, timer):
data = timer.callback_data
limit_order = self.cache.order(data['limit_order_id'])
if limit_order and not limit_order.is_closed:
# Cancel limit order
self.cancel_order(limit_order)
# Submit market order
original = data['original_order']
market_order = self.order_factory.market(
instrument_id=original.instrument_id,
order_side=original.side,
quantity=original.quantity,
tags={'type': 'entry', 'fill_type': 'taker', 'fallback': True}
)
self.submit_order(market_order)
```
**Issue #4: Missing `order_factory` attribute**
The `ExecAlgorithm` doesn't have `self.order_factory`. You need to use:
```python
from nautilus_trader.model.orders import LimitOrder, MarketOrder
# Create orders directly
limit_order = LimitOrder(...)
```
---
### **4. VOLATILITY DETECTOR** (`volatility_detector.py`)
#### ✅ **CORRECT IMPLEMENTATION**
This is the most critical component (-18% PF impact if disabled). The implementation is correct:
```python
# CORRECT: Dual condition check
def is_high_regime(self) -> bool:
vol_array = np.array(list(self._volatility_history))
p50 = np.percentile(vol_array, 50)
p75 = np.percentile(vol_array, 75)
return (self._current_vol > p50) and (self._current_vol > p75)
```
Both conditions (elevated AND high percentile) are correctly enforced.
```python
# CORRECT: Annualization factor for 5-second bars
vol = np.std(list(self._returns)) * np.sqrt(252 * 12 * 720)
```
The annualization is correct: 252 trading days × 12 hours/day × 720 bars/hour.
**Minor Observation**: The permissive default (`return True` when insufficient data) is appropriate for production but could mask issues during paper trading. Consider logging when this happens.
---
### **5. CIRCUIT BREAKER** (`circuit_breaker.py`)
#### ⚠️ **ISSUES**
**Issue #1: `log_info` and `log_alert` are not overridden**
```python
# Lines 216-222 - These just print, not log
def log_info(self, message: str):
print(f"[CircuitBreaker] {message}")
```
**Problem**: In production, these should integrate with Nautilus logging. The strategy passes a logger reference via `strategy` parameter but it's not used.
**Fix**:
```python
class CircuitBreakerManager:
def __init__(self, ..., strategy=None):
self.strategy = strategy
def log_info(self, message: str):
if self.strategy and hasattr(self.strategy, 'log'):
self.strategy.log.info(message)
else:
print(f"[CircuitBreaker] {message}")
```
**Issue #2: Daily loss limit uses wrong calculation**
```python
# Lines 131-138 - Loss percentage inverted
loss_pct = (-self._day_pnl / self._starting_balance) * 100
if loss_pct >= self.daily_loss_limit_pct:
```
**Problem**: If `day_pnl` is negative (loss), `-day_pnl` is positive. But the formula should be:
```python
# CORRECT formula
pnl_pct = (self._day_pnl / self._starting_balance) * 100 # Negative for loss
if pnl_pct <= -self.daily_loss_limit_pct: # Check if below -10%
```
Or alternatively:
```python
current_balance = self._starting_balance + self._day_pnl
loss_pct = ((self._starting_balance - current_balance) / self._starting_balance) * 100
```
---
### **6. VALIDATION FRAMEWORK** (`backtest_runner.py`, `comparator.py`)
#### ⚠️ **PLACEHOLDER IMPLEMENTATION**
**Issue #1: Backtest is simulated, not actual**
```python
# Lines 168-241 - This is a placeholder!
def _simulate_backtest(self, data: Dict, config: Optional[Dict] = None) -> Dict[str, Any]:
# Placeholder simulation - generates synthetic results
# In real implementation, use Nautilus backtest runner
```
**Critical Gap**: This doesn't actually run a Nautilus backtest. It generates synthetic trades with `is_win = i % 2 == 0`.
**What's needed**:
```python
from nautilus_trader.backtest.node import BacktestNode
from nautilus_trader.backtest.config import BacktestConfig
def run_validation_period(self, ...):
# Create backtest node
node = BacktestNode()
# Add strategy
node.add_strategy(DolphinExecutionStrategy(config))
# Add data
for bar in data['bars']:
node.add_data(bar)
# Run
result = node.run()
return result
```
**Issue #2: Comparator doesn't validate signal timing**
The comparator checks trade counts and prices but doesn't validate that signals were generated at the correct timestamps. This could miss timing-related bugs.
---
### **7. SIGNAL GENERATOR** (`generator.py`)
#### ⚠️ **ISSUES**
**Issue #1: Per-asset vel_div not implemented**
```python
# Lines 222-243 - Uses SAME vel_div for ALL assets
def _extract_vel_div(self, scan_data: Dict) -> Dict[str, float]:
windows = scan_data.get('windows', {})
w50 = windows.get('50', {}).get('tracking_data', {})
w150 = windows.get('150', {}).get('tracking_data', {})
if w50 and w150:
v50 = w50.get('lambda_max_velocity', 0)
v150 = w150.get('lambda_max_velocity', 0)
# BUG: Applies SAME vel_div to all assets
for asset in prices.keys():
vel_div[asset] = v50 - v150 # <-- ALL ASSETS GET SAME VALUE
```
**Problem**: The velocity divergence should be **per-asset**, not global. Looking at the JSON structure from the design doc, eigenvalue data should have per-asset velocities:
```json
{
"assets": {
"BTCUSDT": {
"v50": 0.0234,
"v150": 0.0468,
"vel_div": -0.0234
},
"ETHUSDT": {
"v50": 0.0156,
"v150": 0.0312,
"vel_div": -0.0156
}
}
}
```
**Fix**:
```python
def _extract_vel_div(self, scan_data: Dict) -> Dict[str, float]:
vel_div = {}
assets = scan_data.get('assets', {})
for asset, data in assets.items():
v50 = data.get('v50', 0)
v150 = data.get('v150', 0)
vel_div[asset] = v50 - v150
return vel_div
```
**Issue #2: Strength calculation is arbitrary**
```python
# Line 170 - Arbitrary scaling
strength = min(abs(vel_div) * 20, 1.0) # Scale: -0.05 -> 1.0
```
The strength calculation should match the VBT backtest's actual strength calculation, not an arbitrary formula.
**Issue #3: Missing external factors integration**
The generator doesn't incorporate the external market factors (funding rates, DVOL, etc.) that were analyzed to improve circuit breaker effectiveness. The signal should include:
```python
signal = {
...,
'funding_btc': external_data.get('funding_btc'),
'dvol_btc': external_data.get('dvol_btc'),
'fear_greed': external_data.get('fng'),
}
```
---
## **MISSING COMPONENTS**
Based on the design document, the following components are referenced but not provided:
| Component | Reference | Status |
|-----------|-----------|--------|
| `PositionManager` | strategy.py:53, 94-98 | ❌ NOT PROVIDED |
| `MetricsMonitor` | strategy.py:58, 113-116 | ❌ NOT PROVIDED |
| `ThresholdConfig` | strategy.py:114 | ❌ NOT PROVIDED |
| `AdaptiveCircuitBreaker` | strategy.py:55-57 | ❌ NOT PROVIDED |
| `ACBPositionSizer` | strategy.py:57, 109 | ❌ NOT PROVIDED |
| `SignalEnricher` | generator.py:10, 54 | ❌ NOT PROVIDED |
| `RedisSignalPublisher` | generator.py:11, 55 | ❌ NOT PROVIDED |
| `BacktestDataLoader` | backtest_runner.py:9, 64 | ❌ NOT PROVIDED |
---
## **RECOMMENDATIONS BY PRIORITY**
### **P0 - CRITICAL (Must fix before any testing)**
1. **Fix SmartExecAlgorithm** - It doesn't actually execute orders. Rewrite `on_submit`, `_handle_entry`, and add fallback timer logic.
2. **Fix per-asset vel_div extraction** - Currently all assets get the same signal, which defeats the purpose of asset selection.
3. **Add missing components** - PositionManager is imported but never provided. The strategy will crash on `self.position_manager.on_bar()`.
### **P1 - HIGH (Fix before paper trading)**
4. **Fix Signal Bridge clock access** - Will crash when checking signal freshness.
5. **Fix Circuit Breaker daily loss calculation** - Inverted formula could cause false positives or miss real losses.
6. **Implement actual Nautilus backtest** - Current validation is placeholder.
### **P2 - MEDIUM (Fix before production)**
7. **Add Redis reconnection logic** - Currently will hang forever if Redis disconnects.
8. **Integrate external factors** - The analysis showed 18% improvement in CB effectiveness, but this isn't used.
9. **Move exec algorithm registration to node level** - Current approach won't work.
---
## **FINAL VERDICT**
The implementation demonstrates **strong architectural thinking** and **correct parameter implementation** for the Grid 5F champion strategy. The volatility detector is correctly implemented, and the signal bridge's Redis Streams approach is sound.
However, **the SmartExecAlgorithm is non-functional** - it logs actions but doesn't execute them. This is a critical gap that makes paper trading impossible. Additionally, several imported components are missing, which will cause runtime crashes.
**Estimated work to production-ready**:
- SmartExecAlgorithm rewrite: 4-6 hours
- Missing components: 3-4 hours
- Integration testing: 2-3 hours
- **Total: 10-13 hours of focused development**
Would you like me to provide corrected implementations for any specific component?