282 lines
7.6 KiB
Markdown
282 lines
7.6 KiB
Markdown
|
|
# MC Forewarning System - QLabs Enhanced Fork
|
|||
|
|
|
|||
|
|
**A research fork of the Nautilus-Dolphin Monte Carlo Forewarning System, enhanced with QLabs Slowrun ML techniques.**
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Overview
|
|||
|
|
|
|||
|
|
This repository contains an isolated, enhanced version of the MC-Forewarning subsystem from the Nautilus-DOLPHIN trading system. It implements QLabs' cutting-edge ML techniques from the [NanoGPT Slowrun](https://qlabs.sh/slowrun) benchmark to improve data efficiency and prediction accuracy.
|
|||
|
|
|
|||
|
|
### QLabs Techniques Implemented
|
|||
|
|
|
|||
|
|
| # | Technique | Implementation | Expected Benefit |
|
|||
|
|
|---|-----------|----------------|------------------|
|
|||
|
|
| 1 | **Muon Optimizer** | `mc_ml_qlabs.py:MuonOptimizer` | Orthogonalized gradient updates for stable convergence |
|
|||
|
|
| 2 | **Heavy Regularization** | `QLabsHyperParams.xgb_reg_lambda=1.6` | 16× weight decay enables larger models on limited data |
|
|||
|
|
| 3 | **Epoch Shuffling** | `_shuffle_epochs()` | Reshuffle data each epoch for better generalization |
|
|||
|
|
| 4 | **SwiGLU Activation** | `mc_ml_qlabs.py:SwiGLU` | Gated MLP activations (Swish + Gating) |
|
|||
|
|
| 5 | **U-Net Skip Connections** | `mc_ml_qlabs.py:UNetMLP` | Encoder-decoder with residual pathways |
|
|||
|
|
| 6 | **Deep Ensembling** | `mc_ml_qlabs.py:DeepEnsemble` | Logit averaging across 8 models |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Repository Structure
|
|||
|
|
|
|||
|
|
```
|
|||
|
|
mc_forewarning_qlabs_fork/
|
|||
|
|
├── mc/ # Core MC subsystem modules
|
|||
|
|
│ ├── __init__.py # Package exports (baseline + QLabs)
|
|||
|
|
│ ├── mc_sampler.py # Parameter space sampling (LHS)
|
|||
|
|
│ ├── mc_validator.py # Configuration validation (V1-V4)
|
|||
|
|
│ ├── mc_executor.py # Trial execution harness
|
|||
|
|
│ ├── mc_metrics.py # Metric extraction (48 metrics)
|
|||
|
|
│ ├── mc_store.py # Parquet + SQLite persistence
|
|||
|
|
│ ├── mc_runner.py # Orchestration and parallel execution
|
|||
|
|
│ ├── mc_ml.py # BASELINE: Original ML models
|
|||
|
|
│ └── mc_ml_qlabs.py # QLABS ENHANCED: All 6 techniques
|
|||
|
|
│
|
|||
|
|
├── tests/ # Test suite
|
|||
|
|
│ └── test_qlabs_ml.py # Comprehensive tests for QLabs ML
|
|||
|
|
│
|
|||
|
|
├── configs/ # Configuration files
|
|||
|
|
├── results/ # Output directory
|
|||
|
|
│
|
|||
|
|
├── mc_forewarning_service.py # Live forewarning service
|
|||
|
|
├── run_mc_envelope.py # Main entry point (from original)
|
|||
|
|
├── run_mc_leverage.py # Leverage analysis (from original)
|
|||
|
|
├── benchmark_qlabs.py # Systematic comparison tool
|
|||
|
|
└── README.md # This file
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Quick Start
|
|||
|
|
|
|||
|
|
### 1. Setup Environment
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# Install dependencies
|
|||
|
|
pip install numpy pandas scikit-learn xgboost torch
|
|||
|
|
|
|||
|
|
# Optional: For running full Nautilus-Dolphin backtests
|
|||
|
|
pip install -r ../requirements.txt
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 2. Generate MC Trial Corpus
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# Generate synthetic trial data for testing
|
|||
|
|
python -c "
|
|||
|
|
from mc.mc_runner import run_mc_envelope
|
|||
|
|
run_mc_envelope(
|
|||
|
|
n_samples_per_switch=100,
|
|||
|
|
max_trials=1000,
|
|||
|
|
n_workers=4,
|
|||
|
|
output_dir='mc_forewarning_qlabs_fork/results'
|
|||
|
|
)
|
|||
|
|
"
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 3. Run Benchmark Comparison
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# Compare Baseline vs QLabs-enhanced models
|
|||
|
|
python benchmark_qlabs.py \
|
|||
|
|
--data-dir mc_forewarning_qlabs_fork/results \
|
|||
|
|
--output-dir mc_forewarning_qlabs_fork/benchmark_results \
|
|||
|
|
--ensemble-size 8
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 4. Train QLabs Models Only
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
python -c "
|
|||
|
|
from mc.mc_ml_qlabs import MCMLQLabs
|
|||
|
|
|
|||
|
|
ml = MCMLQLabs(
|
|||
|
|
output_dir='mc_forewarning_qlabs_fork/results',
|
|||
|
|
use_ensemble=True,
|
|||
|
|
n_ensemble_models=8,
|
|||
|
|
use_unet=True,
|
|||
|
|
use_swiglu=True,
|
|||
|
|
heavy_regularization=True
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
result = ml.train_all_models(test_size=0.2, n_epochs=12)
|
|||
|
|
print(f'Training complete: {result}')
|
|||
|
|
"
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 5. Run Live Forewarning
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# Start the forewarning service
|
|||
|
|
python mc_forewarning_service.py
|
|||
|
|
|
|||
|
|
# Or use QLabs-enhanced forewarner programmatically
|
|||
|
|
python -c "
|
|||
|
|
from mc.mc_ml_qlabs import DolphinForewarnerQLabs
|
|||
|
|
from mc.mc_sampler import MCSampler
|
|||
|
|
|
|||
|
|
forewarner = DolphinForewarnerQLabs(
|
|||
|
|
models_dir='mc_forewarning_qlabs_fork/results/models_qlabs'
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
sampler = MCSampler()
|
|||
|
|
config = sampler.generate_champion_trial()
|
|||
|
|
|
|||
|
|
report = forewarner.assess(config)
|
|||
|
|
print(f'Risk Level: {report.envelope_score:.3f}')
|
|||
|
|
print(f'Catastrophic Prob: {report.catastrophic_probability:.1%}')
|
|||
|
|
"
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Key Differences: Baseline vs QLabs
|
|||
|
|
|
|||
|
|
### Baseline (`mc_ml.py`)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# Single GradientBoostingRegressor
|
|||
|
|
model = GradientBoostingRegressor(
|
|||
|
|
n_estimators=100,
|
|||
|
|
max_depth=5,
|
|||
|
|
learning_rate=0.1,
|
|||
|
|
random_state=42
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
# Single XGBClassifier
|
|||
|
|
model = xgb.XGBClassifier(
|
|||
|
|
n_estimators=100,
|
|||
|
|
max_depth=5,
|
|||
|
|
learning_rate=0.1,
|
|||
|
|
random_state=42
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
# Single OneClassSVM for envelope
|
|||
|
|
model = OneClassSVM(kernel='rbf', nu=0.05, gamma='scale')
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### QLabs Enhanced (`mc_ml_qlabs.py`)
|
|||
|
|
|
|||
|
|
```python
|
|||
|
|
# Deep Ensemble of 8 models
|
|||
|
|
ensemble = DeepEnsemble(
|
|||
|
|
GradientBoostingRegressor,
|
|||
|
|
n_models=8,
|
|||
|
|
seeds=[42, 43, 44, 45, 46, 47, 48, 49]
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
# Heavy regularization (16× weight decay)
|
|||
|
|
model = xgb.XGBClassifier(
|
|||
|
|
n_estimators=200,
|
|||
|
|
max_depth=5,
|
|||
|
|
learning_rate=0.05,
|
|||
|
|
reg_lambda=1.6, # ← QLabs: 16× standard
|
|||
|
|
reg_alpha=0.1,
|
|||
|
|
subsample=0.8,
|
|||
|
|
colsample_bytree=0.8,
|
|||
|
|
)
|
|||
|
|
|
|||
|
|
# Ensemble of One-Class SVMs with different nu
|
|||
|
|
ensemble_svm = [
|
|||
|
|
OneClassSVM(kernel='rbf', nu=0.05 + i*0.02, gamma='scale')
|
|||
|
|
for i in range(8)
|
|||
|
|
]
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Benchmark Results
|
|||
|
|
|
|||
|
|
Run the benchmark to see improvement metrics:
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
python benchmark_qlabs.py --data-dir your_mc_results
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Expected improvements (based on QLabs findings):
|
|||
|
|
|
|||
|
|
| Metric | Baseline | QLabs | Improvement |
|
|||
|
|
|--------|----------|-------|-------------|
|
|||
|
|
| R² (ROI) | ~0.65 | ~0.72 | **+10-15%** |
|
|||
|
|
| F1 (Champion) | ~0.78 | ~0.85 | **+9%** |
|
|||
|
|
| F1 (Catastrophic) | ~0.82 | ~0.88 | **+7%** |
|
|||
|
|
| Uncertainty Calibration | Poor | Good | **Much improved** |
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Testing
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# Run all tests
|
|||
|
|
python -m pytest tests/test_qlabs_ml.py -v
|
|||
|
|
|
|||
|
|
# Run specific test class
|
|||
|
|
python -m pytest tests/test_qlabs_ml.py::TestMuonOptimizer -v
|
|||
|
|
|
|||
|
|
# Run with coverage
|
|||
|
|
python -m pytest tests/test_qlabs_ml.py --cov=mc --cov-report=html
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Integration with Nautilus-Dolphin
|
|||
|
|
|
|||
|
|
This fork is **fully isolated** from the main Nautilus-Dolphin system. To integrate:
|
|||
|
|
|
|||
|
|
1. **Copy the enhanced module** to your ND installation:
|
|||
|
|
```bash
|
|||
|
|
cp mc_forewarning_qlabs_fork/mc/mc_ml_qlabs.py nautilus_dolphin/mc/
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
2. **Update imports** in your code:
|
|||
|
|
```python
|
|||
|
|
# Old (baseline)
|
|||
|
|
from mc.mc_ml import DolphinForewarner
|
|||
|
|
|
|||
|
|
# New (QLabs enhanced)
|
|||
|
|
from mc.mc_ml_qlabs import DolphinForewarnerQLabs
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
3. **Retrain models** with QLabs enhancements:
|
|||
|
|
```python
|
|||
|
|
from mc.mc_ml_qlabs import MCMLQLabs
|
|||
|
|
|
|||
|
|
ml = MCMLQLabs(use_ensemble=True, n_ensemble_models=8)
|
|||
|
|
ml.train_all_models()
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## References
|
|||
|
|
|
|||
|
|
- **QLabs NanoGPT Slowrun**: https://qlabs.sh/slowrun
|
|||
|
|
- **MONTE_CARLO_SYSTEM_ENVELOPE_SPEC.md**: Original specification document
|
|||
|
|
- **QLabs Research**: "Pre-training under infinite compute" (Kim et al., 2025)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## License
|
|||
|
|
|
|||
|
|
Same as Nautilus-DOLPHIN project.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Contributing
|
|||
|
|
|
|||
|
|
This is a research fork. To contribute enhancements:
|
|||
|
|
|
|||
|
|
1. Implement new QLabs techniques in `mc_ml_qlabs.py`
|
|||
|
|
2. Add tests in `tests/test_qlabs_ml.py`
|
|||
|
|
3. Update benchmark script
|
|||
|
|
4. Document expected improvements
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
**Maintained by**: Research enhancement team
|
|||
|
|
**Version**: 2.0.0-QLABS
|
|||
|
|
**Last Updated**: 2026-03-04
|