initial: import DOLPHIN baseline 2026-04-21 from dolphinng5_predict working tree
Includes core prod + GREEN/BLUE subsystems: - prod/ (BLUE harness, configs, scripts, docs) - nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved) - adaptive_exit/ (AEM engine + models/bucket_assignments.pkl) - Observability/ (EsoF advisor, TUI, dashboards) - external_factors/ (EsoF producer) - mc_forewarning_qlabs_fork/ (MC regime/envelope) Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
This commit is contained in:
427
prod/services/INDUSTRIAL_FRAMEWORKS.md
Executable file
427
prod/services/INDUSTRIAL_FRAMEWORKS.md
Executable file
@@ -0,0 +1,427 @@
|
||||
# Industrial-Grade Service Frameworks
|
||||
|
||||
## 🏆 Recommendation: Supervisor
|
||||
|
||||
**Supervisor** is the industry standard for process management in Python deployments.
|
||||
|
||||
### Why Supervisor?
|
||||
- ✅ **Battle-tested**: Used by millions of production systems
|
||||
- ✅ **Mature**: 20+ years of development
|
||||
- ✅ **Simple**: INI-style configuration
|
||||
- ✅ **Reliable**: Handles crashes, restarts, logging automatically
|
||||
- ✅ **Web UI**: Built-in web interface for monitoring
|
||||
- ✅ **API**: XML-RPC API for programmatic control
|
||||
|
||||
---
|
||||
|
||||
## Quick Start: Supervisor
|
||||
|
||||
```bash
|
||||
# 1. Start supervisor and all services
|
||||
cd /mnt/dolphinng5_predict/prod/supervisor
|
||||
./supervisorctl.sh start
|
||||
|
||||
# 2. Check status
|
||||
./supervisorctl.sh status
|
||||
|
||||
# 3. View logs
|
||||
./supervisorctl.sh logs exf
|
||||
./supervisorctl.sh logs ob_streamer
|
||||
|
||||
# 4. Restart a service
|
||||
./supervisorctl.sh ctl restart exf
|
||||
|
||||
# 5. Stop everything
|
||||
./supervisorctl.sh stop
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Alternative: Circus (Mozilla)
|
||||
|
||||
**Circus** is Mozilla's Python process & socket manager.
|
||||
|
||||
### Pros:
|
||||
- ✅ Python-native (easier to extend)
|
||||
- ✅ Built-in statistics (CPU, memory per process)
|
||||
- ✅ Socket management
|
||||
- ✅ Web dashboard
|
||||
|
||||
### Cons:
|
||||
- ❌ Less widely used than Supervisor
|
||||
- ❌ Smaller community
|
||||
|
||||
```bash
|
||||
# Install
|
||||
pip install circus
|
||||
|
||||
# Run
|
||||
circusd circus.ini
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Alternative: Honcho (Python Foreman)
|
||||
|
||||
**Honcho** is a Python port of Ruby's Foreman.
|
||||
|
||||
### Pros:
|
||||
- ✅ Very simple (Procfile-based)
|
||||
- ✅ Good for development
|
||||
- ✅ Easy to understand
|
||||
|
||||
### Cons:
|
||||
- ❌ Less production features
|
||||
- ❌ No auto-restart on crash
|
||||
|
||||
```bash
|
||||
# Procfile
|
||||
exf: python -m external_factors.realtime_exf_service
|
||||
ob: python -m services.ob_stream_service
|
||||
watchdog: python -m services.system_watchdog_service
|
||||
|
||||
# Run
|
||||
honcho start
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Comparison Table
|
||||
|
||||
| Feature | Supervisor | Circus | Honcho | Custom Code |
|
||||
|---------|-----------|--------|--------|-------------|
|
||||
| Auto-restart | ✅ | ✅ | ❌ | ✅ (if built) |
|
||||
| Web UI | ✅ | ✅ | ❌ | ❌ |
|
||||
| Log rotation | ✅ | ✅ | ❌ | ⚠️ (manual) |
|
||||
| Resource limits | ✅ | ✅ | ❌ | ⚠️ (partial) |
|
||||
| API | ✅ XML-RPC | ✅ | ❌ | ❌ |
|
||||
| Maturity | ⭐⭐⭐ | ⭐⭐ | ⭐⭐ | ⭐ |
|
||||
| Ease of use | ⭐⭐ | ⭐⭐ | ⭐⭐⭐ | ⭐⭐ |
|
||||
|
||||
---
|
||||
|
||||
## Our Setup: Supervisor
|
||||
|
||||
**Location**: `/mnt/dolphinng5_predict/prod/supervisor/`
|
||||
|
||||
**Config**: `dolphin-supervisord.conf`
|
||||
|
||||
**Services managed**:
|
||||
- `exf` - External Factors (0.5s)
|
||||
- `ob_streamer` - Order Book (0.5s)
|
||||
- `watchdog` - Survival Stack (10s)
|
||||
- `mc_forewarner` - MC-Forewarner (4h)
|
||||
|
||||
**Features enabled**:
|
||||
- Auto-restart with backoff
|
||||
- Separate stdout/stderr logs
|
||||
- Log rotation (50MB, 10 backups)
|
||||
- Process groups
|
||||
- Event listeners (alerts)
|
||||
|
||||
---
|
||||
|
||||
## Integration with Existing Code
|
||||
|
||||
Your existing service code works **unchanged** with Supervisor:
|
||||
|
||||
```python
|
||||
# Your existing service (works with Supervisor)
|
||||
class ExFService:
|
||||
def run(self):
|
||||
while True:
|
||||
self.fetch_indicators()
|
||||
self.push_to_hz()
|
||||
time.sleep(0.5)
|
||||
|
||||
# Supervisor handles:
|
||||
# - Starting it
|
||||
# - Restarting if it crashes
|
||||
# - Logging stdout/stderr
|
||||
# - Monitoring
|
||||
```
|
||||
|
||||
No code changes needed!
|
||||
|
||||
---
|
||||
|
||||
## Web Dashboard
|
||||
|
||||
Supervisor includes a web interface:
|
||||
|
||||
```ini
|
||||
[inet_http_server]
|
||||
port=0.0.0.0:9001
|
||||
username=user
|
||||
password=pass
|
||||
```
|
||||
|
||||
Then visit: `http://localhost:9001`
|
||||
|
||||
---
|
||||
|
||||
## Summary
|
||||
|
||||
| Use Case | Recommendation |
|
||||
|----------|---------------|
|
||||
| **Production trading system** | **Supervisor** ✅ |
|
||||
| Development/Testing | Honcho |
|
||||
| Need sockets + stats | Circus |
|
||||
| Maximum control | Custom + systemd |
|
||||
|
||||
We recommend **Supervisor** for Dolphin production.
|
||||
|
||||
---
|
||||
|
||||
# CHANGE LOG - All Modifications Made
|
||||
|
||||
## Session: 2026-03-25 (Current Session)
|
||||
|
||||
### 1. Supervisor Installation
|
||||
|
||||
**Command executed:**
|
||||
```bash
|
||||
pip install supervisor
|
||||
```
|
||||
|
||||
**Result:** Supervisor 4.3.0 installed
|
||||
|
||||
---
|
||||
|
||||
### 2. Directory Structure Created
|
||||
|
||||
```
|
||||
/mnt/dolphinng5_predict/prod/supervisor/
|
||||
├── dolphin-supervisord.conf # Main supervisor configuration
|
||||
├── supervisorctl.sh # Control wrapper script
|
||||
├── logs/ # Log directory (created)
|
||||
└── run/ # PID/socket directory (created)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. Configuration File: dolphin-supervisord.conf
|
||||
|
||||
**Location:** `/mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf`
|
||||
|
||||
**Contents:**
|
||||
- `[supervisord]` section with logging, pidfile, environment
|
||||
- `[unix_http_server]` for supervisorctl communication
|
||||
- `[rpcinterface:supervisor]` for API
|
||||
- `[supervisorctl]` client configuration
|
||||
- `[program:exf]` - External Factors service (0.5s)
|
||||
- `[program:ob_streamer]` - Order Book Streamer (0.5s)
|
||||
- `[program:watchdog]` - Survival Stack Watchdog (10s)
|
||||
- `[program:mc_forewarner]` - MC-Forewarner (4h)
|
||||
- `[eventlistener:crashmail]` - Alert on crashes
|
||||
- `[group:dolphin]` - Group all programs
|
||||
|
||||
**Key settings:**
|
||||
- `autostart=true` - All services start with supervisor
|
||||
- `autorestart=true` - Auto-restart on crash
|
||||
- `startretries=3` - 3 restart attempts
|
||||
- `stdout_logfile_maxbytes=50MB` - Log rotation
|
||||
- `rlimit_as=512MB` - Memory limit per service
|
||||
|
||||
---
|
||||
|
||||
### 4. Control Script: supervisorctl.sh
|
||||
|
||||
**Location:** `/mnt/dolphinng5_predict/prod/supervisor/supervisorctl.sh`
|
||||
|
||||
**Commands implemented:**
|
||||
- `start` - Start supervisord and all services
|
||||
- `stop` - Stop all services and supervisord
|
||||
- `restart` - Restart all services
|
||||
- `status` - Show service status
|
||||
- `logs [service]` - Show logs (last 50 lines)
|
||||
- `ctl [cmd]` - Pass through to supervisorctl
|
||||
|
||||
**Usage:**
|
||||
```bash
|
||||
./supervisorctl.sh start
|
||||
./supervisorctl.sh status
|
||||
./supervisorctl.sh logs exf
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 5. Python Libraries Installed
|
||||
|
||||
**Via pip:**
|
||||
- `supervisor==4.3.0` - Main process manager
|
||||
- `tenacity==9.1.4` - Retry logic (previously installed)
|
||||
- `schedule==1.2.2` - Task scheduling (previously installed)
|
||||
|
||||
**System packages checked:**
|
||||
- `supervisor.noarch` available via dnf (not installed, using pip)
|
||||
|
||||
---
|
||||
|
||||
### 6. Alternative Architectures (Previously Created)
|
||||
|
||||
#### 6.1 Custom Supervisor (Pure Python)
|
||||
|
||||
**Location:** `/mnt/dolphinng5_predict/prod/services/supervisor.py`
|
||||
|
||||
**Features:**
|
||||
- `ServiceComponent` base class
|
||||
- `DolphinSupervisor` manager
|
||||
- Thread-based component management
|
||||
- Built-in health monitoring
|
||||
- Example components: ExF, OB, Watchdog, MC
|
||||
|
||||
**Status:** Available but NOT primary (Supervisor preferred)
|
||||
|
||||
---
|
||||
|
||||
#### 6.2 Systemd User Services
|
||||
|
||||
**Location:** `~/.config/systemd/user/`
|
||||
|
||||
**Files created:**
|
||||
- `dolphin-exf.service` - External Factors
|
||||
- `dolphin-ob.service` - Order Book
|
||||
- `dolphin-watchdog.service` - Watchdog
|
||||
- `dolphin-mc.service` + `dolphin-mc.timer` - MC-Forewarner
|
||||
- `dolphin-supervisor.service` - Custom supervisor (optional)
|
||||
- `dolphin-test.service` - Test service
|
||||
|
||||
**Control script:** `/mnt/dolphinng5_predict/prod/services/service_manager.py`
|
||||
|
||||
---
|
||||
|
||||
### 7. Service Base Class (Boilerplate)
|
||||
|
||||
**Location:** `/mnt/dolphinng5_predict/prod/services/service_base.py`
|
||||
|
||||
**Features:**
|
||||
- `ServiceBase` abstract class
|
||||
- Automatic retries with tenacity
|
||||
- Structured JSON logging
|
||||
- Health check endpoints
|
||||
- Graceful shutdown handling
|
||||
- Systemd notify support
|
||||
- `run_scheduled()` helper
|
||||
|
||||
**Status:** Available for custom implementations
|
||||
|
||||
---
|
||||
|
||||
### 8. Documentation Files Created
|
||||
|
||||
| File | Location | Purpose |
|
||||
|------|----------|---------|
|
||||
| `INDUSTRIAL_FRAMEWORKS.md` | `/mnt/dolphinng5_predict/prod/services/` | This document - framework comparison |
|
||||
| `ARCHITECTURE_CHOICE.md` | `/mnt/dolphinng5_predict/prod/services/` | Architecture options comparison |
|
||||
| `README.md` | `/mnt/dolphinng5_predict/prod/services/` | General services documentation |
|
||||
| `dolphin-supervisord.conf` | `/mnt/dolphinng5_predict/prod/supervisor/` | Supervisor configuration |
|
||||
|
||||
---
|
||||
|
||||
### 9. kimi.json Updated
|
||||
|
||||
**Change:** Associated session with ops directory
|
||||
|
||||
**Before:**
|
||||
```json
|
||||
{
|
||||
"path": "/mnt/dolphinng5_predict/prod/ops",
|
||||
"kaos": "local",
|
||||
"last_session_id": null
|
||||
}
|
||||
```
|
||||
|
||||
**After:**
|
||||
```json
|
||||
{
|
||||
"path": "/mnt/dolphinng5_predict/prod/ops",
|
||||
"kaos": "local",
|
||||
"last_session_id": "c23a69c5-ba4a-41c4-8624-05114e8fd9ea"
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 10. Session Backup
|
||||
|
||||
**Session backed up:** `c23a69c5-ba4a-41c4-8624-05114e8fd9ea`
|
||||
- **Original location:** `~/.kimi/sessions/9330f053b5f85e950222ed1fed8f6f02/`
|
||||
- **Backup location 1:** `/mnt/dolphinng5_predict/prod/ops/kimi_session_backup/`
|
||||
- **Backup location 2:** `/mnt/vids/`
|
||||
- **Markdown transcript:** `KIMI_Session_Rearch_Services-Prefect.md` (684KB)
|
||||
|
||||
---
|
||||
|
||||
## Summary: What to Use
|
||||
|
||||
### For Production Trading System:
|
||||
|
||||
**Recommended: SUPERVISOR**
|
||||
```bash
|
||||
cd /mnt/dolphinng5_predict/prod/supervisor
|
||||
./supervisorctl.sh start
|
||||
./supervisorctl.sh status
|
||||
```
|
||||
|
||||
**Why:** Battle-tested, 20+ years, web UI, API, log rotation
|
||||
|
||||
### For Simplicity / No Extra Deps:
|
||||
|
||||
**Alternative: SYSTEMD --user**
|
||||
```bash
|
||||
systemctl --user start dolphin-exf
|
||||
systemctl --user start dolphin-ob
|
||||
systemctl --user start dolphin-watchdog
|
||||
```
|
||||
|
||||
**Why:** Built-in, no pip installs, OS-integrated
|
||||
|
||||
### For Full Control:
|
||||
|
||||
**Alternative: Custom Python**
|
||||
```bash
|
||||
systemctl --user start dolphin-supervisor # Custom one
|
||||
```
|
||||
|
||||
**Why:** Educational, customizable, no external deps
|
||||
|
||||
---
|
||||
|
||||
## Files Modified/Created Summary
|
||||
|
||||
### New Directories:
|
||||
1. `/mnt/dolphinng5_predict/prod/supervisor/`
|
||||
2. `/mnt/dolphinng5_predict/prod/supervisor/logs/`
|
||||
3. `/mnt/dolphinng5_predict/prod/supervisor/run/`
|
||||
4. `/mnt/dolphinng5_predict/prod/ops/kimi_session_backup/`
|
||||
|
||||
### New Files:
|
||||
1. `/mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf`
|
||||
2. `/mnt/dolphinng5_predict/prod/supervisor/supervisorctl.sh`
|
||||
3. `/mnt/dolphinng5_predict/prod/services/INDUSTRIAL_FRAMEWORKS.md` (this file)
|
||||
4. `/mnt/dolphinng5_predict/prod/services/ARCHITECTURE_CHOICE.md`
|
||||
5. `/mnt/dolphinng5_predict/prod/services/supervisor.py` (custom impl)
|
||||
6. `/mnt/dolphinng5_predict/prod/services/service_base.py` (boilerplate)
|
||||
7. `/mnt/dolphinng5_predict/prod/services/service_manager.py` (systemd ctl)
|
||||
8. `/mnt/dolphinng5_predict/prod/ops/KIMI_Session_Rearch_Services-Prefect.md`
|
||||
9. `/mnt/dolphinng5_predict/prod/ops/SESSION_INFO.txt`
|
||||
10. `/mnt/dolphinng5_predict/prod/ops/resume_session.sh`
|
||||
|
||||
### Modified Files:
|
||||
1. `~/.config/systemd/user/dolphin-*.service` (6 services)
|
||||
2. `~/.config/systemd/user/dolphin-mc.timer`
|
||||
3. `~/.kimi/kimi.json` (session association)
|
||||
|
||||
---
|
||||
|
||||
## Current Status
|
||||
|
||||
✅ **Supervisor 4.3.0** installed and configured
|
||||
✅ **6 systemd user services** configured (backup option)
|
||||
✅ **Custom supervisor** available (educational)
|
||||
✅ **Service base class** with retries/logging (boilerplate)
|
||||
✅ **All documentation** complete
|
||||
✅ **Session backed up** to multiple locations
|
||||
|
||||
**Ready for:** Production deployment
|
||||
Reference in New Issue
Block a user