Files
DOLPHIN/prod/services/INDUSTRIAL_FRAMEWORKS.md
hjnormey 01c19662cb initial: import DOLPHIN baseline 2026-04-21 from dolphinng5_predict working tree
Includes core prod + GREEN/BLUE subsystems:
- prod/ (BLUE harness, configs, scripts, docs)
- nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved)
- adaptive_exit/ (AEM engine + models/bucket_assignments.pkl)
- Observability/ (EsoF advisor, TUI, dashboards)
- external_factors/ (EsoF producer)
- mc_forewarning_qlabs_fork/ (MC regime/envelope)

Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
2026-04-21 16:58:38 +02:00

428 lines
10 KiB
Markdown
Executable File

# Industrial-Grade Service Frameworks
## 🏆 Recommendation: Supervisor
**Supervisor** is the industry standard for process management in Python deployments.
### Why Supervisor?
-**Battle-tested**: Used by millions of production systems
-**Mature**: 20+ years of development
-**Simple**: INI-style configuration
-**Reliable**: Handles crashes, restarts, logging automatically
-**Web UI**: Built-in web interface for monitoring
-**API**: XML-RPC API for programmatic control
---
## Quick Start: Supervisor
```bash
# 1. Start supervisor and all services
cd /mnt/dolphinng5_predict/prod/supervisor
./supervisorctl.sh start
# 2. Check status
./supervisorctl.sh status
# 3. View logs
./supervisorctl.sh logs exf
./supervisorctl.sh logs ob_streamer
# 4. Restart a service
./supervisorctl.sh ctl restart exf
# 5. Stop everything
./supervisorctl.sh stop
```
---
## Alternative: Circus (Mozilla)
**Circus** is Mozilla's Python process & socket manager.
### Pros:
- ✅ Python-native (easier to extend)
- ✅ Built-in statistics (CPU, memory per process)
- ✅ Socket management
- ✅ Web dashboard
### Cons:
- ❌ Less widely used than Supervisor
- ❌ Smaller community
```bash
# Install
pip install circus
# Run
circusd circus.ini
```
---
## Alternative: Honcho (Python Foreman)
**Honcho** is a Python port of Ruby's Foreman.
### Pros:
- ✅ Very simple (Procfile-based)
- ✅ Good for development
- ✅ Easy to understand
### Cons:
- ❌ Less production features
- ❌ No auto-restart on crash
```bash
# Procfile
exf: python -m external_factors.realtime_exf_service
ob: python -m services.ob_stream_service
watchdog: python -m services.system_watchdog_service
# Run
honcho start
```
---
## Comparison Table
| Feature | Supervisor | Circus | Honcho | Custom Code |
|---------|-----------|--------|--------|-------------|
| Auto-restart | ✅ | ✅ | ❌ | ✅ (if built) |
| Web UI | ✅ | ✅ | ❌ | ❌ |
| Log rotation | ✅ | ✅ | ❌ | ⚠️ (manual) |
| Resource limits | ✅ | ✅ | ❌ | ⚠️ (partial) |
| API | ✅ XML-RPC | ✅ | ❌ | ❌ |
| Maturity | ⭐⭐⭐ | ⭐⭐ | ⭐⭐ | ⭐ |
| Ease of use | ⭐⭐ | ⭐⭐ | ⭐⭐⭐ | ⭐⭐ |
---
## Our Setup: Supervisor
**Location**: `/mnt/dolphinng5_predict/prod/supervisor/`
**Config**: `dolphin-supervisord.conf`
**Services managed**:
- `exf` - External Factors (0.5s)
- `ob_streamer` - Order Book (0.5s)
- `watchdog` - Survival Stack (10s)
- `mc_forewarner` - MC-Forewarner (4h)
**Features enabled**:
- Auto-restart with backoff
- Separate stdout/stderr logs
- Log rotation (50MB, 10 backups)
- Process groups
- Event listeners (alerts)
---
## Integration with Existing Code
Your existing service code works **unchanged** with Supervisor:
```python
# Your existing service (works with Supervisor)
class ExFService:
def run(self):
while True:
self.fetch_indicators()
self.push_to_hz()
time.sleep(0.5)
# Supervisor handles:
# - Starting it
# - Restarting if it crashes
# - Logging stdout/stderr
# - Monitoring
```
No code changes needed!
---
## Web Dashboard
Supervisor includes a web interface:
```ini
[inet_http_server]
port=0.0.0.0:9001
username=user
password=pass
```
Then visit: `http://localhost:9001`
---
## Summary
| Use Case | Recommendation |
|----------|---------------|
| **Production trading system** | **Supervisor** ✅ |
| Development/Testing | Honcho |
| Need sockets + stats | Circus |
| Maximum control | Custom + systemd |
We recommend **Supervisor** for Dolphin production.
---
# CHANGE LOG - All Modifications Made
## Session: 2026-03-25 (Current Session)
### 1. Supervisor Installation
**Command executed:**
```bash
pip install supervisor
```
**Result:** Supervisor 4.3.0 installed
---
### 2. Directory Structure Created
```
/mnt/dolphinng5_predict/prod/supervisor/
├── dolphin-supervisord.conf # Main supervisor configuration
├── supervisorctl.sh # Control wrapper script
├── logs/ # Log directory (created)
└── run/ # PID/socket directory (created)
```
---
### 3. Configuration File: dolphin-supervisord.conf
**Location:** `/mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf`
**Contents:**
- `[supervisord]` section with logging, pidfile, environment
- `[unix_http_server]` for supervisorctl communication
- `[rpcinterface:supervisor]` for API
- `[supervisorctl]` client configuration
- `[program:exf]` - External Factors service (0.5s)
- `[program:ob_streamer]` - Order Book Streamer (0.5s)
- `[program:watchdog]` - Survival Stack Watchdog (10s)
- `[program:mc_forewarner]` - MC-Forewarner (4h)
- `[eventlistener:crashmail]` - Alert on crashes
- `[group:dolphin]` - Group all programs
**Key settings:**
- `autostart=true` - All services start with supervisor
- `autorestart=true` - Auto-restart on crash
- `startretries=3` - 3 restart attempts
- `stdout_logfile_maxbytes=50MB` - Log rotation
- `rlimit_as=512MB` - Memory limit per service
---
### 4. Control Script: supervisorctl.sh
**Location:** `/mnt/dolphinng5_predict/prod/supervisor/supervisorctl.sh`
**Commands implemented:**
- `start` - Start supervisord and all services
- `stop` - Stop all services and supervisord
- `restart` - Restart all services
- `status` - Show service status
- `logs [service]` - Show logs (last 50 lines)
- `ctl [cmd]` - Pass through to supervisorctl
**Usage:**
```bash
./supervisorctl.sh start
./supervisorctl.sh status
./supervisorctl.sh logs exf
```
---
### 5. Python Libraries Installed
**Via pip:**
- `supervisor==4.3.0` - Main process manager
- `tenacity==9.1.4` - Retry logic (previously installed)
- `schedule==1.2.2` - Task scheduling (previously installed)
**System packages checked:**
- `supervisor.noarch` available via dnf (not installed, using pip)
---
### 6. Alternative Architectures (Previously Created)
#### 6.1 Custom Supervisor (Pure Python)
**Location:** `/mnt/dolphinng5_predict/prod/services/supervisor.py`
**Features:**
- `ServiceComponent` base class
- `DolphinSupervisor` manager
- Thread-based component management
- Built-in health monitoring
- Example components: ExF, OB, Watchdog, MC
**Status:** Available but NOT primary (Supervisor preferred)
---
#### 6.2 Systemd User Services
**Location:** `~/.config/systemd/user/`
**Files created:**
- `dolphin-exf.service` - External Factors
- `dolphin-ob.service` - Order Book
- `dolphin-watchdog.service` - Watchdog
- `dolphin-mc.service` + `dolphin-mc.timer` - MC-Forewarner
- `dolphin-supervisor.service` - Custom supervisor (optional)
- `dolphin-test.service` - Test service
**Control script:** `/mnt/dolphinng5_predict/prod/services/service_manager.py`
---
### 7. Service Base Class (Boilerplate)
**Location:** `/mnt/dolphinng5_predict/prod/services/service_base.py`
**Features:**
- `ServiceBase` abstract class
- Automatic retries with tenacity
- Structured JSON logging
- Health check endpoints
- Graceful shutdown handling
- Systemd notify support
- `run_scheduled()` helper
**Status:** Available for custom implementations
---
### 8. Documentation Files Created
| File | Location | Purpose |
|------|----------|---------|
| `INDUSTRIAL_FRAMEWORKS.md` | `/mnt/dolphinng5_predict/prod/services/` | This document - framework comparison |
| `ARCHITECTURE_CHOICE.md` | `/mnt/dolphinng5_predict/prod/services/` | Architecture options comparison |
| `README.md` | `/mnt/dolphinng5_predict/prod/services/` | General services documentation |
| `dolphin-supervisord.conf` | `/mnt/dolphinng5_predict/prod/supervisor/` | Supervisor configuration |
---
### 9. kimi.json Updated
**Change:** Associated session with ops directory
**Before:**
```json
{
"path": "/mnt/dolphinng5_predict/prod/ops",
"kaos": "local",
"last_session_id": null
}
```
**After:**
```json
{
"path": "/mnt/dolphinng5_predict/prod/ops",
"kaos": "local",
"last_session_id": "c23a69c5-ba4a-41c4-8624-05114e8fd9ea"
}
```
---
### 10. Session Backup
**Session backed up:** `c23a69c5-ba4a-41c4-8624-05114e8fd9ea`
- **Original location:** `~/.kimi/sessions/9330f053b5f85e950222ed1fed8f6f02/`
- **Backup location 1:** `/mnt/dolphinng5_predict/prod/ops/kimi_session_backup/`
- **Backup location 2:** `/mnt/vids/`
- **Markdown transcript:** `KIMI_Session_Rearch_Services-Prefect.md` (684KB)
---
## Summary: What to Use
### For Production Trading System:
**Recommended: SUPERVISOR**
```bash
cd /mnt/dolphinng5_predict/prod/supervisor
./supervisorctl.sh start
./supervisorctl.sh status
```
**Why:** Battle-tested, 20+ years, web UI, API, log rotation
### For Simplicity / No Extra Deps:
**Alternative: SYSTEMD --user**
```bash
systemctl --user start dolphin-exf
systemctl --user start dolphin-ob
systemctl --user start dolphin-watchdog
```
**Why:** Built-in, no pip installs, OS-integrated
### For Full Control:
**Alternative: Custom Python**
```bash
systemctl --user start dolphin-supervisor # Custom one
```
**Why:** Educational, customizable, no external deps
---
## Files Modified/Created Summary
### New Directories:
1. `/mnt/dolphinng5_predict/prod/supervisor/`
2. `/mnt/dolphinng5_predict/prod/supervisor/logs/`
3. `/mnt/dolphinng5_predict/prod/supervisor/run/`
4. `/mnt/dolphinng5_predict/prod/ops/kimi_session_backup/`
### New Files:
1. `/mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf`
2. `/mnt/dolphinng5_predict/prod/supervisor/supervisorctl.sh`
3. `/mnt/dolphinng5_predict/prod/services/INDUSTRIAL_FRAMEWORKS.md` (this file)
4. `/mnt/dolphinng5_predict/prod/services/ARCHITECTURE_CHOICE.md`
5. `/mnt/dolphinng5_predict/prod/services/supervisor.py` (custom impl)
6. `/mnt/dolphinng5_predict/prod/services/service_base.py` (boilerplate)
7. `/mnt/dolphinng5_predict/prod/services/service_manager.py` (systemd ctl)
8. `/mnt/dolphinng5_predict/prod/ops/KIMI_Session_Rearch_Services-Prefect.md`
9. `/mnt/dolphinng5_predict/prod/ops/SESSION_INFO.txt`
10. `/mnt/dolphinng5_predict/prod/ops/resume_session.sh`
### Modified Files:
1. `~/.config/systemd/user/dolphin-*.service` (6 services)
2. `~/.config/systemd/user/dolphin-mc.timer`
3. `~/.kimi/kimi.json` (session association)
---
## Current Status
**Supervisor 4.3.0** installed and configured
**6 systemd user services** configured (backup option)
**Custom supervisor** available (educational)
**Service base class** with retries/logging (boilerplate)
**All documentation** complete
**Session backed up** to multiple locations
**Ready for:** Production deployment