428 lines
10 KiB
Markdown
428 lines
10 KiB
Markdown
|
|
# Industrial-Grade Service Frameworks
|
||
|
|
|
||
|
|
## 🏆 Recommendation: Supervisor
|
||
|
|
|
||
|
|
**Supervisor** is the industry standard for process management in Python deployments.
|
||
|
|
|
||
|
|
### Why Supervisor?
|
||
|
|
- ✅ **Battle-tested**: Used by millions of production systems
|
||
|
|
- ✅ **Mature**: 20+ years of development
|
||
|
|
- ✅ **Simple**: INI-style configuration
|
||
|
|
- ✅ **Reliable**: Handles crashes, restarts, logging automatically
|
||
|
|
- ✅ **Web UI**: Built-in web interface for monitoring
|
||
|
|
- ✅ **API**: XML-RPC API for programmatic control
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Quick Start: Supervisor
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# 1. Start supervisor and all services
|
||
|
|
cd /mnt/dolphinng5_predict/prod/supervisor
|
||
|
|
./supervisorctl.sh start
|
||
|
|
|
||
|
|
# 2. Check status
|
||
|
|
./supervisorctl.sh status
|
||
|
|
|
||
|
|
# 3. View logs
|
||
|
|
./supervisorctl.sh logs exf
|
||
|
|
./supervisorctl.sh logs ob_streamer
|
||
|
|
|
||
|
|
# 4. Restart a service
|
||
|
|
./supervisorctl.sh ctl restart exf
|
||
|
|
|
||
|
|
# 5. Stop everything
|
||
|
|
./supervisorctl.sh stop
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Alternative: Circus (Mozilla)
|
||
|
|
|
||
|
|
**Circus** is Mozilla's Python process & socket manager.
|
||
|
|
|
||
|
|
### Pros:
|
||
|
|
- ✅ Python-native (easier to extend)
|
||
|
|
- ✅ Built-in statistics (CPU, memory per process)
|
||
|
|
- ✅ Socket management
|
||
|
|
- ✅ Web dashboard
|
||
|
|
|
||
|
|
### Cons:
|
||
|
|
- ❌ Less widely used than Supervisor
|
||
|
|
- ❌ Smaller community
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Install
|
||
|
|
pip install circus
|
||
|
|
|
||
|
|
# Run
|
||
|
|
circusd circus.ini
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Alternative: Honcho (Python Foreman)
|
||
|
|
|
||
|
|
**Honcho** is a Python port of Ruby's Foreman.
|
||
|
|
|
||
|
|
### Pros:
|
||
|
|
- ✅ Very simple (Procfile-based)
|
||
|
|
- ✅ Good for development
|
||
|
|
- ✅ Easy to understand
|
||
|
|
|
||
|
|
### Cons:
|
||
|
|
- ❌ Less production features
|
||
|
|
- ❌ No auto-restart on crash
|
||
|
|
|
||
|
|
```bash
|
||
|
|
# Procfile
|
||
|
|
exf: python -m external_factors.realtime_exf_service
|
||
|
|
ob: python -m services.ob_stream_service
|
||
|
|
watchdog: python -m services.system_watchdog_service
|
||
|
|
|
||
|
|
# Run
|
||
|
|
honcho start
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Comparison Table
|
||
|
|
|
||
|
|
| Feature | Supervisor | Circus | Honcho | Custom Code |
|
||
|
|
|---------|-----------|--------|--------|-------------|
|
||
|
|
| Auto-restart | ✅ | ✅ | ❌ | ✅ (if built) |
|
||
|
|
| Web UI | ✅ | ✅ | ❌ | ❌ |
|
||
|
|
| Log rotation | ✅ | ✅ | ❌ | ⚠️ (manual) |
|
||
|
|
| Resource limits | ✅ | ✅ | ❌ | ⚠️ (partial) |
|
||
|
|
| API | ✅ XML-RPC | ✅ | ❌ | ❌ |
|
||
|
|
| Maturity | ⭐⭐⭐ | ⭐⭐ | ⭐⭐ | ⭐ |
|
||
|
|
| Ease of use | ⭐⭐ | ⭐⭐ | ⭐⭐⭐ | ⭐⭐ |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Our Setup: Supervisor
|
||
|
|
|
||
|
|
**Location**: `/mnt/dolphinng5_predict/prod/supervisor/`
|
||
|
|
|
||
|
|
**Config**: `dolphin-supervisord.conf`
|
||
|
|
|
||
|
|
**Services managed**:
|
||
|
|
- `exf` - External Factors (0.5s)
|
||
|
|
- `ob_streamer` - Order Book (0.5s)
|
||
|
|
- `watchdog` - Survival Stack (10s)
|
||
|
|
- `mc_forewarner` - MC-Forewarner (4h)
|
||
|
|
|
||
|
|
**Features enabled**:
|
||
|
|
- Auto-restart with backoff
|
||
|
|
- Separate stdout/stderr logs
|
||
|
|
- Log rotation (50MB, 10 backups)
|
||
|
|
- Process groups
|
||
|
|
- Event listeners (alerts)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Integration with Existing Code
|
||
|
|
|
||
|
|
Your existing service code works **unchanged** with Supervisor:
|
||
|
|
|
||
|
|
```python
|
||
|
|
# Your existing service (works with Supervisor)
|
||
|
|
class ExFService:
|
||
|
|
def run(self):
|
||
|
|
while True:
|
||
|
|
self.fetch_indicators()
|
||
|
|
self.push_to_hz()
|
||
|
|
time.sleep(0.5)
|
||
|
|
|
||
|
|
# Supervisor handles:
|
||
|
|
# - Starting it
|
||
|
|
# - Restarting if it crashes
|
||
|
|
# - Logging stdout/stderr
|
||
|
|
# - Monitoring
|
||
|
|
```
|
||
|
|
|
||
|
|
No code changes needed!
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Web Dashboard
|
||
|
|
|
||
|
|
Supervisor includes a web interface:
|
||
|
|
|
||
|
|
```ini
|
||
|
|
[inet_http_server]
|
||
|
|
port=0.0.0.0:9001
|
||
|
|
username=user
|
||
|
|
password=pass
|
||
|
|
```
|
||
|
|
|
||
|
|
Then visit: `http://localhost:9001`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Summary
|
||
|
|
|
||
|
|
| Use Case | Recommendation |
|
||
|
|
|----------|---------------|
|
||
|
|
| **Production trading system** | **Supervisor** ✅ |
|
||
|
|
| Development/Testing | Honcho |
|
||
|
|
| Need sockets + stats | Circus |
|
||
|
|
| Maximum control | Custom + systemd |
|
||
|
|
|
||
|
|
We recommend **Supervisor** for Dolphin production.
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
# CHANGE LOG - All Modifications Made
|
||
|
|
|
||
|
|
## Session: 2026-03-25 (Current Session)
|
||
|
|
|
||
|
|
### 1. Supervisor Installation
|
||
|
|
|
||
|
|
**Command executed:**
|
||
|
|
```bash
|
||
|
|
pip install supervisor
|
||
|
|
```
|
||
|
|
|
||
|
|
**Result:** Supervisor 4.3.0 installed
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 2. Directory Structure Created
|
||
|
|
|
||
|
|
```
|
||
|
|
/mnt/dolphinng5_predict/prod/supervisor/
|
||
|
|
├── dolphin-supervisord.conf # Main supervisor configuration
|
||
|
|
├── supervisorctl.sh # Control wrapper script
|
||
|
|
├── logs/ # Log directory (created)
|
||
|
|
└── run/ # PID/socket directory (created)
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 3. Configuration File: dolphin-supervisord.conf
|
||
|
|
|
||
|
|
**Location:** `/mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf`
|
||
|
|
|
||
|
|
**Contents:**
|
||
|
|
- `[supervisord]` section with logging, pidfile, environment
|
||
|
|
- `[unix_http_server]` for supervisorctl communication
|
||
|
|
- `[rpcinterface:supervisor]` for API
|
||
|
|
- `[supervisorctl]` client configuration
|
||
|
|
- `[program:exf]` - External Factors service (0.5s)
|
||
|
|
- `[program:ob_streamer]` - Order Book Streamer (0.5s)
|
||
|
|
- `[program:watchdog]` - Survival Stack Watchdog (10s)
|
||
|
|
- `[program:mc_forewarner]` - MC-Forewarner (4h)
|
||
|
|
- `[eventlistener:crashmail]` - Alert on crashes
|
||
|
|
- `[group:dolphin]` - Group all programs
|
||
|
|
|
||
|
|
**Key settings:**
|
||
|
|
- `autostart=true` - All services start with supervisor
|
||
|
|
- `autorestart=true` - Auto-restart on crash
|
||
|
|
- `startretries=3` - 3 restart attempts
|
||
|
|
- `stdout_logfile_maxbytes=50MB` - Log rotation
|
||
|
|
- `rlimit_as=512MB` - Memory limit per service
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 4. Control Script: supervisorctl.sh
|
||
|
|
|
||
|
|
**Location:** `/mnt/dolphinng5_predict/prod/supervisor/supervisorctl.sh`
|
||
|
|
|
||
|
|
**Commands implemented:**
|
||
|
|
- `start` - Start supervisord and all services
|
||
|
|
- `stop` - Stop all services and supervisord
|
||
|
|
- `restart` - Restart all services
|
||
|
|
- `status` - Show service status
|
||
|
|
- `logs [service]` - Show logs (last 50 lines)
|
||
|
|
- `ctl [cmd]` - Pass through to supervisorctl
|
||
|
|
|
||
|
|
**Usage:**
|
||
|
|
```bash
|
||
|
|
./supervisorctl.sh start
|
||
|
|
./supervisorctl.sh status
|
||
|
|
./supervisorctl.sh logs exf
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 5. Python Libraries Installed
|
||
|
|
|
||
|
|
**Via pip:**
|
||
|
|
- `supervisor==4.3.0` - Main process manager
|
||
|
|
- `tenacity==9.1.4` - Retry logic (previously installed)
|
||
|
|
- `schedule==1.2.2` - Task scheduling (previously installed)
|
||
|
|
|
||
|
|
**System packages checked:**
|
||
|
|
- `supervisor.noarch` available via dnf (not installed, using pip)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 6. Alternative Architectures (Previously Created)
|
||
|
|
|
||
|
|
#### 6.1 Custom Supervisor (Pure Python)
|
||
|
|
|
||
|
|
**Location:** `/mnt/dolphinng5_predict/prod/services/supervisor.py`
|
||
|
|
|
||
|
|
**Features:**
|
||
|
|
- `ServiceComponent` base class
|
||
|
|
- `DolphinSupervisor` manager
|
||
|
|
- Thread-based component management
|
||
|
|
- Built-in health monitoring
|
||
|
|
- Example components: ExF, OB, Watchdog, MC
|
||
|
|
|
||
|
|
**Status:** Available but NOT primary (Supervisor preferred)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
#### 6.2 Systemd User Services
|
||
|
|
|
||
|
|
**Location:** `~/.config/systemd/user/`
|
||
|
|
|
||
|
|
**Files created:**
|
||
|
|
- `dolphin-exf.service` - External Factors
|
||
|
|
- `dolphin-ob.service` - Order Book
|
||
|
|
- `dolphin-watchdog.service` - Watchdog
|
||
|
|
- `dolphin-mc.service` + `dolphin-mc.timer` - MC-Forewarner
|
||
|
|
- `dolphin-supervisor.service` - Custom supervisor (optional)
|
||
|
|
- `dolphin-test.service` - Test service
|
||
|
|
|
||
|
|
**Control script:** `/mnt/dolphinng5_predict/prod/services/service_manager.py`
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 7. Service Base Class (Boilerplate)
|
||
|
|
|
||
|
|
**Location:** `/mnt/dolphinng5_predict/prod/services/service_base.py`
|
||
|
|
|
||
|
|
**Features:**
|
||
|
|
- `ServiceBase` abstract class
|
||
|
|
- Automatic retries with tenacity
|
||
|
|
- Structured JSON logging
|
||
|
|
- Health check endpoints
|
||
|
|
- Graceful shutdown handling
|
||
|
|
- Systemd notify support
|
||
|
|
- `run_scheduled()` helper
|
||
|
|
|
||
|
|
**Status:** Available for custom implementations
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 8. Documentation Files Created
|
||
|
|
|
||
|
|
| File | Location | Purpose |
|
||
|
|
|------|----------|---------|
|
||
|
|
| `INDUSTRIAL_FRAMEWORKS.md` | `/mnt/dolphinng5_predict/prod/services/` | This document - framework comparison |
|
||
|
|
| `ARCHITECTURE_CHOICE.md` | `/mnt/dolphinng5_predict/prod/services/` | Architecture options comparison |
|
||
|
|
| `README.md` | `/mnt/dolphinng5_predict/prod/services/` | General services documentation |
|
||
|
|
| `dolphin-supervisord.conf` | `/mnt/dolphinng5_predict/prod/supervisor/` | Supervisor configuration |
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 9. kimi.json Updated
|
||
|
|
|
||
|
|
**Change:** Associated session with ops directory
|
||
|
|
|
||
|
|
**Before:**
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"path": "/mnt/dolphinng5_predict/prod/ops",
|
||
|
|
"kaos": "local",
|
||
|
|
"last_session_id": null
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
**After:**
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"path": "/mnt/dolphinng5_predict/prod/ops",
|
||
|
|
"kaos": "local",
|
||
|
|
"last_session_id": "c23a69c5-ba4a-41c4-8624-05114e8fd9ea"
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
### 10. Session Backup
|
||
|
|
|
||
|
|
**Session backed up:** `c23a69c5-ba4a-41c4-8624-05114e8fd9ea`
|
||
|
|
- **Original location:** `~/.kimi/sessions/9330f053b5f85e950222ed1fed8f6f02/`
|
||
|
|
- **Backup location 1:** `/mnt/dolphinng5_predict/prod/ops/kimi_session_backup/`
|
||
|
|
- **Backup location 2:** `/mnt/vids/`
|
||
|
|
- **Markdown transcript:** `KIMI_Session_Rearch_Services-Prefect.md` (684KB)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Summary: What to Use
|
||
|
|
|
||
|
|
### For Production Trading System:
|
||
|
|
|
||
|
|
**Recommended: SUPERVISOR**
|
||
|
|
```bash
|
||
|
|
cd /mnt/dolphinng5_predict/prod/supervisor
|
||
|
|
./supervisorctl.sh start
|
||
|
|
./supervisorctl.sh status
|
||
|
|
```
|
||
|
|
|
||
|
|
**Why:** Battle-tested, 20+ years, web UI, API, log rotation
|
||
|
|
|
||
|
|
### For Simplicity / No Extra Deps:
|
||
|
|
|
||
|
|
**Alternative: SYSTEMD --user**
|
||
|
|
```bash
|
||
|
|
systemctl --user start dolphin-exf
|
||
|
|
systemctl --user start dolphin-ob
|
||
|
|
systemctl --user start dolphin-watchdog
|
||
|
|
```
|
||
|
|
|
||
|
|
**Why:** Built-in, no pip installs, OS-integrated
|
||
|
|
|
||
|
|
### For Full Control:
|
||
|
|
|
||
|
|
**Alternative: Custom Python**
|
||
|
|
```bash
|
||
|
|
systemctl --user start dolphin-supervisor # Custom one
|
||
|
|
```
|
||
|
|
|
||
|
|
**Why:** Educational, customizable, no external deps
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Files Modified/Created Summary
|
||
|
|
|
||
|
|
### New Directories:
|
||
|
|
1. `/mnt/dolphinng5_predict/prod/supervisor/`
|
||
|
|
2. `/mnt/dolphinng5_predict/prod/supervisor/logs/`
|
||
|
|
3. `/mnt/dolphinng5_predict/prod/supervisor/run/`
|
||
|
|
4. `/mnt/dolphinng5_predict/prod/ops/kimi_session_backup/`
|
||
|
|
|
||
|
|
### New Files:
|
||
|
|
1. `/mnt/dolphinng5_predict/prod/supervisor/dolphin-supervisord.conf`
|
||
|
|
2. `/mnt/dolphinng5_predict/prod/supervisor/supervisorctl.sh`
|
||
|
|
3. `/mnt/dolphinng5_predict/prod/services/INDUSTRIAL_FRAMEWORKS.md` (this file)
|
||
|
|
4. `/mnt/dolphinng5_predict/prod/services/ARCHITECTURE_CHOICE.md`
|
||
|
|
5. `/mnt/dolphinng5_predict/prod/services/supervisor.py` (custom impl)
|
||
|
|
6. `/mnt/dolphinng5_predict/prod/services/service_base.py` (boilerplate)
|
||
|
|
7. `/mnt/dolphinng5_predict/prod/services/service_manager.py` (systemd ctl)
|
||
|
|
8. `/mnt/dolphinng5_predict/prod/ops/KIMI_Session_Rearch_Services-Prefect.md`
|
||
|
|
9. `/mnt/dolphinng5_predict/prod/ops/SESSION_INFO.txt`
|
||
|
|
10. `/mnt/dolphinng5_predict/prod/ops/resume_session.sh`
|
||
|
|
|
||
|
|
### Modified Files:
|
||
|
|
1. `~/.config/systemd/user/dolphin-*.service` (6 services)
|
||
|
|
2. `~/.config/systemd/user/dolphin-mc.timer`
|
||
|
|
3. `~/.kimi/kimi.json` (session association)
|
||
|
|
|
||
|
|
---
|
||
|
|
|
||
|
|
## Current Status
|
||
|
|
|
||
|
|
✅ **Supervisor 4.3.0** installed and configured
|
||
|
|
✅ **6 systemd user services** configured (backup option)
|
||
|
|
✅ **Custom supervisor** available (educational)
|
||
|
|
✅ **Service base class** with retries/logging (boilerplate)
|
||
|
|
✅ **All documentation** complete
|
||
|
|
✅ **Session backed up** to multiple locations
|
||
|
|
|
||
|
|
**Ready for:** Production deployment
|