Includes core prod + GREEN/BLUE subsystems: - prod/ (BLUE harness, configs, scripts, docs) - nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved) - adaptive_exit/ (AEM engine + models/bucket_assignments.pkl) - Observability/ (EsoF advisor, TUI, dashboards) - external_factors/ (EsoF producer) - mc_forewarning_qlabs_fork/ (MC regime/envelope) Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
120 lines
2.5 KiB
Markdown
Executable File
120 lines
2.5 KiB
Markdown
Executable File
# Service Architecture Options
|
|
|
|
## Option 1: Single Supervisor (Recommended for You)
|
|
**One systemd service → Manages multiple internal components**
|
|
|
|
```
|
|
dolphin-supervisor.service
|
|
├── ExF Component (thread)
|
|
├── OB Component (thread)
|
|
├── Watchdog Component (thread)
|
|
└── MC Component (thread)
|
|
```
|
|
|
|
**Pros:**
|
|
- One systemd unit to manage
|
|
- Components share memory efficiently
|
|
- Centralized health monitoring
|
|
- Built-in restart per component
|
|
- Lower system overhead
|
|
|
|
**Cons:**
|
|
- Single process (if it crashes, all components stop)
|
|
- Less isolation between components
|
|
|
|
**Use when:** Components are tightly coupled, share data
|
|
|
|
**Commands:**
|
|
```bash
|
|
systemctl --user start dolphin-supervisor
|
|
journalctl --user -u dolphin-supervisor -f
|
|
```
|
|
|
|
---
|
|
|
|
## Option 2: Multiple Separate Services
|
|
**Each component = separate systemd service**
|
|
|
|
```
|
|
dolphin-exf.service
|
|
├── ExF Component
|
|
|
|
dolphin-ob.service
|
|
├── OB Component
|
|
|
|
dolphin-watchdog.service
|
|
├── Watchdog Component
|
|
```
|
|
|
|
**Pros:**
|
|
- Full isolation between components
|
|
- Independent restart/failure domains
|
|
- Can set different resource limits per service
|
|
- Systemd handles everything
|
|
|
|
**Cons:**
|
|
- More systemd units to manage
|
|
- Higher memory overhead (separate processes)
|
|
- IPC needed for shared data
|
|
|
|
**Use when:** Components are independent, need strong isolation
|
|
|
|
**Commands:**
|
|
```bash
|
|
./service_manager.py start
|
|
./service_manager.py status
|
|
```
|
|
|
|
---
|
|
|
|
## Option 3: Hybrid (Single Supervisor + Critical Services Separate)
|
|
|
|
```
|
|
dolphin-supervisor.service
|
|
├── ExF Component
|
|
├── OB Component
|
|
└── MC Component (scheduled)
|
|
|
|
dolphin-watchdog.service (separate - critical!)
|
|
└── Watchdog Component
|
|
```
|
|
|
|
**Use when:** One component is critical/safety-related
|
|
|
|
---
|
|
|
|
## Recommendation
|
|
|
|
For your Dolphin system, **Option 1 (Single Supervisor)** is likely best because:
|
|
|
|
1. **Tight coupling**: ExF, OB, Watchdog all need Hazelcast
|
|
2. **Data sharing**: Components share state via memory
|
|
3. **Simplicity**: One command to start/stop everything
|
|
4. **Resource efficiency**: Lower overhead than separate processes
|
|
|
|
The supervisor handles:
|
|
- Auto-restart of failed components
|
|
- Health monitoring
|
|
- Structured logging
|
|
- Graceful shutdown
|
|
|
|
---
|
|
|
|
## Quick Start: Single Supervisor
|
|
|
|
```bash
|
|
# 1. Enable and start
|
|
cd /mnt/dolphinng5_predict/prod/services
|
|
systemctl --user enable dolphin-supervisor
|
|
systemctl --user start dolphin-supervisor
|
|
|
|
# 2. Check status
|
|
systemctl --user status dolphin-supervisor
|
|
|
|
# 3. View logs
|
|
journalctl --user -u dolphin-supervisor -f
|
|
|
|
# 4. Stop
|
|
systemctl --user stop dolphin-supervisor
|
|
```
|