Files
DOLPHIN/prod/docs/TEST_REPORTING.md

243 lines
7.7 KiB
Markdown
Raw Normal View History

# TEST_REPORTING.md
## How Automated Test Suites Update the TUI Live Footer
**Audience**: developers writing or extending dolphin test suites
**Last updated**: 2026-04-05
---
## Overview
`dolphin_tui_v5.py` displays a live test-results footer showing the latest automated test run
per category. The footer is driven by a single JSON file:
```
/mnt/dolphinng5_predict/run_logs/test_results_latest.json
```
Any test script, pytest fixture, or CI runner can update this file by calling the
`write_test_results()` function exported from `dolphin_tui_v5.py`, or by writing the JSON
directly with the correct schema.
---
## JSON Schema
```json
{
"_run_at": "2026-04-05T14:30:00",
"data_integrity": {"passed": 15, "total": 15, "status": "PASS"},
"finance_fuzz": {"passed": 8, "total": 8, "status": "PASS"},
"signal_fill": {"passed": 6, "total": 6, "status": "PASS"},
"degradation": {"passed": 12, "total": 12, "status": "PASS"},
"actor": {"passed": 46, "total": 46, "status": "PASS"}
}
```
### Fields
| Field | Type | Description |
|---|---|---|
| `_run_at` | ISO-8601 string | UTC timestamp of the run. Set automatically by `write_test_results()`. |
| `data_integrity` | CategoryResult | Arrow schema + HZ key integrity tests |
| `finance_fuzz` | CategoryResult | Financial edge cases: negative capital, zero price, NaN signals |
| `signal_fill` | CategoryResult | Signal path continuity: vel_div → posture → order |
| `degradation` | CategoryResult | Graceful-degradation tests: missing HZ keys, stale data |
| `actor` | CategoryResult | DolphinActor unit + integration tests |
### CategoryResult
```json
{
"passed": 15, // int or null if not yet run
"total": 15, // int or null if not yet run
"status": "PASS" // "PASS" | "FAIL" | "N/A"
}
```
Use `"status": "N/A"` and `null` counts for categories not yet automated.
---
## Python API — Preferred Method
```python
# At the end of a test run (pytest session, script, etc.)
import sys
sys.path.insert(0, "/mnt/dolphinng5_predict/Observability/TUI")
from dolphin_tui_v5 import write_test_results
write_test_results({
"data_integrity": {"passed": 15, "total": 15, "status": "PASS"},
"finance_fuzz": {"passed": 8, "total": 8, "status": "PASS"},
"signal_fill": {"passed": 6, "total": 6, "status": "PASS"},
"degradation": {"passed": 12, "total": 12, "status": "PASS"},
"actor": {"passed": 46, "total": 46, "status": "PASS"},
})
```
`write_test_results()` does the following atomically:
1. Injects `"_run_at"` = current UTC ISO timestamp
2. Merges provided categories with any existing file (missing categories are preserved)
3. Writes to `run_logs/test_results_latest.json`
---
## pytest Integration — conftest.py Pattern
Add a session-scoped fixture in `conftest.py` at the repo root or the relevant test package:
```python
# conftest.py
import pytest
import sys
sys.path.insert(0, "/mnt/dolphinng5_predict/Observability/TUI")
from dolphin_tui_v5 import write_test_results
def pytest_sessionfinish(session, exitstatus):
"""Push test results to TUI footer after every pytest run."""
summary = {}
for item in session.items:
cat = _category_for(item)
summary.setdefault(cat, {"passed": 0, "total": 0})
summary[cat]["total"] += 1
if item.session.testsfailed == 0: # refine: check per-item outcome
summary[cat]["passed"] += 1
for cat, counts in summary.items():
counts["status"] = "PASS" if counts["passed"] == counts["total"] else "FAIL"
write_test_results(summary)
def _category_for(item) -> str:
"""Map a test item to a footer category based on module path."""
path = str(item.fspath)
if "data_integrity" in path: return "data_integrity"
if "finance" in path: return "finance_fuzz"
if "signal" in path: return "signal_fill"
if "degradation" in path: return "degradation"
if "actor" in path: return "actor"
return "actor" # default bucket
```
A cleaner alternative is to use `pytest-terminal-reporter` or `pytest_runtest_logreport` to
capture per-item pass/fail rather than inferring from session state.
---
## Shell / CI Script Pattern
For shell-level CI (Prefect flows, bash scripts):
```bash
#!/bin/bash
source /home/dolphin/siloqy_env/bin/activate
# Run the suite
python -m pytest prod/tests/test_data_integrity.py -v --tb=short
EXIT=$?
# Push result
STATUS=$( [ $EXIT -eq 0 ] && echo "PASS" || echo "FAIL" )
PASSED=$(python -m pytest prod/tests/test_data_integrity.py --co -q 2>/dev/null | grep -c "test session")
python3 - <<EOF
import sys
sys.path.insert(0, "/mnt/dolphinng5_predict/Observability/TUI")
from dolphin_tui_v5 import write_test_results
write_test_results({
"data_integrity": {"passed": None, "total": None, "status": "$STATUS"}
})
EOF
```
---
## Test Categories — Definitions
### `data_integrity`
Verifies structural correctness of data at system boundaries:
- Arrow IPC files match `SCAN_SCHEMA` (27 fields, `schema_version="5.0.0"`)
- HZ keys present and non-empty after scanner startup
- JSON payloads deserialise without error
- Scan number monotonically increases
### `finance_fuzz`
Financial edge-case property tests (Hypothesis or manual):
- AlphaBetSizer with `capital=0`, `capital<0`, `price=0`, `vel_div=NaN`
- ACB boost clamped to `[0.5, 2.0]` under all inputs
- Position sizing never produces `quantity < 0`
- Fee model: slippage + commission never exceeds gross PnL on minimum position
### `signal_fill`
End-to-end signal path:
- `vel_div < -0.02` → posture becomes APEX → order generated
- `vel_div >= 0` → no new orders
- Signal correctly flows through `NDAlphaEngine → DolphinActor → NautilusOrder`
- Dedup: same `scan_number` never generates two orders
### `degradation`
Graceful degradation under missing/stale inputs:
- TUI renders without crash when any HZ key is absent
- `mc_forewarner_latest` absent → "not deployed" rendered, no exception
- `ext_features_latest` fields `None``_exf_str()` substitutes `"?"`
- Scanner starts with no prior Arrow files (scan_number starts at 1)
- MHS missing subsystem → RM_META excludes it gracefully
### `actor`
DolphinActor Nautilus integration:
- `on_bar()` incremental step (not batch)
- Threading lock on ACB prevents race
- `_GateSnap` stale-state detection fires within 1 bar
- Capital sync on `on_start()` matches Nautilus portfolio balance
- MC-Forewarner wired and returning envelope gate signal
---
## File Location Contract
The file path is **hardcoded** relative to the TUI module:
```python
_RESULTS_FILE = Path(__file__).parent.parent.parent / "run_logs" / "test_results_latest.json"
# resolves to: /mnt/dolphinng5_predict/run_logs/test_results_latest.json
```
Do not move `run_logs/` or rename the file — the TUI footer will silently show stale data.
---
## TUI Footer Refresh
The footer is read once on `on_mount()`. To force a live reload:
- Press **`t`** — toggles footer visibility (hide/show re-reads file)
- Press **`r`** — forces full panel refresh
The footer does **not** auto-watch the file for changes (no inotify). Press `t` twice after
a test run to see updated results without restarting the TUI.
---
## Bootstrap File
`run_logs/test_results_latest.json` ships with a bootstrap entry (prior manual run results)
so the footer is never blank on first launch:
```json
{
"_run_at": "2026-04-05T00:00:00",
"data_integrity": {"passed": 15, "total": 15, "status": "PASS"},
"finance_fuzz": {"passed": null, "total": null, "status": "N/A"},
"signal_fill": {"passed": null, "total": null, "status": "N/A"},
"degradation": {"passed": 12, "total": 12, "status": "PASS"},
"actor": {"passed": null, "total": null, "status": "N/A"}
}
```
---
*See also: `SYSTEM_BIBLE.md` §28.4 — TUI test footer architecture*