Files
siloqy/DATA_LOCATIONS.md
HJ Normey 351ce2044d chore: safety snapshot 2026-03-05 — HCM infrastructure before 2y klines experiment
Captures critical infrastructure surrounding the nautilus_dolphin core package:
- dolphin_vbt_real.py: VBT vectorized backtest engine (6008 lines)
- dolphin_paper_trade_adaptive_cb_v2.py: champion runner (champion_5x_f20)
- _update_vbt_cache.py / update_VBT_parquet_cache.bat: cache builder
- external_factors/: ExF system (all 85 indicator fetchers + NPZ cache)
- mc_forewarning_qlabs_fork/: QLabs-enhanced MC-Forewarner research fork
- DATA_LOCATIONS.md: source-of-truth path registry
- .gitignore: excludes vbt_cache*, backfilled_data, .venv, models, etc.

Note: nautilus_dolphin/ has own git repo (inner) — safety snapshot committed there separately.
Champion state: WR=49.3%, ROI=+44.89%, PF=1.123, DD=14.95%, Sharpe=2.50 (55d, full-stack, abs_max_lev=6.0).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-05 23:51:30 +01:00

2.6 KiB

DOLPHIN NG HD Data Locations

Production Data

Location: C:\Users\Lenovo\Documents\- Dolphin NG HD (NG3)\correlation_arb512

Directory Structure

correlation_arb512/
├── matrices/
│   ├── 2025-12-26_SKIP/
│   ├── 2025-12-27_SKIP/
│   ├── ...
│   ├── 2025-12-31/
│   ├── 2026-01-01/
│   │   ├── scan_016875_w50_000003.arb512.pkl.zst
│   │   ├── scan_016875_w150_000003.arb512.pkl.zst
│   │   ├── scan_016875_w300_000003.arb512.pkl.zst
│   │   ├── scan_016875_w750_000003.arb512.pkl.zst
│   │   └── ...
│   ├── 2026-01-02/
│   ├── 2026-01-03/
│   └── 2026-01-04/
│
├── eigenvalues/
│   ├── 2025-12-26_SKIP/
│   ├── ...
│   ├── 2026-01-01/
│   │   ├── scan_016875_000003.json
│   │   ├── scan_016876_000014.json
│   │   └── ...
│   └── ...
│
├── eigenvectors/
│   └── [dated directories with eigenvector data]
│
└── metadata/
    └── [dated directories with metadata]

File Naming Convention

Eigenvalue JSON: scan_NNNNNN_HHMMSS.json

  • NNNNNN: 6-digit scan number
  • HHMMSS: Timestamp (HHMMSS format)

Matrix ZST: scan_NNNNNN_wWWW_HHMMSS.arb512.pkl.zst

  • NNNNNN: 6-digit scan number (matches eigenvalue)
  • WWW: Window size (50, 150, 300, 750)
  • HHMMSS: Timestamp
  • .arb512.pkl.zst: Blosc-compressed pickle with 512-bit arb precision

SKIP Directories

Directories with _SKIP suffix should be excluded from processing. These contain data that failed validation or is marked for exclusion.


Test Data (Current Project)

Location: C:\Users\Lenovo\Documents\- DOLPHIN NG HD HCM TSF Predict

Test data should mirror production structure with partial data:

- DOLPHIN NG HD HCM TSF Predict/
├── matrices/
│   ├── [root level files - legacy format]
│   └── 2026-01-03/
├── eigenvalues/
│   ├── 2026-01-01/
│   └── 2026-01-03/
└── ...

Note: Test data scan numbers may not match between directories. Always verify pairing before running pipelines.


Quick Reference

Environment Path
Production C:\Users\Lenovo\Documents\- Dolphin NG HD (NG3)\correlation_arb512
Test/Dev C:\Users\Lenovo\Documents\- DOLPHIN NG HD HCM TSF Predict

  • ZST_Compressed_Matrix_DOLPHIN_format_spec.md - Detailed format specification for .arb512.pkl.zst files
  • run_joint_encoder_pipeline.py - Pipeline using this data

Last updated: 2026-01-10