# DOLPHIN NG HD Data Locations ## Production Data **Location**: `C:\Users\Lenovo\Documents\- Dolphin NG HD (NG3)\correlation_arb512` ### Directory Structure ``` correlation_arb512/ ├── matrices/ │ ├── 2025-12-26_SKIP/ │ ├── 2025-12-27_SKIP/ │ ├── ... │ ├── 2025-12-31/ │ ├── 2026-01-01/ │ │ ├── scan_016875_w50_000003.arb512.pkl.zst │ │ ├── scan_016875_w150_000003.arb512.pkl.zst │ │ ├── scan_016875_w300_000003.arb512.pkl.zst │ │ ├── scan_016875_w750_000003.arb512.pkl.zst │ │ └── ... │ ├── 2026-01-02/ │ ├── 2026-01-03/ │ └── 2026-01-04/ │ ├── eigenvalues/ │ ├── 2025-12-26_SKIP/ │ ├── ... │ ├── 2026-01-01/ │ │ ├── scan_016875_000003.json │ │ ├── scan_016876_000014.json │ │ └── ... │ └── ... │ ├── eigenvectors/ │ └── [dated directories with eigenvector data] │ └── metadata/ └── [dated directories with metadata] ``` ### File Naming Convention **Eigenvalue JSON**: `scan_NNNNNN_HHMMSS.json` - `NNNNNN`: 6-digit scan number - `HHMMSS`: Timestamp (HHMMSS format) **Matrix ZST**: `scan_NNNNNN_wWWW_HHMMSS.arb512.pkl.zst` - `NNNNNN`: 6-digit scan number (matches eigenvalue) - `WWW`: Window size (50, 150, 300, 750) - `HHMMSS`: Timestamp - `.arb512.pkl.zst`: Blosc-compressed pickle with 512-bit arb precision ### SKIP Directories Directories with `_SKIP` suffix should be excluded from processing. These contain data that failed validation or is marked for exclusion. --- ## Test Data (Current Project) **Location**: `C:\Users\Lenovo\Documents\- DOLPHIN NG HD HCM TSF Predict` Test data should mirror production structure with partial data: ``` - DOLPHIN NG HD HCM TSF Predict/ ├── matrices/ │ ├── [root level files - legacy format] │ └── 2026-01-03/ ├── eigenvalues/ │ ├── 2026-01-01/ │ └── 2026-01-03/ └── ... ``` **Note**: Test data scan numbers may not match between directories. Always verify pairing before running pipelines. --- ## Quick Reference | Environment | Path | |-------------|------| | **Production** | `C:\Users\Lenovo\Documents\- Dolphin NG HD (NG3)\correlation_arb512` | | **Test/Dev** | `C:\Users\Lenovo\Documents\- DOLPHIN NG HD HCM TSF Predict` | --- ## Related Documentation - **ZST_Compressed_Matrix_DOLPHIN_format_spec.md** - Detailed format specification for `.arb512.pkl.zst` files - **run_joint_encoder_pipeline.py** - Pipeline using this data --- *Last updated: 2026-01-10*