Files
DOLPHIN/external_factors/EsoFactors_Test_Prompt.md

430 lines
6.2 KiB
Markdown
Raw Normal View History

Now before following, if the new esoteric factor engine is modeled after the established factor engine, it should have fallback feeling, backfilling, backfilling capabilities. Please backfill for all 55 days, backfill only end in the backfill mode, and backfill all the esoteric factors. If I am assuming the data sources are either computable or historically capable of that, fetch the information from other sources, the mainstay can be backfilled. When you're done with that, without altering any of the main algorithms, look for any and all correlations between the output of the essential factors and drawdown or run-up and profitability or not of the particular day, and other system variables, just to say correlate all against all, looking for any strong confirmations or correlations.
What youre describing is essentially a controlled historical replay + full-factor correlation audit, without touching the core production logic. Ill structure this in clean operational terms
---
1⃣ Esoteric Factor Engine — Full 55-Day Backfill Mode
Assumptions:
Esoteric engine mirrors the established factor engine architecture.
All data sources are either:
Deterministically computable from price/volume history, or
Historically retrievable from external APIs or stored archives.
No modification to factor formulas.
---
A. Backfill Design Principles
Mode: BACKFILL_ONLY_END
This means:
Replay day-by-day from T-55 → T-1.
Compute all esoteric factors exactly as if live.
Write values to historical store.
No strategy decisions executed.
No incremental learning updates.
No state mutation except factor history buffers.
---
B. Fallback Hierarchy
For each esoteric factor:
1. Primary historical source
(official data store / archive)
2. Secondary API historical endpoint
3. Deterministic reconstruction
Recompute from base OHLCV
Reconstruct state from rolling window
4. Synthetic proxy fallback
Only if mathematically derivable
Must be flagged as fallback_level = 3
You log fallback level for each factor/day.
---
C. Backfill Procedure
Step 1 — Freeze Production State
Snapshot:
Rolling buffers
Latent embeddings (if any)
Volatility states
Regime states
Step 2 — Initialize Clean Historical Buffers
Clear only:
Esoteric factor buffers
Derived rolling statistics
Keep:
Core algorithm logic intact.
---
Step 3 — Replay Loop (55 days)
For each day d from T-55 to T-1:
1. Load OHLCV
2. Fetch or reconstruct all auxiliary inputs
3. Compute:
All esoteric factors
All intermediate states
Any smoothed or filtered outputs
Any latent representations
4. Append to:
esoteric_factor_store[d]
factor_output_store[d]
system_state_snapshot[d]
No trading logic triggered.
---
Step 4 — Integrity Check
After replay:
Check no NaNs
Check window completeness
Check rolling windows properly warm
Confirm fallback levels < threshold (ideally mostly 0/1)
---
2⃣ Correlation Audit — All Against All
Now comes the analytical part.
We compute correlations between:
A. Target Variables
Per day:
Max intraday drawdown
Max intraday run-up
Close-to-close return
Strategy P&L
Win/Loss (binary)
Profitability magnitude
Volatility
Spread
Liquidity proxy
Regime label
Latent manifold distortion (if present)
Drift velocity (if you implemented that earlier idea)
---
B. Factor Groups
1. Core factors
2. Esoteric factors
3. Combined outputs
4. Latent embeddings (if available)
5. Signal strength
6. Conviction metrics
7. Confidence weighting
8. Any internal risk throttles
---
C. Correlation Types to Compute
You want more than Pearson.
Compute:
1. Pearson correlation
2. Spearman rank correlation
3. Kendall tau
4. Mutual information
5. Distance correlation
6. Rolling correlation (7, 14, 30 days)
7. Lagged correlation (±1, ±2, ±3 days)
---
D. Binary Outcome Testing
For profitability:
Logistic regression coefficients
Point-biserial correlation
Information coefficient (IC)
t-stat significance
---
E. Cross-Correlation Matrix
You compute:
corr_matrix = corr(all_factors all_targets)
Then:
Extract |corr| > 0.6
Flag p < 0.05
Flag stable correlations across rolling windows
Flag correlations that persist across fallback levels
---
3⃣ What Youre Actually Looking For
This audit answers:
1. Do esoteric factors anticipate drawdown?
2. Do they amplify run-up?
3. Are they redundant with core factors?
4. Are they orthogonal alpha?
5. Do they correlate only in stressed regimes?
6. Do they degrade performance in low-vol days?
7. Do latent geometry distortions align with profitability?
---
4⃣ Interpretation Layer
You classify correlations into:
Strong Confirmations
Stable across windows
Significant
Not present in core-only model
Conditional Correlations
Appear only in high volatility
Appear only in drawdown clusters
Spurious / Structural
Correlate due to shared base data
Dangerous
Correlate negatively with profitability
Increase drawdown magnitude
---
5⃣ Deliverables You Should Generate
1. Heatmap of full correlation matrix
2. Ranked factor impact table
3. Stability score per factor
4. Redundancy map (clustered)
5. Regime-conditional breakdown
6. Factor → drawdown predictive ranking
7. Factor → run-up predictive ranking
---
6⃣ Critical Warning
Do NOT:
Change algorithm weights.
Remove factors.
Normalize differently.
Retrain anything.
This is purely diagnostic.
---
7⃣ What This Tells You Strategically
If strong correlation emerges between:
Esoteric manifold distortion and drawdown
→ youve built a stress sensor.
If strong correlation emerges between:
Drift velocity and next-day profitability
→ you have regime anticipation.
If esoteric factors are mostly redundant
→ compress the engine.
If orthogonal and stable
→ youve added real signal dept