Includes core prod + GREEN/BLUE subsystems: - prod/ (BLUE harness, configs, scripts, docs) - nautilus_dolphin/ (GREEN Nautilus-native impl + dvae/ preserved) - adaptive_exit/ (AEM engine + models/bucket_assignments.pkl) - Observability/ (EsoF advisor, TUI, dashboards) - external_factors/ (EsoF producer) - mc_forewarning_qlabs_fork/ (MC regime/envelope) Excludes runtime caches, logs, backups, and reproducible artifacts per .gitignore.
3754 lines
42 KiB
Plaintext
Executable File
3754 lines
42 KiB
Plaintext
Executable File
Aeronautical-Grade High-Frequency
|
||
Trading
|
||
|
||
System
|
||
|
||
Specification:
|
||
|
||
Upgrading
|
||
|
||
the
|
||
|
||
DOLPHIN-NAUTILUS
|
||
|
||
Distributed
|
||
|
||
Reactive
|
||
|
||
Mesh
|
||
|
||
The quantitative trading landscape has entered an era where the distinction between financial
|
||
infrastructure
|
||
|
||
and
|
||
|
||
aerospace
|
||
|
||
avionics
|
||
|
||
is
|
||
|
||
no
|
||
|
||
longer
|
||
|
||
a
|
||
|
||
matter
|
||
|
||
of
|
||
|
||
analogy,
|
||
|
||
but
|
||
|
||
of
|
||
|
||
technical
|
||
|
||
convergence.
|
||
|
||
The
|
||
|
||
DOLPHIN-NAUTILUS
|
||
|
||
platform,
|
||
|
||
in
|
||
|
||
its
|
||
|
||
current
|
||
|
||
research-driven
|
||
|
||
state,
|
||
|
||
has
|
||
|
||
achieved
|
||
|
||
high-fidelity
|
||
|
||
signal
|
||
|
||
generation
|
||
|
||
and
|
||
|
||
a
|
||
|
||
defense-in-depth
|
||
|
||
execution
|
||
|
||
architecture.
|
||
1
|
||
|
||
However,
|
||
|
||
to
|
||
|
||
transition
|
||
|
||
from
|
||
|
||
a
|
||
|
||
research
|
||
|
||
stack
|
||
|
||
to
|
||
|
||
an
|
||
|
||
aeronautics-grade
|
||
|
||
high-frequency
|
||
|
||
trading
|
||
|
||
(HFT)
|
||
|
||
system,
|
||
|
||
a
|
||
|
||
fundamental
|
||
|
||
architectural
|
||
|
||
metamorphosis
|
||
|
||
is
|
||
|
||
required.
|
||
|
||
This
|
||
|
||
specification
|
||
|
||
defines
|
||
|
||
the
|
||
|
||
desiderata
|
||
|
||
for
|
||
|
||
a
|
||
|
||
Level
|
||
|
||
4
|
||
|
||
system,
|
||
|
||
shifting
|
||
|
||
from
|
||
|
||
monolithic
|
||
|
||
Python
|
||
|
||
daemons
|
||
|
||
to
|
||
|
||
a
|
||
|
||
distributed
|
||
|
||
reactive
|
||
|
||
mesh
|
||
|
||
that
|
||
|
||
integrates
|
||
|
||
Python’s
|
||
|
||
flexibility
|
||
|
||
with
|
||
|
||
the
|
||
|
||
low-latency
|
||
|
||
guarantees
|
||
|
||
of
|
||
|
||
Hazelcast,
|
||
|
||
the
|
||
|
||
orchestration
|
||
|
||
rigor
|
||
|
||
of
|
||
|
||
Prefect,
|
||
|
||
and
|
||
|
||
the
|
||
|
||
performance-critical
|
||
|
||
Rust
|
||
|
||
core
|
||
|
||
of
|
||
|
||
Nautilus-Trader.
|
||
1
|
||
The current system relies on price action heuristics and lagging indicators, whereas the
|
||
upgraded
|
||
|
||
Alpha
|
||
|
||
Engine
|
||
|
||
operates
|
||
|
||
on
|
||
|
||
the
|
||
|
||
physics
|
||
|
||
of
|
||
|
||
the
|
||
|
||
order
|
||
|
||
book,
|
||
|
||
utilizing
|
||
|
||
eigenvalue
|
||
|
||
entropy
|
||
|
||
(v50,
|
||
|
||
v750)
|
||
|
||
and
|
||
|
||
plunge
|
||
|
||
depth
|
||
|
||
(vel_div)
|
||
|
||
to
|
||
|
||
anticipate
|
||
|
||
structural
|
||
|
||
shifts.
|
||
1
|
||
The defensive layers must
|
||
evolve
|
||
|
||
from
|
||
|
||
static
|
||
|
||
penalty
|
||
|
||
tables
|
||
|
||
to
|
||
|
||
a
|
||
|
||
multiplicative
|
||
|
||
survival
|
||
|
||
controller
|
||
|
||
driven
|
||
|
||
by
|
||
|
||
control
|
||
|
||
theory,
|
||
|
||
ensuring
|
||
|
||
that
|
||
|
||
risk
|
||
|
||
exposure
|
||
|
||
is
|
||
|
||
a
|
||
|
||
continuous
|
||
|
||
function
|
||
|
||
of
|
||
|
||
the
|
||
|
||
system’s
|
||
|
||
epistemic
|
||
|
||
confidence
|
||
|
||
in
|
||
|
||
its
|
||
|
||
world
|
||
|
||
model.
|
||
1
|
||
This transition addresses the "Variance Drain" problem that frequently
|
||
compromises
|
||
|
||
retail-grade
|
||
|
||
systems
|
||
|
||
by
|
||
|
||
implementing
|
||
|
||
a
|
||
|
||
geometric
|
||
|
||
sweet
|
||
|
||
spot
|
||
|
||
for
|
||
|
||
leverage
|
||
|
||
at
|
||
|
||
6.0x,
|
||
|
||
beyond
|
||
|
||
which
|
||
|
||
variance
|
||
|
||
drag
|
||
|
||
exceeds
|
||
|
||
arithmetic
|
||
|
||
growth.
|
||
1
|
||
I. Current System Baseline and Strategic Deficiencies
|
||
The current operational state, as documented in the Siloqy environment, serves as the point of
|
||
departure
|
||
|
||
for
|
||
|
||
this
|
||
|
||
aeronautical
|
||
|
||
upgrade.
|
||
|
||
The
|
||
|
||
baseline
|
||
|
||
system,
|
||
|
||
described
|
||
|
||
in
|
||
|
||
the
|
||
|
||
production
|
||
|
||
bring-up
|
||
|
||
guide,
|
||
|
||
operates
|
||
|
||
on
|
||
|
||
a
|
||
|
||
Windows
|
||
|
||
11
|
||
|
||
platform
|
||
|
||
using
|
||
|
||
Docker
|
||
|
||
Desktop
|
||
|
||
for
|
||
|
||
containerization.
|
||
1
|
||
|
||
While
|
||
|
||
functional
|
||
|
||
for
|
||
|
||
paper
|
||
|
||
trading,
|
||
|
||
this
|
||
|
||
environment
|
||
|
||
lacks
|
||
|
||
the
|
||
|
||
deterministic
|
||
|
||
latency
|
||
|
||
profiles
|
||
|
||
required
|
||
|
||
for
|
||
|
||
institutional-grade
|
||
|
||
HFT.
|
||
|
||
System Attribute Current State (Baseline) Desideratum (Aeronautical Grade)
|
||
|
||
Operating System Windows 11
|
||
1
|
||
Tuned Linux with Real-Time Kernel
|
||
2
|
||
Networking Stack Standard Kernel TCP/IP Kernel Bypass (DPDK/RDMA)
|
||
3
|
||
Execution Latency ~15ms (JSON/NPZ I/O)
|
||
1
|
||
< 100 microseconds (In-Memory)
|
||
1
|
||
Data Sharing File-based (JSON, NPZ, Arrow)
|
||
1
|
||
Distributed In-Memory Data Grid (IMDG)
|
||
1
|
||
Capital Tracking Fresh re-instantiation ($25,000)
|
||
1
|
||
Persistent Multi-Session Ledger Integrity
|
||
1
|
||
Risk Logic Static Threshold Penalty Ladder
|
||
1
|
||
Multiplicative PID Survival Controller
|
||
1
|
||
The existing system faces significant RAM and CPU limitations when monitoring the current
|
||
target
|
||
|
||
of
|
||
|
||
50
|
||
|
||
assets,
|
||
|
||
with
|
||
|
||
a
|
||
|
||
strategic
|
||
|
||
goal
|
||
|
||
of
|
||
|
||
expanding
|
||
|
||
to
|
||
|
||
400
|
||
|
||
assets.
|
||
1
|
||
The Python Global
|
||
Interpreter
|
||
|
||
Lock
|
||
|
||
(GIL)
|
||
|
||
and
|
||
|
||
garbage
|
||
|
||
collection
|
||
|
||
(GC)
|
||
|
||
pauses
|
||
|
||
create
|
||
|
||
"micro-stutter,"
|
||
|
||
introducing
|
||
|
||
unacceptable
|
||
|
||
jitter
|
||
|
||
in
|
||
|
||
the
|
||
|
||
5s/6s
|
||
|
||
eigenvalue
|
||
|
||
scan
|
||
|
||
intervals.
|
||
1
|
||
Furthermore, the current "HELL" test
|
||
indicates
|
||
|
||
that
|
||
|
||
while
|
||
|
||
the
|
||
|
||
system
|
||
|
||
remains
|
||
|
||
profitable
|
||
|
||
under
|
||
|
||
friction,
|
||
|
||
it
|
||
|
||
relies
|
||
|
||
on
|
||
|
||
a
|
||
|
||
stochastic
|
||
|
||
maker-rebate
|
||
|
||
model
|
||
|
||
that
|
||
|
||
may
|
||
|
||
not
|
||
|
||
withstand
|
||
|
||
real-world
|
||
|
||
execution
|
||
|
||
reality
|
||
|
||
without
|
||
|
||
the
|
||
|
||
integration
|
||
|
||
of
|
||
|
||
the
|
||
|
||
SmartPlacer
|
||
|
||
price-making
|
||
|
||
logic
|
||
|
||
into
|
||
|
||
a
|
||
|
||
lower-latency
|
||
|
||
environment.
|
||
1
|
||
The dependency on hardcoded values, such as the volatility calibration constant of 0.000099
|
||
and
|
||
|
||
the
|
||
|
||
fixed
|
||
|
||
initial
|
||
|
||
capital
|
||
|
||
of
|
||
|
||
25,000,
|
||
|
||
suggests
|
||
|
||
a
|
||
|
||
lack
|
||
|
||
of
|
||
|
||
dynamic
|
||
|
||
adaptation
|
||
|
||
to
|
||
|
||
non-stationary
|
||
|
||
market
|
||
|
||
regimes.
|
||
1
|
||
The EsoF (Esoteric Factors) service, which tracks lunar cycles and temporal
|
||
harmonics,
|
||
|
||
provides
|
||
|
||
a
|
||
|
||
unique
|
||
|
||
"Meta-Brake"
|
||
|
||
for
|
||
|
||
sizing,
|
||
|
||
yet
|
||
|
||
its
|
||
|
||
integration
|
||
|
||
is
|
||
|
||
currently
|
||
|
||
anecdotal
|
||
|
||
rather
|
||
|
||
than
|
||
|
||
statistical,
|
||
|
||
necessitating
|
||
|
||
a
|
||
|
||
robust
|
||
|
||
backfilling
|
||
|
||
operation
|
||
|
||
before
|
||
|
||
live-market
|
||
|
||
deployment.
|
||
1
|
||
II. Reliability Frameworks: DO-178C and 10 CFR 50
|
||
Appendix
|
||
|
||
B
|
||
|
||
To achieve aeronautical-grade reliability, the DOLPHIN-NAUTILUS system must adopt software
|
||
integrity
|
||
|
||
standards
|
||
|
||
from
|
||
|
||
the
|
||
|
||
aviation
|
||
|
||
and
|
||
|
||
nuclear
|
||
|
||
sectors.
|
||
|
||
These
|
||
|
||
frameworks
|
||
|
||
ensure
|
||
|
||
that
|
||
|
||
safety-critical
|
||
|
||
components—such
|
||
|
||
as
|
||
|
||
kill-switches
|
||
|
||
and
|
||
|
||
ledger
|
||
|
||
checksums—are
|
||
|
||
isolated
|
||
|
||
from
|
||
|
||
lower-priority
|
||
|
||
tasks
|
||
|
||
like
|
||
|
||
P&L
|
||
|
||
logging.
|
||
9
|
||
1. Design Assurance Level (DAL) Mapping
|
||
|
||
The system architecture is partitioned into functions of varying criticality. Adhering to the RTCA
|
||
DO-178C
|
||
|
||
standard,
|
||
|
||
each
|
||
|
||
component
|
||
|
||
is
|
||
|
||
assigned
|
||
|
||
a
|
||
|
||
Design
|
||
|
||
Assurance
|
||
|
||
Level
|
||
|
||
(DAL)
|
||
|
||
based
|
||
|
||
on
|
||
|
||
the
|
||
|
||
severity
|
||
|
||
of
|
||
|
||
a
|
||
|
||
failure.
|
||
9
|
||
|
||
DAL Level Failure Category Trading System Mapping
|
||
Verification Rigor
|
||
DAL A Catastrophic Order Execution, Ledger Integrity, Kill-Switches
|
||
100% Structural Coverage, Formal Proofs
|
||
12
|
||
DAL B Hazardous Pre-trade Risk Gating, MC-Forewarner, ACB v6
|
||
Boundary Analysis, Decision Coverage 13
|
||
DAL C Major Alpha Signal Generator, Eigenvalue Entropy
|
||
Functional Requirements Testing
|
||
14
|
||
DAL D Minor Esoteric Factor Services (EsoF), DVOL Analytics
|
||
System-Level Testing
|
||
DAL E No Effect Historical Backfilling Progress, Dashboards
|
||
Documentation of Intent
|
||
This partitioning prevents a "Major" failure in the Alpha Signal Generator from interfering with
|
||
the
|
||
|
||
"Catastrophic"
|
||
|
||
safety
|
||
|
||
path
|
||
|
||
of
|
||
|
||
the
|
||
|
||
Kill-Switch.
|
||
|
||
In
|
||
|
||
an
|
||
|
||
HFT
|
||
|
||
context,
|
||
|
||
this
|
||
|
||
is
|
||
|
||
achieved
|
||
|
||
through
|
||
|
||
hardware-level
|
||
|
||
isolation
|
||
|
||
and
|
||
|
||
the
|
||
|
||
use
|
||
|
||
of
|
||
|
||
the
|
||
|
||
separation
|
||
|
||
kernel
|
||
|
||
architecture
|
||
|
||
common
|
||
|
||
in
|
||
|
||
modular
|
||
|
||
avionics.
|
||
15
|
||
2. Nuclear Quality Assurance Criteria
|
||
The implementation of state management within Hazelcast must comply with the quality
|
||
assurance
|
||
|
||
requirements
|
||
|
||
of
|
||
|
||
10
|
||
|
||
CFR
|
||
|
||
50
|
||
|
||
Appendix
|
||
|
||
B.
|
||
16
|
||
These eighteen criteria establish a
|
||
"Defense-in-Depth"
|
||
|
||
strategy
|
||
|
||
for
|
||
|
||
the
|
||
|
||
system's
|
||
|
||
distributed
|
||
|
||
state
|
||
|
||
machine,
|
||
|
||
ensuring
|
||
|
||
that
|
||
|
||
no
|
||
|
||
single
|
||
|
||
component
|
||
|
||
failure
|
||
|
||
can
|
||
|
||
cause
|
||
|
||
an
|
||
|
||
out-of-control
|
||
|
||
execution
|
||
|
||
state.
|
||
10
|
||
Critical criteria applied to the HFT transition include Criterion III (Design Control), which
|
||
|
||
mandates that the MC-Forewarner safety envelope be validated against 4,500+ Monte Carlo
|
||
simulations
|
||
|
||
to
|
||
|
||
ensure
|
||
|
||
geometric
|
||
|
||
safety.
|
||
1
|
||
Criterion XI (Test Control) requires the execution of
|
||
high-friction
|
||
|
||
"HELL"
|
||
|
||
tests—validating
|
||
|
||
system
|
||
|
||
survival
|
||
|
||
through
|
||
|
||
48%
|
||
|
||
entry
|
||
|
||
and
|
||
|
||
35%
|
||
|
||
exit
|
||
|
||
fill
|
||
|
||
scenarios—before
|
||
|
||
the
|
||
|
||
system
|
||
|
||
is
|
||
|
||
permitted
|
||
|
||
to
|
||
|
||
move
|
||
|
||
from
|
||
|
||
paper
|
||
|
||
to
|
||
|
||
live
|
||
|
||
trading.
|
||
1
|
||
Criterion XVI
|
||
(Corrective
|
||
|
||
Action)
|
||
|
||
is
|
||
|
||
embodied
|
||
|
||
in
|
||
|
||
the
|
||
|
||
Survival
|
||
|
||
Controller,
|
||
|
||
which
|
||
|
||
monitors
|
||
|
||
for
|
||
|
||
information
|
||
|
||
decay
|
||
|
||
and
|
||
|
||
automatically
|
||
|
||
contracts
|
||
|
||
the
|
||
|
||
reachable
|
||
|
||
state
|
||
|
||
space
|
||
|
||
(leverage)
|
||
|
||
when
|
||
|
||
sensors
|
||
|
||
drift.
|
||
1
|
||
III. The Data Plane: Hazelcast Distributed IMDG
|
||
The move to Hazelcast transitions the DOLPHIN-NAUTILUS system into a distributed reactive
|
||
mesh.
|
||
|
||
By
|
||
|
||
utilizing
|
||
|
||
an
|
||
|
||
In-Memory
|
||
|
||
Data
|
||
|
||
Grid
|
||
|
||
(IMDG),
|
||
|
||
the
|
||
|
||
system
|
||
|
||
overcomes
|
||
|
||
the
|
||
|
||
I/O
|
||
|
||
bottlenecks
|
||
|
||
and
|
||
|
||
RAM
|
||
|
||
limitations
|
||
|
||
that
|
||
|
||
currently
|
||
|
||
constrain
|
||
|
||
asset
|
||
|
||
monitoring
|
||
|
||
to
|
||
|
||
a
|
||
|
||
subset
|
||
|
||
of
|
||
|
||
the
|
||
|
||
target
|
||
|
||
universe.
|
||
1
|
||
1. Global Feature Store and Distributed State
|
||
The core of the data plane is the DOLPHIN_FEATURES map. This distributed map stores
|
||
eigenvalue
|
||
|
||
results,
|
||
|
||
order
|
||
|
||
book
|
||
|
||
imbalance
|
||
|
||
metrics,
|
||
|
||
and
|
||
|
||
confidence
|
||
|
||
multipliers
|
||
|
||
as
|
||
|
||
serialized
|
||
|
||
Protobuf
|
||
|
||
or
|
||
|
||
Msgpack
|
||
|
||
objects.
|
||
1
|
||
The primary advantage of this architecture is the elimination of file-based communication.
|
||
Instead
|
||
|
||
of
|
||
|
||
reading
|
||
|
||
from
|
||
|
||
disk-based
|
||
|
||
JSON
|
||
|
||
or
|
||
|
||
NPZ
|
||
|
||
files,
|
||
|
||
the
|
||
|
||
Alpha
|
||
|
||
Engine
|
||
|
||
and
|
||
|
||
Nautilus-Agent
|
||
|
||
perform
|
||
|
||
map.get()
|
||
|
||
operations
|
||
|
||
directly
|
||
|
||
from
|
||
|
||
memory.
|
||
1
|
||
For a scaling target of 400 assets, this
|
||
horizontal
|
||
|
||
distribution
|
||
|
||
allows
|
||
|
||
multiple
|
||
|
||
Dolphin
|
||
|
||
workers
|
||
|
||
to
|
||
|
||
process
|
||
|
||
subsets
|
||
|
||
of
|
||
|
||
the
|
||
|
||
asset
|
||
|
||
universe
|
||
|
||
simultaneously
|
||
|
||
without
|
||
|
||
competing
|
||
|
||
for
|
||
|
||
a
|
||
|
||
single
|
||
|
||
process’s
|
||
|
||
RAM.
|
||
1
|
||
Near Cache optimization is essential for achieving the required latency profile. By maintaining a
|
||
local
|
||
|
||
cache
|
||
|
||
of
|
||
|
||
the
|
||
|
||
distributed
|
||
|
||
data
|
||
|
||
inside
|
||
|
||
the
|
||
|
||
Python
|
||
|
||
client’s
|
||
|
||
memory,
|
||
|
||
read
|
||
|
||
latency
|
||
|
||
is
|
||
|
||
reduced
|
||
|
||
from
|
||
|
||
network
|
||
|
||
speeds
|
||
|
||
to
|
||
|
||
nanosecond
|
||
|
||
local
|
||
|
||
lookups.
|
||
1
|
||
However, the use of Near Cache
|
||
introduces
|
||
|
||
eventual
|
||
|
||
consistency
|
||
|
||
trade-offs.
|
||
|
||
For
|
||
|
||
time-sensitive
|
||
|
||
risk
|
||
|
||
multipliers,
|
||
|
||
invalidation
|
||
|
||
must
|
||
|
||
be
|
||
|
||
instantaneous,
|
||
|
||
requiring
|
||
|
||
the
|
||
|
||
map.invalidation.batch.enabled
|
||
|
||
property
|
||
|
||
to
|
||
|
||
be
|
||
|
||
set
|
||
|
||
to
|
||
|
||
false.
|
||
19
|
||
2. Stream Processing and Atomic Updates
|
||
Hazelcast Jet provides the compute engine for Phase MIG6 of the migration.
|
||
1
|
||
By treating 5s
|
||
scan
|
||
|
||
data
|
||
|
||
and
|
||
|
||
Binance
|
||
|
||
WebSocket
|
||
|
||
feeds
|
||
|
||
as
|
||
|
||
continuous
|
||
|
||
streams,
|
||
|
||
the
|
||
|
||
system
|
||
|
||
eliminates
|
||
|
||
the
|
||
|
||
latency
|
||
|
||
overhead
|
||
|
||
of
|
||
|
||
polling.
|
||
1
|
||
● Entry Processors for Risk Gating: The Adaptive Cut-off (ACB) v6 logic must be
|
||
implemented
|
||
|
||
as
|
||
|
||
a
|
||
|
||
Hazelcast
|
||
|
||
Entry
|
||
|
||
Processor.
|
||
|
||
Unlike
|
||
|
||
traditional
|
||
|
||
request-response
|
||
|
||
cycles,
|
||
|
||
an
|
||
|
||
Entry
|
||
|
||
Processor
|
||
|
||
executes
|
||
|
||
the
|
||
|
||
risk
|
||
|
||
calculation
|
||
|
||
directly
|
||
|
||
on
|
||
|
||
the
|
||
|
||
cluster
|
||
|
||
node
|
||
|
||
that
|
||
|
||
owns
|
||
|
||
the
|
||
|
||
data
|
||
|
||
partition.
|
||
21
|
||
This provides "Nuclear-grade reliability" by ensuring that leverage cuts are
|
||
applied
|
||
|
||
atomically
|
||
|
||
before
|
||
|
||
an
|
||
|
||
order
|
||
|
||
ever
|
||
|
||
leaves
|
||
|
||
the
|
||
|
||
execution
|
||
|
||
layer.
|
||
1
|
||
● Pipeline Transformations: The Alpha Rank Score (ARS) calculation is migrated into a Jet
|
||
|
||
pipeline. The transform operators apply live-weighted priority queuing, penalizing thin
|
||
assets
|
||
|
||
by
|
||
|
||
up
|
||
|
||
to
|
||
|
||
-10%
|
||
|
||
and
|
||
|
||
rewarding
|
||
|
||
liquid
|
||
|
||
ones
|
||
|
||
based
|
||
|
||
on
|
||
|
||
real-time
|
||
|
||
order
|
||
|
||
book
|
||
|
||
depth
|
||
|
||
quality.
|
||
1
|
||
3. High-Performance Tuning for Hazelcast Python Clients
|
||
To support 100,000+ events per second with millisecond-level latency, the Hazelcast cluster
|
||
must
|
||
|
||
be
|
||
|
||
tuned
|
||
|
||
for
|
||
|
||
a
|
||
|
||
high-traffic,
|
||
|
||
low-jitter
|
||
|
||
environment.
|
||
23
|
||
|
||
Parameter Recommended Setting Rationale
|
||
Memory Management High-Density Memory Store (Off-Heap)
|
||
Bypasses Java GC for large datasets
|
||
24
|
||
Serialization Compact or Portable Serialization
|
||
Enables cross-language schema evolution and fast access
|
||
24
|
||
Threading pool-size = core count Minimizes context switching overhead
|
||
24
|
||
TCP Buffers socket.receive.buffer.size = 128KB+
|
||
Prevents throttling during market bursts
|
||
27
|
||
Swappiness vm.swappiness = 0 Eliminates disk-swapping latency spikes
|
||
27
|
||
The system must also implement the "slow operation detector"
|
||
(hazelcast.slow.operation.detector.threshold.millis)
|
||
|
||
set
|
||
|
||
to
|
||
|
||
a
|
||
|
||
1ms
|
||
|
||
threshold
|
||
|
||
during
|
||
|
||
development
|
||
|
||
to
|
||
|
||
identify
|
||
|
||
any
|
||
|
||
EntryProcessors
|
||
|
||
that
|
||
|
||
might
|
||
|
||
block
|
||
|
||
partition
|
||
|
||
threads
|
||
|
||
during
|
||
|
||
critical
|
||
|
||
execution
|
||
|
||
paths.
|
||
24
|
||
IV. The Management Plane: Prefect Orchestration
|
||
In an aeronautics-grade system, orchestration is not merely about scheduling; it is about
|
||
maintaining
|
||
|
||
situational
|
||
|
||
awareness
|
||
|
||
of
|
||
|
||
the
|
||
|
||
operational
|
||
|
||
envelope.
|
||
|
||
Prefect
|
||
|
||
serves
|
||
|
||
as
|
||
|
||
the
|
||
|
||
management
|
||
|
||
plane,
|
||
|
||
or
|
||
|
||
"Air
|
||
|
||
Traffic
|
||
|
||
Control,"
|
||
|
||
overseeing
|
||
|
||
the
|
||
|
||
macro
|
||
|
||
layers
|
||
|
||
and
|
||
|
||
ensuring
|
||
|
||
state-aware
|
||
|
||
failure
|
||
|
||
handling.
|
||
1
|
||
1. SITARA: Situation Awareness and Task Reliability
|
||
The ExF and EsoF engines, currently standalone services, are migrated to Prefect Flows. This
|
||
|
||
provides full observability into sensor health.
|
||
1
|
||
Currently, a Binance WebSocket drop might go
|
||
undetected
|
||
|
||
for
|
||
|
||
minutes;
|
||
|
||
Prefect's
|
||
|
||
state-handling
|
||
|
||
ensures
|
||
|
||
that
|
||
|
||
a
|
||
|
||
"Fail-Safe"
|
||
|
||
signal
|
||
|
||
is
|
||
|
||
emitted
|
||
|
||
to
|
||
|
||
the
|
||
|
||
Nautilus-Agent
|
||
|
||
immediately
|
||
|
||
upon
|
||
|
||
service
|
||
|
||
stall.
|
||
1
|
||
The "Watchdog" Flow is the heartbeat of the management plane. Running every 10 seconds, it
|
||
checks
|
||
|
||
the
|
||
|
||
"Last
|
||
|
||
Updated"
|
||
|
||
timestamps
|
||
|
||
of
|
||
|
||
the
|
||
|
||
health
|
||
|
||
maps
|
||
|
||
in
|
||
|
||
Hazelcast
|
||
|
||
and
|
||
|
||
computes
|
||
|
||
a
|
||
|
||
global
|
||
|
||
Health Score ( ).
|
||
1
|
||
This score is used by the AlphaBetSizer to throttles aggression when the world model begins to blur.
|
||
1
|
||
2. Task Granularity and Latency Hazards
|
||
A critical architectural constraint in Phase 1 is the avoidance of task overhead in the "Hot Loop."
|
||
Prefect
|
||
|
||
is
|
||
|
||
a
|
||
|
||
"task-heavy"
|
||
|
||
orchestrator,
|
||
|
||
and
|
||
|
||
wrapping
|
||
|
||
the
|
||
|
||
AlphaSignalGenerator
|
||
|
||
or
|
||
|
||
SmartPlacer
|
||
|
||
inside
|
||
|
||
a
|
||
|
||
Prefect
|
||
|
||
Task
|
||
|
||
would
|
||
|
||
destroy
|
||
|
||
the
|
||
|
||
5s
|
||
|
||
scan
|
||
|
||
budget
|
||
|
||
due
|
||
|
||
to
|
||
|
||
state-tracking
|
||
|
||
overhead.
|
||
1
|
||
The strategic desideratum is to use Prefect for data ingestion and meta-adaptive updates (the
|
||
"Slow
|
||
|
||
Thinking")
|
||
|
||
while
|
||
|
||
maintaining
|
||
|
||
the
|
||
|
||
execution
|
||
|
||
agent
|
||
|
||
in
|
||
|
||
a
|
||
|
||
low-latency
|
||
|
||
process
|
||
|
||
(the
|
||
|
||
"Fast
|
||
|
||
Doing").
|
||
1
|
||
The Alpha Engine reads its current state from Hazelcast, which is updated by Prefect
|
||
at
|
||
|
||
a
|
||
|
||
lower
|
||
|
||
frequency,
|
||
|
||
reducing
|
||
|
||
signal-to-execution
|
||
|
||
latency
|
||
|
||
by
|
||
|
||
200-500ms
|
||
|
||
compared
|
||
|
||
to
|
||
|
||
the
|
||
|
||
current
|
||
|
||
disk-based
|
||
|
||
polling.
|
||
1
|
||
V. The Execution Plane: Nautilus-Trader and Rust
|
||
Integration
|
||
|
||
Nautilus-Trader serves as the system's "Cockpit." Its hybrid architecture—Python API with a
|
||
Rust
|
||
|
||
core—is
|
||
|
||
the
|
||
|
||
primary
|
||
|
||
mechanism
|
||
|
||
for
|
||
|
||
achieving
|
||
|
||
microsecond-level
|
||
|
||
execution
|
||
|
||
while
|
||
|
||
retaining
|
||
|
||
access
|
||
|
||
to
|
||
|
||
Python's
|
||
|
||
rich
|
||
|
||
ecosystem
|
||
|
||
of
|
||
|
||
ML
|
||
|
||
and
|
||
|
||
esoteric
|
||
|
||
libraries.
|
||
23
|
||
1. The Nautilus-Agent Actor Model
|
||
The execution logic is encapsulated within a Nautilus-Agent, implemented as an Actor within
|
||
the
|
||
|
||
platform
|
||
|
||
kernel.
|
||
1
|
||
This agent communicates with the Hazelcast/Prefect layer via a sidecar
|
||
process
|
||
|
||
on
|
||
|
||
each
|
||
|
||
cluster
|
||
|
||
node.
|
||
1
|
||
● AsyncDataEngine Integration: The Agent uses an AsyncDataEngine to subscribe to
|
||
market
|
||
|
||
data
|
||
|
||
topics.
|
||
6
|
||
By piping a Hazelcast Topic into the Nautilus kernel, the system reacts
|
||
to
|
||
|
||
change
|
||
|
||
the
|
||
|
||
millisecond
|
||
|
||
it
|
||
|
||
appears
|
||
|
||
in
|
||
|
||
memory,
|
||
|
||
eliminating
|
||
|
||
the
|
||
|
||
need
|
||
|
||
for
|
||
|
||
periodic
|
||
|
||
polling
|
||
|
||
of
|
||
|
||
the
|
||
|
||
order
|
||
|
||
book
|
||
|
||
state.
|
||
1
|
||
● Rust Networking: The platform's core components are written in Rust, leveraging the
|
||
tokio
|
||
|
||
runtime
|
||
|
||
for
|
||
|
||
asynchronous
|
||
|
||
networking.
|
||
28
|
||
This ensures that order submissions,
|
||
modifications,
|
||
|
||
and
|
||
|
||
cancellations
|
||
|
||
are
|
||
|
||
handled
|
||
|
||
with
|
||
|
||
nanosecond
|
||
|
||
resolution
|
||
|
||
and
|
||
|
||
thread
|
||
|
||
safety.
|
||
28
|
||
|
||
|
||
2. Zero-Copy Data Handling with Apache Arrow
|
||
For a 400-asset universe, the cost of serializing and deserializing order book matrices is
|
||
prohibitive.
|
||
|
||
The
|
||
|
||
system
|
||
|
||
must
|
||
|
||
standardize
|
||
|
||
on
|
||
|
||
the
|
||
|
||
Apache
|
||
|
||
Arrow
|
||
|
||
format,
|
||
|
||
which
|
||
|
||
serves
|
||
|
||
as
|
||
|
||
the
|
||
|
||
"lingua
|
||
|
||
franca"
|
||
|
||
for
|
||
|
||
high-performance
|
||
|
||
data
|
||
|
||
exchange.
|
||
1
|
||
Arrow's columnar memory model allows Python services to exchange data with JVM-based
|
||
Hazelcast
|
||
|
||
or
|
||
|
||
Rust-based
|
||
|
||
Nautilus
|
||
|
||
without
|
||
|
||
conversion
|
||
|
||
overhead.
|
||
30
|
||
By using Arrow IPC, the
|
||
system
|
||
|
||
achieves
|
||
|
||
a
|
||
|
||
zero-copy
|
||
|
||
architecture
|
||
|
||
where
|
||
|
||
multiple
|
||
|
||
components
|
||
|
||
share
|
||
|
||
references
|
||
|
||
to
|
||
|
||
a
|
||
|
||
common
|
||
|
||
memory
|
||
|
||
region
|
||
|
||
rather
|
||
|
||
than
|
||
|
||
duplicating
|
||
|
||
bytes.
|
||
30
|
||
This approach reduces CPU usage
|
||
and
|
||
|
||
eliminates
|
||
|
||
the
|
||
|
||
latency
|
||
|
||
spikes
|
||
|
||
associated
|
||
|
||
with
|
||
|
||
Python's
|
||
|
||
object-heavy
|
||
|
||
data
|
||
|
||
formats.
|
||
30
|
||
3. Tick-to-Trade Optimization
|
||
Optimization Technique Target Latency Implementation
|
||
uvloop < 1ms High-performance event loop for Linux/macOS
|
||
32
|
||
CPU Pinning Microseconds Isolates critical threads from OS interrupts
|
||
2
|
||
Kernel Bypass 1-5 Microseconds Bypasses network stack via DPDK/RDMA
|
||
3
|
||
Lock-Free Buffers Nanoseconds Uses LMAX Disruptor for internal messaging
|
||
33
|
||
The ultimate execution goal is the attainment of "Asymptotic Latency," where the time from
|
||
market
|
||
|
||
event
|
||
|
||
detection
|
||
|
||
to
|
||
|
||
order
|
||
|
||
placement
|
||
|
||
is
|
||
|
||
limited
|
||
|
||
only
|
||
|
||
by
|
||
|
||
the
|
||
|
||
physical
|
||
|
||
constraints
|
||
|
||
of
|
||
|
||
the
|
||
|
||
in-memory
|
||
|
||
grid
|
||
|
||
and
|
||
|
||
the
|
||
|
||
network
|
||
|
||
path
|
||
|
||
to
|
||
|
||
the
|
||
|
||
exchange.
|
||
1
|
||
VI. Control Theory-Driven Risk Management: The
|
||
Survival
|
||
|
||
Stack
|
||
|
||
The hallmark of an aeronautical-grade system is its ability to fail gracefully. The
|
||
DOLPHIN-NAUTILUS
|
||
|
||
survival
|
||
|
||
stack
|
||
|
||
replaces
|
||
|
||
binary
|
||
|
||
"Cease
|
||
|
||
Fire"
|
||
|
||
gates
|
||
|
||
with
|
||
|
||
a
|
||
|
||
continuous
|
||
|
||
Risk
|
||
|
||
Engine
|
||
|
||
driven
|
||
|
||
by
|
||
|
||
control
|
||
|
||
theory.
|
||
1
|
||
1. The Epistemic Risk Controller
|
||
In this model, the system's ability to trade is a function of its confidence in its world model. Risk
|
||
|
||
is managed by a multiplicative differential controller, where the total risk multiplier ( ) is the product of survival functions across five categories.
|
||
1
|
||
|
||
This approach respects the non-stationarity of risk. Instead of punishing the system with a flat
|
||
percentage
|
||
|
||
cap,
|
||
|
||
it
|
||
|
||
contracts
|
||
|
||
the
|
||
|
||
reachable
|
||
|
||
state
|
||
|
||
space
|
||
|
||
proportional
|
||
|
||
to
|
||
|
||
information
|
||
|
||
decay.
|
||
1
|
||
Category 1: Invariants (Existential/Binary) This is the only binary layer, residing in the Hazelcast Quorum and Nautilus heartbeat. If data corruption (checksum failure), ledger desync, or heartbeat loss is detected, is hard-set to 0 at the hardware interrupt level. There is no damping for this category; response is
|
||
instantaneous
|
||
|
||
(<10ms).
|
||
1
|
||
Category 2: Structural Integrity (Envelope-Aware)
|
||
This category monitors the MC-Forewarner update latency ( ). Instead of a 70% flat cap, the system uses an exponential decay function.
|
||
1
|
||
|
||
If the safety envelope is fresh (<30s), the multiplier is 1.0x. As staleness grows, uncertainty
|
||
increases
|
||
|
||
exponentially,
|
||
|
||
reducing
|
||
|
||
risk
|
||
|
||
to
|
||
|
||
0.10x
|
||
|
||
after
|
||
|
||
2
|
||
|
||
hours
|
||
|
||
of
|
||
|
||
blindness.
|
||
1
|
||
Category 3: Microstructure Confidence (Execution-Aware)
|
||
Input includes order book jitter ( ), latency ( ), and depth decay. As microstructure confidence falls, the system shifts its operational posture from aggressive taker/maker mix to
|
||
passive-only
|
||
|
||
quoting
|
||
|
||
to
|
||
|
||
minimize
|
||
|
||
adverse
|
||
|
||
selection.
|
||
1
|
||
Category 4: Environmental Entropy (Exogenous) This category responds to external shocks, such as a sudden DVOL spike or a taker ratio crash.
|
||
It
|
||
|
||
follows
|
||
|
||
an
|
||
|
||
impulse-decay
|
||
|
||
model:
|
||
|
||
on
|
||
|
||
a
|
||
|
||
shock
|
||
|
||
event,
|
||
|
||
risk
|
||
|
||
drops
|
||
|
||
to
|
||
|
||
0.3
|
||
|
||
instantly
|
||
|
||
and
|
||
|
||
decays
|
||
|
||
back
|
||
|
||
to
|
||
|
||
1.0
|
||
|
||
over
|
||
|
||
a
|
||
|
||
60-minute
|
||
|
||
cooling
|
||
|
||
period,
|
||
|
||
provided
|
||
|
||
no
|
||
|
||
new
|
||
|
||
shocks
|
||
|
||
occur,
|
||
|
||
preventing
|
||
|
||
whipsaw
|
||
|
||
over-trading.
|
||
1
|
||
Category 5: Capital Stress (Fiscal-Aware) Portfolio drawdown (DD) is mapped to risk using a continuous sigmoid function, preventing the
|
||
|
||
|
||
"cliffs" that stop system recovery.
|
||
1
|
||
|
||
2. Operational Postures and Mode Switching
|
||
The system dynamically shifts its posture based on levels, changing quote width, size, and inventory bounds.
|
||
1
|
||
|
||
Posture Rtotal Operational Action
|
||
APEX
|
||
Full 6.0x leverage allowed; aggressive taker entries
|
||
1
|
||
STALKER
|
||
Limit-only entries; max 2.0x leverage ceiling
|
||
1
|
||
TURTLE
|
||
Passive quoting; wide spreads; exit-only mode
|
||
1
|
||
HIBERNATE
|
||
Cancel all open orders; all stop
|
||
1
|
||
3. Damping and Hysteresis Logic
|
||
To prevent "Pilot-Induced Oscillation" where the system vibrates due to minor network lag, the
|
||
survival
|
||
|
||
controller
|
||
|
||
implements
|
||
|
||
three
|
||
|
||
filters
|
||
|
||
1
|
||
: ● Safety Deadband: Sensors have a "nominal range" where the multiplier remains exactly
|
||
1.0
|
||
|
||
(e.g.,
|
||
|
||
a
|
||
|
||
200ms
|
||
|
||
WebSocket
|
||
|
||
lag
|
||
|
||
is
|
||
|
||
ignored).
|
||
1
|
||
● Fast-Attack / Slow-Recovery: The system follows health drops immediately for safety
|
||
but
|
||
|
||
recovers
|
||
|
||
slowly
|
||
|
||
(e.g.,
|
||
|
||
1%
|
||
|
||
per
|
||
|
||
minute)
|
||
|
||
to
|
||
|
||
ensure
|
||
|
||
confidence
|
||
|
||
is
|
||
|
||
earned
|
||
|
||
back
|
||
|
||
through
|
||
|
||
stability.
|
||
1
|
||
● Schmitt Trigger Gates: Posture switches require a gap (e.g., drop to STALKER at 85%
|
||
health,
|
||
|
||
return
|
||
|
||
to
|
||
|
||
APEX
|
||
|
||
at
|
||
|
||
92%)
|
||
|
||
to
|
||
|
||
prevent
|
||
|
||
toggling
|
||
|
||
at
|
||
|
||
threshold
|
||
|
||
boundaries.
|
||
1
|
||
VII. Formal Verification and Traceability Protocols
|
||
Aeronautics-grade systems require more than just empirical testing; they require mathematical
|
||
proof
|
||
|
||
of
|
||
|
||
correctness.
|
||
|
||
The
|
||
|
||
core
|
||
|
||
logic
|
||
|
||
of
|
||
|
||
the
|
||
|
||
survival
|
||
|
||
controller
|
||
|
||
and
|
||
|
||
state
|
||
|
||
transitions
|
||
|
||
must
|
||
|
||
be
|
||
|
||
|
||
|
||
formally verified.
|
||
38
|
||
1. TLA+ Modeling of Distributed State
|
||
Temporal Logic of Actions (TLA+) is used to model the system design above the code level.
|
||
40
|
||
|
||
This
|
||
|
||
allows
|
||
|
||
engineers
|
||
|
||
to
|
||
|
||
prove
|
||
|
||
that
|
||
|
||
the
|
||
|
||
distributed
|
||
|
||
state
|
||
|
||
machine
|
||
|
||
can
|
||
|
||
never
|
||
|
||
violate
|
||
|
||
safety
|
||
|
||
requirements,
|
||
|
||
such
|
||
|
||
as
|
||
|
||
concurrent
|
||
|
||
access
|
||
|
||
to
|
||
|
||
a
|
||
|
||
single
|
||
|
||
inventory
|
||
|
||
resource
|
||
|
||
by
|
||
|
||
multiple
|
||
|
||
Nautilus
|
||
|
||
workers.
|
||
41
|
||
TLA+ catch errors in concurrent logic—which are notoriously difficult to test—before
|
||
a
|
||
|
||
single
|
||
|
||
line
|
||
|
||
of
|
||
|
||
Python
|
||
|
||
is
|
||
|
||
written.
|
||
40
|
||
2. Rocq and Proofs of Correctness
|
||
The Coq theorem prover (now Rocq) provides the highest level of assurance, enabling
|
||
machine-checked
|
||
|
||
proofs
|
||
|
||
that
|
||
|
||
the
|
||
|
||
survival
|
||
|
||
controller's
|
||
|
||
implementation
|
||
|
||
exactly
|
||
|
||
matches
|
||
|
||
its
|
||
|
||
mathematical
|
||
|
||
specification.
|
||
12
|
||
This is particularly critical for the Category 1 Invariants, where a
|
||
proof
|
||
|
||
of
|
||
|
||
"no-panic"
|
||
|
||
execution
|
||
|
||
is
|
||
|
||
required.
|
||
12
|
||
3. Trace Validation and Formal Auditing
|
||
Trace validation bridges the gap between high-level specs and Rust implementation. By
|
||
analyzing
|
||
|
||
execution
|
||
|
||
traces,
|
||
|
||
the
|
||
|
||
system
|
||
|
||
verifies
|
||
|
||
that
|
||
|
||
the
|
||
|
||
actual
|
||
|
||
sequence
|
||
|
||
of
|
||
|
||
events
|
||
|
||
in
|
||
|
||
the
|
||
|
||
Nautilus
|
||
|
||
matching
|
||
|
||
engine
|
||
|
||
conforms
|
||
|
||
to
|
||
|
||
the
|
||
|
||
formal
|
||
|
||
model.
|
||
43
|
||
This creates a "nuclear-grade" audit
|
||
trail,
|
||
|
||
ensuring
|
||
|
||
full
|
||
|
||
traceability
|
||
|
||
of
|
||
|
||
every
|
||
|
||
trading
|
||
|
||
decision
|
||
|
||
back
|
||
|
||
to
|
||
|
||
the
|
||
|
||
sensor
|
||
|
||
inputs
|
||
|
||
and
|
||
|
||
risk
|
||
|
||
multipliers.
|
||
17
|
||
VIII. Migration Path: From BRING_UP_GUIDE to Level 4
|
||
The migration is a multi-phase operation designed to maintain paper trading functionality
|
||
throughout
|
||
|
||
the
|
||
|
||
transition.
|
||
1
|
||
Phase MIG1: Prefect Orchestration and Standardized Workflows
|
||
The initial phase focuses on the management plane. standalone daemons for EsoF and ExF are
|
||
ported
|
||
|
||
to
|
||
|
||
Prefect
|
||
|
||
Flows.
|
||
|
||
Caching
|
||
|
||
strategies
|
||
|
||
using
|
||
|
||
cache_key_fn
|
||
|
||
are
|
||
|
||
implemented
|
||
|
||
to
|
||
|
||
preserve
|
||
|
||
CPU
|
||
|
||
during
|
||
|
||
periods
|
||
|
||
of
|
||
|
||
esoteric
|
||
|
||
signal
|
||
|
||
stability.
|
||
1
|
||
situational awareness is established through
|
||
Prefect
|
||
|
||
"Pulse"
|
||
|
||
alerts
|
||
|
||
monitoring
|
||
|
||
scan
|
||
|
||
interval
|
||
|
||
drift.
|
||
1
|
||
Phase MIG2: Initial Hazelcast IMDG Incorporation
|
||
The memory plane is introduced. file-based I/O for eigenvalues and order book snapshots is
|
||
replaced
|
||
|
||
with
|
||
|
||
the
|
||
|
||
DOLPHIN_FEATURES
|
||
|
||
map.
|
||
1
|
||
A Hazelcast member is deployed locally, and the
|
||
Nautilus-Agent
|
||
|
||
is
|
||
|
||
configured
|
||
|
||
to
|
||
|
||
use
|
||
|
||
Near
|
||
|
||
Cache
|
||
|
||
with
|
||
|
||
NearCacheConfig.
|
||
1
|
||
This phase solves the
|
||
immediate
|
||
|
||
RAM/CPU
|
||
|
||
bottleneck
|
||
|
||
for
|
||
|
||
the
|
||
|
||
50-asset
|
||
|
||
set.
|
||
1
|
||
|
||
Phase MIG3: Arrow-Standardized Hybrid Storage
|
||
Data storage is unified using Apache Arrow. The "Hot" state (Hazelcast) and "Cold" state
|
||
(Parquet
|
||
|
||
on
|
||
|
||
disk)
|
||
|
||
are
|
||
|
||
synchronized.
|
||
|
||
This
|
||
|
||
ensures
|
||
|
||
that
|
||
|
||
the
|
||
|
||
Vectorized
|
||
|
||
Backtesting
|
||
|
||
(VBT)
|
||
|
||
engine
|
||
|
||
can
|
||
|
||
swap
|
||
|
||
live
|
||
|
||
memory
|
||
|
||
streams
|
||
|
||
for
|
||
|
||
historical
|
||
|
||
files
|
||
|
||
seamlessly,
|
||
|
||
maintaining
|
||
|
||
100%
|
||
|
||
research-to-live
|
||
|
||
parity.
|
||
1
|
||
Phase MIG4: Epistemic Risk Controller Implementation
|
||
The static risk logic is replaced by the category-based Multiplicative Survival Controller. The
|
||
AlphaBetSizer
|
||
|
||
is
|
||
|
||
modified
|
||
|
||
to
|
||
|
||
pull
|
||
|
||
smoothed
|
||
|
||
health
|
||
|
||
scores
|
||
|
||
from
|
||
|
||
the
|
||
|
||
SYSTEM_HEALTH
|
||
|
||
map
|
||
|
||
in
|
||
|
||
local
|
||
|
||
RAM,
|
||
|
||
achieving
|
||
|
||
<5
|
||
|
||
microsecond
|
||
|
||
sizing
|
||
|
||
calculations.
|
||
1
|
||
Phase MIG5: Horizontal Scaling to 400 Assets
|
||
Leveraging the horizontal scalability of Hazelcast, the asset universe is expanded to 400
|
||
instruments.
|
||
1
|
||
Eight different Prefect/Python workers are deployed, each handling a subset of
|
||
50
|
||
|
||
assets
|
||
|
||
and
|
||
|
||
writing
|
||
|
||
to
|
||
|
||
the
|
||
|
||
same
|
||
|
||
distributed
|
||
|
||
IMDG
|
||
|
||
map,
|
||
|
||
bypassing
|
||
|
||
the
|
||
|
||
Python
|
||
|
||
GIL
|
||
|
||
constraints.
|
||
1
|
||
Phase MIG6: Inverse Migration (Data-to-Compute)
|
||
Compute logic is moved into the data layer. Adaptive cut-off and signal generation are
|
||
migrated
|
||
|
||
to
|
||
|
||
Hazelcast
|
||
|
||
Jet
|
||
|
||
pipelines.
|
||
1
|
||
The system transitions from a "polling" model to a "push"
|
||
model,
|
||
|
||
where
|
||
|
||
taker
|
||
|
||
ratio
|
||
|
||
spikes
|
||
|
||
or
|
||
|
||
eigenvalue
|
||
|
||
plunge
|
||
|
||
events
|
||
|
||
trigger
|
||
|
||
immediate
|
||
|
||
leverage
|
||
|
||
adjustments
|
||
|
||
in
|
||
|
||
the
|
||
|
||
execution
|
||
|
||
layer.
|
||
1
|
||
Phase MIG7: Total Dolphin-Nautilus Backend Migration
|
||
The final phase involves a wholesale migration of the Nautilus-Trader backend to Hazelcast
|
||
infra.
|
||
1
|
||
Nautilus becomes a native Hazelcast client, with the Agent sidecar process achieving
|
||
"Asymptotic
|
||
|
||
Latency"
|
||
|
||
by
|
||
|
||
executing
|
||
|
||
the
|
||
|
||
entire
|
||
|
||
trade
|
||
|
||
loop
|
||
|
||
off-heap
|
||
|
||
and
|
||
|
||
memory-local.
|
||
1
|
||
IX. Quantitative Desiderata and Performance Ceiling
|
||
The success of the upgrade is measured against the quantitative performance improvements
|
||
observed
|
||
|
||
in
|
||
|
||
the
|
||
|
||
V4
|
||
|
||
integrated
|
||
|
||
stack.
|
||
1
|
||
1. Geometric Growth Optimization
|
||
The system identified a "6.0x Knee," where the geometric growth rate (GGR) is maximized.
|
||
Beyond this point, the variance drag ( ) begins to erode the arithmetic mean ( ).
|
||
1
|
||
|
||
|
||
|
||
The objective of the Variance Amputation OB filter is to reduce daily variance by ~15.35%,
|
||
effectively
|
||
|
||
raising
|
||
|
||
the
|
||
|
||
GGR
|
||
|
||
ceiling.
|
||
1
|
||
2. Quantitative Performance Targets
|
||
Metric Baseline (V2) Aeronautics Grade (V4 + MC + OB)
|
||
ROI (55-day window) ~9.3% +85.72%
|
||
1
|
||
Max Drawdown 12.0% 6.68% (at 5x) / 16.2% (with OB)
|
||
1
|
||
Profit Factor 1.07 1.17 - 1.40
|
||
1
|
||
Sharpe Ratio 1.15 3.0 - 12.3 (Regime Dependent)
|
||
1
|
||
Win-Rate 41.2% 49.2% (Trend-Breakout localized ceiling)
|
||
1
|
||
The "HELL" test provides the ultimate friction-adjusted metric, ensuring profitability even when
|
||
entry
|
||
|
||
fill
|
||
|
||
rates
|
||
|
||
drop
|
||
|
||
to
|
||
|
||
48%,
|
||
|
||
a
|
||
|
||
scenario
|
||
|
||
that
|
||
|
||
causes
|
||
|
||
a
|
||
|
||
53%
|
||
|
||
loss
|
||
|
||
in
|
||
|
||
the
|
||
|
||
baseline
|
||
|
||
system.
|
||
1
|
||
X. Technical Dependencies and Desiderata
|
||
The upgraded system requires a meticulously managed dependency stack to ensure
|
||
compatibility
|
||
|
||
across
|
||
|
||
the
|
||
|
||
JVM,
|
||
|
||
Python
|
||
|
||
interpreter,
|
||
|
||
and
|
||
|
||
Rust
|
||
|
||
kernel.
|
||
|
||
Dependency Minimum Version Critical Feature
|
||
Python 3.12.x+ User API, Vectorized Backtesting (VBT)
|
||
1
|
||
Rust Stable (Latest) Matching Engine Core,
|
||
|
||
|
||
Adapter Networking
|
||
28
|
||
Hazelcast Platform 5.6.0+ Distributed Jet, CP Subsystem, Compact Serializers
|
||
24
|
||
Prefect 3.0 GA State-Aware Flow Orchestration, Pulse Automations
|
||
47
|
||
Nautilus-Trader 1.224.0+ Rust Matcher, Parquet Data Catalog, nanosecond Clock 29
|
||
Apache Arrow Latest Zero-copy IPC, Columnar Memory Format
|
||
30
|
||
Numba Latest JIT-optimization for Alpha engine hot loops
|
||
1
|
||
The system must migrate from the current Windows-based Siloqy environment to a tuned
|
||
Linux
|
||
|
||
distribution
|
||
|
||
(e.g.,
|
||
|
||
RHEL
|
||
|
||
or
|
||
|
||
Ubuntu
|
||
|
||
with
|
||
|
||
a
|
||
|
||
real-time
|
||
|
||
kernel)
|
||
|
||
to
|
||
|
||
support
|
||
|
||
the
|
||
|
||
required
|
||
|
||
kernel-bypass
|
||
|
||
networking
|
||
|
||
and
|
||
|
||
CPU
|
||
|
||
isolation
|
||
|
||
protocols.
|
||
1
|
||
XI. Nuanced Conclusions and Strategic
|
||
Recommendations
|
||
|
||
The transition of the DOLPHIN-NAUTILUS stack to a Level 4 architecture is not merely a
|
||
technical
|
||
|
||
upgrade;
|
||
|
||
it
|
||
|
||
is
|
||
|
||
a
|
||
|
||
fundamental
|
||
|
||
shift
|
||
|
||
toward
|
||
|
||
an
|
||
|
||
aeronautics-standard
|
||
|
||
engineering
|
||
|
||
culture.
|
||
|
||
By
|
||
|
||
treating
|
||
|
||
the
|
||
|
||
trading
|
||
|
||
platform
|
||
|
||
as
|
||
|
||
a
|
||
|
||
mission-critical
|
||
|
||
flight-control
|
||
|
||
computer,
|
||
|
||
the
|
||
|
||
system
|
||
|
||
moves
|
||
|
||
beyond
|
||
|
||
the
|
||
|
||
limitations
|
||
|
||
of
|
||
|
||
deterministic
|
||
|
||
scripts
|
||
|
||
and
|
||
|
||
binary
|
||
|
||
failure
|
||
|
||
modes.
|
||
|
||
The survival controller ensures that the system "breathes" with the market, contracting risk in
|
||
response
|
||
|
||
to
|
||
|
||
information
|
||
|
||
decay
|
||
|
||
and
|
||
|
||
expanding
|
||
|
||
it
|
||
|
||
only
|
||
|
||
when
|
||
|
||
sensor
|
||
|
||
confidence
|
||
|
||
is
|
||
|
||
verified.
|
||
|
||
The
|
||
|
||
reliance
|
||
|
||
on
|
||
|
||
off-heap
|
||
|
||
memory
|
||
|
||
through
|
||
|
||
Hazelcast
|
||
|
||
and
|
||
|
||
zero-copy
|
||
|
||
data
|
||
|
||
handling
|
||
|
||
via
|
||
|
||
Apache
|
||
|
||
Arrow
|
||
|
||
overcomes
|
||
|
||
the
|
||
|
||
architectural
|
||
|
||
limitations
|
||
|
||
of
|
||
|
||
Python,
|
||
|
||
allowing
|
||
|
||
the
|
||
|
||
platform
|
||
|
||
to
|
||
|
||
scale
|
||
|
||
to
|
||
|
||
400
|
||
|
||
assets
|
||
|
||
without
|
||
|
||
sacrificing
|
||
|
||
the
|
||
|
||
microsecond-level
|
||
|
||
reaction
|
||
|
||
times
|
||
|
||
needed
|
||
|
||
for
|
||
|
||
HFT.
|
||
|
||
The ultimate recommendation for the implementation team is to adhere strictly to the phased
|
||
roadmap,
|
||
|
||
validating
|
||
|
||
each
|
||
|
||
category
|
||
|
||
of
|
||
|
||
the
|
||
|
||
risk
|
||
|
||
engine
|
||
|
||
against
|
||
|
||
the
|
||
|
||
high-friction
|
||
|
||
"HELL"
|
||
|
||
models.
|
||
|
||
By
|
||
|
||
bridging
|
||
|
||
the
|
||
|
||
gap
|
||
|
||
between
|
||
|
||
Python's
|
||
|
||
"thinking"
|
||
|
||
and
|
||
|
||
Hazelcast/Rust's
|
||
|
||
"doing,"
|
||
|
||
the
|
||
|
||
DOLPHIN-NAUTILUS
|
||
|
||
platform
|
||
|
||
attains
|
||
|
||
the
|
||
|
||
level
|
||
|
||
of
|
||
|
||
nuclear-grade
|
||
|
||
reliability
|
||
|
||
and
|
||
|
||
aeronautical
|
||
|
||
precision
|
||
|
||
required
|
||
|
||
to
|
||
|
||
dominate
|
||
|
||
the
|
||
|
||
contemporary
|
||
|
||
high-frequency
|
||
|
||
landscape.
|
||
|
||
The
|
||
|
||
goal
|
||
|
||
is
|
||
|
||
the
|
||
|
||
|
||
attainment of Asymptotic Latency—a state where performance is limited only by the laws of
|
||
physics
|
||
|
||
and
|
||
|
||
the
|
||
|
||
stability
|
||
|
||
of
|
||
|
||
the
|
||
|
||
distributed
|
||
|
||
reactive
|
||
|
||
mesh.
|
||
|
||
Works cited
|
||
1. BRINGUP_GUIDE.md 2. High-Frequency Trading Software Development Guide - Appinventiv, accessed
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://appinventiv.com/blog/high-frequency-trading-software-development-guide/ 3. Kernel Bypass in HFT: How to Reduce Latency in Linux | QuantVPS, accessed March 6, 2026, https://www.quantvps.com/blog/kernel-bypass-in-hft 4. HFTPerformance: An Open-Source Framework for High-Frequency Trading
|
||
System
|
||
|
||
Benchmarking
|
||
|
||
and
|
||
|
||
Optimization
|
||
|
||
-
|
||
|
||
Medium,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://medium.com/@gwrx2005/hftperformance-an-open-source-framework-for-high-frequency-trading-system-benchmarking-and-803031fe7157 5. Hazelcast vs Spark: Detailed Performance Comparison 2024 - Timeplus, accessed March 6, 2026, https://www.timeplus.com/post/hazelcast-vs-spark 6. Architecture - NautilusTrader Documentation, accessed March 6, 2026, https://nautilustrader.io/docs/latest/concepts/architecture/ 7. An Optimal PID Based Trading Strategy under the log-Normal Stock Market
|
||
Characterization
|
||
|
||
-
|
||
|
||
UPV,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://personales.upv.es/thinkmind/dl/journals/soft/soft_v16_n12_2023/soft_v16_n12_2023_10.pdf 8. Mastering High-Frequency Trading: A Comprehensive Guide to Architecture,
|
||
Technology,
|
||
|
||
and
|
||
|
||
Best
|
||
|
||
Practices
|
||
|
||
-
|
||
|
||
Sachin
|
||
|
||
Chitre,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://growth-guru.medium.com/mastering-high-frequency-trading-a-comprehensive-guide-to-architecture-technology-and-best-8774c9942fac 9. Embedded Software Design Standards in Aviation, Military, Medical, and Space
|
||
Systems
|
||
|
||
-
|
||
|
||
Real
|
||
|
||
Time
|
||
|
||
Consulting,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://real-time-consulting.com/wp-content/uploads/2025/07/Embedded-Software-Design-Standards-Across-Multiple-Disiplines.pdf 10. Regulatory Guide 1.152, Revision 3, "Criteria for Use of Computers in Safety
|
||
Systems
|
||
|
||
of
|
||
|
||
Nuclear
|
||
|
||
Power
|
||
|
||
Plants.",
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://www.nrc.gov/docs/ml1028/ml102870022.pdf 11. Mixed Criticality Systems - A Review - University of York, accessed March 6, 2026, https://www-users.york.ac.uk/~ab38/review.pdf 12. formal-land/rocq-of-rust: Formal verification tool for Rust: check 100% of
|
||
execution
|
||
|
||
cases
|
||
|
||
of
|
||
|
||
your
|
||
|
||
programs
|
||
|
||
to
|
||
|
||
make
|
||
|
||
safer
|
||
|
||
applications.
|
||
|
||
-
|
||
|
||
GitHub,
|
||
|
||
accessed
|
||
March 6, 2026, https://github.com/formal-land/rocq-of-rust 13. Avionics Systems, Software, and Hardware Verification/Testing - ENSCO, Inc.,
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://www.ensco.com/aerospace/avionics-systems-software-and-hardware-verification-testing 14. Aerospace Software Testing | Rapita Systems, accessed March 6, 2026,
|
||
|
||
https://www.rapitasystems.com/aerospace-software-testing 15. LYNX MOSA.ic™: Modular Avionics Software Framework for Mission-Critical
|
||
Systems,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://www.lynx.com/solutions/safe-and-secure-operating-environment 16. Appendix B to Part 50—Quality Assurance Criteria for Nuclear Power Plants and
|
||
Fuel
|
||
|
||
Reprocessing
|
||
|
||
Plants,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://www.nrc.gov/reading-rm/doc-collections/cfr/part050/part050-appb 17. Mechanistic Interpretability of LoRA-Adapted Language Models for Nuclear
|
||
Reactor
|
||
|
||
Safety
|
||
|
||
Applications
|
||
|
||
-
|
||
|
||
arXiv,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://arxiv.org/html/2507.09931v2 18. Hazelcast Jet: Low-latency Stream Processing at the 99.99th percentile - TU Delft,
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://repository.tudelft.nl/file/File_a0937bef-756e-4d2a-9e72-c2cc40e75f41 19. Near Cache - Hazelcast Documentation, accessed March 6, 2026, https://docs.hazelcast.com/hazelcast/5.2/performance/near-cache 20. Near Cache | Hazelcast Documentation, accessed March 6, 2026, https://docs.hazelcast.com/hazelcast/5.6/cluster-performance/near-cache 21. An Easy Performance Improvement with EntryProcessor - Hazelcast, accessed
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://hazelcast.com/blog/an-easy-performance-improvement-with-entryprocessor/ 22. Entry Processor - Hazelcast Documentation, accessed March 6, 2026, https://docs.hazelcast.com/hazelcast/5.4/computing/entry-processor 23. Chapter 1: Introduction to NautilusTrader - DEV Community, accessed March 6,
|
||
2026,
|
||
https://dev.to/henry_lin_3ac6363747f45b4/chapter-1-introduction-to-nautilustrader-5552 24. Performance Tips | Hazelcast Documentation, accessed March 6, 2026, https://docs.hazelcast.com/hazelcast/5.6/cluster-performance/performance-tips 25. Hazelcast Performance Tuning: Best Practices - Mach41, accessed March 6, 2026, https://mach41.com/hazelcast-performance-tuning-best-practices-mach41/ 26. Serialization — Hazelcast Python Client 5.6.0 documentation, accessed March 6, 2026, https://hazelcast.readthedocs.io/en/stable/serialization.html 27. Performance Tuning | Hazelcast Documentation, accessed March 6, 2026, https://docs.hazelcast.com/hazelcast/5.5/cluster-performance/performance-tuning 28. GitHub - nautechsystems/nautilus_trader: A high-performance algorithmic
|
||
trading
|
||
|
||
platform
|
||
|
||
and
|
||
|
||
event-driven
|
||
|
||
backtester,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://github.com/nautechsystems/nautilus_trader 29. NautilusTrader: The fastest, most reliable open-source trading engine, accessed March 6, 2026, https://nautilustrader.io/ 30. Zero-Copy Data Processing in Python Using Apache Arrow | by Majidbasharat -
|
||
Medium,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://medium.com/@majidbasharat21/zero-copy-data-processing-in-python-using-apache-arrow-831beb90c59d
|
||
|
||
31. How to optimize data transfer with Apache Arrow? - Tencent Cloud, accessed March 6, 2026, https://www.tencentcloud.com/techpedia/126167 32. Overview - NautilusTrader Documentation, accessed March 6, 2026, https://nautilustrader.io/docs/latest/concepts/overview/ 33. High Frequency Trading Platforms: Architecture, Speed & Infrastructure Explained
|
||
(2026)
|
||
|
||
|
|
||
|
||
QuantVPS,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://www.quantvps.com/blog/high-frequency-trading-platform 34. Inside High-Frequency Trading Systems: The Race to Zero Latency - Level Up
|
||
Coding,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://levelup.gitconnected.com/inside-high-frequency-trading-systems-the-race-to-zero-latency-faa638d0c180 35. Dynamic Position Sizing and Risk Management in Volatile Markets, accessed
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://internationaltradinginstitute.com/blog/dynamic-position-sizing-and-risk-management-in-volatile-markets/ 36. High-Frequency Trader, accessed March 6, 2026, http://people.ece.cornell.edu/land/courses/ece5760/FinalProjects/s2024/wb273_sal267_rak277/docs/index.html 37. PID Control of Systems with Hysteresis - UWSpace - University of Waterloo,
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://uwspace.uwaterloo.ca/bitstreams/d35e3289-929b-4d9d-a132-87c2de1d354c/download 38. Formal Verification of Rust Programs by Functional Translation - Son HO, accessed March 6, 2026, https://www.sonho.fr/papers/thesis-manuscript.pdf 39. Gut check: TLA+ = state machine simulation + combinatorial generative testing? -
|
||
Reddit,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://www.reddit.com/r/tlaplus/comments/ag5ong/gut_check_tla_state_machine_simulation/ 40. Software Verification (with TLA+) - Hosted By One.com | Webhosting made
|
||
simple,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://usercontent.one/wp/gamess.dk/wp-content/uploads/2024/09/Formal-Software-Verification.pdf?media=1762157214 41. Verification by way of refinement: a case study in the use of Coq and TLA in the
|
||
design
|
||
|
||
of
|
||
|
||
a
|
||
|
||
safety
|
||
|
||
critical
|
||
|
||
system
|
||
|
||
-
|
||
|
||
OSTI.GOV,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://www.osti.gov/servlets/purl/1367058 42. Building A "Simple" Distributed System - Formal Verification - Jack Vanlightly,
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://jack-vanlightly.com/blog/2019/1/27/building-a-simple-distributed-system-formal-verification 43. Converos: Practical Model Checking for Verifying Rust OS Kernel Concurrency -
|
||
USENIX,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://www.usenix.org/system/files/atc25-tang.pdf 44. Digital Instrumentation and Control Systems and Other Advanced Digital
|
||
Technologies
|
||
|
||
for
|
||
|
||
Enhancing
|
||
|
||
Nuclear
|
||
|
||
Power
|
||
|
||
Plant
|
||
|
||
Performance,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
|
||
|
||
https://www-pub.iaea.org/MTCD/publications/PDF/p15731-PUB2116_web.pdf 45. Quickstart | NautilusTrader Documentation, accessed March 6, 2026, https://nautilustrader.io/docs/nightly/getting_started/quickstart/ 46. Announcing Hazelcast Platform 5.6.0: Enhanced Control and Resilience, accessed March 6, 2026, https://hazelcast.com/blog/announcing-hazelcast-platform-5-6-0/ 47. < Marvin> how you differentiate Prefect from Temporal Prefect Community
|
||
#ask-marvin,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://linen.prefect.io/t/30251233/ulva73b9p-how-you-differentiate-prefect-from-temporal 48. The Race to Zero Latency: How to Optimize Code for High-Frequency Trading in
|
||
Quant
|
||
|
||
Firms
|
||
|
||
|
|
||
|
||
by
|
||
|
||
Muhammed
|
||
|
||
Nihal
|
||
|
||
|
|
||
|
||
Medium,
|
||
|
||
accessed
|
||
|
||
March
|
||
|
||
6,
|
||
|
||
2026,
|
||
https://medium.com/@nihal.143/the-race-to-zero-latency-how-to-optimize-code-for-high-frequency-trading-quant-firms-362f828f9c16 |