Home / Input Output / ouroboros-leios-sim
Apr 15, 4-5 AM (0)
Apr 15, 5-6 AM (0)
Apr 15, 6-7 AM (0)
Apr 15, 7-8 AM (0)
Apr 15, 8-9 AM (0)
Apr 15, 9-10 AM (0)
Apr 15, 10-11 AM (0)
Apr 15, 11-12 PM (0)
Apr 15, 12-1 PM (0)
Apr 15, 1-2 PM (0)
Apr 15, 2-3 PM (0)
Apr 15, 3-4 PM (0)
Apr 15, 4-5 PM (0)
Apr 15, 5-6 PM (0)
Apr 15, 6-7 PM (0)
Apr 15, 7-8 PM (0)
Apr 15, 8-9 PM (0)
Apr 15, 9-10 PM (0)
Apr 15, 10-11 PM (0)
Apr 15, 11-12 AM (0)
Apr 16, 12-1 AM (0)
Apr 16, 1-2 AM (0)
Apr 16, 2-3 AM (0)
Apr 16, 3-4 AM (0)
Apr 16, 4-5 AM (0)
Apr 16, 5-6 AM (0)
Apr 16, 6-7 AM (0)
Apr 16, 7-8 AM (0)
Apr 16, 8-9 AM (2)
Apr 16, 9-10 AM (3)
Apr 16, 10-11 AM (2)
Apr 16, 11-12 PM (0)
Apr 16, 12-1 PM (3)
Apr 16, 1-2 PM (1)
Apr 16, 2-3 PM (0)
Apr 16, 3-4 PM (0)
Apr 16, 4-5 PM (0)
Apr 16, 5-6 PM (0)
Apr 16, 6-7 PM (0)
Apr 16, 7-8 PM (0)
Apr 16, 8-9 PM (0)
Apr 16, 9-10 PM (0)
Apr 16, 10-11 PM (0)
Apr 16, 11-12 AM (0)
Apr 17, 12-1 AM (0)
Apr 17, 1-2 AM (0)
Apr 17, 2-3 AM (0)
Apr 17, 3-4 AM (0)
Apr 17, 4-5 AM (0)
Apr 17, 5-6 AM (0)
Apr 17, 6-7 AM (0)
Apr 17, 7-8 AM (0)
Apr 17, 8-9 AM (0)
Apr 17, 9-10 AM (0)
Apr 17, 10-11 AM (0)
Apr 17, 11-12 PM (0)
Apr 17, 12-1 PM (0)
Apr 17, 1-2 PM (0)
Apr 17, 2-3 PM (3)
Apr 17, 3-4 PM (0)
Apr 17, 4-5 PM (0)
Apr 17, 5-6 PM (0)
Apr 17, 6-7 PM (0)
Apr 17, 7-8 PM (1)
Apr 17, 8-9 PM (1)
Apr 17, 9-10 PM (0)
Apr 17, 10-11 PM (0)
Apr 17, 11-12 AM (0)
Apr 18, 12-1 AM (0)
Apr 18, 1-2 AM (0)
Apr 18, 2-3 AM (0)
Apr 18, 3-4 AM (0)
Apr 18, 4-5 AM (0)
Apr 18, 5-6 AM (1)
Apr 18, 6-7 AM (0)
Apr 18, 7-8 AM (0)
Apr 18, 8-9 AM (0)
Apr 18, 9-10 AM (0)
Apr 18, 10-11 AM (1)
Apr 18, 11-12 PM (0)
Apr 18, 12-1 PM (0)
Apr 18, 1-2 PM (0)
Apr 18, 2-3 PM (0)
Apr 18, 3-4 PM (0)
Apr 18, 4-5 PM (0)
Apr 18, 5-6 PM (0)
Apr 18, 6-7 PM (0)
Apr 18, 7-8 PM (0)
Apr 18, 8-9 PM (0)
Apr 18, 9-10 PM (0)
Apr 18, 10-11 PM (0)
Apr 18, 11-12 AM (0)
Apr 19, 12-1 AM (0)
Apr 19, 1-2 AM (0)
Apr 19, 2-3 AM (0)
Apr 19, 3-4 AM (0)
Apr 19, 4-5 AM (0)
Apr 19, 5-6 AM (0)
Apr 19, 6-7 AM (0)
Apr 19, 7-8 AM (0)
Apr 19, 8-9 AM (0)
Apr 19, 9-10 AM (0)
Apr 19, 10-11 AM (0)
Apr 19, 11-12 PM (0)
Apr 19, 12-1 PM (0)
Apr 19, 1-2 PM (0)
Apr 19, 2-3 PM (0)
Apr 19, 3-4 PM (0)
Apr 19, 4-5 PM (0)
Apr 19, 5-6 PM (0)
Apr 19, 6-7 PM (0)
Apr 19, 7-8 PM (0)
Apr 19, 8-9 PM (0)
Apr 19, 9-10 PM (0)
Apr 19, 10-11 PM (0)
Apr 19, 11-12 AM (0)
Apr 20, 12-1 AM (0)
Apr 20, 1-2 AM (1)
Apr 20, 2-3 AM (0)
Apr 20, 3-4 AM (0)
Apr 20, 4-5 AM (0)
Apr 20, 5-6 AM (0)
Apr 20, 6-7 AM (0)
Apr 20, 7-8 AM (0)
Apr 20, 8-9 AM (0)
Apr 20, 9-10 AM (0)
Apr 20, 10-11 AM (0)
Apr 20, 11-12 PM (0)
Apr 20, 12-1 PM (0)
Apr 20, 1-2 PM (0)
Apr 20, 2-3 PM (0)
Apr 20, 3-4 PM (0)
Apr 20, 4-5 PM (1)
Apr 20, 5-6 PM (0)
Apr 20, 6-7 PM (0)
Apr 20, 7-8 PM (0)
Apr 20, 8-9 PM (0)
Apr 20, 9-10 PM (0)
Apr 20, 10-11 PM (0)
Apr 20, 11-12 AM (0)
Apr 21, 12-1 AM (0)
Apr 21, 1-2 AM (0)
Apr 21, 2-3 AM (1)
Apr 21, 3-4 AM (0)
Apr 21, 4-5 AM (0)
Apr 21, 5-6 AM (0)
Apr 21, 6-7 AM (0)
Apr 21, 7-8 AM (0)
Apr 21, 8-9 AM (2)
Apr 21, 9-10 AM (2)
Apr 21, 10-11 AM (0)
Apr 21, 11-12 PM (1)
Apr 21, 12-1 PM (1)
Apr 21, 1-2 PM (0)
Apr 21, 2-3 PM (0)
Apr 21, 3-4 PM (0)
Apr 21, 4-5 PM (0)
Apr 21, 5-6 PM (0)
Apr 21, 6-7 PM (0)
Apr 21, 7-8 PM (0)
Apr 21, 8-9 PM (0)
Apr 21, 9-10 PM (0)
Apr 21, 10-11 PM (2)
Apr 21, 11-12 AM (0)
Apr 22, 12-1 AM (0)
Apr 22, 1-2 AM (0)
Apr 22, 2-3 AM (0)
Apr 22, 3-4 AM (0)
Apr 22, 4-5 AM (0)
29 commits this week Apr 15, 2026 - Apr 22, 2026
Skip event buffering when no output file is requested
The deterministic event sorting pipeline (added in 54389c5ec) was
cloning and buffering every simulation event even when no -o output
file was given.  At T=0.250 with 1500 nodes this accumulated 7M+
OutputEvent structs (~10 GB) at peak, causing RSS to balloon from
~21 GB (actual node state) to 59 GB and OOM.

Guard the clone/buffer/flush path with a has_output check.  RSS at
slot 656 dropped from 59 GB to 28 GB — matching tracked node state
plus normal allocator overhead.

Also adds EventMonitor and LivenessMonitor stats logging every 60
slots for ongoing memory diagnostics.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add network queue stats instrumentation
Expose per-shard connection queue statistics (total/active connections,
queued messages, queued bytes) via a shared NetworkStatsCollector.
Each shard's sequential engine updates its counters at slot boundaries;
the node's existing log_memory_stats reads the aggregate.

Output appears every 60 slots alongside Memory stats, covering all
shards.  Initial profiling showed zero queued messages in turbo mode
(zero-latency clusters bypass bandwidth queues), ruling out network
queues as the cause of the ~40 GB RSS vs ~20 GB tracked-state gap.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Fix topology connectivity with minimal reciprocal links
Replace the full symmetrization (which nearly doubled link count from
39k to 59k) with a targeted fixup: for each node not listed as
anyone's producer, add a single reciprocal link back from its first
producer.  This adds only 432 links (one per BP) vs ~20k before.

BPs were the only nodes needing fixup — they pick 2 relay producers
but no relay was picking them back, making them invisible to the
sim's consumer-edge BFS.  Relays cross-reference each other enough
to be naturally reachable.

Re-generated topology: 38,943 links (vs 59,268 symmetric, 38,511
original asymmetric).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Fix generate-topology to produce bidirectional links
The sim's connectivity BFS traverses consumer edges (reverse of
producers). Unidirectional producer links left nodes unreachable,
causing "Graph must be fully connected!" errors. Symmetrize all
links so every A→B producer also creates B→A.

Also rename generate_topology.py → generate-topology.py and
summarize_topology.py → summarize-topology.py for consistency
with the other shell scripts.

Re-generated topology-v2-expanded-1500.yaml (59,268 links, fully connected).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Retain EB-critical TXs on peer backlog overflow
Problem
-------
When a node's peer TX backlog hits its cap (e.g. 10,000), incoming TXs
are silently dropped from self.txs.  If a dropped TX is referenced by a
pending Endorser Block, the EB's validation scan (try_validating_eb)
finds has_tx() = false and the EB is never marked all_txs_seen.  The EB
then misses its vote window and is orphaned by the next Ranking Block
(WrongEB).  Because the TX is never re-offered by peers, the one-shot
missing_txs trigger — already consumed by acknowledge_tx — cannot
re-fire, leaving the EB permanently stuck.

Under Poisson-clustered RB production (e.g. seed 4 at 0.200 MB/s), this
cascade produced 48 EBs with 19 uncertified (40%), 23M peer TX drops,
and a mean of only 348 votes/EB (well below the 450 quorum).

Fix
---
Two changes in propagate_tx():

1. Move the mempool insertion check (try_add_to_mempool) BEFORE
   acknowledge_tx, so that missing_txs has not yet been consumed at the
   point where we decide whether to drop.

2. When PeerBacklogFull fires, check whether the TX is referenced by a
   pending EB (self.leios.missing_txs.contains_key).  If yes, keep the
   TX in self.txs (skip the backlog, but preserve has_tx = true) and
   fall through to acknowledge_tx normally.  If no, drop as before.

This retains only EB-critical TXs — bounded by (pending_EBs × EB_size),
typically a few thousand entries and ~3 MB of HashMap overhead per node.
Non-critical TXs are still dropped, preserving the memory cap's purpose.

Effect on seed 4 sequential 0.200/wfa-ls (worst-case seed)
-----------------------------------------------------------
                  EBs  uncert  mean   WrongEB  drops   peak RSS
caps (before):    48   19      348    1138     23.2M   ~20 GB
caps-retain:      45    8      470    1330      5.9M   ~24 GB
nocaps (ref):     46    8      473    1516      0      ~35 GB

Uncertified EBs:  19 → 8  (40% → 18%)
Mean votes/EB:    348 → 470  (near nocaps 473)
Peer TX drops:    23.2M → 5.9M  (−74%)
Peak RSS:         ~20 → ~24 GB  (+20%, well below nocaps ~35 GB)

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add no-caps parameter file and baseline voting results
parameters/no-caps.yaml disables all three memory caps for diagnostic
experiments (peer backlog, generated backlog, TX max age).

voting_results.csv captures the full 4-way matrix at 0.200/wfa-ls:
{turbo,sequential} × {caps,nocaps} × seeds 0-4. Key findings:

- Seed 4 is the stress seed: caps cause 40% uncertified (seq) vs 17%
  without caps. Root cause is a race in propagate_tx where
  acknowledge_tx consumes the one-shot missing_txs trigger before
  PeerBacklogFull drops the TX.
- Seeds 1,3 are cap-insensitive (well-spaced RBs).
- No-caps converges all seeds to 16-22% uncertified.
- Stale rows (pre-rayon-fix, pre-seed-wiring) labelled as such.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Fix rayon non-determinism: remove .filter() from parallel dispatch
rayon's filter() on an indexed parallel iterator produces an unindexed
iterator whose collect() does NOT preserve element order — the output
Vec order depends on work-stealing scheduling, which varies per process.
Moving the empty-work check into .map() keeps the iterator indexed, so
collect() is deterministic regardless of rayon thread scheduling.

This was the root cause of the bistable attractor at 0.200/wfa-ls: the
same seed+config could land on either 28/8 (healthy) or 81/49
(pathological) depending on which process-launch rayon happened to
schedule.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Write per-run sim logs, and poll-sim picks latest by default
cip-voting-options.sh was piping every run through `tee /dev/stderr`
which reopens /proc/self/fd/2 on each invocation; on Linux that gives
a fresh offset-0 open-file-description, so successive seeds in a
-S sweep overwrote the combined log from byte 0 — only the in-flight
seed ever survived on disk.

Now each run tees to /tmp/sim-T<T>-<mode>-<engine>-seed<N>.log so
every seed retains its full log. poll-sim.sh defaults to the latest
/tmp/sim-*.log when no path is given, so the normal /loop monitor
workflow keeps working without changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add -P/--extra-params and scripts/poll-sim.sh
cip-voting-options.sh gains a repeatable -P/--extra-params flag that
layers additional YAML parameter files on top of the existing config
chain (applied last so they override everything). Useful for quick
experiments — e.g., `-P /tmp/coarse-timestamp.yaml` to bump
timestamp-resolution-ms without touching the committed parameter set.

poll-sim.sh prints a concise one-line status of a running sim-cli plus
the log tail, intended for use from /loop or cron to watch a
long-running benchmark without blocking Claude's thread on sleep.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Make multi-shard sequential engine deterministic
Cross-shard message delivery order in the sequential engine previously
depended on OS thread scheduling of peer shards, so runs with
shard_count > 1 produced different event sequences across runs. Fixing
this required four coordinated changes:

1. **Deterministic cross-shard merge**: tag every CrossShardMsg with
   `source_shard` and a per-sender monotonic `seq`. Receiving shards
   buffer incoming messages into a `BinaryHeap` keyed on
   `(send_time, source_shard, seq)` and only deliver those whose
   send_time is strictly less than the minimum of every peer's
   advertised `shared_time`. Under that rule, no future message can
   arrive with an earlier send_time, so delivery order is a pure
   function of sent messages (the messages themselves are produced
   deterministically per-shard).

2. **Strict CMB ceiling**: the block condition changes from
   `timestamp > ceiling` to `timestamp >= ceiling`. At the boundary
   `timestamp == ceiling`, a peer might still be about to send a
   message whose `delivery_time == timestamp`; using strict less-than
   ensures every message with `delivery_time <= timestamp` is already
   on the mpsc by the time we process `timestamp`.

3. **Content-derived sort at pop**: BinaryHeap pop order for
   equal-timestamp events is a function of push history, which under
   multi-shard can vary across runs (cross-shard pushes from drain
   interleave with intra-shard pushes from apply_batch_output). Collect
   all events at the current timestamp into a Vec and sort by
   `GlobalEvent::sort_key()` before processing, so the order is a pure
   function of event content.

4. **Ceiling-aware termination**: replace the
   primary-shard-cancels-on-SlotBoundary scheme with an independent
   per-shard termination check that only breaks when the local queue
   has no events with `ts < end_time` AND the CMB ceiling is also
   `>= end_time`. Every shard stops at the same simulation time,
   independent of token-cancellation propagation races.

5. **Second drain before popping**: run drain_cross_shard_safe a second
   time after the ceiling check passes. The top-of-loop drain may run
   before the peer has advanced enough for send_time=`timestamp - eps`
   messages to be deliverable; the post-ceiling-check drain catches
   them, preventing a cross-shard delivery from landing in a later
   iteration and splitting a timestamp's events across batches.

New test `test_sequential_multi_shard_deterministic` compares per-node
event trajectories across two runs under shard_count=2. Passes 500/500
in release mode (was failing in ~100% of runs before the fix, ~25%
with only the sort fix, 2% with the termination fix, 0% with the
second drain).

All 55 sim-core tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Fix TX generation rate over-rate from f64 truncation
`TxGeneratorCore::generate` computed inter-tx delay as
`config.frequency_ms.sample() as u64 * shard_count as u64` and passed
it to `Duration::from_millis`. The `as u64` cast truncated each
sample: a configured 7.5 ms became 7 ms, producing TXs ~7% faster
than requested. For the 0.200/wfa-ls single-shard run this meant
128,572 TXs over 900s (~214 KB/s) instead of the intended ~120,000
TXs (~200 KB/s).

Only affects configurations with sub-ms precision and no batching.
Turbo is largely unaffected (1 ms resolution, 10 ms tx-batch-window
collapses the fractional delay anyway).

Switch to `Duration::from_secs_f64`, preserving sub-millisecond
precision via nanosecond-resolution Duration. Clamp to `.max(0.0)` so
distributions that can sample negative (e.g., Normal) keep the old
"treat negative as zero delay" behaviour rather than panicking in
`from_secs_f64`.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
De-RNG Linear Leios completely: withhold attacker + TxGeneratorCore
Migrate every remaining stateful-RNG use reachable from Linear Leios:

- linear_leios.rs generate_withheld_txs: `self.rng.random_bool(p)` is
  replaced with `rng.draw_bool(node, slot, DrawSite::WithholdDecision,
  p)`. The distribution sample for `txs_to_generate` and the per-tx
  `new_tx` body generation use `Rng::seeded_chacha(node, slot, site)`
  to produce one-shot ChaChaRngs seeded from context — this keeps the
  rand_distr / `new_tx` machinery unchanged while removing the
  cross-call stateful coupling.

- tx.rs TxGeneratorCore: replaces its `ChaChaRng` with the stateless
  `SimRng` plus a monotonic `next_tx_idx: u64`. Each TX is generated
  from a one-shot ChaChaRng seeded from
  `("tx_generator", tx_idx)` — so the generated TX stream is a pure
  function of the master seed regardless of per-node or network-timing
  behaviour. Propagates the `SimRng` type through TransactionProducer
  and its callers in sim/sequential.rs and sharding/shard.rs; the
  master-RNG `.next_u64()` consumption is preserved to keep any
  remaining downstream draws on stracciatella/leios variants seeded
  the same way they were.

- Drops `rng: ChaChaRng` field from `LinearLeiosNode`. The NodeImpl
  trait signature still takes a `ChaChaRng` for the other variants, so
  LinearLeiosNode::new accepts it as `_rng` and discards.

New Rng methods: `seeded_chacha(node, slot, site)` for context-tied
one-shot ChaChaRng seeding, and `seeded_chacha_from<K: Hash>(&K)` for
sim-wide (non-node-tied) draws like the TX generator.

All 54 sim-core tests pass; clippy clean for Linear Leios and
TxGeneratorCore.

Stracciatella and full-Leios variants retain their stateful `self.rng`
for now — they build fine but are out of scope for the current
determinism investigation.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Migrate Linear Leios mempool shuffle to stateless Rng
Replace `candidates.shuffle(&mut self.rng)` in
LinearLeiosNode::sample_from_mempool with Rng::context_shuffle, which
performs Fisher-Yates using DrawSite::MempoolSwap { call, idx } for
each swap. The `call` discriminator distinguishes independent shuffle
invocations at the same (node, slot): the RB-body sample uses call=0,
the EB-body sample uses call=1, so they don't collide.

DrawSite::MempoolSwap gains a `call: u32` field. Three new rng tests
cover: deterministic-per-context, distinct-calls-yield-distinct-perms,
multiset-preservation.

Threads `slot` and `shuffle_call` through sample_from_mempool's
signature. Both call sites (RB path, EB path) in try_generate_rb pass
the active slot and their assigned call index.

Note: the default `leios-mempool-sampling-strategy: ordered-by-id`
means the shuffle branch doesn't fire in the current benchmark; this
is structural cleanup so Linear Leios contains no remaining
stateful-RNG uses on its hot VRF / sampling path.

Stracciatella and full Leios variants still use stateful `self.rng` for
their shuffle paths; those will be migrated in a follow-up.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>