Snapshot of Frisby's SQLite latency investigation during 2026 March
Home /
Input Output /
ouroboros-leios-sim
Apr 13, 12-1 PM (0)
Apr 13, 1-2 PM (2)
Apr 13, 2-3 PM (0)
Apr 13, 3-4 PM (0)
Apr 13, 4-5 PM (0)
Apr 13, 5-6 PM (0)
Apr 13, 6-7 PM (0)
Apr 13, 7-8 PM (1)
Apr 13, 8-9 PM (0)
Apr 13, 9-10 PM (0)
Apr 13, 10-11 PM (0)
Apr 13, 11-12 AM (0)
Apr 14, 12-1 AM (0)
Apr 14, 1-2 AM (0)
Apr 14, 2-3 AM (0)
Apr 14, 3-4 AM (0)
Apr 14, 4-5 AM (0)
Apr 14, 5-6 AM (0)
Apr 14, 6-7 AM (0)
Apr 14, 7-8 AM (0)
Apr 14, 8-9 AM (0)
Apr 14, 9-10 AM (0)
Apr 14, 10-11 AM (2)
Apr 14, 11-12 PM (2)
Apr 14, 12-1 PM (1)
Apr 14, 1-2 PM (0)
Apr 14, 2-3 PM (0)
Apr 14, 3-4 PM (0)
Apr 14, 4-5 PM (0)
Apr 14, 5-6 PM (0)
Apr 14, 6-7 PM (0)
Apr 14, 7-8 PM (0)
Apr 14, 8-9 PM (0)
Apr 14, 9-10 PM (0)
Apr 14, 10-11 PM (0)
Apr 14, 11-12 AM (0)
Apr 15, 12-1 AM (0)
Apr 15, 1-2 AM (0)
Apr 15, 2-3 AM (0)
Apr 15, 3-4 AM (0)
Apr 15, 4-5 AM (0)
Apr 15, 5-6 AM (0)
Apr 15, 6-7 AM (0)
Apr 15, 7-8 AM (0)
Apr 15, 8-9 AM (0)
Apr 15, 9-10 AM (0)
Apr 15, 10-11 AM (0)
Apr 15, 11-12 PM (0)
Apr 15, 12-1 PM (0)
Apr 15, 1-2 PM (0)
Apr 15, 2-3 PM (0)
Apr 15, 3-4 PM (0)
Apr 15, 4-5 PM (0)
Apr 15, 5-6 PM (0)
Apr 15, 6-7 PM (0)
Apr 15, 7-8 PM (0)
Apr 15, 8-9 PM (0)
Apr 15, 9-10 PM (0)
Apr 15, 10-11 PM (0)
Apr 15, 11-12 AM (0)
Apr 16, 12-1 AM (0)
Apr 16, 1-2 AM (0)
Apr 16, 2-3 AM (0)
Apr 16, 3-4 AM (0)
Apr 16, 4-5 AM (0)
Apr 16, 5-6 AM (0)
Apr 16, 6-7 AM (0)
Apr 16, 7-8 AM (0)
Apr 16, 8-9 AM (2)
Apr 16, 9-10 AM (3)
Apr 16, 10-11 AM (2)
Apr 16, 11-12 PM (0)
Apr 16, 12-1 PM (3)
Apr 16, 1-2 PM (1)
Apr 16, 2-3 PM (0)
Apr 16, 3-4 PM (0)
Apr 16, 4-5 PM (0)
Apr 16, 5-6 PM (0)
Apr 16, 6-7 PM (0)
Apr 16, 7-8 PM (0)
Apr 16, 8-9 PM (0)
Apr 16, 9-10 PM (0)
Apr 16, 10-11 PM (0)
Apr 16, 11-12 AM (0)
Apr 17, 12-1 AM (0)
Apr 17, 1-2 AM (0)
Apr 17, 2-3 AM (0)
Apr 17, 3-4 AM (0)
Apr 17, 4-5 AM (0)
Apr 17, 5-6 AM (0)
Apr 17, 6-7 AM (0)
Apr 17, 7-8 AM (0)
Apr 17, 8-9 AM (0)
Apr 17, 9-10 AM (0)
Apr 17, 10-11 AM (0)
Apr 17, 11-12 PM (0)
Apr 17, 12-1 PM (0)
Apr 17, 1-2 PM (0)
Apr 17, 2-3 PM (3)
Apr 17, 3-4 PM (0)
Apr 17, 4-5 PM (0)
Apr 17, 5-6 PM (0)
Apr 17, 6-7 PM (0)
Apr 17, 7-8 PM (1)
Apr 17, 8-9 PM (1)
Apr 17, 9-10 PM (0)
Apr 17, 10-11 PM (0)
Apr 17, 11-12 AM (0)
Apr 18, 12-1 AM (0)
Apr 18, 1-2 AM (0)
Apr 18, 2-3 AM (0)
Apr 18, 3-4 AM (0)
Apr 18, 4-5 AM (0)
Apr 18, 5-6 AM (1)
Apr 18, 6-7 AM (0)
Apr 18, 7-8 AM (0)
Apr 18, 8-9 AM (0)
Apr 18, 9-10 AM (0)
Apr 18, 10-11 AM (1)
Apr 18, 11-12 PM (0)
Apr 18, 12-1 PM (0)
Apr 18, 1-2 PM (0)
Apr 18, 2-3 PM (0)
Apr 18, 3-4 PM (0)
Apr 18, 4-5 PM (0)
Apr 18, 5-6 PM (0)
Apr 18, 6-7 PM (0)
Apr 18, 7-8 PM (0)
Apr 18, 8-9 PM (0)
Apr 18, 9-10 PM (0)
Apr 18, 10-11 PM (0)
Apr 18, 11-12 AM (0)
Apr 19, 12-1 AM (0)
Apr 19, 1-2 AM (0)
Apr 19, 2-3 AM (0)
Apr 19, 3-4 AM (0)
Apr 19, 4-5 AM (0)
Apr 19, 5-6 AM (0)
Apr 19, 6-7 AM (0)
Apr 19, 7-8 AM (0)
Apr 19, 8-9 AM (0)
Apr 19, 9-10 AM (0)
Apr 19, 10-11 AM (0)
Apr 19, 11-12 PM (0)
Apr 19, 12-1 PM (0)
Apr 19, 1-2 PM (0)
Apr 19, 2-3 PM (0)
Apr 19, 3-4 PM (0)
Apr 19, 4-5 PM (0)
Apr 19, 5-6 PM (0)
Apr 19, 6-7 PM (0)
Apr 19, 7-8 PM (0)
Apr 19, 8-9 PM (0)
Apr 19, 9-10 PM (0)
Apr 19, 10-11 PM (0)
Apr 19, 11-12 AM (0)
Apr 20, 12-1 AM (0)
Apr 20, 1-2 AM (1)
Apr 20, 2-3 AM (0)
Apr 20, 3-4 AM (0)
Apr 20, 4-5 AM (0)
Apr 20, 5-6 AM (0)
Apr 20, 6-7 AM (0)
Apr 20, 7-8 AM (0)
Apr 20, 8-9 AM (0)
Apr 20, 9-10 AM (0)
Apr 20, 10-11 AM (0)
Apr 20, 11-12 PM (0)
Apr 20, 12-1 PM (0)
27 commits this week
Apr 13, 2026
-
Apr 20, 2026
Retain EB-critical TXs on peer backlog overflow
Problem
-------
When a node's peer TX backlog hits its cap (e.g. 10,000), incoming TXs
are silently dropped from self.txs. If a dropped TX is referenced by a
pending Endorser Block, the EB's validation scan (try_validating_eb)
finds has_tx() = false and the EB is never marked all_txs_seen. The EB
then misses its vote window and is orphaned by the next Ranking Block
(WrongEB). Because the TX is never re-offered by peers, the one-shot
missing_txs trigger — already consumed by acknowledge_tx — cannot
re-fire, leaving the EB permanently stuck.
Under Poisson-clustered RB production (e.g. seed 4 at 0.200 MB/s), this
cascade produced 48 EBs with 19 uncertified (40%), 23M peer TX drops,
and a mean of only 348 votes/EB (well below the 450 quorum).
Fix
---
Two changes in propagate_tx():
1. Move the mempool insertion check (try_add_to_mempool) BEFORE
acknowledge_tx, so that missing_txs has not yet been consumed at the
point where we decide whether to drop.
2. When PeerBacklogFull fires, check whether the TX is referenced by a
pending EB (self.leios.missing_txs.contains_key). If yes, keep the
TX in self.txs (skip the backlog, but preserve has_tx = true) and
fall through to acknowledge_tx normally. If no, drop as before.
This retains only EB-critical TXs — bounded by (pending_EBs × EB_size),
typically a few thousand entries and ~3 MB of HashMap overhead per node.
Non-critical TXs are still dropped, preserving the memory cap's purpose.
Effect on seed 4 sequential 0.200/wfa-ls (worst-case seed)
-----------------------------------------------------------
EBs uncert mean WrongEB drops peak RSS
caps (before): 48 19 348 1138 23.2M ~20 GB
caps-retain: 45 8 470 1330 5.9M ~24 GB
nocaps (ref): 46 8 473 1516 0 ~35 GB
Uncertified EBs: 19 → 8 (40% → 18%)
Mean votes/EB: 348 → 470 (near nocaps 473)
Peer TX drops: 23.2M → 5.9M (−74%)
Peak RSS: ~20 → ~24 GB (+20%, well below nocaps ~35 GB)
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add no-caps parameter file and baseline voting results
parameters/no-caps.yaml disables all three memory caps for diagnostic
experiments (peer backlog, generated backlog, TX max age).
voting_results.csv captures the full 4-way matrix at 0.200/wfa-ls:
{turbo,sequential} × {caps,nocaps} × seeds 0-4. Key findings:
- Seed 4 is the stress seed: caps cause 40% uncertified (seq) vs 17%
without caps. Root cause is a race in propagate_tx where
acknowledge_tx consumes the one-shot missing_txs trigger before
PeerBacklogFull drops the TX.
- Seeds 1,3 are cap-insensitive (well-spaced RBs).
- No-caps converges all seeds to 16-22% uncertified.
- Stale rows (pre-rayon-fix, pre-seed-wiring) labelled as such.
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add -L/--label to cip-voting-options.sh for tagging CSV rows
Adds a label column (position 5, between seed and time_seconds) to distinguish experiment configurations (e.g. "caps", "nocaps") without relying on memory of which rows came from which invocation. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add RSS to poll-sim.sh process status line
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Wire seed parameter through RawParameters to SimConfiguration
The seed field existed on SimConfiguration but was hardcoded to 0 in build(). Adding it to RawParameters (with #[serde(default)]) lets it be set via -p YAML files, which the -S/--seed flag in cip-voting-options.sh already generates. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Fix rayon non-determinism: remove .filter() from parallel dispatch
rayon's filter() on an indexed parallel iterator produces an unindexed iterator whose collect() does NOT preserve element order — the output Vec order depends on work-stealing scheduling, which varies per process. Moving the empty-work check into .map() keeps the iterator indexed, so collect() is deterministic regardless of rayon thread scheduling. This was the root cause of the bistable attractor at 0.200/wfa-ls: the same seed+config could land on either 28/8 (healthy) or 81/49 (pathological) depending on which process-launch rayon happened to schedule. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Write per-run sim logs, and poll-sim picks latest by default
cip-voting-options.sh was piping every run through `tee /dev/stderr` which reopens /proc/self/fd/2 on each invocation; on Linux that gives a fresh offset-0 open-file-description, so successive seeds in a -S sweep overwrote the combined log from byte 0 — only the in-flight seed ever survived on disk. Now each run tees to /tmp/sim-T<T>-<mode>-<engine>-seed<N>.log so every seed retains its full log. poll-sim.sh defaults to the latest /tmp/sim-*.log when no path is given, so the normal /loop monitor workflow keeps working without changes. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add -S/--seed to cip-voting-options.sh for multi-seed sweeps
Seed is the innermost loop so a partial run still yields a complete seed distribution for each (throughput, mode) cell. CSV grows a seed column (position 4); existing rows should be backfilled with seed=0. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add -P/--extra-params and scripts/poll-sim.sh
cip-voting-options.sh gains a repeatable -P/--extra-params flag that layers additional YAML parameter files on top of the existing config chain (applied last so they override everything). Useful for quick experiments — e.g., `-P /tmp/coarse-timestamp.yaml` to bump timestamp-resolution-ms without touching the committed parameter set. poll-sim.sh prints a concise one-line status of a running sim-cli plus the log tail, intended for use from /loop or cron to watch a long-running benchmark without blocking Claude's thread on sleep. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Make multi-shard sequential engine deterministic
Cross-shard message delivery order in the sequential engine previously depended on OS thread scheduling of peer shards, so runs with shard_count > 1 produced different event sequences across runs. Fixing this required four coordinated changes: 1. **Deterministic cross-shard merge**: tag every CrossShardMsg with `source_shard` and a per-sender monotonic `seq`. Receiving shards buffer incoming messages into a `BinaryHeap` keyed on `(send_time, source_shard, seq)` and only deliver those whose send_time is strictly less than the minimum of every peer's advertised `shared_time`. Under that rule, no future message can arrive with an earlier send_time, so delivery order is a pure function of sent messages (the messages themselves are produced deterministically per-shard). 2. **Strict CMB ceiling**: the block condition changes from `timestamp > ceiling` to `timestamp >= ceiling`. At the boundary `timestamp == ceiling`, a peer might still be about to send a message whose `delivery_time == timestamp`; using strict less-than ensures every message with `delivery_time <= timestamp` is already on the mpsc by the time we process `timestamp`. 3. **Content-derived sort at pop**: BinaryHeap pop order for equal-timestamp events is a function of push history, which under multi-shard can vary across runs (cross-shard pushes from drain interleave with intra-shard pushes from apply_batch_output). Collect all events at the current timestamp into a Vec and sort by `GlobalEvent::sort_key()` before processing, so the order is a pure function of event content. 4. **Ceiling-aware termination**: replace the primary-shard-cancels-on-SlotBoundary scheme with an independent per-shard termination check that only breaks when the local queue has no events with `ts < end_time` AND the CMB ceiling is also `>= end_time`. Every shard stops at the same simulation time, independent of token-cancellation propagation races. 5. **Second drain before popping**: run drain_cross_shard_safe a second time after the ceiling check passes. The top-of-loop drain may run before the peer has advanced enough for send_time=`timestamp - eps` messages to be deliverable; the post-ceiling-check drain catches them, preventing a cross-shard delivery from landing in a later iteration and splitting a timestamp's events across batches. New test `test_sequential_multi_shard_deterministic` compares per-node event trajectories across two runs under shard_count=2. Passes 500/500 in release mode (was failing in ~100% of runs before the fix, ~25% with only the sort fix, 2% with the termination fix, 0% with the second drain). All 55 sim-core tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Fix TX generation rate over-rate from f64 truncation
`TxGeneratorCore::generate` computed inter-tx delay as `config.frequency_ms.sample() as u64 * shard_count as u64` and passed it to `Duration::from_millis`. The `as u64` cast truncated each sample: a configured 7.5 ms became 7 ms, producing TXs ~7% faster than requested. For the 0.200/wfa-ls single-shard run this meant 128,572 TXs over 900s (~214 KB/s) instead of the intended ~120,000 TXs (~200 KB/s). Only affects configurations with sub-ms precision and no batching. Turbo is largely unaffected (1 ms resolution, 10 ms tx-batch-window collapses the fractional delay anyway). Switch to `Duration::from_secs_f64`, preserving sub-millisecond precision via nanosecond-resolution Duration. Clamp to `.max(0.0)` so distributions that can sample negative (e.g., Normal) keep the old "treat negative as zero delay" behaviour rather than panicking in `from_secs_f64`. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
De-RNG Linear Leios completely: withhold attacker + TxGeneratorCore
Migrate every remaining stateful-RNG use reachable from Linear Leios:
- linear_leios.rs generate_withheld_txs: `self.rng.random_bool(p)` is
replaced with `rng.draw_bool(node, slot, DrawSite::WithholdDecision,
p)`. The distribution sample for `txs_to_generate` and the per-tx
`new_tx` body generation use `Rng::seeded_chacha(node, slot, site)`
to produce one-shot ChaChaRngs seeded from context — this keeps the
rand_distr / `new_tx` machinery unchanged while removing the
cross-call stateful coupling.
- tx.rs TxGeneratorCore: replaces its `ChaChaRng` with the stateless
`SimRng` plus a monotonic `next_tx_idx: u64`. Each TX is generated
from a one-shot ChaChaRng seeded from
`("tx_generator", tx_idx)` — so the generated TX stream is a pure
function of the master seed regardless of per-node or network-timing
behaviour. Propagates the `SimRng` type through TransactionProducer
and its callers in sim/sequential.rs and sharding/shard.rs; the
master-RNG `.next_u64()` consumption is preserved to keep any
remaining downstream draws on stracciatella/leios variants seeded
the same way they were.
- Drops `rng: ChaChaRng` field from `LinearLeiosNode`. The NodeImpl
trait signature still takes a `ChaChaRng` for the other variants, so
LinearLeiosNode::new accepts it as `_rng` and discards.
New Rng methods: `seeded_chacha(node, slot, site)` for context-tied
one-shot ChaChaRng seeding, and `seeded_chacha_from<K: Hash>(&K)` for
sim-wide (non-node-tied) draws like the TX generator.
All 54 sim-core tests pass; clippy clean for Linear Leios and
TxGeneratorCore.
Stracciatella and full-Leios variants retain their stateful `self.rng`
for now — they build fine but are out of scope for the current
determinism investigation.
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Migrate Linear Leios mempool shuffle to stateless Rng
Replace `candidates.shuffle(&mut self.rng)` in
LinearLeiosNode::sample_from_mempool with Rng::context_shuffle, which
performs Fisher-Yates using DrawSite::MempoolSwap { call, idx } for
each swap. The `call` discriminator distinguishes independent shuffle
invocations at the same (node, slot): the RB-body sample uses call=0,
the EB-body sample uses call=1, so they don't collide.
DrawSite::MempoolSwap gains a `call: u32` field. Three new rng tests
cover: deterministic-per-context, distinct-calls-yield-distinct-perms,
multiset-preservation.
Threads `slot` and `shuffle_call` through sample_from_mempool's
signature. Both call sites (RB path, EB path) in try_generate_rb pass
the active slot and their assigned call index.
Note: the default `leios-mempool-sampling-strategy: ordered-by-id`
means the shuffle branch doesn't fire in the current benchmark; this
is structural cleanup so Linear Leios contains no remaining
stateful-RNG uses on its hot VRF / sampling path.
Stracciatella and full Leios variants still use stateful `self.rng` for
their shuffle paths; those will be migrated in a follow-up.
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Add stateless context-derived RNG primitive; migrate VRF/lottery
The simulator's stateful ChaChaRng-per-node design is fragile: RNG
consumption count per node depends on control flow (e.g., "did this
node receive an EB in time to vote"), which depends on network timing.
Any microsecond-scale timing drift changes the number of RNG draws on a
node, desynchronising its RNG state, and every downstream random
decision on that node diverges — a macro-amplifier that turns upstream
timing blips into EB-scale outcome drift.
It's also unrealistic. Cardano's real VRF is stateless per slot:
vrf_output = f(key, nonce || slot) is a pure function that doesn't
"advance" with each use.
Introduce a stateless oracle: every random draw becomes a pure function
of (global_seed, context). The new `sim-core/src/rng` module provides:
- DrawSite enum naming every call site (RbLottery, VoteVrf, MempoolSwap,
TxGen{Node,Body,Frequency}, TxConflict, Withhold*, test/lottery site
variants). Discriminant plus variant fields are hashed into the
context, so distinct call sites never collide.
- Rng::draw_{u64,range,f64_01,bool}, all pure functions of
(seed, node, slot, site).
- SplitMixHasher — portable deterministic hasher: endian-pinned writes
(to_le_bytes in every write_uNN), splitmix64-style mixing, splitmix
finalizer. Not cryptographic; fine for a sim (no adversarial inputs)
and ~ns per draw.
Ten unit tests in rng::tests cover: determinism, different-seed
differentiation, 500-context collision check, 600-trial-index
distinctness, site-variant-on-same-(node,slot) distinctness, range/
probability sanity, endian-independence, and golden vectors pinning the
hash output (tested to catch accidental hash-function changes).
Migrate the VRF/lottery call paths for all three node variants:
- sim/lottery.rs: LotteryConfig::run signature changes from
`(kind, success_rate, &mut ChaChaRng)` to
`(kind, success_rate, &Rng, NodeId, slot, DrawSite)`. MockLotteryResults
(tests) unchanged: still keyed by LotteryKind.
- sim/linear_leios.rs: run_vrf threads slot+site through; RB lottery
uses DrawSite::RbLottery; vote VRF enumerates its (up to) 600 trials
as DrawSite::VoteVrf { eb_id, trial }.
- sim/stracciatella.rs: inline run_vrf (bypasses LotteryConfig) migrated
similarly. DrawSites: RbLottery, EbLottery{pipeline, trial},
VoteVrfPipeline{pipeline, trial}.
- sim/leios.rs: inline run_vrf migrated. DrawSites: IbLottery, EbLottery,
VoteVrfPipeline, RbLottery.
Nodes still hold a ChaChaRng for mempool shuffle, withhold-TX attack,
TxGeneratorCore, and new_tx body randomness. These are migrated in
follow-up phases. The critical VRF path — the macro-amplifier that
cascades network-timing non-determinism into per-node RNG-state
desynchronisation — is now structurally deterministic by construction.
All 51 sim-core tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sharpen sequential determinism test by coupling events to per-node RNG
The previous determinism microtest would pass even with a lingering non-determinism source downstream of the bandwidth-queue fix, because TestNode ignored its seeded ChaChaRng and its event payloads didn't depend on any accumulated per-node state. Any timing-induced drift in message-delivery order across runs was undetectable. Extend TestNode to roll self.rng.random::<u64>() on each Ping and Heartbeat receipt and weave the roll into the event payload (and into the returned Pong reply). Event content is now tied to accumulated per-node RNG state, so any non-determinism in message-delivery order or count desynchronises the RNG and surfaces as a differing roll=... field in a compared event. Add test_sequential_deterministic_bw_under_rayon which exercises the rayon-parallel path (parallel_threshold=1) under bandwidth contention and asserts per-node event trajectories (timestamp-sorted) match across runs. The existing test_sequential_deterministic runs serial; this one catches any rayon-visible shared-state non-determinism. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Convert cip-voting-options.sh to named params, add engine selector
Positional args had grown unwieldy. Rewrite with flag parsing: -t/--topology, -T/--throughput, -m/--mode, -e/--engine, -s/--slots, --quorum-fraction, --stake-fraction. Add an `--engine` selector that writes an on-the-fly override file: actor — default (tokio async), single-shard, non-deterministic sequential — single-shard sequential DES (deterministic) turbo — sequential DES with 6 shards (non-deterministic, fast) Add `engine` as a CSV column so runs from different engines can live in the same file and be pivoted cleanly. Add determinism-run.sh / determinism-check.sh as a simple 3-run harness for spot-checking single-shard-sequential determinism against the 0.200/wfa-ls scenario. determinism-run.sh runs the benchmark 3× and writes progress to /tmp/det-run-state; determinism-check.sh prints a concise status summary (safe to poll from /loop or cron). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Fix single-shard sequential engine non-determinism via BTreeMap bandwidth queues
Connection::split_bytes_amongst_queues iterated `bandwidth_queues` (a `HashMap` with std RandomState) and distributed a `bytes % queues` remainder by walking a stable-sorted vec. When two mini-protocols had equal queued bytes, the stable sort preserved HashMap iteration order, so the +1 byte landed on a non-deterministic protocol. Under bandwidth contention this shifted message arrival timestamps, and the divergence cascaded into different EB certification outcomes across otherwise identical runs. Switch `bandwidth_queues` to `BTreeMap` and widen the `TProtocol` bound from `Hash` to `Ord`. Add `PartialOrd, Ord` to the production `MiniProtocol` derive; propagate the `Ord` bound through `Network`, `NetworkCoordinator`, and `sharding::shard`. Tie-break is now by `TProtocol`'s Ord order (Tx < Block < IB < EB < Vote) — a stable, documentable bias strictly better than the previous stable-but-random behaviour. Add `test_sequential_deterministic_under_bandwidth_contention` that forces two mini-protocols to queue simultaneously on bandwidth-capped links and asserts bit-identical event streams (timestamps included). The pre-existing `test_sequential_deterministic` is kept as the no-bandwidth lane. Note multi-shard sequential remains non-deterministic (std_mpsc cross-shard message interleaving); add a comment to flag this. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Merge pull request #858 from input-output-hk/fix/antithesis-moog-poll
fix: handle transient errors and reduce poll frequency in wait-for-test
Add configurable committee selection algorithms for linear leios
Add committee-selection-algorithm config with three modes: - wfa-ls (default): existing VRF lottery matching CIP-0164 wFA+LS - everyone: every node votes unconditionally (1 vote each) - top-stake-fraction: nodes covering top N% of cumulative stake vote This enables traffic analysis comparing the CIP's VRF-based scheme against simpler alternatives. Vote bundle sizes, CPU times, diffusion, and threshold checking are unchanged — only the selection mechanism differs. Includes benchmark script (scripts/cip-voting-options.sh) that runs CIP topology under turbo mode across all three committee modes. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Fix stale node_lookup reference in TransactionProducer::run()
The TxGeneratorCore refactor (8a4da350) moved node selection logic into TxGeneratorCore but left a reference to the removed `node_lookup` local. Replace with `self.sinks` which serves the same empty-check purpose. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Merge pull request #855 from input-output-hk/ci/poll-antithesis
ci(antithesis): poll for test results after submission
fix: only use MOOG_GITHUB_PAT when executing moog
Signed-off-by: Chris Gianelloni <[email protected]>
ci(antithesis): poll for test results after submission
Add a two-phase polling script (wait-for-test.sh) that monitors Antithesis test status via Moog: - Phase 1: poll every 10s until the test is accepted or rejected - Phase 2: poll every 60s until the test finishes - Final check: exit 0 on success, 1 on failure/unknown Also hoists Moog env vars to job level, auto-increments the try counter per commit, and caps the job at 180 minutes. Adapted from cardano-foundation/cardano-node-antithesis. Signed-off-by: Chris Gianelloni <[email protected]>