Home / Input Output / ouroboros-leios
Jun 04, 1-2 PM (0)
Jun 04, 2-3 PM (0)
Jun 04, 3-4 PM (0)
Jun 04, 4-5 PM (0)
Jun 04, 5-6 PM (0)
Jun 04, 6-7 PM (0)
Jun 04, 7-8 PM (0)
Jun 04, 8-9 PM (0)
Jun 04, 9-10 PM (0)
Jun 04, 10-11 PM (0)
Jun 04, 11-12 AM (0)
Jun 05, 12-1 AM (0)
Jun 05, 1-2 AM (0)
Jun 05, 2-3 AM (0)
Jun 05, 3-4 AM (0)
Jun 05, 4-5 AM (0)
Jun 05, 5-6 AM (0)
Jun 05, 6-7 AM (0)
Jun 05, 7-8 AM (0)
Jun 05, 8-9 AM (0)
Jun 05, 9-10 AM (0)
Jun 05, 10-11 AM (0)
Jun 05, 11-12 PM (0)
Jun 05, 12-1 PM (0)
Jun 05, 1-2 PM (0)
Jun 05, 2-3 PM (0)
Jun 05, 3-4 PM (0)
Jun 05, 4-5 PM (0)
Jun 05, 5-6 PM (0)
Jun 05, 6-7 PM (0)
Jun 05, 7-8 PM (0)
Jun 05, 8-9 PM (0)
Jun 05, 9-10 PM (0)
Jun 05, 10-11 PM (0)
Jun 05, 11-12 AM (0)
Jun 06, 12-1 AM (0)
Jun 06, 1-2 AM (0)
Jun 06, 2-3 AM (0)
Jun 06, 3-4 AM (0)
Jun 06, 4-5 AM (0)
Jun 06, 5-6 AM (0)
Jun 06, 6-7 AM (0)
Jun 06, 7-8 AM (0)
Jun 06, 8-9 AM (0)
Jun 06, 9-10 AM (0)
Jun 06, 10-11 AM (0)
Jun 06, 11-12 PM (0)
Jun 06, 12-1 PM (0)
Jun 06, 1-2 PM (0)
Jun 06, 2-3 PM (0)
Jun 06, 3-4 PM (0)
Jun 06, 4-5 PM (0)
Jun 06, 5-6 PM (0)
Jun 06, 6-7 PM (0)
Jun 06, 7-8 PM (0)
Jun 06, 8-9 PM (0)
Jun 06, 9-10 PM (0)
Jun 06, 10-11 PM (0)
Jun 06, 11-12 AM (0)
Jun 07, 12-1 AM (0)
Jun 07, 1-2 AM (0)
Jun 07, 2-3 AM (0)
Jun 07, 3-4 AM (0)
Jun 07, 4-5 AM (0)
Jun 07, 5-6 AM (0)
Jun 07, 6-7 AM (0)
Jun 07, 7-8 AM (0)
Jun 07, 8-9 AM (0)
Jun 07, 9-10 AM (0)
Jun 07, 10-11 AM (0)
Jun 07, 11-12 PM (0)
Jun 07, 12-1 PM (0)
Jun 07, 1-2 PM (0)
Jun 07, 2-3 PM (0)
Jun 07, 3-4 PM (0)
Jun 07, 4-5 PM (0)
Jun 07, 5-6 PM (0)
Jun 07, 6-7 PM (0)
Jun 07, 7-8 PM (0)
Jun 07, 8-9 PM (0)
Jun 07, 9-10 PM (0)
Jun 07, 10-11 PM (0)
Jun 07, 11-12 AM (0)
Jun 08, 12-1 AM (0)
Jun 08, 1-2 AM (0)
Jun 08, 2-3 AM (0)
Jun 08, 3-4 AM (0)
Jun 08, 4-5 AM (0)
Jun 08, 5-6 AM (0)
Jun 08, 6-7 AM (0)
Jun 08, 7-8 AM (0)
Jun 08, 8-9 AM (2)
Jun 08, 9-10 AM (9)
Jun 08, 10-11 AM (0)
Jun 08, 11-12 PM (0)
Jun 08, 12-1 PM (0)
Jun 08, 1-2 PM (3)
Jun 08, 2-3 PM (1)
Jun 08, 3-4 PM (0)
Jun 08, 4-5 PM (0)
Jun 08, 5-6 PM (0)
Jun 08, 6-7 PM (0)
Jun 08, 7-8 PM (0)
Jun 08, 8-9 PM (0)
Jun 08, 9-10 PM (0)
Jun 08, 10-11 PM (0)
Jun 08, 11-12 AM (0)
Jun 09, 12-1 AM (0)
Jun 09, 1-2 AM (0)
Jun 09, 2-3 AM (0)
Jun 09, 3-4 AM (0)
Jun 09, 4-5 AM (0)
Jun 09, 5-6 AM (3)
Jun 09, 6-7 AM (1)
Jun 09, 7-8 AM (1)
Jun 09, 8-9 AM (0)
Jun 09, 9-10 AM (0)
Jun 09, 10-11 AM (23)
Jun 09, 11-12 PM (0)
Jun 09, 12-1 PM (0)
Jun 09, 1-2 PM (0)
Jun 09, 2-3 PM (3)
Jun 09, 3-4 PM (0)
Jun 09, 4-5 PM (0)
Jun 09, 5-6 PM (0)
Jun 09, 6-7 PM (1)
Jun 09, 7-8 PM (0)
Jun 09, 8-9 PM (0)
Jun 09, 9-10 PM (0)
Jun 09, 10-11 PM (0)
Jun 09, 11-12 AM (0)
Jun 10, 12-1 AM (0)
Jun 10, 1-2 AM (0)
Jun 10, 2-3 AM (0)
Jun 10, 3-4 AM (0)
Jun 10, 4-5 AM (1)
Jun 10, 5-6 AM (0)
Jun 10, 6-7 AM (0)
Jun 10, 7-8 AM (0)
Jun 10, 8-9 AM (1)
Jun 10, 9-10 AM (1)
Jun 10, 10-11 AM (9)
Jun 10, 11-12 PM (38)
Jun 10, 12-1 PM (0)
Jun 10, 1-2 PM (0)
Jun 10, 2-3 PM (0)
Jun 10, 3-4 PM (0)
Jun 10, 4-5 PM (0)
Jun 10, 5-6 PM (0)
Jun 10, 6-7 PM (0)
Jun 10, 7-8 PM (0)
Jun 10, 8-9 PM (1)
Jun 10, 9-10 PM (0)
Jun 10, 10-11 PM (0)
Jun 10, 11-12 AM (0)
Jun 11, 12-1 AM (0)
Jun 11, 1-2 AM (0)
Jun 11, 2-3 AM (0)
Jun 11, 3-4 AM (0)
Jun 11, 4-5 AM (0)
Jun 11, 5-6 AM (0)
Jun 11, 6-7 AM (0)
Jun 11, 7-8 AM (0)
Jun 11, 8-9 AM (0)
Jun 11, 9-10 AM (0)
Jun 11, 10-11 AM (0)
Jun 11, 11-12 PM (0)
Jun 11, 12-1 PM (0)
Jun 11, 1-2 PM (0)
98 commits this week Jun 04, 2026 - Jun 11, 2026
docs: repoint sim-rs links to the leios-tools repo
sim-rs/, net-rs/ and shared-rs/ now live in their own repository at
cardano-scaling/leios-tools. Update the remaining monorepo docs that
linked into sim-rs as a local path so they point at the new home:

- Markdown links → https://github.com/cardano-scaling/leios-tools/tree/main/sim-rs
  (and /blob/main/... for specific files).
- Embedded architecture diagrams → raw.githubusercontent.com/.../main/sim-rs/docs/...

Docs that live under net-rs/, sim-rs/ and shared-rs/ are left out — they
travel with their own repo. Prose mentions, git tags, output filenames,
docker image names, and shell-command relative paths are intentionally
left untouched (a URL can't be a symlink target or local build output).
Logbook.md is a historical journal and is left as-is.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
net-core: encode LeiosFetch bitmaps as indefinite-length
The leios-prototype CDDL defines `bitmaps` as

    ; indefinite-length map from 64-tx window index to 64-bit presence
    bitmaps = { * base.word16 => base.word64 }

and the reference relay's parser enforces the indefinite-length form:
sending the definite-length variant triggers an immediate TCP RST
within ~25–45 ms of `MsgLeiosBlockTxsRequest`, which we'd been
working around in our dev-relay configs with
`fetch_policy.eb_txs = "no_fetch"`.

Switch `encode_bitmap` from `e.map(len)` (definite-length) to
`e.begin_map()` + `e.end()` (indefinite-length).  Same pattern as
TxSubmission's indefinite-length inner lists.

Verified against the live Leios dev relay (magic 164,
`leios-node.play.dev.cardano.org`): with the fix in, the relay
accepts `MsgLeiosBlockTxsRequest` and streams back the EB-tx list
without RSTing.  First test: `count=512` transactions delivered for
one EB, zero connection resets.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
praos: surface upstream gappy ChainSync via two complementary WARNs
Diagnosing a wedged catch-up against the public Leios dev relay
required grepping across the orphan / fork-mismatch INFO traffic and
inferring the cause from cache state.  Two new WARNs hand the
diagnosis directly to an operator skimming logs:

- **ChainSync ingress contiguity check** in `record_peer_tip`: when an
  arriving header's `prev_hash` doesn't match the previously-announced
  one's hash, log the (block_no, hash) pair on each side and the
  implied skipped-block count.  Throttled per peer
  (`GAP_WARNING_INTERVAL = 10 s`) so a sustained non-contiguous forward
  doesn't flood the log.  This is the direct signal — the WARN fires
  the moment upstream commits the offence.

- **Stuck-validation rollup** in `retry_select_chain`: when validation
  has been frozen for `STUCK_THRESHOLD = 30 s` and some peer offers a
  strictly-better tip, emit one rollup line summarising stuck duration,
  adopted vs best-peer block_no, the count of entries in that peer's
  replay whose parent_hash we don't have locally, and the peer-chain
  size.  Throttled to one fire per `STUCK_WARNING_INTERVAL = 60 s`.
  This covers the general "stuck for any reason" case and stays
  informative when the ingress check has gone quiet under its
  per-peer cooldown.

Both lines were verified against the dev relay: ingress fires within
~30 s of catch-up reaching the wedge boundary (with the exact missing
block hash prefix in the message), and the rollup fires 30 s later
with `unreachable_parent_hashes > 0`, both throttled correctly under
sustained wedge load.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
praos: fix stuck-rollup parent check + clean up per-peer state
- maybe_emit_stuck_warning was counting any peer-chain parent missing from
  adopted_ancestors as "unreachable", but the diagnostic doc says "neither
  in chain_tree nor block_cache".  A peer chain that forks off into a
  branch we hold but haven't adopted would false-positive the WARN.  Check
  the whole chain_tree (every fork we know about) instead.

- record_peer_disconnected wasn't removing the per-peer last_gap_warning_at
  throttle entry, so the map grew without bound under reconnect churn if
  PeerIds are monotonically assigned.  Same lifecycle as the orphan
  cooldown that already gets cleared here.

- Drop a duplicate `#[allow(clippy::too_many_arguments)]` on
  on_tip_advanced (one was enough).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
praos: trust the ChainSync intersection anchor as the common ancestor
After reconnecting to a peer on a divergent fork (the round-robin
relay-backend handover), select_chain could perpetually fail to find a
common ancestor: the per-peer fragment is too short to reach the fork
point and `adopted_ancestors` may be truncated by pruning, so the
strictly-better peer is classified OrphanCandidate forever and the tip
freezes (the original relay wedge: "orphan — no common ancestor found",
looping on re-intersection).

`find_intersection` already probes back to genesis, so its anchor is the
authoritative common ancestor. Add a final fallback in select_chain: when
no ancestor is found via the fragment, trust the peer's intersection
anchor if it is a real block we hold and the resulting reorg is within k.
The bounded (<= k) reorg then proceeds via the existing
WaitingForBlocks/range-fetch path. Reorgs deeper than k are refused
(Praos finality — settled blocks). Origin/genesis anchors are excluded:
a switch sharing only genesis needs a full re-sync from block 1 that the
range fetch can't anchor at Origin, and isn't the round-robin case
(those backends share real history at the fork point).

Tests: select_chain_trusts_intersection_anchor_within_k (verified to fail
without the fallback) and select_chain_refuses_reorg_deeper_than_k. 308
shared-consensus tests pass.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
behaviour: add DeepReorg — deliberate deep self-reorg for chaos testing
Adds a config-driven producer-side behaviour that periodically abandons
a chain suffix and forks, so downstream followers must recover from a
deep RollBackward. Reusable as deep-reorg / reorg-resilience tooling
(and the harness for the deep-rollback recovery work).

- shared-consensus:
  - ChainTree::remove_above(block_number) — drop an abandoned suffix and
    recompute best_tip.
  - PraosState::force_rollback(depth) — re-anchor the adopted chain
    `depth` blocks back, prune the suffix, emit InjectRollback so the
    served chain mirrors it. No-op below depth/with no tip. Unit-tested.
  - Behaviour::praos_reorg(slot) -> Option<depth> hook (default None;
    composite returns first Some); LeiosState::ask_praos_reorg consults it.
  - BehaviourSpec::DeepReorg { every_slots, depth } + registry wiring +
    behaviours::DeepReorg (fires once per `every_slots`-aligned slot).

- net-node: PraosConsensus::force_rollback diffuses the rollback;
  Consensus::maybe_force_reorg consults the behaviour each slot; the main
  slot loop calls it before production so the producer forks.

Config: `[behaviour] kind="deep-reorg", every_slots=N, depth=D`.

303 shared-consensus tests pass. Verified live (producer+follower): the
behaviour fires and forks; an honest follower recovers cleanly from
single-producer deep reorgs (the relay wedge needs additional
conditions — multi-peer/reconnect — still under investigation).

Co-Authored-By: Claude Opus 4.8 <[email protected]>
mux: RunningMux::abort must cancel the muxer/demuxer, not just the supervisor
`RunningMux::abort()` only aborted the supervisor task. The supervisor
merely *watches* the muxer (egress/writer) and demuxer (ingress/reader)
and aborts the survivor when one fails — it doesn't own the bearer. So
aborting only the supervisor left the muxer and demuxer running, holding
the TCP read/write halves open: the peer never saw EOF and only
disconnected via the 60s keepalive timeout.

Store the muxer/demuxer AbortHandles in RunningMux and cancel all three
in abort(), so tearing a connection down closes the socket promptly and
the remote observes the disconnect immediately. Surfaced while building
a server-side peer-reset behaviour, but affects every abort-driven
teardown (peer Disconnect, shutdown, supervisor cleanup).

Co-Authored-By: Claude Opus 4.8 <[email protected]>
behaviour: short-circuit DropInboundPeers at probability >= 1.0
`u64 / u64::MAX as f64` lands in [0, 1] inclusive: `u64::MAX as f64` rounds
to 2^64, and a hash equal to u64::MAX yields a draw of exactly 1.0.  The
`draw < probability` test then refused to drop on that one-in-2^64 hash
even when the operator asked for "always drop" semantics with
probability=1.0.

Mirror the `probability <= 0.0` short-circuit at the top of the function
so probability=1.0 unconditionally returns true, and fix the comment to
say [0, 1] rather than [0, 1).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
behaviour: add DropInboundPeers — server-side connection-reset chaos
Adds a config-driven behaviour that randomly resets accepted (inbound)
peer connections, so the remote reconnects and re-runs ChainSync
intersection from scratch. Mimics a relay that RSTs inbound peers — the
reconnect-handover trigger for deep-rollback recovery testing. Compose
with DeepReorg so each reconnect re-intersects against a reorged chain.

- shared-consensus:
  - Behaviour::drop_inbound_peers(slot) -> bool hook (default false;
    composite ORs children); LeiosState::ask_drop_inbound_peers.
  - BehaviourSpec::DropInboundPeers { probability } + registry wiring +
    behaviours::DropInboundPeers (deterministic per-(seed, slot) draw).
- net-core: NetworkCommand::DropInboundPeers; coordinator resets every
  accepted peer (ip_guard.is_some()) via PeerCommand::Disconnect (relies
  on the mux-teardown fix to close the socket promptly).
- net-node: Consensus::should_drop_inbound_peers; the slot loop issues
  the command when the behaviour fires.

Config: `[behaviour] kind="drop-inbound-peers", probability=P`.

Verified live (producer with DeepReorg + DropInboundPeers): the follower
disconnects promptly (mux error, not 60s keepalive) and reconnects each
drop, re-intersecting against the reorged chain.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
shared-consensus: clear pre-existing clippy 1.92 lints
- t22.rs: dedent inline list items to 2 spaces (overindented); drop the
  redundant `let decision = …; decision` binding (`needless_return`).
- praos.rs: rewrap PraosStateSizes equivocation_bytes_estimate doc so the
  list bullet sits flush with the prose.
- behaviour/selection.rs: replace `out.get(&2).is_none()` with
  `!out.contains_key(&2)` (`unnecessary_get_then_check`).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
praos: rate-limit the chain_tree gap-bridge to stop re-fetch amplification
retry_select_chain bridges [adopted_tip -> best_tip] gated only on the
moving `gap_point`. As a live peer keeps producing, best_tip advances
every tick, so the per-block in-flight dedup never trips and the whole
range re-fetches continuously (observed ~44 blocks/s of re-fetch with the
tip frozen). Rate-limit the bridge per adopted tip (BRIDGE_COOLDOWN,
10s), reset as soon as the adopted tip advances (real progress).

(An earlier version of this commit also tried to "un-starve" the oldest
in-flight gap block on a short timer; that was reverted — it can't tell a
superseded fetch from a slow-but-in-progress one, so during normal
catch-up it re-issued the whole range and amplified block traffic ~47x.
The 15s IN_FLIGHT_TTL already re-tries genuinely stuck blocks.)

308 shared-consensus tests pass.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
coordinator: throttle re-intersection per address to stop fork-loop spin
A peer stuck on an unreconcilable fork re-intersects in a tight loop
(the original relay wedge spun at ~1.4 orphan/re-intersect per second).
The per-peer-id orphan cooldown in shared-consensus doesn't survive the
reconnect handovers that assign a fresh PeerId each time.

Rate-limit `NetworkCommand::ReIntersect` in the coordinator keyed by peer
*address* (stable across reconnects) with exponential backoff (1s → 30s).
All re-intersections — within a connection and across reconnect handovers
— flow through this command, so one throttle covers both. The first
attempt for an address always passes (legitimate single re-intersects are
never blocked); rapid repeats back off; a peer that goes quiet resets.

Measured on the round-robin divergent-backend repro: orphan/re-intersect
events dropped from ~26/90s (spin) to 3, and the follower makes progress
instead of freezing.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
praos: evict abandoned suffix from block cache on force_rollback
`force_rollback` re-anchored chain state and pruned the chain tree
suffix, but left the abandoned blocks in `block_cache` / `validated` /
`in_flight_validation`.  The dedup at the top of `on_block_received`
short-circuits when a hash is in any of those, so a peer re-offering an
abandoned block after a deliberate self-reorg was silently dropped —
`chain_tree` never re-acquired it and the node was pinned on its dead
fork, defeating post-chaos recovery.

Mirror the k-prune retention pattern from `on_block_applied` in the
opposite direction: retain `block_cache` to entries with `block_no <=
target_bn`, then drop `validated` / `in_flight_validation` /
`header_first_seen` to hashes still in `block_cache`.

Test extended to assert the suffix is gone from cache + validated.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
praos: heal non-contiguous candidate chains instead of looping on orphan
A passive follower could wedge permanently once at the live tip: after a
deep rollback the peer re-announces its chain, but our cached copy has a
gap (the peer fragment skipped a header, and that gap block was never
fetched because it never appeared in `missing` — it isn't in the
fragment).  `select_chain` then classified the strictly-better candidate
as a non-contiguous / fork-mismatch orphan, cleared the fragment, and
requested re-intersection — which the peer answers by rolling back to our
tip and re-announcing, looping ~1.4x/s with the tip frozen.

Primary fix (gap-fill): when a strictly-better candidate's cached chain
is non-contiguous, fetch the contiguous range [common ancestor ->
candidate tip] from the announcing peer instead of giving up.  A range
request needs only the two endpoints; the peer streams every intermediate
block, healing the gap so a later pass switches.  Genesis-rooted
ancestors still fall back to the orphan verdict (can't anchor a range at
Origin).

Secondary safety net (fork-tip prune): the periodic gap-bridge could also
fixate on an abandoned fork tip left far ahead in chain_tree after the
rollback — no connected peer offers it, so the bridge re-issued a peerless
(peer_count=0) no-op fetch forever.  `ChainTree::remove_fork_tip` drops
such an unreachable best_tip and recomputes best_tip; retry_select_chain
prunes rather than emit a peerless fetch.

Tests: select_chain_heals_noncontiguous_cached_candidate_via_range_fetch
and retry_prunes_unreachable_best_tip_instead_of_peerless_fetch (both
verified to fail without the respective change).  300 pass.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
sim-core: clear pre-existing clippy 1.92 lints
- tcp_connection.rs: `mem::replace(_, Default::default())` → `mem::take`.
- config.rs: `for (_, spec) in &out` → `for spec in out.values()`.
- network/connection.rs: allow `clippy::large_enum_variant` on
  `ConnectionKind` — the two variants share a uniform interface and
  Box-ing the TCP variant would add an indirection on every connection
  access in the sim hot path.  Also allow `clippy::items_after_test_module`
  on the file's `mod tests` block; `ConnectionKind` legitimately lives
  after it and moving the 400-line test module to the file end would be
  pure churn.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-node: configs for the public Leios dev relay
Split the same way `mainnet.toml` + `follow-real-relay.toml` are: a
base config carrying the dev network's wire parameters (magic 164,
genesis 2026-05-29T00:00:00Z, slot length 1 s, Leios enabled), and a
per-node overlay that puts net-node in stake-0 follower mode against
`leios-node.play.dev.cardano.org:3001`.

Used both to surface and reproduce the at-tip catch-up wedge
described in this PR's "Known limitation" section, and as the
reproduction recipe in the relay-side bug report. The `no_fetch`
eb_txs policy in the overlay sidesteps the relay's RST on
MsgLeiosBlockTxsRequest (cardano-scaling/cardano-blueprint#68).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
praos: update gap_between_ancestor_and_replay test assertion
The test's docstring already described the desired behavior — "should
return WaitingForBlocks so the gap blocks get fetched" — but the
assertion still matched the older `OrphanCandidate` path that the test
was written to deprecate.  The behavior flipped to WaitingForBlocks
once PR #931's `praos: fetch only the frontier gap during catch-up`
landed (and stayed there under this branch's
`praos: heal non-contiguous candidate chains` commit), so the
assertion is now actually exercised — update it to match.

Assert the four fields that matter for the contiguous-range fetch:
`ancestor == block 3 hash`, `anchor_point == block 3 point` (so the
BlockFetch range pins to a real block, not Origin), `missing` carries
the gap tip, and `tip_block_no` reflects the peer's announced height.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>