Home / Input Output / ouroboros-leios
Jun 01, 12-1 PM (0)
Jun 01, 1-2 PM (0)
Jun 01, 2-3 PM (2)
Jun 01, 3-4 PM (0)
Jun 01, 4-5 PM (0)
Jun 01, 5-6 PM (0)
Jun 01, 6-7 PM (0)
Jun 01, 7-8 PM (0)
Jun 01, 8-9 PM (0)
Jun 01, 9-10 PM (0)
Jun 01, 10-11 PM (1)
Jun 01, 11-12 AM (0)
Jun 02, 12-1 AM (0)
Jun 02, 1-2 AM (0)
Jun 02, 2-3 AM (0)
Jun 02, 3-4 AM (0)
Jun 02, 4-5 AM (0)
Jun 02, 5-6 AM (35)
Jun 02, 6-7 AM (0)
Jun 02, 7-8 AM (0)
Jun 02, 8-9 AM (0)
Jun 02, 9-10 AM (0)
Jun 02, 10-11 AM (1)
Jun 02, 11-12 PM (0)
Jun 02, 12-1 PM (2)
Jun 02, 1-2 PM (0)
Jun 02, 2-3 PM (0)
Jun 02, 3-4 PM (0)
Jun 02, 4-5 PM (0)
Jun 02, 5-6 PM (0)
Jun 02, 6-7 PM (0)
Jun 02, 7-8 PM (0)
Jun 02, 8-9 PM (0)
Jun 02, 9-10 PM (0)
Jun 02, 10-11 PM (0)
Jun 02, 11-12 AM (0)
Jun 03, 12-1 AM (0)
Jun 03, 1-2 AM (1)
Jun 03, 2-3 AM (0)
Jun 03, 3-4 AM (0)
Jun 03, 4-5 AM (0)
Jun 03, 5-6 AM (1)
Jun 03, 6-7 AM (0)
Jun 03, 7-8 AM (1)
Jun 03, 8-9 AM (1)
Jun 03, 9-10 AM (2)
Jun 03, 10-11 AM (1)
Jun 03, 11-12 PM (0)
Jun 03, 12-1 PM (0)
Jun 03, 1-2 PM (0)
Jun 03, 2-3 PM (3)
Jun 03, 3-4 PM (1)
Jun 03, 4-5 PM (1)
Jun 03, 5-6 PM (0)
Jun 03, 6-7 PM (0)
Jun 03, 7-8 PM (0)
Jun 03, 8-9 PM (0)
Jun 03, 9-10 PM (1)
Jun 03, 10-11 PM (0)
Jun 03, 11-12 AM (0)
Jun 04, 12-1 AM (0)
Jun 04, 1-2 AM (0)
Jun 04, 2-3 AM (0)
Jun 04, 3-4 AM (0)
Jun 04, 4-5 AM (0)
Jun 04, 5-6 AM (0)
Jun 04, 6-7 AM (0)
Jun 04, 7-8 AM (0)
Jun 04, 8-9 AM (0)
Jun 04, 9-10 AM (0)
Jun 04, 10-11 AM (0)
Jun 04, 11-12 PM (0)
Jun 04, 12-1 PM (0)
Jun 04, 1-2 PM (0)
Jun 04, 2-3 PM (0)
Jun 04, 3-4 PM (0)
Jun 04, 4-5 PM (0)
Jun 04, 5-6 PM (0)
Jun 04, 6-7 PM (0)
Jun 04, 7-8 PM (0)
Jun 04, 8-9 PM (0)
Jun 04, 9-10 PM (0)
Jun 04, 10-11 PM (0)
Jun 04, 11-12 AM (0)
Jun 05, 12-1 AM (0)
Jun 05, 1-2 AM (0)
Jun 05, 2-3 AM (0)
Jun 05, 3-4 AM (0)
Jun 05, 4-5 AM (0)
Jun 05, 5-6 AM (0)
Jun 05, 6-7 AM (0)
Jun 05, 7-8 AM (0)
Jun 05, 8-9 AM (0)
Jun 05, 9-10 AM (0)
Jun 05, 10-11 AM (0)
Jun 05, 11-12 PM (0)
Jun 05, 12-1 PM (0)
Jun 05, 1-2 PM (0)
Jun 05, 2-3 PM (0)
Jun 05, 3-4 PM (0)
Jun 05, 4-5 PM (0)
Jun 05, 5-6 PM (0)
Jun 05, 6-7 PM (0)
Jun 05, 7-8 PM (0)
Jun 05, 8-9 PM (0)
Jun 05, 9-10 PM (0)
Jun 05, 10-11 PM (0)
Jun 05, 11-12 AM (0)
Jun 06, 12-1 AM (0)
Jun 06, 1-2 AM (0)
Jun 06, 2-3 AM (0)
Jun 06, 3-4 AM (0)
Jun 06, 4-5 AM (0)
Jun 06, 5-6 AM (0)
Jun 06, 6-7 AM (0)
Jun 06, 7-8 AM (0)
Jun 06, 8-9 AM (0)
Jun 06, 9-10 AM (0)
Jun 06, 10-11 AM (0)
Jun 06, 11-12 PM (0)
Jun 06, 12-1 PM (0)
Jun 06, 1-2 PM (0)
Jun 06, 2-3 PM (0)
Jun 06, 3-4 PM (0)
Jun 06, 4-5 PM (0)
Jun 06, 5-6 PM (0)
Jun 06, 6-7 PM (0)
Jun 06, 7-8 PM (0)
Jun 06, 8-9 PM (0)
Jun 06, 9-10 PM (0)
Jun 06, 10-11 PM (0)
Jun 06, 11-12 AM (0)
Jun 07, 12-1 AM (0)
Jun 07, 1-2 AM (0)
Jun 07, 2-3 AM (0)
Jun 07, 3-4 AM (0)
Jun 07, 4-5 AM (0)
Jun 07, 5-6 AM (0)
Jun 07, 6-7 AM (0)
Jun 07, 7-8 AM (0)
Jun 07, 8-9 AM (0)
Jun 07, 9-10 AM (0)
Jun 07, 10-11 AM (0)
Jun 07, 11-12 PM (0)
Jun 07, 12-1 PM (0)
Jun 07, 1-2 PM (0)
Jun 07, 2-3 PM (0)
Jun 07, 3-4 PM (0)
Jun 07, 4-5 PM (0)
Jun 07, 5-6 PM (0)
Jun 07, 6-7 PM (0)
Jun 07, 7-8 PM (0)
Jun 07, 8-9 PM (0)
Jun 07, 9-10 PM (0)
Jun 07, 10-11 PM (0)
Jun 07, 11-12 AM (0)
Jun 08, 12-1 AM (0)
Jun 08, 1-2 AM (0)
Jun 08, 2-3 AM (0)
Jun 08, 3-4 AM (0)
Jun 08, 4-5 AM (0)
Jun 08, 5-6 AM (0)
Jun 08, 6-7 AM (0)
Jun 08, 7-8 AM (0)
Jun 08, 8-9 AM (2)
Jun 08, 9-10 AM (9)
Jun 08, 10-11 AM (0)
Jun 08, 11-12 PM (0)
Jun 08, 12-1 PM (0)
65 commits this week Jun 01, 2026 - Jun 08, 2026
sim-core: update Praos lottery call site after main's API rename
PR #924 moved the lottery threshold from a free function
`lottery::rb_win_threshold(rate, stake)` returning an absolute count
in `[0, total_stake)` to a method
`LotteryParams::new(f).rb_win_threshold(stake, total_stake)` returning
a `[0, 2^64)` threshold (spec-faithful `φ(σ) = 1 − (1−f)^σ` scaled by
2^64).  net-rs/production.rs was updated on main, but sim-core was
missed because the merge auto-resolved here on an unrelated hunk
(PR #928's LeiosElectionInfo telemetry arm).

Switch the call to the new method form and pair it with a direct
`Rng::draw_u64` (uniform `[0, 2^64)`) instead of `draw_range(...,
total_stake)`, mirroring how net-rs/production.rs draws and compares.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-core: trace Leios client EB/vote activity with full hashes
Add client-side tracing to the per-peer LeiosNotify and LeiosFetch
sub-tasks so a follower's logs can be cross-referenced against a peer's
(e.g. a relay's) server-side logs:

  - leios_notify: EB offered / EB txs offered (slot + full eb_hash),
    votes received (count; per-vote slot/eb_hash/voter_id/sig at debug),
    EB announcement (header size).
  - leios_fetch: requesting EB / EB received (slot, full eb_hash,
    manifest bytes); requesting EB txs / received (requested index
    count + first indices); request-failed warnings carry the same
    fields so a disconnect is attributable to a specific EB.

EB hashes are logged in full (32 bytes) — the natural correlation key
against a relay's logs.  info level for the per-EB events, debug for
per-vote detail.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
shared-consensus: remove the now-dead VoteFetchPolicy
Votes are delivered inline (previous commit), so the vote offer→fetch
machinery has no callers.  Remove it across the workspace:

  - shared-consensus fetch.rs: drop the VoteFetchPolicy trait, its
    LowestRttFirst/BroadcastN/NoFetch impls, VoteCandidateLookup, and
    the CandidateTracker vote_offers / pending_vote_fetches fields plus
    note_vote_offered / vote_candidates / start_vote_fetch /
    finish_vote_fetch.  state_sizes() drops its two vote counters.
    VoteId stays as a generic (slot, voter_id) tuple (sim-rs keys its
    own vote state on it).
  - shared-consensus leios.rs: drop the vote_policy field, the
    with_fetch vote_policy param, and set_vote_policy.
  - net-node / sim-rs: drop the `votes` fetch-policy config field,
    into_vote_policy, and the set_vote_policy wiring; drop the
    fetch_policy.votes doc from mainnet.toml.

EB / EB-tx fetch policies are untouched.  Tests: shared-consensus 297,
net-core 322, net-node 112, net-cli 57 pass.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
net-core: log the resolved peer IP per connection
Carry the concrete socket address the TCP connection landed on
(`resolved_addr`) on `Connection`/`DuplexConnection`, and log it with
the peer_id at connect time:

  peer connected (duplex) peer_id=peer-0
    host=leios-node.play.dev.cardano.org:3001 resolved_ip=52.29.179.71:3001

The host is often a round-robin DNS name, so each (re)connection can
land on a different backend.  Since the per-peer Leios trace logs are
keyed by peer_id, this one line lets a given EB's activity be
attributed to the specific IP that served it.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
praos: fetch only the frontier gap during catch-up
A follower far behind a peer re-fetched the entire not-yet-validated
backlog on every ChainSync roll-forward: `issue_fetch_internal` built a
range `[anchor .. missing.last()]` and deduped only on the `to` endpoint,
while `in_flight` recorded only that single endpoint. Each new tip
therefore re-issued an overlapping range, so blocks_received grew ~30x
the chain length and block bandwidth climbed super-linearly during deep
catch-up.

Track BlockFetch ranges per block in `in_flight` (matching the per-block
removal already done in on_block_received), filter `missing` against it
so only the frontier gap is requested, and start the range at the first
still-needed block when an in-flight prefix was filtered. on_block_fetch_
failed now clears the whole [from,to] slot range so a failed range can be
retried.

Verified offline (local producer -> stake-0 follower): blocks_received
== validated == tip (1x) vs the prior 30x. New regression test
deep_catch_up_does_not_refetch_in_flight_blocks.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
leios: deliver votes inline, remove the offer→fetch round-trip
CIP-0164's prototype (cardano-blueprint leios-prototype) delivers Leios
votes inline in LeiosNotify rather than offering vote ids to be pulled
over LeiosFetch.  This reworks net-rs + shared-consensus to match and
makes the prototype vote shape drive quorum natively.

Wire (net-core):
  - msgLeiosBlockOffer = [2, point, eb_size]  (gained eb_size: word32)
  - msgLeiosVotes      = [4, [vote, ...]]      (replaces msgLeiosVotesOffer)
      vote = [slot, eb_hash, voter_id: word16, vote_signature: bool]
  Decoders now read the declared array length and skip trailing
  elements, so a future field addition degrades gracefully instead of
  desyncing the mux stream (the bug that dropped the dev-relay link).
  Captured dev-relay frames are pinned as codec test vectors.  Removed
  the LeiosFetch votes-request/delivery messages (gone from the spec).

shared-consensus:
  - New canonical types::Vote (re-exported); CBOR codec stays in the
    net I/O layer so this crate remains format-agnostic.
  - Deterministic u16<->node_id voter index (Elections::voter_index /
    voter_id_at: position in the sorted stake registry, identical on
    every node) lets inline votes resolve to a committee weight.
  - on_votes_received now takes Vec<Vote> and feeds aggregation
    directly (mocked bool signatures need no ledger validation step);
    removed on_votes_offered, LeiosEffect::FetchLeiosVotes /
    ValidateVotes, and the vote behaviour-offer hooks.

Plumbing / node:
  - PeerEvent/NetworkEvent::LeiosVotesReceived carry Vec<Vote>; the
    coordinator dedups (peer, slot, eb_hash, voter_id) and re-injects
    into LeiosStore for epidemic gossip.  Self-vote production emits a
    structured Vote with the node's own index.  Removed the opaque
    VoteBody codec and the vote ledger-validation command/outcome.

Verified: shared-consensus 298, net-core 322, net-node 112, net-cli 57
tests pass.  Interop vs the live Leios dev relay (magic 164) no longer
drops on LeiosNotify decode.  A 30-node StakeCentile cluster reaches
stake-weighted LeiosQuorumReached purely from inline votes, no fetch.

Out of scope (follow-ups): LeiosFetch is a further CDDL revision behind
(endorser_block is now a {hash => size} map, done moved to [9],
block-txs gained bitmaps); the vestigial VoteFetchPolicy.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
net: align LeiosFetch with the prototype CDDL
Bring the LeiosFetch mini-protocol up to the cardano-blueprint
leios-prototype CDDL, fixing the live `leios_fetch block: expected
bytes, got map` disconnect when fetching an EB from the dev relay.

Wire changes:
  - msgLeiosBlock = [1, endorser_block] where
    endorser_block = { tx_hash => tx_size } (a CBOR map, previously
    decoded as an opaque bytestring).  The codec now captures/splices
    the map's raw CBOR verbatim (minicbor input()/writer_mut()), so the
    `block: Vec<u8>` plumbing is unchanged — the bytes simply *are* the
    manifest map.
  - msgLeiosBlockTxs = [3, point, bitmaps, tx_list] (was [3, tx_list]);
    the server now echoes the request's point + bitmap.  Each tx in
    tx_list is carried as raw CBOR (opaque pass-through), so a
    structured `tx.tx` from a real peer round-trips.
  - Removed the block-range sub-protocol (msgLeiosBlockRangeRequest [6],
    msgLeiosNext/LastBlockAndTxsInRange [7]/[8], StBlockRange,
    fetch_block_range) — gone from the spec.

net-node: encode/decode_overflow_eb now (de)serialise the
`{ hash => size }` manifest map (sizes 0 on the produce path; the EB
blob's slot is dropped — it was never read).  Callers/tests updated.

Verified against the live Leios dev relay (magic 164): an EB is fetched,
the manifest decodes, an election is created (phase CertEligible), and
an EB-tx fetch is issued — zero CBOR decode errors.  Tests: net-core
318, net-node 112, shared-consensus 297, net-cli green.

Known follow-up: the relay sends a TCP RST shortly after our
MsgLeiosBlockTxsRequest (bitmap EB-tx fetch); no decode error on our
side, connection reconnects and resyncs.  Separate behavioural item.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
shared-consensus: never anchor a BlockFetch range at Origin
A standard Cardano BlockFetch server resets the bearer when it
receives a MsgRequestRange whose lower bound is the genesis point:
Origin carries no block, so the range cannot be resolved.

On first connect the ChainSync intersection is at Origin, and both
FetchBlockRange emit sites could surface it as the range's `from`:

  - issue_fetch_internal used `anchor_point` (the intersection) as the
    lower bound, which is Some(Origin) for a genesis intersection.
  - the gap-bridge retry fell back to `unwrap_or(Point::Origin)` when
    there was no adopted tip.

Both now require a real block point: issue_fetch_internal falls back
to the oldest missing block when the anchor is Origin, and the
gap-bridge only fires when an adopted tip provides a lower bound.

Found via interop testing net-node against a live Leios dev relay
(leios-node.play.dev.cardano.org): the relay RST the connection
immediately after our RequestRange(Origin, <first header>). With the
fix the node syncs cleanly (handshake -> ChainSync -> BlockFetch ->
validation, no resets). Mainnet never exercised this path because its
Byron-era genesis headers aren't parsed, so no fetch was ever issued
from Origin.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
shared-consensus, net-node: spec-faithful Praos lottery threshold
Replace the linear approximation `(stake * f) as u64` with the exact
Praos formula `φ(σ) = 1 − (1 − f)^σ` quantized to a `[0, 2^64)` u64
threshold, and switch the call site to integer comparison.  The old
truncation locked any node with `stake × f < 1` out of the lottery
entirely — e.g. an equal-stake 100-node cluster at default
`total_stake = 1000, f = 0.05` had every node at threshold zero, no
RBs ever produced.

Internally `φ` is computed via `exp_m1` to keep full f64 precision
when `φ` is small (the low-stake regime); quantization to u64 is the
only place precision is lost, and is well below the `1/2^64` draw
resolution.  Integer-comparison shape is forward-compatible with a
real VRF — when one lands, only the draw source changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-node: refresh leios quorum tests for the ceil() threshold
Two `consensus::leios::tests` fixtures (`quorum_emits_cert_formed_telemetry`,
`eb_certifiable_slot_targets_specific_hash`) used 7 voters in
`EveryoneVotes` mode, which cleared the old floor-based quorum
threshold but not the ceil-based one introduced in `33db4bba0`.
Both registries have 10 unit-weight pools, so the τ = 0.75 quorum is
now `ceil(7.5) = 8` — bump the test voter set to 8 and update the
pinned `voted_weight` / `voters` assertions to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
praos: rate-limit the chain_tree gap-bridge to stop re-fetch amplification
retry_select_chain bridges [adopted_tip -> best_tip] gated only on the
moving `gap_point`. As a live peer keeps producing, best_tip advances
every tick, so the per-block in-flight dedup never trips and the whole
range re-fetches continuously (observed ~44 blocks/s of re-fetch with the
tip frozen). Rate-limit the bridge per adopted tip (BRIDGE_COOLDOWN,
10s), reset as soon as the adopted tip advances (real progress).

(An earlier version of this commit also tried to "un-starve" the oldest
in-flight gap block on a short timer; that was reverted — it can't tell a
superseded fetch from a slow-but-in-progress one, so during normal
catch-up it re-issued the whole range and amplified block traffic ~47x.
The 15s IN_FLIGHT_TTL already re-tries genuinely stuck blocks.)

308 shared-consensus tests pass.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
coordinator: throttle re-intersection per address to stop fork-loop spin
A peer stuck on an unreconcilable fork re-intersects in a tight loop
(the original relay wedge spun at ~1.4 orphan/re-intersect per second).
The per-peer-id orphan cooldown in shared-consensus doesn't survive the
reconnect handovers that assign a fresh PeerId each time.

Rate-limit `NetworkCommand::ReIntersect` in the coordinator keyed by peer
*address* (stable across reconnects) with exponential backoff (1s → 30s).
All re-intersections — within a connection and across reconnect handovers
— flow through this command, so one throttle covers both. The first
attempt for an address always passes (legitimate single re-intersects are
never blocked); rapid repeats back off; a peer that goes quiet resets.

Measured on the round-robin divergent-backend repro: orphan/re-intersect
events dropped from ~26/90s (spin) to 3, and the follower makes progress
instead of freezing.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
praos: trust the ChainSync intersection anchor as the common ancestor
After reconnecting to a peer on a divergent fork (the round-robin
relay-backend handover), select_chain could perpetually fail to find a
common ancestor: the per-peer fragment is too short to reach the fork
point and `adopted_ancestors` may be truncated by pruning, so the
strictly-better peer is classified OrphanCandidate forever and the tip
freezes (the original relay wedge: "orphan — no common ancestor found",
looping on re-intersection).

`find_intersection` already probes back to genesis, so its anchor is the
authoritative common ancestor. Add a final fallback in select_chain: when
no ancestor is found via the fragment, trust the peer's intersection
anchor if it is a real block we hold and the resulting reorg is within k.
The bounded (<= k) reorg then proceeds via the existing
WaitingForBlocks/range-fetch path. Reorgs deeper than k are refused
(Praos finality — settled blocks). Origin/genesis anchors are excluded:
a switch sharing only genesis needs a full re-sync from block 1 that the
range fetch can't anchor at Origin, and isn't the round-robin case
(those backends share real history at the fork point).

Tests: select_chain_trusts_intersection_anchor_within_k (verified to fail
without the fallback) and select_chain_refuses_reorg_deeper_than_k. 308
shared-consensus tests pass.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
behaviour: add DropInboundPeers — server-side connection-reset chaos
Adds a config-driven behaviour that randomly resets accepted (inbound)
peer connections, so the remote reconnects and re-runs ChainSync
intersection from scratch. Mimics a relay that RSTs inbound peers — the
reconnect-handover trigger for deep-rollback recovery testing. Compose
with DeepReorg so each reconnect re-intersects against a reorged chain.

- shared-consensus:
  - Behaviour::drop_inbound_peers(slot) -> bool hook (default false;
    composite ORs children); LeiosState::ask_drop_inbound_peers.
  - BehaviourSpec::DropInboundPeers { probability } + registry wiring +
    behaviours::DropInboundPeers (deterministic per-(seed, slot) draw).
- net-core: NetworkCommand::DropInboundPeers; coordinator resets every
  accepted peer (ip_guard.is_some()) via PeerCommand::Disconnect (relies
  on the mux-teardown fix to close the socket promptly).
- net-node: Consensus::should_drop_inbound_peers; the slot loop issues
  the command when the behaviour fires.

Config: `[behaviour] kind="drop-inbound-peers", probability=P`.

Verified live (producer with DeepReorg + DropInboundPeers): the follower
disconnects promptly (mux error, not 60s keepalive) and reconnects each
drop, re-intersecting against the reorged chain.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
mux: RunningMux::abort must cancel the muxer/demuxer, not just the supervisor
`RunningMux::abort()` only aborted the supervisor task. The supervisor
merely *watches* the muxer (egress/writer) and demuxer (ingress/reader)
and aborts the survivor when one fails — it doesn't own the bearer. So
aborting only the supervisor left the muxer and demuxer running, holding
the TCP read/write halves open: the peer never saw EOF and only
disconnected via the 60s keepalive timeout.

Store the muxer/demuxer AbortHandles in RunningMux and cancel all three
in abort(), so tearing a connection down closes the socket promptly and
the remote observes the disconnect immediately. Surfaced while building
a server-side peer-reset behaviour, but affects every abort-driven
teardown (peer Disconnect, shutdown, supervisor cleanup).

Co-Authored-By: Claude Opus 4.8 <[email protected]>
behaviour: add DeepReorg — deliberate deep self-reorg for chaos testing
Adds a config-driven producer-side behaviour that periodically abandons
a chain suffix and forks, so downstream followers must recover from a
deep RollBackward. Reusable as deep-reorg / reorg-resilience tooling
(and the harness for the deep-rollback recovery work).

- shared-consensus:
  - ChainTree::remove_above(block_number) — drop an abandoned suffix and
    recompute best_tip.
  - PraosState::force_rollback(depth) — re-anchor the adopted chain
    `depth` blocks back, prune the suffix, emit InjectRollback so the
    served chain mirrors it. No-op below depth/with no tip. Unit-tested.
  - Behaviour::praos_reorg(slot) -> Option<depth> hook (default None;
    composite returns first Some); LeiosState::ask_praos_reorg consults it.
  - BehaviourSpec::DeepReorg { every_slots, depth } + registry wiring +
    behaviours::DeepReorg (fires once per `every_slots`-aligned slot).

- net-node: PraosConsensus::force_rollback diffuses the rollback;
  Consensus::maybe_force_reorg consults the behaviour each slot; the main
  slot loop calls it before production so the producer forks.

Config: `[behaviour] kind="deep-reorg", every_slots=N, depth=D`.

303 shared-consensus tests pass. Verified live (producer+follower): the
behaviour fires and forks; an honest follower recovers cleanly from
single-producer deep reorgs (the relay wedge needs additional
conditions — multi-peer/reconnect — still under investigation).

Co-Authored-By: Claude Opus 4.8 <[email protected]>
praos: heal non-contiguous candidate chains instead of looping on orphan
A passive follower could wedge permanently once at the live tip: after a
deep rollback the peer re-announces its chain, but our cached copy has a
gap (the peer fragment skipped a header, and that gap block was never
fetched because it never appeared in `missing` — it isn't in the
fragment).  `select_chain` then classified the strictly-better candidate
as a non-contiguous / fork-mismatch orphan, cleared the fragment, and
requested re-intersection — which the peer answers by rolling back to our
tip and re-announcing, looping ~1.4x/s with the tip frozen.

Primary fix (gap-fill): when a strictly-better candidate's cached chain
is non-contiguous, fetch the contiguous range [common ancestor ->
candidate tip] from the announcing peer instead of giving up.  A range
request needs only the two endpoints; the peer streams every intermediate
block, healing the gap so a later pass switches.  Genesis-rooted
ancestors still fall back to the orphan verdict (can't anchor a range at
Origin).

Secondary safety net (fork-tip prune): the periodic gap-bridge could also
fixate on an abandoned fork tip left far ahead in chain_tree after the
rollback — no connected peer offers it, so the bridge re-issued a peerless
(peer_count=0) no-op fetch forever.  `ChainTree::remove_fork_tip` drops
such an unreachable best_tip and recomputes best_tip; retry_select_chain
prunes rather than emit a peerless fetch.

Tests: select_chain_heals_noncontiguous_cached_candidate_via_range_fetch
and retry_prunes_unreachable_best_tip_instead_of_peerless_fetch (both
verified to fail without the respective change).  300 pass.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
praos: fetch only the frontier gap during catch-up
A follower far behind a peer re-fetched the entire not-yet-validated
backlog on every ChainSync roll-forward: `issue_fetch_internal` built a
range `[anchor .. missing.last()]` and deduped only on the `to` endpoint,
while `in_flight` recorded only that single endpoint. Each new tip
therefore re-issued an overlapping range, so blocks_received grew ~30x
the chain length and block bandwidth climbed super-linearly during deep
catch-up.

Track BlockFetch ranges per block in `in_flight` (matching the per-block
removal already done in on_block_received), filter `missing` against it
so only the frontier gap is requested, and start the range at the first
still-needed block when an in-flight prefix was filtered. on_block_fetch_
failed now clears the whole [from,to] slot range so a failed range can be
retried.

Verified offline (local producer -> stake-0 follower): blocks_received
== validated == tip (1x) vs the prior 30x. New regression test
deep_catch_up_does_not_refetch_in_flight_blocks.

Co-Authored-By: Claude Opus 4.8 <[email protected]>