Home / Input Output / ouroboros-leios-sim
Apr 30, 1-2 PM (0)
Apr 30, 2-3 PM (68)
Apr 30, 3-4 PM (0)
Apr 30, 4-5 PM (0)
Apr 30, 5-6 PM (0)
Apr 30, 6-7 PM (0)
Apr 30, 7-8 PM (1)
Apr 30, 8-9 PM (0)
Apr 30, 9-10 PM (0)
Apr 30, 10-11 PM (0)
Apr 30, 11-12 AM (0)
May 01, 12-1 AM (0)
May 01, 1-2 AM (0)
May 01, 2-3 AM (0)
May 01, 3-4 AM (0)
May 01, 4-5 AM (0)
May 01, 5-6 AM (0)
May 01, 6-7 AM (0)
May 01, 7-8 AM (0)
May 01, 8-9 AM (0)
May 01, 9-10 AM (0)
May 01, 10-11 AM (0)
May 01, 11-12 PM (0)
May 01, 12-1 PM (0)
May 01, 1-2 PM (0)
May 01, 2-3 PM (0)
May 01, 3-4 PM (0)
May 01, 4-5 PM (0)
May 01, 5-6 PM (0)
May 01, 6-7 PM (0)
May 01, 7-8 PM (0)
May 01, 8-9 PM (0)
May 01, 9-10 PM (0)
May 01, 10-11 PM (0)
May 01, 11-12 AM (0)
May 02, 12-1 AM (0)
May 02, 1-2 AM (0)
May 02, 2-3 AM (0)
May 02, 3-4 AM (0)
May 02, 4-5 AM (0)
May 02, 5-6 AM (0)
May 02, 6-7 AM (0)
May 02, 7-8 AM (0)
May 02, 8-9 AM (0)
May 02, 9-10 AM (0)
May 02, 10-11 AM (0)
May 02, 11-12 PM (0)
May 02, 12-1 PM (0)
May 02, 1-2 PM (0)
May 02, 2-3 PM (0)
May 02, 3-4 PM (0)
May 02, 4-5 PM (0)
May 02, 5-6 PM (0)
May 02, 6-7 PM (0)
May 02, 7-8 PM (0)
May 02, 8-9 PM (0)
May 02, 9-10 PM (0)
May 02, 10-11 PM (0)
May 02, 11-12 AM (0)
May 03, 12-1 AM (0)
May 03, 1-2 AM (0)
May 03, 2-3 AM (0)
May 03, 3-4 AM (0)
May 03, 4-5 AM (0)
May 03, 5-6 AM (0)
May 03, 6-7 AM (0)
May 03, 7-8 AM (0)
May 03, 8-9 AM (0)
May 03, 9-10 AM (0)
May 03, 10-11 AM (0)
May 03, 11-12 PM (0)
May 03, 12-1 PM (0)
May 03, 1-2 PM (0)
May 03, 2-3 PM (0)
May 03, 3-4 PM (0)
May 03, 4-5 PM (0)
May 03, 5-6 PM (0)
May 03, 6-7 PM (0)
May 03, 7-8 PM (0)
May 03, 8-9 PM (0)
May 03, 9-10 PM (0)
May 03, 10-11 PM (0)
May 03, 11-12 AM (0)
May 04, 12-1 AM (0)
May 04, 1-2 AM (0)
May 04, 2-3 AM (0)
May 04, 3-4 AM (0)
May 04, 4-5 AM (0)
May 04, 5-6 AM (0)
May 04, 6-7 AM (0)
May 04, 7-8 AM (0)
May 04, 8-9 AM (0)
May 04, 9-10 AM (0)
May 04, 10-11 AM (0)
May 04, 11-12 PM (0)
May 04, 12-1 PM (0)
May 04, 1-2 PM (0)
May 04, 2-3 PM (0)
May 04, 3-4 PM (0)
May 04, 4-5 PM (0)
May 04, 5-6 PM (0)
May 04, 6-7 PM (0)
May 04, 7-8 PM (0)
May 04, 8-9 PM (0)
May 04, 9-10 PM (0)
May 04, 10-11 PM (0)
May 04, 11-12 AM (0)
May 05, 12-1 AM (0)
May 05, 1-2 AM (0)
May 05, 2-3 AM (0)
May 05, 3-4 AM (0)
May 05, 4-5 AM (0)
May 05, 5-6 AM (0)
May 05, 6-7 AM (0)
May 05, 7-8 AM (0)
May 05, 8-9 AM (1)
May 05, 9-10 AM (0)
May 05, 10-11 AM (4)
May 05, 11-12 PM (1)
May 05, 12-1 PM (1)
May 05, 1-2 PM (1)
May 05, 2-3 PM (1)
May 05, 3-4 PM (1)
May 05, 4-5 PM (0)
May 05, 5-6 PM (0)
May 05, 6-7 PM (0)
May 05, 7-8 PM (0)
May 05, 8-9 PM (0)
May 05, 9-10 PM (0)
May 05, 10-11 PM (0)
May 05, 11-12 AM (0)
May 06, 12-1 AM (0)
May 06, 1-2 AM (0)
May 06, 2-3 AM (0)
May 06, 3-4 AM (0)
May 06, 4-5 AM (0)
May 06, 5-6 AM (0)
May 06, 6-7 AM (0)
May 06, 7-8 AM (0)
May 06, 8-9 AM (0)
May 06, 9-10 AM (0)
May 06, 10-11 AM (0)
May 06, 11-12 PM (0)
May 06, 12-1 PM (3)
May 06, 1-2 PM (0)
May 06, 2-3 PM (0)
May 06, 3-4 PM (0)
May 06, 4-5 PM (0)
May 06, 5-6 PM (0)
May 06, 6-7 PM (0)
May 06, 7-8 PM (0)
May 06, 8-9 PM (0)
May 06, 9-10 PM (0)
May 06, 10-11 PM (0)
May 06, 11-12 AM (0)
May 07, 12-1 AM (0)
May 07, 1-2 AM (0)
May 07, 2-3 AM (0)
May 07, 3-4 AM (0)
May 07, 4-5 AM (0)
May 07, 5-6 AM (0)
May 07, 6-7 AM (0)
May 07, 7-8 AM (103)
May 07, 8-9 AM (0)
May 07, 9-10 AM (0)
May 07, 10-11 AM (0)
May 07, 11-12 PM (0)
May 07, 12-1 PM (0)
May 07, 1-2 PM (0)
185 commits this week Apr 30, 2026 - May 07, 2026
net-rs: fix fork-mismatch sticking in chain selection
When the contiguity guard detects that all replay blocks have bodies
but chain_tree.ancestors() doesn't reach the common ancestor, this
means the replay chain goes through a different fork than the ancestor
(stale PeerChain entries from an abandoned fork mixed with new ones).

Previously this returned WaitingForBlocks and issued a range fetch
that could never succeed (no blocks connect two different forks),
causing nodes to loop forever on the same failing fetch.

Now returns OrphanCandidate instead, and the OrphanCandidate handler
always clears PeerChain entries and requests re-intersection. This
forces ChainSync to rebuild from a fresh intersection point,
resolving the stale-entry contamination.

Cluster test (p=0.2, 25 nodes, 20 min): 24/25 nodes at tip with
289 blocks and 5104 fork switches, zero gap-to-ancestor events.
Previous behavior: 6-10 permanently stuck nodes.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
net-rs: pull-model TxSubmission with tx validation
Replace the push-based SubmitTransaction broadcast (which flooded
per-peer command channels causing peer evictions) with a demand-driven
pull model: TxSubmission clients signal TxsRequested when the peer's
server asks for tx IDs, the application peeks the mempool and responds
with ProvideTxs routed to that specific peer.

Also adds pre-mempool tx validation: received transactions go through
a concurrent validation delay (tx_validation_ms) before entering the
mempool, modeling Cardano Phase 1 validation.

net-core changes:
- run_client takes optional request_sender for pull signaling
- spawn_txsubmission bridges requests as PeerEvent::TxsRequested
- SubmitTransaction replaced with ProvideTxs in commands/events
- Coordinator routes ProvideTxs to specific peer (no broadcast)

net-node changes:
- Mempool.peek_up_to for non-destructive reads (tx dissemination)
- spawn_tx_validator with semaphore-gated concurrent validation
- Tx generator only pushes to mempool (no network broadcast)
- Main loop handles TxsRequested → peek mempool → ProvideTxs

Cluster-verified: 25 nodes, p=0.05, tx_rate=2.0 — zero peer evictions,
100% EB propagation, votes flooding, quorum reached on 2/4 EBs.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
net-rs: receivers re-serve EB txs via mempool resolver
Adds a TxBodyResolver trait to net-core. LeiosStore now has two paths
for MsgLeiosBlockTxsRequest: block_txs (full bodies, producer side)
and eb_tx_hashes + resolver (manifest only, receiver side). The
receiver path drops indices the resolver cannot supply, so partial
responses are now first-class.

net-node implements MempoolTxBodyResolver over SharedMempool and
threads it through network::start into the coordinator's
LeiosStore. After fetching and decoding an EB, LeiosConsensus emits
RecordLeiosEbManifest so the coordinator's store can serve downstream
peers' bitmap requests by resolving each tx_hash against the mempool.

This closes the epidemic-flood gap: receivers can now satisfy
MsgLeiosBlockTxsRequest for any EB whose manifest they have cached,
without keeping a duplicate copy of the bodies.

Tests cover: producer path takes precedence, resolver fallback,
resolver partial response, server-handler wire integration, coordinator
RecordLeiosEbManifest dispatch, and mempool get_body_by_id round-trip.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-rs: spec-faithful WFA+LS voting integration
Splits CommitteeSelection::WfaLs into per-epoch persistent voters and
per-EB non-persistent voters, each backed by stake-weighted lotteries.
Vote bodies carry no explicit weight; aggregators derive weight from
external state, mirroring CIP-0164.

WfaLs now has { persistent_voters, non_persistent_voters } (defaults
480 + 120, matching sim-rs e30087cdf). Per-startup wFA committee is
allocated identically by every node from the stake registry + a
shared seed (genesis_time_unix), so each node knows its own seat
count and every other pool's without communication.

Per-EB NPV: each pool computes a deterministic eligibility signature
from (voter_id, eb_hash, eb_slot) — modeling a CIP-0164 VRF output —
and seeds a per-stake-unit Bernoulli lottery from it. The signature
is what travels on the wire; the count of wins is reconstructed
independently by every aggregator from the signature plus the
voter's ledger-resolved stake.

A pool may emit up to two bodies per EB: one PV (if it holds ≥1 seat
in the persistent committee) and one NPV (if it won ≥1 lottery trial).

Quorum threshold is now weight-based:
  Σ weight ≥ quorum_weight_fraction × expected_committee_size
where weight is committee[voter_id] for PV and count_npv_wins(...) for
NPV. expected_committee_size = Σ committee_seats + n_npv (e.g. 600
under defaults). EveryoneVotes / StakeCentile keep simpler unit-weight
semantics with no NPV path.

VoteDecision enum and decide_vote method removed; replaced by per-mode
committee construction at startup plus signature-driven NPV at vote
time. Telemetry: voted_stake → voted_weight on LeiosQuorumReached and
LeiosElectionExpired.

Cluster-verified: WfaLs at 25-node uniform stake produces 19 PV seats
per node (480/25), expected committee 600, quorum at 450 reached
reliably; RbCertifiedEb fires.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-rs: make LeiosStore stats logging configurable, default off
Add a stats_log_interval knob to LeiosStore: when non-zero, every Nth
bump_version logs the current map sizes (blocks/block_txs/eb_tx_hashes/
votes/notifications/max_slot/cutoff). Default 0 disables.

Plumb through CoordinatorConfig.leios_store_stats_log_interval and a
matching net-node config field so it can be enabled from TOML or
--node-set transactions.leios_store_stats_log_interval=50.

Useful for memory-leak diagnostics — confirmed slot-window retention
is working as designed (entries evicted past max_slot - retention).

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-rs: refactor validator into Ledger trait + sequential actor
The previous validator was a stateless stub: each validate_block call
spawned a detached task that slept and reported success. That can't
host a real validating ledger, where every apply mutates internal
state and depends on the previous apply having completed.

Replaces it with:

- A `Ledger` trait (`apply` / `rollback` / `tip`), `#[async_trait]` so
  `Box<dyn Ledger>` works for runtime selection. The target real impl
  is Acropolis (message-based, async Go/NoGo voting), so the trait is
  async-first.
- `LedgerCommand` (Apply / Rollback) and `LedgerOutcome` (Applied /
  RolledBack / ApplyFailed / RollbackFailed).
- A single long-lived actor task that owns a `Box<dyn Ledger>` and
  processes one `LedgerCommand` at a time. Strict sequential order is
  enforced by construction — the consensus layer no longer needs to
  serialize calls.
- `FakeLedger`: stateless stand-in matching today's behaviour. `apply`
  sleeps for the configured per-block delay (preserving the cluster
  test's wall-clock validation cost) and updates `head`. `rollback`
  updates `head` with no I/O.
- A temporary `ValidationComplete` shim wraps the actor's outcome
  receiver so the consensus layer's existing `on_validation_complete`
  entry point keeps working. This shim disappears once consensus
  consumes `LedgerOutcome` directly (rollback + failure outcomes).

Adds `async-trait` to net-node deps.

Tests:
- delay_computation (regression on the original formula)
- validate_block_completes
- apply_then_apply_processes_in_order (the critical sequential invariant)
- fake_ledger_tracks_head_through_apply_and_rollback
- apply_failure_reported_and_actor_continues (ApplyFailed surfaces and
  the actor keeps draining its queue)

No consensus or main.rs changes — call sites still see the same
`Validator::new` / `validate_block` / `ValidationComplete` shape. 25-node
cluster smoke test reaches block 12 with all nodes within lag 1, no
behavioural regression. 62 net-node tests + 298 net-core tests pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
net-rs: revert from-endpoint check in FetchBlockRange routing
The fragment.contains(&from) check added in 8065d81cc causes a
fetch-fail-retry loop: after a block is fetched, its point is
removed from all peer fragments (coordinator line 326). If a
later FetchBlockRange uses that fetched block as `from`, no peer's
fragment contains it, so the fetch fails immediately and retries
forever.

Revert to checking only `to`. The server-side get_range already
handles unknown `from` by returning the prefix up to `to` — this
is intentional fork-aware behavior (chain_store.rs line 235-240).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
net-rs: split consensus into praos/leios submodules
Rename the monolithic consensus.rs into consensus/{mod,praos,leios}.rs so
that Praos longest-chain logic and Leios EB/vote handling are logically
separate. mod.rs is a thin facade dispatching events to either sublayer;
PraosConsensus is the old struct verbatim minus the Leios match arms;
LeiosConsensus now owns the offer-to-fetch routing and keeps its own
in-flight map instead of sharing Praos's via a synthetic-key hack.
No behaviour change; adds 4 tests for LeiosConsensus routing.
net-rs: make telemetry sinks async to propagate backpressure
HttpEventSink::emit and HttpStatsSink::emit each spawned a fresh tokio
task per event, with no bound on in-flight tasks. Under heavy event
load (e.g. once the TxSubmission codec fix made tx propagation actually
work), spawn rate exceeded drain rate and tasks accumulated unboundedly,
each pinning a JSON payload, a reqwest::Client clone, and the in-flight
POST future.

Switch the EventSink/StatsSink traits to async (#[async_trait]) and
have HTTP sinks .await the POST inline. record() and emit_stats() are
now async; record_network_event() too. A slow aggregator now backpressures
the caller chain naturally instead of leaking spawned tasks.

New regression test (http_event_sink_does_not_spawn_per_emit) sanity-checks
that emit doesn't leave background tasks behind.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-rs: retry partial EB tx responses on a different peer
Closes the partial-response loop. When a server returns fewer bodies
than the bitmap requested, consensus identifies the still-missing
manifest indices and re-issues FetchLeiosBlockTxs; the coordinator's
leios_tracker excludes peers it has already tried, so the retry
naturally lands on a different candidate. The cycle terminates when
the request is fully satisfied or the offering peer set is exhausted.

leios_tracker: txs_attempts: HashMap<(slot, hash), HashSet<PeerId>>
tracks peers asked for each EB's txs across the retry sequence.
pick_txs_fetch_peer filters candidates by this set and records its
pick. update_slot prunes alongside the other slot-keyed sets;
remove_peer drops disconnected peers.

LeiosConsensus: EbTxMatchOutcome gains remaining_bitmap. After
matching, the still-missing indices are stored back into
pending_eb_tx_fetches so the next response is verified against the
exact remaining set. New retry_eb_tx_fetch issues the follow-up
command; LeiosBlockTxsReceived drops the in-flight gate so the retry
path is unblocked.

Tests: tracker exclusion (3), partial-then-retry two-stage flow,
empty-bitmap retry is a no-op.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-rs: bitmap helpers for LeiosFetch tx selection
Pure helpers over the CIP-0164 sparse `BTreeMap<u16, u64>` bitmap
used by `MsgLeiosBlockTxsRequest`: `from_indices`, `select_all`,
`contains`, `iter_indices`. Empty bitmap selects no transactions,
matching the wire-format semantics; `select_all(n)` produces the
"every tx" bitmap for callers that want the previous behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-rs: stop clearing PeerChain entries on stale anchor
clear_entries() on stale anchor OrphanCandidate caused a
pathological loop: wipe to 0 entries, next TipAdvanced adds 1
entry, walk fails (1 entry can't overlap), OrphanCandidate again,
clear, repeat. The re-intersect signal never reaches ChainSync
(blocked in MsgAwaitReply), so clearing just makes things worse.

Keep existing entries so the walk has material to work with and
the tried set properly skips exhausted peers to try others.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
net-rs: use local chain for ChainSync intersection instead of jumping to tip
spawn_chainsync was asking the peer to intersect at Origin and then jumping
to the peer's current tip, so RollForward only streamed headers *after*
the peer's tip at connect time. The per-peer PeerChain in praos consensus
never saw the history between the common ancestor and the peer's tip,
select_chain_once classified every peer as OrphanCandidate, and nodes
stuck on their own forks instead of switching to the canonical chain.

Thread Arc<ChainStore> into PeerTaskConfig / spawn_chainsync and use
ChainStore::intersection_candidates (new, exponentially-spaced points from
the local tip down to Origin) as find_intersection candidates. RollForward
then streams every header from the common ancestor onward, PeerChain is
populated contiguously, and fork switches proceed.

Also add a diagnostic info\! log in select_chain_once when it returns
OrphanCandidate (peer_chain_len, oldest/newest block numbers,
oldest_prev_in_ancestors) so future regressions of this class are loud.

Regression test in consensus::praos asserts both halves: a peer_chain
with only the tip yields OrphanCandidate, and the same peer_chain
populated contiguously from the common ancestor yields WaitingForBlocks.

Verified on the 25-node sample cluster: all nodes advance in lockstep,
zero orphan log entries after startup.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
net-rs: mainnet-shaped stake distribution strategy
Analysis of data/simulation/pseudo-mainnet/topology-v2.yaml:
- 750 nodes, 534 (71.2%) zero-stake relays
- 216 pools, top-200 ≈ 95% cumulative
- Pool stakes log-rank slope ≈ -0.06 (saturation-flat, not heavy-tail)

Add "mainnet-shaped" strategy that captures the dominant shape:
relay_fraction = 0.71 of nodes get stake 0; remaining pools split the
total uniformly (matches Cardano's k=500 saturation cap, where pools
cluster near equal stake). Pools occupy low indices.

Skews per-pool stake aren't modeled — slope is gentle enough that
uniform-among-pools is within 5% on the prefix-sum metrics that
matter for committee selection.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-rs: periodic select_chain retry for convergence after production stops
When block production stops (p=0), the network previously froze
completely — select_chain only ran on incoming network events, so
with no new blocks there were no events to trigger retries of
stale fetches or pending_validation deadlocks.

Add retry_select_chain() to PraosConsensus: evicts stale in_flight
entries and re-runs select_chain when there is actionable state
(stale evictions, in-flight fetches, or pending_validation entries).
Called every 5 slots (~5s) from the main loop's slot tick.

Cluster testing confirms: after setting p=0, most nodes now converge
to the majority tip. Three nodes remain stuck due to a separate
pending_validation ordering issue (blocks arrive but parents are
missing — needs further investigation).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
net-rs: restore adopted_tip guard on genesis fallback
The unconditional genesis fallback (oldest.prev_hash == None ->
Origin) caused spurious 93+ block Origin replays that deadlocked
the channel pipeline. A node at block 115 doing normal single-block
switches would hit stale block-1 entries in PeerChain after a peer
rollback, triggering a massive Origin replay.

Restore the guard: Origin rollback only for fresh nodes with no
adopted chain (adopted_tip_hash.is_none()). An adopted node seeing
prev_hash=None means stale PeerChain entries survived a rollback,
not a genuine genesis-diverged fork. The re-intersect mechanism
handles stale state via OrphanCandidate.

The original problem this guard removal solved (commit 9a44d1031)
was actually caused by pending_validation deadlocks, now removed
in commit 472ded2bd.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
net-rs: vote aggregation with per-EB quorum detection
Parse validated vote bodies to extract endorser_block_hash and
voter_id, attribute votes to their EB election, detect quorum (≥3
unique voters for MVP). VotesValidated outcome now carries vote_data.

Verified in 25-node cluster: 75 quorum events across all nodes.
Added scripts/leios-check.sh for Leios cluster diagnostics.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
con-rs: lift Tip; extract LeiosState sans-IO state machine
Two changes bundled, both extending con-rs's coverage to the Leios
side of consensus.

1. `types::Tip` lifted from net-core (data + Display + minicbor codec).
   net-core re-exports via `pub use con_rs::Tip` so existing call
   sites keep working.

2. `leios::LeiosState` mirrors `praos::PraosState`: a sans-IO state
   machine that owns the per-EB election state (via `Elections`), the
   per-EB tx-hash manifest cache, the in-flight Leios-fetch tracking,
   and the local node's voting configuration.  It emits
   `Vec<LeiosEffect>` describing what the I/O layer should do —
   request an EB body, request EB transactions, ask for votes, record
   a manifest, hand a block / votes to the validator, emit a vote,
   raise telemetry.

   The state never builds wire-format vote bodies: when an election
   enters the Voting phase and the local node is eligible (PV via
   `persistent_seats`, NPV via the wfa lottery on the eligibility
   signature), it emits an `EmitVote` effect carrying logical args
   (PV flag, NPV signature) and the I/O layer encodes the body.
   Same principle as `praos`: con-rs is wire-format-agnostic.

   `EbTxMatchOutcome` and the bitmap-helpers move along.  `ValidatedVote`
   is a borrowed view used to feed decoded vote bodies into the state
   machine without copying.

Net-node delta:
- `consensus/leios/mod.rs` becomes a thin wrapper holding `state`,
  `commands`, `validator`, `mempool`, plus the telemetry buffer.
  All event handlers translate `NetworkEvent` / `LedgerOutcome` into
  state-machine calls and dispatch the returned effects.
- `consensus/leios/voting.rs` is deleted; vote-body construction
  (`VoteBody::stub_persistent` / `stub_non_persistent` / `encode`)
  happens in the wrapper's `emit_vote` helper, triggered by
  `LeiosEffect::EmitVote`.
- `bitmap_for_missing_txs` (mempool query for EB-tx bitmap) stays in
  net-node — mempool semantics will get their own trait later.

Comment hygiene: cleaned up consumer-specific phrasing in con-rs doc
comments (no more references to "net-node" or "sim-rs" — con-rs is
generic Cardano consensus, not glue).

Verified: 68 con-rs tests pass (was 58 — +10 new LeiosState tests),
524 net-rs tests pass (was 528 — the 4 voting.rs tests retired in
favour of con-rs equivalents), cargo clippy --all-targets shows 11
warnings in net-rs (baseline) and 0 in con-rs.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-rs: decouple chain selection from fetch decisions
Block arrival now drives chain selection directly: on_block_received
calls try_switch_to(this_block), which walks chain_tree backward to
find the common ancestor with the adopted chain and switches if all
intermediate blocks are cached. No peer chain consultation needed.

Fetch decisions are separate: evaluate_and_fetch examines peer chains
to determine what blocks to request, handles OrphanCandidate
re-intersection, and issues FetchBlockRange commands.

This separation means a node that receives blocks from any source
can immediately apply them without depending on peer chain state
that may be stale or fragmented after rollbacks.

Cluster tested: 25 nodes at p=0.2 for 20 minutes (350+ blocks),
zero stuck nodes. Previously 7-13 nodes would get permanently stuck.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
net-rs: fix shadow select_chain to use validated set (phase 2.2 follow-up)
Initial cluster run showed every shadow decision was "would switch":
the walk checked chain_tree.contains, which under the current design
returns true for unvalidated peer-announced headers too. That's wrong
for the shadow semantics — the "common ancestor" has to be a block we
have validated, not just heard about.

Fix: walk against self.validated. Also handle the genesis case: when
the peer's oldest entry has prev_hash=None, treat the synthetic all-
zero hash as a shared anchor (both our chain and theirs root at
genesis). Without this, any time two nodes self-produced different
first blocks the shadow would declare the peer orphaned.

After the fix, a fresh cluster run shows realistic decision mix:
  8311 would fetch   (normal steady-state, missing_len=1)
    68 would switch  (all blocks validated, tip replacement only)
    12 no common ancestor (genuine orphans)

Added test: shadow_genesis_root_chain_treated_as_waiting.

Tests: 298 net-core + 75 net-node passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
net-rs: track per-peer announced txs in mempool, stop re-announcing
The TxsRequested handler in main.rs called mempool.peek_up_to (non-
consuming) and returned the same head-of-mempool txs every cycle.
With the codec fix unblocking real tx flow, this hot-loop re-cloned
and re-shipped the same txs to each peer hundreds of times per second
— ~115 MB/s of body memcpy per node.

Move per-peer state into the mempool: peek_unannounced_for_peer marks
each tx as advertised to the given peer; subsequent calls skip those
ids. Push/drain_up_to/drain_all/capacity-eviction prune the affected
ids from every peer set, bounding total state by mempool size.
forget_peer drops the entry on disconnect.

The handler in main.rs becomes a thin call into the mempool. New
unit tests cover the per-peer independence, lazy pruning on tx
removal, and forget_peer.

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
net-rs: guard Origin anchor fallback for fresh nodes only
The anchor fallback had an unconditional Origin path
(anchor.hash == [0u8;32]) that bypassed the genesis guard
restored in the previous commit. An Origin intersection
(from a prior ChainSync) would trigger massive replays
even for adopted nodes.

Apply the same guard: Origin anchor only accepted when
adopted_tip_hash is None (fresh node).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
net-rs: contiguity walk falls back to block_cache
select_chain_once's contiguity guard called chain_tree.ancestors(last_hash)
to verify that a peer's replay chain reaches the picked common ancestor.
The walk terminates at the first block whose parent is not in chain_tree.

On_block_received inserts every fetched block into both chain_tree and
block_cache — but the chain_tree insert is skipped when the header has no
parsed info AND chain_tree doesn't already know the block (so block_no=0).
That leaves the block in block_cache without a chain_tree entry, and the
next contiguity walk terminates early, firing `fork mismatch (replay
doesn't reach ancestor)` → OrphanCandidate. With the cooldown cap this no
longer infects other peers, but individual nodes under sustained fork
load can slowly get stuck on this false mismatch.

Fix: add a hybrid walker that follows prev_hash links using chain_tree
first and block_cache as a fallback. The walk terminates at a genuine
gap (neither store has the parent) or at a genesis child (prev_hash=None)
— both distinguished via a new HybridWalk.reached_origin flag so the
genesis-reached check in select_chain_once still works.

The walk is a new private method on PraosConsensus in selection.rs:
walk_ancestors_hybrid(start_hash) -> HybridWalk.

4 new unit tests exercise:
- chain_tree-only case (back-compat with pre-fix behaviour)
- block_cache fallback (tree has tip + anchor, middle only in cache)
- gap termination (parent in neither store → reached_origin=false)
- start_only_in_cache (start block only in block_cache)

Cluster verification at p=0.2: 24/25 nodes stayed healthy for ~55 min
(vs previous build which had 4 stuck by T+60min). The one stuck node
(node-4) hit a separate mux-level ingress-overflow bug during catch-up
fetches, not the contiguity walk — tracked separately.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>