Fix partialFanout regarding correct snapshot signature
Signed-off-by: Sasha Bogicevic <[email protected]>
Signed-off-by: Sasha Bogicevic <[email protected]>
The info banner sat inside the genesis() function body, so it fired once per `wb genesis <op>` dispatch
The configs live under net-node/configs/, but the snippet wrote just configs/...; running it from the net-rs/ workspace root (the canonical location per CLAUDE.md) would fail to find the files. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- maybe_emit_stuck_warning was counting any peer-chain parent missing from adopted_ancestors as "unreachable", but the diagnostic doc says "neither in chain_tree nor block_cache". A peer chain that forks off into a branch we hold but haven't adopted would false-positive the WARN. Check the whole chain_tree (every fork we know about) instead. - record_peer_disconnected wasn't removing the per-peer last_gap_warning_at throttle entry, so the map grew without bound under reconnect churn if PeerIds are monotonically assigned. Same lifecycle as the orphan cooldown that already gets cleared here. - Drop a duplicate `#[allow(clippy::too_many_arguments)]` on on_tip_advanced (one was enough). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
`u64 / u64::MAX as f64` lands in [0, 1] inclusive: `u64::MAX as f64` rounds to 2^64, and a hash equal to u64::MAX yields a draw of exactly 1.0. The `draw < probability` test then refused to drop on that one-in-2^64 hash even when the operator asked for "always drop" semantics with probability=1.0. Mirror the `probability <= 0.0` short-circuit at the top of the function so probability=1.0 unconditionally returns true, and fix the comment to say [0, 1] rather than [0, 1). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- tcp_connection.rs: `mem::replace(_, Default::default())` → `mem::take`. - config.rs: `for (_, spec) in &out` → `for spec in out.values()`. - network/connection.rs: allow `clippy::large_enum_variant` on `ConnectionKind` — the two variants share a uniform interface and Box-ing the TCP variant would add an indirection on every connection access in the sim hot path. Also allow `clippy::items_after_test_module` on the file's `mod tests` block; `ConnectionKind` legitimately lives after it and moving the 400-line test module to the file end would be pure churn. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- aggregator.rs: drop `len.clone()` on a `usize` (Copy). - server.rs: allow `clippy::too_many_arguments` on `start()`; the nine args are all distinct top-level cluster wiring (port, channels, shared state) and grouping them into a config struct would be churn for marginal benefit. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
- t22.rs: dedent inline list items to 2 spaces (overindented); drop the redundant `let decision = …; decision` binding (`needless_return`). - praos.rs: rewrap PraosStateSizes equivocation_bytes_estimate doc so the list bullet sits flush with the prose. - behaviour/selection.rs: replace `out.get(&2).is_none()` with `!out.contains_key(&2)` (`unnecessary_get_then_check`). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Signed-off-by: Roland Kuhn <[email protected]>
fixes #937 Signed-off-by: Roland Kuhn <[email protected]>
fixes #936 Signed-off-by: Roland Kuhn <[email protected]>
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Clippy 1.92 lints the manual form; introduced on this branch via DeepReorg behaviour and net-node's state-size log gate. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Rebase onto main brought in the yaml topology source (PR #915), which adds `serde_yaml` (+ `indexmap`) to `net-cluster`'s dependency graph. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
`force_rollback` re-anchored chain state and pruned the chain tree suffix, but left the abandoned blocks in `block_cache` / `validated` / `in_flight_validation`. The dedup at the top of `on_block_received` short-circuits when a hash is in any of those, so a peer re-offering an abandoned block after a deliberate self-reorg was silently dropped — `chain_tree` never re-acquired it and the node was pinned on its dead fork, defeating post-chaos recovery. Mirror the k-prune retention pattern from `on_block_applied` in the opposite direction: retain `block_cache` to entries with `block_no <= target_bn`, then drop `validated` / `in_flight_validation` / `header_first_seen` to hashes still in `block_cache`. Test extended to assert the suffix is gone from cache + validated. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
The gRPC server built its tonic transport with no HTTP/2 tuning, inheriting hyper/h2 defaults (64 KiB per-stream flow-control window, adaptive windowing off) with no way for operators to influence them. Add optional HTTP/2 fields to GrpcConfig (http2_adaptive_window, http2_initial_stream_window_size, http2_initial_connection_window_size, http2_max_frame_size, http2_max_concurrent_streams), all backwards-compatible optional fields, and plumb them into the server builder via a dedicated apply_http2_tuning helper. Adaptive windowing defaults to on. NOTE: this is a behavior change for existing deployments (windows now auto-size to the BDP instead of being pinned at 64 KiB). Adaptive and explicit fixed windows are mutually exclusive in hyper; setting both logs a warning rather than silently dropping the fixed sizes. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Signed-off-by: Sasha Bogicevic <[email protected]>
Diagnosing a wedged catch-up against the public Leios dev relay required grepping across the orphan / fork-mismatch INFO traffic and inferring the cause from cache state. Two new WARNs hand the diagnosis directly to an operator skimming logs: - **ChainSync ingress contiguity check** in `record_peer_tip`: when an arriving header's `prev_hash` doesn't match the previously-announced one's hash, log the (block_no, hash) pair on each side and the implied skipped-block count. Throttled per peer (`GAP_WARNING_INTERVAL = 10 s`) so a sustained non-contiguous forward doesn't flood the log. This is the direct signal — the WARN fires the moment upstream commits the offence. - **Stuck-validation rollup** in `retry_select_chain`: when validation has been frozen for `STUCK_THRESHOLD = 30 s` and some peer offers a strictly-better tip, emit one rollup line summarising stuck duration, adopted vs best-peer block_no, the count of entries in that peer's replay whose parent_hash we don't have locally, and the peer-chain size. Throttled to one fire per `STUCK_WARNING_INTERVAL = 60 s`. This covers the general "stuck for any reason" case and stays informative when the ingress check has gone quiet under its per-peer cooldown. Both lines were verified against the dev relay: ingress fires within ~30 s of catch-up reaching the wedge boundary (with the exact missing block hash prefix in the message), and the rollup fires 30 s later with `unreachable_parent_hashes > 0`, both throttled correctly under sustained wedge load. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>