sidecar composer: instrument convergence template for Antithesis
Applies the /antithesis-tests skill to the flagship test. Before, the
cardano_node_master testnet had:
- One no-op serial_driver (sleep 60), inherited from moog with no
test semantics. Antithesis's 'All commands were run to completion
at least once' property required it to run on some timeline, which
it sometimes didn't at duration=1h (~30% flake rate).
- One eventually_ command with a 12-hour polling loop. Antithesis's
eventually_ is meant to be a short post-fault recovery probe that
exits quickly. A 12h loop means the script never 'runs to
completion' in any reasonable duration, double-flagging under
'started' and 'run to completion' properties.
- No finally_ command at all.
- No SDK assertions anywhere — the report had nothing to score.
After:
- components/sidecar/composer/convergence/helper_sdk_lib.sh
Sourceable bash emitter for Antithesis's Fallback SDK JSONL.
Offers sdk_reachable / sdk_unreachable / sdk_sometimes / sdk_always.
'helper_' prefix → ignored by the composer, usable by siblings.
- components/sidecar/composer/convergence/eventually_converged.sh
Rewritten per the canonical pattern: sleep 15s to settle after
fault injection stops, then up to 10×2s retries (35s worst-case
wall time) querying tip on every producer, exit 0 on
convergence / 1 on failure. Emits sdk_reachable on entry,
sdk_sometimes on success, sdk_unreachable on genuine
non-convergence. Was a 12h polling loop.
- components/sidecar/composer/convergence/finally_tips_agree.sh
New. Runs after drivers complete naturally. 15s settle + 5×2s
tip-agreement retries. sdk_always on the final-state invariant:
every producer agrees on the tip after workload ends. A failing
sdk_always here means the test created a permanent fork.
- components/sidecar/composer/convergence/serial_driver_tip_agreement.sh
New. Replaces serial_driver_sleep.sh. Samples tip on every
producer 3× under active fault injection, emits sdk_sometimes
whether tips agree or diverge. Real exclusive-access semantics
(blocks parallel drivers from writing to the chain while
sampling), and every sample is a reportable data point.
- components/sidecar/composer/convergence/serial_driver_sleep.sh — deleted.
Locally verified: each script runs to completion on a healthy
compose, all three emit well-formed antithesis_assert JSONL, total
wall time eventually=20s / finally=20s / serial=48s on a green
cluster. No timeout wrapper around cardano-cli ping (the sidecar
image is minimal — coreutils + bash + jq — and 'timeout' was
swallowing cardano-cli's stdout).