Home / Cardano Foundation / antithesis
May 09, 12-1 AM (0)
May 09, 1-2 AM (0)
May 09, 2-3 AM (0)
May 09, 3-4 AM (0)
May 09, 4-5 AM (0)
May 09, 5-6 AM (0)
May 09, 6-7 AM (0)
May 09, 7-8 AM (0)
May 09, 8-9 AM (0)
May 09, 9-10 AM (0)
May 09, 10-11 AM (0)
May 09, 11-12 PM (0)
May 09, 12-1 PM (0)
May 09, 1-2 PM (0)
May 09, 2-3 PM (0)
May 09, 3-4 PM (0)
May 09, 4-5 PM (0)
May 09, 5-6 PM (0)
May 09, 6-7 PM (0)
May 09, 7-8 PM (0)
May 09, 8-9 PM (0)
May 09, 9-10 PM (0)
May 09, 10-11 PM (0)
May 09, 11-12 AM (0)
May 10, 12-1 AM (0)
May 10, 1-2 AM (0)
May 10, 2-3 AM (0)
May 10, 3-4 AM (0)
May 10, 4-5 AM (0)
May 10, 5-6 AM (0)
May 10, 6-7 AM (0)
May 10, 7-8 AM (0)
May 10, 8-9 AM (0)
May 10, 9-10 AM (0)
May 10, 10-11 AM (0)
May 10, 11-12 PM (0)
May 10, 12-1 PM (0)
May 10, 1-2 PM (3)
May 10, 2-3 PM (0)
May 10, 3-4 PM (0)
May 10, 4-5 PM (0)
May 10, 5-6 PM (0)
May 10, 6-7 PM (0)
May 10, 7-8 PM (0)
May 10, 8-9 PM (0)
May 10, 9-10 PM (0)
May 10, 10-11 PM (0)
May 10, 11-12 AM (0)
May 11, 12-1 AM (0)
May 11, 1-2 AM (0)
May 11, 2-3 AM (0)
May 11, 3-4 AM (0)
May 11, 4-5 AM (0)
May 11, 5-6 AM (0)
May 11, 6-7 AM (0)
May 11, 7-8 AM (0)
May 11, 8-9 AM (5)
May 11, 9-10 AM (0)
May 11, 10-11 AM (0)
May 11, 11-12 PM (0)
May 11, 12-1 PM (0)
May 11, 1-2 PM (0)
May 11, 2-3 PM (0)
May 11, 3-4 PM (0)
May 11, 4-5 PM (0)
May 11, 5-6 PM (0)
May 11, 6-7 PM (0)
May 11, 7-8 PM (0)
May 11, 8-9 PM (0)
May 11, 9-10 PM (0)
May 11, 10-11 PM (0)
May 11, 11-12 AM (0)
May 12, 12-1 AM (0)
May 12, 1-2 AM (0)
May 12, 2-3 AM (0)
May 12, 3-4 AM (0)
May 12, 4-5 AM (0)
May 12, 5-6 AM (0)
May 12, 6-7 AM (0)
May 12, 7-8 AM (0)
May 12, 8-9 AM (0)
May 12, 9-10 AM (0)
May 12, 10-11 AM (0)
May 12, 11-12 PM (0)
May 12, 12-1 PM (3)
May 12, 1-2 PM (0)
May 12, 2-3 PM (1)
May 12, 3-4 PM (0)
May 12, 4-5 PM (0)
May 12, 5-6 PM (0)
May 12, 6-7 PM (0)
May 12, 7-8 PM (0)
May 12, 8-9 PM (0)
May 12, 9-10 PM (0)
May 12, 10-11 PM (0)
May 12, 11-12 AM (0)
May 13, 12-1 AM (0)
May 13, 1-2 AM (0)
May 13, 2-3 AM (0)
May 13, 3-4 AM (0)
May 13, 4-5 AM (0)
May 13, 5-6 AM (0)
May 13, 6-7 AM (0)
May 13, 7-8 AM (0)
May 13, 8-9 AM (0)
May 13, 9-10 AM (0)
May 13, 10-11 AM (0)
May 13, 11-12 PM (0)
May 13, 12-1 PM (0)
May 13, 1-2 PM (0)
May 13, 2-3 PM (0)
May 13, 3-4 PM (0)
May 13, 4-5 PM (0)
May 13, 5-6 PM (0)
May 13, 6-7 PM (0)
May 13, 7-8 PM (0)
May 13, 8-9 PM (0)
May 13, 9-10 PM (0)
May 13, 10-11 PM (0)
May 13, 11-12 AM (0)
May 14, 12-1 AM (0)
May 14, 1-2 AM (0)
May 14, 2-3 AM (0)
May 14, 3-4 AM (0)
May 14, 4-5 AM (0)
May 14, 5-6 AM (0)
May 14, 6-7 AM (0)
May 14, 7-8 AM (0)
May 14, 8-9 AM (0)
May 14, 9-10 AM (0)
May 14, 10-11 AM (0)
May 14, 11-12 PM (0)
May 14, 12-1 PM (0)
May 14, 1-2 PM (0)
May 14, 2-3 PM (0)
May 14, 3-4 PM (0)
May 14, 4-5 PM (0)
May 14, 5-6 PM (0)
May 14, 6-7 PM (0)
May 14, 7-8 PM (0)
May 14, 8-9 PM (0)
May 14, 9-10 PM (0)
May 14, 10-11 PM (0)
May 14, 11-12 AM (0)
May 15, 12-1 AM (0)
May 15, 1-2 AM (0)
May 15, 2-3 AM (0)
May 15, 3-4 AM (0)
May 15, 4-5 AM (0)
May 15, 5-6 AM (0)
May 15, 6-7 AM (0)
May 15, 7-8 AM (0)
May 15, 8-9 AM (0)
May 15, 9-10 AM (0)
May 15, 10-11 AM (0)
May 15, 11-12 PM (0)
May 15, 12-1 PM (0)
May 15, 1-2 PM (0)
May 15, 2-3 PM (0)
May 15, 3-4 PM (0)
May 15, 4-5 PM (0)
May 15, 5-6 PM (0)
May 15, 6-7 PM (0)
May 15, 7-8 PM (0)
May 15, 8-9 PM (0)
May 15, 9-10 PM (0)
May 15, 10-11 PM (0)
May 15, 11-12 AM (0)
May 16, 12-1 AM (0)
12 commits this week May 09, 2026 - May 16, 2026
fix(tx-generator): install bash signal trap so composer scripts honour the exit-0 contract
The "Always: Commands finish with zero exit code" property has been
flagging tx-generator/parallel_driver_refill.sh at ~0.12% in recent
Antithesis runs (6/4908 on the 2026-05-11 Cardano Foundation Test
run). Every failing example shows the SDK's antithesis_random Rust
binary writing
  Error: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }
to the script's stderr immediately before the script exits non-zero —
the parent bash interpreter receives SIGPIPE and dies with 141.

Install sdk_install_signal_trap (from the now-shared composer-sdk
helper) at the top of every tx-generator composer script. The trap
converts in-bash SIGPIPE/SIGTERM/SIGINT into a Sometimes-optional
observation + exit 0, mirroring the defense asteria-game has carried
since df7ef80.
refactor(composer-sdk): share helper across tx-generator and asteria-game
Extract emit + signal-handling helpers into
components/composer-sdk/helper_sdk_common.sh, wired into both
component subflakes as a path: input. asteria-game/helper_sdk.sh and
tx-generator/helper_sdk_lib.sh become thin shims that source the
shared file alongside any component-specific helpers
(control_socket_request for tx-generator).

No behavior change for asteria-game; tx-generator gains the
sdk_install_signal_trap / sdk_run_signal_safe[_fn] /
sdk_sometimes_optional primitives previously available only on the
asteria side. Callers of those primitives in tx-generator are added in
the follow-up commit.
chore(asteria-game): rename composer/stub → composer/asteria-game (drop stale label)
Closes #144.

The directory was lifted verbatim from components/asteria-stub/ in
fbb8982 ("asteria-game: testnet split + lift PR #67 source +
idempotent bootstrap (#100)") when the real asteria-game implementation
arrived. The "stub" label stuck despite the scripts inside no longer
being placeholders — they are the canonical drivers for heartbeat,
alive probes, asteria bootstrap / player / consistency / admin
singleton, plus their SDK observability.

The rest of the repo follows <component>/composer/<purpose>/:
  components/adversary/composer/chain-sync-client/  ← descriptive
  components/tx-generator/composer/tx-generator/    ← descriptive
  components/asteria-game/composer/stub/            ← stale (was)

This PR fixes that. The label was leaking into the Antithesis
report — every finding showed up as `stub/parallel_driver_X.sh` and
every assertion as `stub heartbeat ticked` etc. — which read as
"placeholder code being flaky" rather than "real driver". That
misreading actually happened during triage of #142.

Changes:
- git mv components/asteria-game/composer/stub
       components/asteria-game/composer/asteria-game
- All SDK assertion IDs in the renamed scripts: drop the "stub "
  prefix → "asteria_game " (matching tx-generator's snake_case
  component-prefix convention). 21 assertion IDs across 7 scripts.
- Prose comments in the renamed scripts: "stub script" / "sibling
  stubs" → plain wording, since the rename makes the original
  framing wrong.
- Path references in:
    docs/components/asteria-player.md
    docs/testnets/cardano-node-master.md
    testnets/cardano_node_master/docker-compose.yaml
    testnets/cardano_node_adversary/docker-compose.yaml
    testnets/asteria_game/docker-compose.yaml
  all updated for /opt/antithesis/test/v1/asteria-game/ and the
  source dir.

components/asteria-stub/ (the legacy component with the un-bounded
socat reference baseline) is left untouched.

Antithesis identity reset:
- composer commands: now identified as
  asteria-game/parallel_driver_heartbeat.sh etc., distinct from the
  archived stub/* identities. History bar restarts for these.
- SDK assertions: same — asteria_game heartbeat ticked is a fresh
  identity vs the archived stub heartbeat ticked.
- One-time cost; the payoff is permanently readable triage output.

Local smoke (renamed paths):
- 10 concurrent shells × 5 emits → 50 valid JSON lines
- SIGTERM mid-sleep → exit 0 with trap-emitted observation
- timeout --kill-after=2 escalation → exit 137 → absorbed
fix(asteria-game): add --kill-after=2 to all stub timeout-wrapped binaries
Closes #145.

After #143 landed, six of seven previously-failing stubs went green
on the first post-merge cron of 8690faa, but
stub/parallel_driver_asteria_player.sh still tripped the
Always:zero-exit-code property with one new finding (3h 19m run,
faults on). Decoded examples:

  example 1  fail  rc=1  runtime 27.27 s
  example 2  fail  rc=1  runtime 28.65 s
  example 3  fail  rc=1  runtime 47.31 s
  example 4  pass  rc=0  runtime  3.02 s

The wrapper is `sdk_run_signal_safe ... timeout 12 /bin/asteria-game`.
Plain `timeout 12` sends SIGTERM at the 12 s deadline but does not
escalate to SIGKILL — that requires --kill-after. The Haskell
binary catches SIGTERM, runs slow N2C cleanup that fails on a
torn socket, then exits rc=1 (Haskell default unhandled-exception
code). sdk_run_signal_safe deliberately propagates non-signal
exits so the property fires.

Adding --kill-after=2 escalates to SIGKILL 2 s after SIGTERM. The
kernel-killed exit (137) is in sdk_run_signal_safe's absorb set
along with 124/129/143/255, so the script terminates deterministically
inside (deadline + 2) seconds and the property stays green.

Same shape on three sibling stubs that haven't tripped yet but
have the same wrapper pattern; fixed all four together to match
the failure mode rather than the run-by-run symptom:

  parallel_driver_asteria_player.sh   timeout 12 → --kill-after=2 12
  anytime_asteria_admin_singleton.sh  timeout 12 → --kill-after=2 12
  finally_asteria_consistency.sh      timeout 30 → --kill-after=2 30
  serial_driver_asteria_bootstrap.sh  timeout 25 → --kill-after=2 25

Local smoke (process that catches SIGTERM and runs slow cleanup):
  - plain timeout 1   → exit 124 sometimes, child rc otherwise (leaky)
  - --kill-after=2 1  → exit 137 reliably

No semantic change for healthy runs; failure path now terminates
predictably and is absorbed.
chore(testnet): bump asteria-game pin to 444a1a5 to activate signal-trap absorbers
Mirrors the 6be8939 → 290a8ed3 pattern: ship the script fix in one
commit, then bump the docker-compose pin in a follow-up so
publish-images.yaml rebuilds asteria-game with the new scripts and
the next Antithesis run actually exercises the fix.

Updates both testnets that pin asteria-game:
- testnets/cardano_node_master/docker-compose.yaml
- testnets/cardano_node_adversary/docker-compose.yaml

Re #142.
fix(asteria-game): absorb in-bash signals + flock sdk.jsonl appends so stub/*.sh honour exit-0 contract under fault injection
Closes #142.

Run try-10 of commit 290a8ed3 reported 7 NEW findings, all under
"Always: Commands finish with zero exit code" against stub/*.sh. Try-11
on the same commit and the same image digests was clean. Same code,
different scheduling — the stubs flake when Antithesis fault injection
delivers a signal to the bash interpreter (not the wrapped binary), or
when concurrent parallel_driver invocations race on /tmp/sdk.jsonl.

Three changes in helper_sdk.sh:

- _sdk_emit now wraps its >> /tmp/sdk.jsonl append with `flock -x` on
  the open append-FD, so two concurrent shells can't interleave at the
  syscall level under FS-fault injection.
- New `sdk_install_signal_trap` installs absorbing traps on
  SIGTERM/SIGINT/SIGPIPE that emit `sdk_sometimes_optional false` and
  exit 0; sourced once at the top of every stub script.
- New `sdk_run_signal_safe_fn` extends `sdk_run_signal_safe` to wrap
  shell-function bodies, not just single-binary launches — needed for
  the heartbeat/eventually_alive/finally_alive stubs whose work is a
  printf|timeout 1 socat|jq pipeline rather than one binary.

Per-stub:

- parallel_driver_heartbeat.sh, eventually_alive.sh, finally_alive.sh:
  body extracted into a local `_xxx_body` function, run through the
  new fn-wrapper. Variable names lower-cased to local style.
- anytime_asteria_admin_singleton.sh, finally_asteria_consistency.sh,
  parallel_driver_asteria_player.sh, serial_driver_asteria_bootstrap.sh:
  add the signal trap as defense-in-depth around the existing
  sdk_run_signal_safe binary wrap.

Smoke-tested standalone: 10 concurrent shells × 5 emits each produces
exactly 50 valid JSON lines (no race / loss); SIGTERM mid-sleep yields
exit 0 with the trap-emitted observation; `timeout 124` via the
fn-wrapper yields exit 0 with `must_hit:false`.

Real verification is the next Antithesis run on this branch.