Skip to main content
← Back to list
01Issue
BugShippedSwamp CLI
Assigneesstack72

#296 swamp extension push deadlocks when invoked from inside a swamp workflow step

Opened by bixu Β· 5/8/2026Β· Shipped 5/8/2026

Description

swamp extension push (and any structural command that calls requireInitializedRepo) hangs indefinitely when invoked as a subprocess from inside a swamp workflow run step. The child process polls forever in waitForPerModelLocks, waiting on per-model locks that are held by its own ancestor β€” the parent swamp workflow run.

This makes it impossible to use a swamp workflow to orchestrate publishing, which was previously a working pattern.

Steps to reproduce

  1. Define a swamp workflow with a step that runs a command/shell model.
  2. In that step's shell, invoke swamp extension push <manifest> as a subprocess.
  3. Run the workflow with swamp workflow run <name>.
  4. The child push gets to its own requireInitializedRepo call, then logs:
    [INF] datastoreΒ·lock: Waiting for N per-model lock(s) to be released
    and never proceeds. N is the number of model references in the workflow.

Real-world example β€” hivemq/swamp-extensions publish workflow:

  • Run 25510761754 β€” killed at the GHA 6h max.
  • Run 25545885311 β€” killed at our self-imposed 5m timeout, with a verbose trace log showing the lock-wait message.

Root cause

Reading the swamp source at 20260506.233640.0-sha.5729ac50:

  1. swamp workflow run extracts model references from the workflow YAML and acquires per-model locks for all of them β€” cli/commands/workflow_run.ts:195 (acquireModelLocks(...)). Locks are held for the entire run.
  2. The child swamp extension push calls requireInitializedRepo β€” cli/commands/extension_push.ts:195.
  3. requireInitializedRepo calls waitForPerModelLocks twice β€” cli/repo_context.ts:434 and :452.
  4. waitForPerModelLocks polls every 1s for non-stale lock files (repo_context.ts:704-714) and never breaks while any are held. It cannot distinguish locks held by the parent process from locks held by an unrelated writer.

Environment

  • swamp version: 20260506.233640.0-sha.5729ac50
  • OS: Ubuntu (GitHub Actions runner, also reproduces locally on macOS)
  • The behaviour appeared between 2026-05-05 (last green publish run) and 2026-05-07 (first hang). Earlier swamp versions did not deadlock for this pattern.

Affected components

  • cli/commands/workflow_run.ts
  • cli/commands/extension_push.ts
  • cli/repo_context.ts (requireInitializedRepo, waitForPerModelLocks)

Impact

CI/CD pipelines that wrap swamp extension push (or any other structural command) in a swamp workflow are broken. The hang is silent until the runner's max-execution timeout, so depending on the platform you get either a 6h cancellation (default GHA) or whatever your team has configured.

Proposed fixes (any of)

  1. Parent-process annotation. Have swamp workflow run set an env var (e.g. SWAMP_PARENT_LOCK_HOLDER=<pid+lockfile-list>) before invoking child shells, and have waitForPerModelLocks skip locks owned by ancestors.
  2. Bounded wait. Give waitForPerModelLocks a configurable timeout (default several minutes) so deadlocks fail loudly instead of hanging indefinitely.
  3. Release-around-shell. Have swamp workflow run release per-model locks before invoking each step's command/shell body (which runs user code that may recurse into swamp), reacquire after.
  4. Opt-out flag on push. Add --skip-wait-for-locks (or similar) to swamp extension push for trusted CI contexts that know they are running inside a parent swamp.

Workaround

Move the publish loop out of the swamp workflow into plain CI bash. See hivemq/swamp-extensions PR #68 β€” the GHA workflow now does auth-write, swamp auth whoami, change-detection, and swamp extension push directly, with no swamp workflow run wrapping the push.

Related: feature request #295 (collective-scoped auth keys / OIDC) is independent but reduces the surface area for this kind of CI-side workaround pile-up.

02Bog Flow
βœ“OPENβœ“TRIAGEDβœ“IN PROGRESSβœ“SHIPPED+ 1 MOREASSIGNED+ 5 MOREREVIEW+ 3 MOREPR_MERGEDSHIPPED

Shipped

5/8/2026, 3:39:19 PM

Click a lifecycle step above to view its details.

03Sludge Pulse
stack72 assigned stack725/8/2026, 12:39:31 PM

Sign in to post a ripple.