#296 swamp extension push deadlocks when invoked from inside a swamp workflow step
Opened by bixu Β· 5/8/2026Β· Shipped 5/8/2026
Description
swamp extension push (and any structural command that calls requireInitializedRepo) hangs indefinitely when invoked as a subprocess from inside a swamp workflow run step. The child process polls forever in waitForPerModelLocks, waiting on per-model locks that are held by its own ancestor β the parent swamp workflow run.
This makes it impossible to use a swamp workflow to orchestrate publishing, which was previously a working pattern.
Steps to reproduce
- Define a swamp workflow with a step that runs a
command/shellmodel. - In that step's shell, invoke
swamp extension push <manifest>as a subprocess. - Run the workflow with
swamp workflow run <name>. - The child push gets to its own
requireInitializedRepocall, then logs:
and never proceeds. N is the number of model references in the workflow.[INF] datastoreΒ·lock: Waiting for N per-model lock(s) to be released
Real-world example β hivemq/swamp-extensions publish workflow:
- Run 25510761754 β killed at the GHA 6h max.
- Run 25545885311 β killed at our self-imposed 5m timeout, with a verbose trace log showing the lock-wait message.
Root cause
Reading the swamp source at 20260506.233640.0-sha.5729ac50:
swamp workflow runextracts model references from the workflow YAML and acquires per-model locks for all of them βcli/commands/workflow_run.ts:195(acquireModelLocks(...)). Locks are held for the entire run.- The child
swamp extension pushcallsrequireInitializedRepoβcli/commands/extension_push.ts:195. requireInitializedRepocallswaitForPerModelLockstwice βcli/repo_context.ts:434and:452.waitForPerModelLockspolls every 1s for non-stale lock files (repo_context.ts:704-714) and never breaks while any are held. It cannot distinguish locks held by the parent process from locks held by an unrelated writer.
Environment
- swamp version:
20260506.233640.0-sha.5729ac50 - OS: Ubuntu (GitHub Actions runner, also reproduces locally on macOS)
- The behaviour appeared between 2026-05-05 (last green publish run) and 2026-05-07 (first hang). Earlier swamp versions did not deadlock for this pattern.
Affected components
cli/commands/workflow_run.tscli/commands/extension_push.tscli/repo_context.ts(requireInitializedRepo,waitForPerModelLocks)
Impact
CI/CD pipelines that wrap swamp extension push (or any other structural command) in a swamp workflow are broken. The hang is silent until the runner's max-execution timeout, so depending on the platform you get either a 6h cancellation (default GHA) or whatever your team has configured.
Proposed fixes (any of)
- Parent-process annotation. Have
swamp workflow runset an env var (e.g.SWAMP_PARENT_LOCK_HOLDER=<pid+lockfile-list>) before invoking child shells, and havewaitForPerModelLocksskip locks owned by ancestors. - Bounded wait. Give
waitForPerModelLocksa configurable timeout (default several minutes) so deadlocks fail loudly instead of hanging indefinitely. - Release-around-shell. Have
swamp workflow runrelease per-model locks before invoking each step'scommand/shellbody (which runs user code that may recurse into swamp), reacquire after. - Opt-out flag on push. Add
--skip-wait-for-locks(or similar) toswamp extension pushfor trusted CI contexts that know they are running inside a parent swamp.
Workaround
Move the publish loop out of the swamp workflow into plain CI bash. See hivemq/swamp-extensions PR #68 β the GHA workflow now does auth-write, swamp auth whoami, change-detection, and swamp extension push directly, with no swamp workflow run wrapping the push.
Related: feature request #295 (collective-scoped auth keys / OIDC) is independent but reduces the surface area for this kind of CI-side workaround pile-up.
Shipped
Click a lifecycle step above to view its details.
Sign in to post a ripple.