Lab: swamp CLI cold-start latency (~5–7 s per invocation) compounds in workflows that shell out

Summary

Each invocation of swamp from a cold shell takes ~5–7 seconds before the actual command runs. Most of this is binary startup (Deno-compiled executable) + extension/repo discovery. Comparable CLIs (kubectl, helm) cold-start in under 100 ms.

Why it matters

Compounds in two common scenarios:

1. Agent-driven workflows. An LLM agent (claude, etc.) issuing 50–100 swamp commands per task burns 4–10 minutes purely on CLI cold-starts. We've measured this in benchmark runs at https://github.com/systeminit/swamp-benchmark — 30–35% of wall-clock time on the swamp variant is CLI startup, not actual work. The freeform/kubectl variant is 6× faster largely for this reason.

2. Workflows that shell out. Hit this directly today building the bootstrap step for @john/k8s's debug workflow. The bootstrap shells swamp model create 8 times to set up per-namespace instances. Sequentially:

for kind in pod service deployment event configmap pvc secret netpol; do
  swamp model create "@john/$kind" "${ns}-${kind}" --global-arg "namespace=${ns}" --json
done

This took >60 seconds (8 × ~7s startup + actual work) and hit the workflow step's 60 000 ms default timeout. Workaround was to parallelise (& done; wait) which fits in the timeout but only because we have 8 cores — won't scale.

Reproduction

# Cold cache
sync && sudo purge   # macOS: drop FS cache
time swamp --version
# Typical: 4.5–6.5s real time (mostly CPU)

# Subsequent invocations are still ~3–4s because the binary doesn't
# share state across processes (no daemon).
time swamp --version
time swamp --version

vs kubectl version --client: ~50 ms cold, ~30 ms warm.

Where the time goes (from a quick instrumented run)

~1.5s: Deno runtime spin-up (compiled binary unpack/init)
~1.0s: extension source discovery (walking .swamp/, .swamp-sources.yaml, pulled-extensions/)
~1.0s: model registry construction (loading extensions/models/**/*.ts bundles)
~1.0–2.0s: command parser/cliffy init + telemetry context

Suggested mitigations (independent of each other)

Persistent daemon — swamp serve already exists for the workflow API. Extend it (or a new lightweight swamp daemon) to serve a Unix socket; CLI invocations proxy via the socket and complete in <100 ms when the daemon is up. Falls back to cold-start when no daemon. Comparable to how git doesn't have one and gh doesn't either, but nix does (and it's how they fixed their startup time).
Lazy registry loading — defer extension/model bundle parsing until a command actually needs it. swamp --version should not need to walk extensions/models/; today it does.
Pre-compiled bundle cache — model TypeScript files get re-bundled per invocation. If the source hashes match the cache, skip re-bundle. Cache lives in .swamp/cache/.
Drop telemetry init from the hot path — the post-success telemetry flush happens after the command completes, but the context build (telemetryCtx in src/cli/mod.ts) happens early. Could be deferred to the catch/finally.

Empirical context

Benchmarked at https://github.com/systeminit/swamp-benchmark (k8s-debug-v2 challenge). Repro of the bootstrap timeout case: try running @john/k8s's @john/debug-namespace-deep workflow on a fresh repo with the bootstrap job enabled. v2026.04.29.1 hit the timeout sequentially; v2026.04.29.2 works around with parallelism but it's a band-aid.

I think (1) is the highest-leverage — it'd benefit every CLI invocation, not just the workflow-shells-out case.

swamp issue get should rate-limit unauthenticated users instead of blocking

swamp issue get should not require authentication

telemetry: emit child entries for follow-up action method invocations

Publish release-candidate / unstable extension versions

extension source rm leaves stale @local/. rows in catalog, blocking later pulls of registry types

extension push drops binaries: field from re-emitted archive manifest.yaml

macOS launchd autoupdate (club.swamp.autoupdate) silently fails — binary stays stale

Clicking @hivemq/honeycomb extension card on /extensions shows 'Something went wrong'

Docs: Update model-definitions.md and workflows.md for direct type execution

Docs: document binaries manifest field in extension-manifest.md

Add an official @swamp/ssh extension for general-purpose SSH (brownfield-friendly)

Accept and display binaries field from extension push metadata

Direct type execution: collapse model create + method run into one command

Per-method telemetry events for workflow runs

Workflow run liveness: orphaned 'running' records when originating CLI process dies mid-run

Provide a CLI-shape primer for AI agents to reduce rediscovery overhead

quality rubric: don't penalize extensions whose upstream constrains them to a single platform

Extension update rejects multiple .ts files extending the same target type within one local extension (regression)

swamp extension push deadlocks when invoked from inside a swamp workflow step

Collective-scoped auth keys + OIDC federation for CI publishing

forEach self.* in modelIdOrName not resolved in runtime execution path

Award leaderboard points for referrals and collective invites

Add agent harness detection and AiTool to telemetry

Workflow-level runtime expressions (env.*, vault.*) not resolved in driverConfig — docker driver receives literal ${{ ... }} strings

Implement W5: Per-fingerprint import URLs + subprocess test harness (extension catalog rearchitecture)

swamp config set crashes with YAML serialization error

datastore compact: VACUUM fails in compiled binary (SQLITE_LIMIT_ATTACHED=0)

Repo-level version gating: minSwampVersion high-water mark for team consistency

Docs: document self.* expressions in modelIdOrName during forEach

Docs: How-to guide for background autoupdating

Manifest version bumps silently ignored for existing local extension aggregates

materialiseExtensions misclassifies pulled rows when manifest name collides with a pulled extension

Local extension edits don't reliably trigger rebundle

Missing unique indexes on user.email and user.username allow duplicate users

Resolve self.* expressions in modelIdOrName during forEach expansion

discord-bot double-sends sign_up notifications

Discord bot sends duplicate signup notifications

Add 'swamp workflow list' as alias for 'swamp workflow search'

Add 'swamp auth status' as alias for 'swamp auth whoami'

Extension bundle cache does not invalidate on source edits

extension pull fails on @local/[email protected] phantom-claim collision when local repo has its own extension

Lab search by numeric issue ID returns no results

W3 sourceToRow writes empty source_mtime — should carry filesystem mtime through Source entity

Warm-start rebundleAndUpdateCatalog should respect terminal RowStates set by reconcile

Implement W4: KindAdapter + unified loader (extension catalog rearchitecture)

Docs: add vault read-secret command to reference manual

Extension layer garbage collection: prune catalog rows + evict orphaned bundles

Docs: document workflow concurrency limits in reference manual

Locally-sourced extension: source_mtime updates without regenerating stale bundle

Extension push: allow shipping executable host helpers (bin/mudroom blocker)

Vault expressions silently deliver __SWAMP_VSEC__ sentinels under the docker driver

First-class shell-shim support for extensions, with registry-level visibility

Local extension model bundles don't rebuild when source changes (no rebuild CLI; manual cache delete breaks the runner)

Configurable concurrency limits for workflow fan-out (forEach, parallel jobs/steps)

feat(security): redact sensitive method arg values from audit log

docs: document swamp datastore compact and GC WAL behaviour

UAT: swamp datastore compact reclaims WAL and catalog space

Plan v4 step 9 literal test untestable under current catalog PK semantics

Cross-process concurrency stress for W2 lifecycle services

UAT additions for W2 lifecycle services (Install/Remove/Upgrade)

Implement W3: ReconcileFromDisk + freshness-as-aggregate-query (extension catalog rearchitecture)

doctor extensions repair: clean catalog-only orphans

Extensions should be able to ship Claude Code skills

Pre-existing TOCTOU windows in YAML repo walkers (findAll directory level + findById)

`swamp datastore setup` migration is not resumable / leaves the repo in a partial state on failure

Built-in models must honor AbortSignal so --timeout works in practice

Reader-lock or lock-free read path for data list/get/search/query

data search: surface jobTag alongside workflowTag and stepTag (follow-up to #237)

global-arg silently strips unknown keys; should reject via strict Zod schema

redmine extension: adopt state machine patterns from @magistr for agent-driven workflows

performance degrades significantly with large SQLite catalog

summarise: timeout/hang on repos with many workflow runs

extension quality/fmt fail on pulled extensions (path mismatch)

vault: add read-secret CLI command for agent-driven secret retrieval

data query: stepName and jobName fields always empty in CEL results

Extension ecosystem: shared utility library, version sync tooling, and HTTP resilience patterns

Agentic CLI improvements: --json stdout isolation, array inputs, and --repo-dir consistency

data delete fails with "Directory not empty (os error 39)" when concurrent writes are active

Extract LockfileRepository (W2 prequel for swamp-club#231)

Extend DatastoreSyncService.markDirty() with optional relPath argument

Workflow-level runtime expressions (env., vault.) not resolved in driverConfig — docker driver receives literal ${{ ... }} strings