Skip to main content
← Back to list
01Issue
FeatureShippedSwamp CLI
Assigneesstack72

#254 Cross-process concurrency stress for W2 lifecycle services

Opened by stack72 · 5/5/2026· Shipped 5/6/2026

Context

W2 (swamp-club#231 / PR systeminit/swamp#1310) introduced `InstallExtensionService`, `RemoveExtensionService`, `UpgradeExtensionService` — the new owners of the catalog-write surface. swamp-club#234 added a 50-process concurrent stress test for the data-delete path. The new lifecycle services have no equivalent stress test.

Architecturally the lifecycle services are concurrency-safe: `saveAll` is one SQLite transaction with WAL on, lockfile mutations use the existing advisory-lock retry path, and per-extension atomicity is the contract. But "architecturally safe" ≠ "verified under stress".

Proposed test

`integration/lifecycle_concurrent_stress_test.ts` (new file), mirroring swamp-club#234's pattern at `integration/data_delete_test.ts`:

  • Spawn N (e.g. 50) concurrent `swamp` processes against the SAME repo, each running one of: `extension pull `, `extension rm `, `extension pull `, `extension update`.
  • Assert that after all processes finish:
    • The catalog has no orphan rows (every row has a matching lockfile entry, every lockfile entry has matching rows).
    • No "`DuplicateTypeError`" surfaced for benign cases (e.g. two processes both trying to install the SAME extension).
    • The lockfile is well-formed JSON.
    • The on-disk `.swamp/pulled-extensions/` tree is consistent with the lockfile (no orphans, no dangling references).

The test should run for several iterations to surface race conditions.

Acceptance

  • New stress test mirrors the swamp-club#234 pattern.
  • Test passes consistently across 50+ runs locally and in CI.
  • Documented in `design/extension.md` under the "Crash-state recovery" section as a load-bearing concurrency claim.

Why post-merge for W2

W2 ships the lifecycle services without this stress test because:

  1. The architecture-agent W2 review explicitly tagged this as soak-territory, not a merge gate.
  2. The unit-level `FaultingStubRepository` tests pin the SQLite txn rollback semantics — concurrency is downstream of those.
  3. No production user has reported concurrency issues on the existing pull/rm flow, which is what the new services replace.

That said, this issue is the explicit follow-up tracking that the gap exists. Worth landing within the W2 release cycle.

02Bog Flow
OPENTRIAGEDIN PROGRESSSHIPPED+ 1 MOREASSIGNED+ 5 MOREREVIEW+ 6 MOREPR_MERGEDSHIPPED

Shipped

5/6/2026, 3:24:43 PM

Click a lifecycle step above to view its details.

03Sludge Pulse
stack72 assigned stack725/5/2026, 8:47:27 PM

Sign in to post a ripple.