Skip to content

fix: Stop pre-creating outputs/ and type/ directories in model index#202

Merged
github-actions[bot] merged 1 commit intomainfrom
lazy-model-dirs
Feb 6, 2026
Merged

fix: Stop pre-creating outputs/ and type/ directories in model index#202
github-actions[bot] merged 1 commit intomainfrom
lazy-model-dirs

Conversation

@adamhjk
Copy link
Copy Markdown
Contributor

@adamhjk adamhjk commented Feb 6, 2026

Summary

  • Stop eagerly creating empty type/ and outputs/ directories when indexing models
  • type/ is now created on-demand by ensureDir in indexTagBasedData when there's actual tag-based data to symlink
  • outputs/ is now only created when method output directories exist to symlink

Test Plan

  • All 1517 tests pass (deno run test)
  • Removed integration test assertions for empty type/ directory existence (definitions with no data no longer create it)
  • deno check, deno lint, deno fmt all pass

Only create type/ and outputs/ directories when there is actual content
to put in them, rather than eagerly creating empty directories during
model indexing.
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

This PR looks good. The changes correctly implement lazy creation of type/ and outputs/ directories - they are now only created when there's actual data to symlink, rather than being eagerly created for every model.

No blocking issues found.

Review Details

Code Quality ✅

  • TypeScript strict mode compliant - no any types
  • Uses named exports as required
  • Clean, focused changes with clear comments explaining the lazy creation behavior

Domain-Driven Design ✅

  • The SymlinkRepoIndexService is appropriately placed in the infrastructure layer
  • Changes maintain the existing structure and only affect the timing of directory creation
  • No new domain concepts introduced - this is purely an implementation detail

Test Coverage ✅

  • Integration tests updated appropriately - removed assertions for empty type/ directories
  • The test file name and description at line 708 (indexes definitions with type/ directory) is slightly misleading now since the test no longer checks for type/ directory existence, but this is a minor documentation concern
  • PR description confirms all 1517 tests pass

Security ✅

  • No security concerns - this is purely filesystem optimization
  • No new inputs or external data handling

Logic Review ✅

  • indexTagBasedData (lines 428-534) correctly calls ensureDir(tagDir) at line 474 only when dataItems.length > 0, ensuring type/ subdirectories are created on-demand
  • outputs/ directory (lines 412-416) is now created inside the if (await this.exists(methodsDir)) block, which is correct
  • The typeDir variable is now scoped inside the if (this.unifiedDataRepo) block (lines 400-403), which is cleaner

Suggestion (non-blocking)

Consider renaming the test "Integration: repo index rebuild indexes definitions with type/ directory" to better reflect what it now tests (e.g., "Integration: repo index rebuild indexes definitions correctly"), since the type/ directory assertions were removed.


LGTM 👍

@github-actions github-actions Bot merged commit 0570652 into main Feb 6, 2026
4 checks passed
@github-actions github-actions Bot deleted the lazy-model-dirs branch February 6, 2026 20:18
dhansak79 pushed a commit to dhansak79/swamp that referenced this pull request May 7, 2026
…ysteminit#1275)

## Summary

`swamp extension install`, `pull --force`, and `update` now remove
source files, bundle files, and skill directories declared in a prior
version's lockfile entry but absent from the current version. Closes
systeminit#202 and is the root-cause fix for the apparent catalog issue in systeminit#201 —
orphan source files were being re-indexed by the catalog rebuild,
producing duplicate `bundle_types` rows that resolved
non-deterministically.

### How

Single fix point in `installExtension()`
(`src/libswamp/extensions/pull.ts`) — all three commands funnel through
it:

1. **Snapshot** the prior lockfile entry's `files[]` BEFORE extraction.
2. After extraction completes, **diff** the snapshot against the new
`extractedFiles[]` via the new exported `computeOrphanDiff` helper.
3. **Prune** the diff via `pruneOrphanFiles` (recursive remove +
empty-parent cleanup, tolerates `NotFound`, returns the actually-removed
list — ground truth).
4. **Then** write the new lockfile entry. The prune-then-write ordering
means a kill mid-prune leaves the lockfile pointing at the OLD version,
so the next install retries the diff.

### User transparency

A new `orphans-pruned` event flows through every command stream
(pull/install/update) carrying the **actually removed** paths. Both log
and json modes render it. Users always see what was removed:

```
Removed 2 file(s) no longer in @hivemq/harvester/kubeconfig@2026.05.01.39:
  .swamp/pulled-extensions/@hivemq/harvester/kubeconfig/models/harvester/fetch_kubeconfig.ts
  .swamp/bundles/738c72f8/harvester/fetch_kubeconfig.js
```

### Doctor

`swamp doctor extensions` gains filesystem orphan detection across every
per-extension root the lockfile references — `pulled-extensions/<name>/`
and the bundle namespaces (`bundles/`, `vault-bundles/`,
`driver-bundles/`, `datastore-bundles/`, `report-bundles/`). A new
`extractTopLevelRoot` helper in `layout.ts` derives the roots from each
tracked path. Skills are tracked as directory paths only and skipped
from the inner-file walk; legacy paths (gen-1/gen-2) are skipped because
the extensionInstall migration handles them.

**Important:** orphans render as **warnings**, NOT failures.
`overallStatus` is unchanged by orphan presence so existing CI gates
that key on `overallStatus === \"fail\"` keep working through the
transition. A future release may promote orphans to failure once routine
install/pull/update calls have drained pre-existing dirt.

### Type signature refinements

To make the prune list flow through every command path, two
test-seam-style types were tightened:

- `InstallExtensionFn` — `Promise<unknown>` → `Promise<InstallResult |
undefined>`.
- `ExtensionUpdateDeps.installExtension` — `Promise<void>` →
`Promise<InstallResult | undefined>`.
- `extensionInstall` (`install.ts:152`) now captures the `InstallResult`
instead of discarding it.
- The CLI wrapper at `extension_update.ts:127` now returns the awaited
result.

The TypeScript compiler caught every dependent change automatically.

### Behavior change

Behavior of `extension install/pull/update` changes from \"additive\"
(write new files, leave old in place) to \"sync to manifest\" (write new
+ remove orphan). If a user has hand-edited files inside
`.swamp/pulled-extensions/<name>/` or in skill directories managed by an
extension, those files will be removed on the next install/pull/update.
The new orphans-pruned event surfaces every removal so the change is
visible at every interaction surface.

### Bonus fix

`sweepLegacyPaths` had a latent same-family bug: `Deno.remove` without
`{ recursive: true }` would fail on a non-empty directory entry (e.g.
legacy skill dirs). Fixed.

## Test Plan

- [x] Unit tests for `pruneOrphanFiles` (file removal, recursive
directory removal for skill dirs, NotFound tolerance, empty-parent
cleanup, ground-truth return value).
- [x] Unit tests for `computeOrphanDiff` (empty inputs, identical sets,
the canonical issue-202 transition, all-files-dropped, order stability).
- [x] Unit tests for `extractTopLevelRoot` (per-extension scoped/flat
subtrees, every bundle kind, skill paths return null, gen-1/gen-2 return
null).
- [x] `extensionInstall` orphans-pruned event firing/non-firing tests
via the existing `InstallExtensionFn` seam.
- [x] `extensionUpdate` orphans-pruned event firing/non-firing tests via
the now-typed `installExtension` deps.
- [x] `doctorExtensions` orphan detection across per-extension subtrees
AND bundle namespaces, plus skill-dir-skipping and
missing-lockfile-as-no-op.
- [x] `doctorExtensions` orphans-do-NOT-flip-overallStatus test.
- [x] Doctor renderer warnings-section test (orphans render as warnings,
OVERALL: PASS preserved).
- [x] Doctor JSON renderer surfaces `orphanFiles` field.
- [x] Manual end-to-end verification at `/tmp/swamp-repro-issue-202`:
- Pull `@hivemq/harvester/kubeconfig@2026.04.21.2` → installs
`fetch_kubeconfig.ts` + bundle.
- Pull `@hivemq/harvester/kubeconfig@2026.05.01.39 --force` → emits the
orphans-pruned event listing both old paths; on disk only
`kubeconfig.ts` and `kubeconfig.js` remain.
  - `swamp doctor extensions` → 5 passed, 0 orphans, OVERALL: PASS.
- `_extension_catalog.db.bundle_types` has exactly **one** row for
harvester (the duplicate-row state from systeminit#201 is gone).
- [x] `deno check`, `deno lint`, `deno fmt`, `deno run test` (5116
passed), `deno run compile`.

## Follow-ups

Tracked separately, not in this PR:

- File an issue against `systeminit/swamp` for a process-level install
lock — concurrent installs can now race to remove files another process
just wrote (worse than today's duplicate-files race).
- File a UAT gap in `systeminit/swamp-uat` (label `cli`) for the
version-bump-drops-a-file scenario.
- File a docs issue in `systeminit/swamp-club` if the existing manual at
`content/manual/reference/` covers `extension install/pull/update`
semantics.

## Risks & non-mitigations

- **Hand-edited files in `pulled-extensions/<name>/` will disappear** on
the next install/pull/update. Mitigated by the always-on
`orphans-pruned` event in command output, but a fully-automated install
that swallows output won't see it. Documented as residual.
- **Verbose output for large prunes** (extension drops 100 files in a
refactor → 100 lines). Acceptable for now; address as follow-up if
feedback surfaces noise.
- **Concurrent install race** filed as separate follow-up issue (worse
failure mode than today's race, but rare and self-correcting).
- **Narrow window**: prune succeeds, lockfile-write fails → user
recovers via `pull --force`. Documented in code comment.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant