Autonomous AI development kit.
npm install -g sfk
# Then use anywhere
sfk # Uses PRD.md with your configured engine
sfk --claude # Use Claude Code
sfk --codex # Use Codex CLI
sfk --model sonnet # Override modelRalph runs an AI coding assistant in a loop, feeding it tasks from a PRD (Product Requirements Document) and tracking progress across iterations. Each iteration:
- Reads
PRD.mdto find the first incomplete task - Reads progress file from
~/.sfk/progress/to learn from previous iterations - Implements exactly ONE task
- Verifies test files were created/modified
- Runs tests to verify the implementation
- If tests pass: marks task complete, commits, and logs progress
- If tests fail: logs failure details for the next iteration
-
Configure SFK:
mkdir -p ~/.sfk cp config.example ~/.sfk/config $EDITOR ~/.sfk/config
-
Create a
PRD.mdin your project with tasks:## Tasks - [ ] Implement user authentication - [ ] Add database migrations - [ ] Set up API endpoints
-
Run Ralph:
sfk
Ralph will work through each task, running tests and committing progress automatically.
sfk # Uses PRD.md and your ~/.sfk/config settings
sfk --opencode # Explicit OpenCode
sfk --claude # Use Claude Code
sfk --codex # Use Codex CLI
sfk --model big-pickle # Override model
sfk --max-iterations 20 # Custom iteration limit
sfk --skip-commit # Don't auto-commit
sfk --no-tests # Skip test verification (not recommended)
sfk --prd tasks.md # Use different PRD file
sfk -v # Verbose output
sfk --help # Show all optionsSFK uses a single INI config file under your home directory:
| Location | Purpose |
|---|---|
~/.sfk/config |
Required SFK configuration, created as a commented example on install |
1. CLI arguments (--model, --engine, etc.)
2. Global config (~/.sfk/config)
Created automatically on npm install -g sfk as a commented example. SFK exits with a friendly setup message until you uncomment and set the required values. Excerpt:
[engine]
# type = opencode
[models]
# Only the selected engine model is required.
# claude = sonnet
# codex = gpt-5-codex
# opencode-primary = big-pickle
# effort = high # low|medium|high|xhigh
[ralph]
# max-iterations = 10
# sleep-seconds = 2
# skip-commit = false
[willie]
# max-iterations = 0
# push-after-fix = falseEdit ~/.sfk/config and uncomment the values you want to use. Only the model for the selected engine is required:
mkdir -p ~/.sfk
$EDITOR ~/.sfk/configMinimal OpenCode example:
[engine]
type = opencode
[models]
claude = sonnet
codex = gpt-5-codex
opencode-primary = big-pickle
effort = high
[rate-limits]
soft-retries = 3
soft-wait = 30
[ralph]
max-iterations = 10
sleep-seconds = 2
skip-commit = false
push-after-commit = false
skip-test-verify = false
max-consecutive-failures = 3
audit-after-complete = false
[willie]
max-iterations = 0
push-after-fix = falseEach agent inherits the configured engine model unless you set a per-agent override:
[engine]
type = opencode
[models]
opencode-primary = opencode/glm-5-free
[ralph]
# inherits opencode/glm-5-free
effort = high # use xhigh for codex/opencode if desired
[willie]
# inherits opencode/glm-5-free
effort = high # Claude rejects xhighOr override the model for a specific agent:
[willie]
model = opus # willie uses opus insteadSee config.example for all available settings with documentation.
Effort levels are configured once and translated per engine at runtime:
- Claude uses
CLAUDE_CODE_EFFORT_LEVEL - Codex uses
model_reasoning_effort - OpenCode uses
--variant
Supported levels:
- Claude:
low,medium,high - Codex/OpenCode:
low,medium,high,xhigh
If you choose an unsupported level for the selected engine, sfk exits with an error.
Ralph automatically detects your test command based on project files:
| Detected File | Test Command |
|---|---|
package.json with test script |
npm test / bun test / pnpm test / yarn test |
vitest.config.ts |
npx vitest run |
jest.config.ts |
npx jest |
pytest.ini or pyproject.toml |
pytest |
go.mod |
go test ./... |
Cargo.toml |
cargo test |
If auto-detection fails or you need a custom command, set test-cmd under [ralph] in ~/.sfk/config.
| File | Description |
|---|---|
PRD.md |
Task list with checkbox format (required) |
~/.sfk/progress/progress-<project>.log |
Centralized progress tracking across iterations |
AGENTS.md |
Reusable patterns for the codebase (optional) |
- One task per iteration - Ensures atomic, testable changes
- Enforced test writing - Verifies test files were actually created/modified
- Test-gated completion - Runs test suite after each iteration, blocks progress on failure
- Double verification - PRD.md check + final test suite before declaring complete
- Progress persistence - Learnings survive across iterations in
~/.sfk/progress/ - External logging - Per-run logs at
~/.sfk/logs/<project>/<agent>/<timestamp>.log - Auto-commit - Commits changes automatically with descriptive messages
- Automatic fallback - Switches to fallback model on rate limits (OpenCode)
- Skip commits - Test PRDs without polluting git history
- Configurable - Per-agent model and effort settings
Task 1 -> AI implements -> tests written? -> NO -> retry iteration
-> YES -> run tests -> FAIL -> retry
-> PASS -> Task 2 -> ...
All [x] -> final test suite -> PASS -> done
-> FAIL -> keep iterating
The script independently verifies:
- Test files were created/modified (*.test.ts, *.spec.ts, etc.)
- Full test suite passes after each iteration
- Final test suite passes before declaring complete
This prevents the AI from marking tasks complete without actually writing tests.
| Engine | CLI | Model Setting |
|---|---|---|
| OpenCode | opencode |
models.opencode-primary |
| Claude | claude |
models.claude |
| Codex | codex |
models.codex |
If a rate limit is detected and opencode-fallback is configured, Ralph automatically switches to the fallback model and retries.
| Code | Meaning |
|---|---|
| 0 | All tasks completed successfully |
| 1 | Max iterations reached or error occurred |
- Node.js 18+ or Bun
- OpenCode CLI (
opencodecommand) - for OpenCode engine - Claude CLI (
claudecommand) - for Claude engine - Codex CLI (
codexcommand) - for Codex engine
# Clone and install
git clone https://github.com/dominicnunez/springfield.git
cd springfield/cli
bun install
# Run in dev mode
npx tsx src/index.ts --help
# Build binaries (requires Bun)
bun run build:allGit hooks are installed by bun install in cli/.
pre-commitrunsbun run hook:pre-commitpre-pushrunsbun run hook:pre-push
Commit messages should use conventional commits. Bodies are optional; when
present, write them as plain prose, wrap longer explanations across multiple
lines, use ! for breaking changes, and avoid labels like Why:.
If you use Nix, enter the shell first with nix develop, then run cd cli && bun install.
Willie is Ralph's counterpart: Ralph builds, Willie audits. It runs a continuous loop of audit → validate → fix until the codebase is clean.
sfk audit # auto-detect source path, then audit that path
sfk audit cli/src # audit cli/src and its subpaths only
sfk audit --max-iterations 3 # limit to 3 iterations
sfk audit --step validate # start from validate step
sfk audit --step fix # start from fix stepWillie resolves the audit prompt in this order:
--audit-prompt <path>CLI flagaudit/prompt.mdin the project root~/.sfk/audit-prompt.md(global)- Built-in default prompt (security, bugs, performance, code quality)
Each iteration runs three steps:
- Audit — Opus scans the requested source path, or first identifies the source path when none is provided, may consult categorized mirrored files under
audit/design/,audit/misreads/, andaudit/risks/selectively, and SFK filters any still-matching exceptions out ofaudit/report.mdbefore validation - Validate — Opus reads the actual code at each finding and removes only true false positives or correct-by-design findings from the report
- Fix — Opus applies proper long-term fixes, commits each semantic group, pushes only when
[willie] push-after-fix = true, and uses categorized mirrored files only for genuine misreads, design decisions, or non-remediable risks
The loop exits when an audit produces zero findings.
All audit artifacts live in the audit/ directory:
| File | Description |
|---|---|
audit/prompt.md |
Custom audit instructions (optional) |
audit/report.md |
Generated findings |
audit/misreads/ |
False positives, stored in mirrored source paths such as audit/misreads/src/auth.md for src/auth.ts |
audit/design/ |
Correct-by-design tradeoffs, stored in mirrored source paths such as audit/design/src/auth.md for src/auth.ts |
audit/risks/ |
Accepted risks and non-remediable constraints, stored in mirrored source paths such as audit/risks/src/auth.md for src/auth.ts |
Exception entries identify only the source line because the category and source file are encoded in the mirrored file path. If one issue spans multiple files, add an entry to each affected file's mirrored category file:
### Plain language description
**Line:** `42` — optional context
**Reason:** Explanation (can be multiple lines)If you're upgrading from an older version, create the new config at ~/.sfk/config using the INI format shown above. See config.example for the full reference.