# Ralph Analysis: Autonomous AI Agent Loop System
## Executive Summary
**Ralph** is an autonomous AI agent loop system designed to run AI coding tools (Amp or Claude Code) repeatedly until all Product Requirements Document (PRD) items are complete. Based on [Geoffrey Huntley's Ralph pattern](https://ghuntley.com/ralph/), it represents a paradigm for autonomous software development where each iteration spawns a fresh AI instance with clean context, relying on git history, a progress log, and a structured PRD JSON file for persistence between runs.
The core philosophy is simple yet powerful: break work into small, independently completable stories, run AI agents in a loop, and let structured persistence mechanisms carry context forward. This approach solves the fundamental problem of AI context limits by treating each iteration as a stateless worker that reads from and writes to well-defined artifacts.
---
## Architecture Overview
### High-Level Flow
```
┌──────────────────────────────────────────────────────────────────┐
│ SETUP PHASE │
├──────────────────────────────────────────────────────────────────┤
│ 1. User writes a PRD (markdown) │
│ 2. Convert PRD to prd.json (structured user stories) │
│ 3. Run ralph.sh (starts autonomous loop) │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ EXECUTION LOOP │
├──────────────────────────────────────────────────────────────────┤
│ 4. AI picks highest priority story where passes: false │
│ 5. Implements the story (writes code, runs tests) │
│ 6. Commits changes (if tests pass) │
│ 7. Updates prd.json (sets passes: true) │
│ 8. Logs learnings to progress.txt │
│ 9. Updates AGENTS.md/CLAUDE.md with reusable patterns │
│ 10. Check: More stories? → Loop back to step 4 │
└──────────────────────────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────┐
│ COMPLETION │
├──────────────────────────────────────────────────────────────────┤
│ Output: COMPLETE and exit │
└──────────────────────────────────────────────────────────────────┘
```
### Core Components
| Component | Purpose | Persistence |
|-----------|---------|-------------|
| `ralph.sh` | Bash loop that spawns fresh AI instances | N/A (orchestrator) |
| `prd.json` | Task list with status tracking | Git-tracked JSON |
| `progress.txt` | Append-only learnings log | Git-tracked text |
| `AGENTS.md` / `CLAUDE.md` | Reusable patterns for future iterations | Git-tracked markdown |
| `prompt.md` | Instructions template for Amp | Static config |
| Skills (`prd`, `ralph`) | PRD generation and conversion helpers | Static config |
---
## Key Features
### 1. **Stateless Iteration Model**
Each iteration spawns a completely fresh AI instance with no memory of previous work. Context is rebuilt from:
- Git history (what was committed)
- `progress.txt` (learnings and context)
- `prd.json` (which stories are done)
**Key insight**: This sidesteps the AI context window limit by treating each run as independent, with structured artifacts serving as the "memory."
### 2. **Structured Task Management (prd.json)**
```json
{
"project": "MyApp",
"branchName": "ralph/task-priority",
"description": "Task Priority System - Add priority levels to tasks",
"userStories": [
{
"id": "US-001",
"title": "Add priority field to database",
"description": "As a developer, I need to store task priority...",
"acceptanceCriteria": [
"Add priority column to tasks table",
"Typecheck passes"
],
"priority": 1,
"passes": false,
"notes": ""
}
]
}
```
**Design decisions:**
- Priority-based ordering ensures dependencies are handled correctly
- `passes: false/true` provides clear completion tracking
- Acceptance criteria are verifiable (not vague)
- Stories are sized to fit within one context window
### 3. **Progressive Learning System**
The dual-file learning system distinguishes between:
**`progress.txt`** - Append-only chronological log:
```
## [Date/Time] - [Story ID]
- What was implemented
- Files changed
- **Learnings for future iterations:**
- Patterns discovered
- Gotchas encountered
- Useful context
---
```
**`AGENTS.md` / `CLAUDE.md`** - Consolidated reusable patterns:
```
## Codebase Patterns
- Use `sql` template for aggregations
- Always use `IF NOT EXISTS` for migrations
- Export types from actions.ts for UI components
```
**Key insight**: Chronological learnings for debugging, consolidated patterns for quick reference.
### 4. **Branch-Based Run Isolation**
- Each feature uses a dedicated branch (`ralph/feature-name`)
- When starting a new feature, previous runs are archived to `archive/YYYY-MM-DD-feature-name/`
- Clean separation between features prevents context pollution
### 5. **Quality Feedback Loops**
Ralph requires feedback loops to function:
- Typecheck catches type errors
- Tests verify behavior
- CI must stay green (broken code compounds)
Stories must include verifiable acceptance criteria like "Typecheck passes" and "Tests pass."
### 6. **Browser Verification for UI Stories**
Frontend stories include "Verify in browser using dev-browser skill" as acceptance criteria. This ensures visual verification of UI changes, not just code compilation.
### 7. **Stop Condition Protocol**
The loop terminates when all stories have `passes: true`. The AI outputs:
```
COMPLETE
```
This magic string is grep'd by `ralph.sh` to detect completion.
### 8. **Multi-Tool Support**
Ralph supports both Amp and Claude Code:
```bash
./ralph.sh --tool amp [max_iterations] # Default
./ralph.sh --tool claude [max_iterations]
```
Each tool has its own prompt template (`prompt.md` for Amp, `CLAUDE.md` for Claude Code).
### 9. **Skills System for PRD Workflow**
Two skills automate PRD creation:
**`prd` skill**: Generates structured PRDs with clarifying questions
- Asks 3-5 essential questions with lettered options (for quick "1A, 2C, 3B" responses)
- Creates markdown PRD with user stories, functional requirements, non-goals
**`ralph` skill**: Converts markdown PRDs to JSON
- Enforces story sizing (completable in one iteration)
- Orders by dependencies (schema → backend → UI)
- Adds standard criteria ("Typecheck passes", "Verify in browser")
---
## Notable Patterns and Design Decisions
### 1. **Single Story Per Iteration**
**Design**: Each AI run handles exactly ONE user story, never more.
**Rationale**:
- Ensures complete focus on a single task
- Prevents context exhaustion mid-feature
- Creates clean commit boundaries
- Simplifies failure recovery (retry a single story, not multiple)
### 2. **Append-Only Progress Log**
**Design**: `progress.txt` is append-only, never overwritten.
**Rationale**:
- Preserves full history for debugging
- Enables pattern discovery over time
- Prevents accidental loss of learnings
- Supports consolidation into AGENTS.md when patterns emerge
### 3. **Story Sizing Rules**
**Design**: Stories must be small enough for one context window.
**Right-sized examples:**
- Add a database column and migration
- Add a UI component to an existing page
- Update a server action with new logic
- Add a filter dropdown to a list
**Too big (must split):**
- "Build the entire dashboard"
- "Add authentication"
- "Refactor the API"
**Rule of thumb**: If you can't describe the change in 2-3 sentences, it's too big.
### 4. **Dependency-Ordered Execution**
**Design**: Stories execute in priority order, earlier stories can't depend on later ones.
**Correct order:**
1. Schema/database changes (migrations)
2. Server actions / backend logic
3. UI components that use the backend
4. Dashboard/summary views that aggregate data
### 5. **Commit Discipline**
**Design**: Only commit when tests pass, with structured messages.
```
feat: [Story ID] - [Story Title]
```
**Rationale**: Clean git history provides context recovery for future iterations.
### 6. **Verifiable Acceptance Criteria**
**Design**: Every criterion must be testable, never vague.
**Good**: "Button shows confirmation dialog before deleting"
**Bad**: "Works correctly", "Good UX", "Handles edge cases"
### 7. **Archiving Previous Runs**
**Design**: When `branchName` changes, archive previous `prd.json` and `progress.txt` to `archive/YYYY-MM-DD-feature-name/`.
**Rationale**: Clean separation between features, preserves history for reference.
---
## Context Management Strategy
Ralph's context management is its most innovative aspect:
### Between Runs (Persistence)
| Mechanism | What It Carries | Format |
|-----------|-----------------|--------|
| Git commits | Code changes, file structure | Versioned files |
| `prd.json` | Task completion status | Structured JSON |
| `progress.txt` | Learnings, gotchas, patterns | Structured text |
| `AGENTS.md` | Consolidated reusable patterns | Markdown |
### Within a Run (Instructions)
The AI receives:
1. Instructions from `prompt.md` or `CLAUDE.md`
2. The `prd.json` file content
3. The `progress.txt` file (especially Codebase Patterns section)
4. Access to read any file via AI tool capabilities
### Context Recovery Pattern
Each iteration:
1. Reads `progress.txt` Codebase Patterns section first (quick reference)
2. Reads `prd.json` to find next incomplete story
3. Checks git branch matches expected branch
4. Implements story
5. Appends learnings to `progress.txt`
6. Optionally consolidates patterns to AGENTS.md
---
## Agent Orchestration Model
### Single-Agent Loop (Not Multi-Agent)
Ralph is NOT a multi-agent system. It's a single-agent loop where:
- One AI instance runs at a time
- Each instance is independent (no inter-agent communication)
- Coordination happens via file-based state (prd.json, progress.txt)
### Orchestration via Bash Script
`ralph.sh` is a simple bash loop:
```bash
for i in $(seq 1 $MAX_ITERATIONS); do
OUTPUT=$(cat prompt.md | amp --dangerously-allow-all 2>&1 | tee /dev/stderr) || true
if echo "$OUTPUT" | grep -q "COMPLETE"; then
echo "Ralph completed all tasks!"
exit 0
fi
done
```
**Key points:**
- Uses `--dangerously-allow-all` (Amp) or `--dangerously-skip-permissions` (Claude) for autonomous operation
- Outputs are piped through `tee` for visibility
- Completion detected via grep for magic string
- 2-second sleep between iterations
---
## Error Handling and Recovery
### Implicit Error Handling
Ralph has minimal explicit error handling. Instead:
- If tests fail, the story isn't committed
- If the AI can't complete a story, it logs learnings and the next iteration retries
- If max iterations are reached, the script exits with an error
- Human intervention is expected for complex failures
### Recovery via Progress Log
Failed attempts are documented in `progress.txt`:
```
## [Date/Time] - [Story ID]
- Attempted to implement X
- Failed because Y
- **Learnings:**
- Don't do Z
- Instead try W
---
```
The next iteration reads these learnings and avoids the same mistakes.
---
## Configuration and Customization
### Per-Project Customization
After copying the prompt template to your project:
- Add project-specific quality check commands
- Include codebase conventions
- Add common gotchas for your stack
### Amp Auto-Handoff Configuration
For large stories that approach context limits:
```json
{
"amp.experimental.autoHandoff": { "context": 90 }
}
```
This enables automatic handoff when context fills up.
### Iteration Limits
```bash
./ralph.sh [max_iterations] # Default: 10
```
---
## Comparison to Typical Orchestration Approaches
| Aspect | Ralph | Typical Orchestration |
|--------|-------|----------------------|
| **Memory** | File-based (git, JSON, text) | In-memory state, databases |
| **Coordination** | Sequential loop | Often parallel/concurrent |
| **Agent Communication** | Via files | Direct messaging, queues |
| **Complexity** | Simple bash script (~100 LOC) | Often complex frameworks |
| **Failure Recovery** | Retry from last good state | Explicit retry logic, checkpoints |
| **Context Management** | Fresh context per iteration | Persistent context, context windows |
| **Task Decomposition** | Pre-planned user stories | Often dynamic planning |
| **Human Oversight** | Minimal during run | Often requires approval gates |
### Key Differentiators
1. **Simplicity**: Ralph is a bash script, not a framework
2. **Statelessness**: Each iteration is independent
3. **Git-Native**: Uses git as the primary state management
4. **AI-Tool Agnostic**: Works with both Amp and Claude Code
5. **Human-Readable Artifacts**: All state is in human-readable files
---
## Implications for Makima
### Features to Consider Adopting
1. **Structured PRD-to-JSON workflow** with skills
2. **Append-only progress logging** for context between runs
3. **Story sizing enforcement** (completable in one context window)
4. **Dependency-ordered task execution**
5. **Branch-based run isolation** with archiving
6. **Consolidated patterns file** (AGENTS.md equivalent)
7. **Magic string completion protocol** (`COMPLETE`)
8. **Verifiable acceptance criteria** enforcement
9. **Browser verification** for UI stories
### Optional Features (Flag-Controlled)
1. `--max-iterations` limit
2. `--auto-handoff` for context management
3. `--archive-previous` for run isolation
4. `--require-tests` for quality gates
5. `--single-story-per-run` mode
### Opinionated Features
1. Task decomposition must result in context-window-sized stories
2. Progress logs must be append-only
3. All commits must pass quality checks
4. Acceptance criteria must be verifiable
5. Dependencies must be ordered correctly
---
## Appendix: File Structure Reference
```
project/
├── scripts/ralph/
│ ├── ralph.sh # Main loop script
│ ├── prompt.md # Amp instructions
│ ├── CLAUDE.md # Claude Code instructions
│ ├── prd.json # Active task list
│ ├── progress.txt # Append-only learnings
│ └── archive/ # Previous run archives
│ └── YYYY-MM-DD-feature-name/
│ ├── prd.json
│ └── progress.txt
├── skills/
│ ├── prd/
│ │ └── SKILL.md # PRD generation skill
│ └── ralph/
│ └── SKILL.md # PRD-to-JSON conversion skill
└── AGENTS.md # Codebase-wide patterns
```
---
## References
- [Ralph GitHub Repository](https://github.com/snarktank/ralph)
- [Geoffrey Huntley's Ralph Article](https://ghuntley.com/ralph/)
- [Amp Documentation](https://ampcode.com/manual)
- [Claude Code Documentation](https://docs.anthropic.com/en/docs/claude-code)