# Ralph Analysis: Autonomous AI Agent Loop System

## Executive Summary

**Ralph** is an autonomous AI agent loop system designed to run AI coding tools (Amp or Claude Code) repeatedly until all Product Requirements Document (PRD) items are complete. Based on [Geoffrey Huntley's Ralph pattern](https://ghuntley.com/ralph/), it represents a paradigm for autonomous software development where each iteration spawns a fresh AI instance with clean context, relying on git history, a progress log, and a structured PRD JSON file for persistence between runs.

The core philosophy is simple yet powerful: break work into small, independently completable stories, run AI agents in a loop, and let structured persistence mechanisms carry context forward. This approach solves the fundamental problem of AI context limits by treating each iteration as a stateless worker that reads from and writes to well-defined artifacts.

---

## Architecture Overview

### High-Level Flow

```
┌──────────────────────────────────────────────────────────────────┐
│                         SETUP PHASE                              │
├──────────────────────────────────────────────────────────────────┤
│  1. User writes a PRD (markdown)                                 │
│  2. Convert PRD to prd.json (structured user stories)            │
│  3. Run ralph.sh (starts autonomous loop)                        │
└──────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────────┐
│                         EXECUTION LOOP                           │
├──────────────────────────────────────────────────────────────────┤
│  4. AI picks highest priority story where passes: false          │
│  5. Implements the story (writes code, runs tests)               │
│  6. Commits changes (if tests pass)                              │
│  7. Updates prd.json (sets passes: true)                         │
│  8. Logs learnings to progress.txt                               │
│  9. Updates AGENTS.md/CLAUDE.md with reusable patterns           │
│ 10. Check: More stories? → Loop back to step 4                   │
└──────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌──────────────────────────────────────────────────────────────────┐
│                         COMPLETION                               │
├──────────────────────────────────────────────────────────────────┤
│  Output: <promise>COMPLETE</promise> and exit                    │
└──────────────────────────────────────────────────────────────────┘
```

### Core Components

| Component | Purpose | Persistence |
|-----------|---------|-------------|
| `ralph.sh` | Bash loop that spawns fresh AI instances | N/A (orchestrator) |
| `prd.json` | Task list with status tracking | Git-tracked JSON |
| `progress.txt` | Append-only learnings log | Git-tracked text |
| `AGENTS.md` / `CLAUDE.md` | Reusable patterns for future iterations | Git-tracked markdown |
| `prompt.md` | Instructions template for Amp | Static config |
| Skills (`prd`, `ralph`) | PRD generation and conversion helpers | Static config |

---

## Key Features

### 1. **Stateless Iteration Model**

Each iteration spawns a completely fresh AI instance with no memory of previous work. Context is rebuilt from:
- Git history (what was committed)
- `progress.txt` (learnings and context)
- `prd.json` (which stories are done)

**Key insight**: This sidesteps the AI context window limit by treating each run as independent, with structured artifacts serving as the "memory."

### 2. **Structured Task Management (prd.json)**

```json
{
  "project": "MyApp",
  "branchName": "ralph/task-priority",
  "description": "Task Priority System - Add priority levels to tasks",
  "userStories": [
    {
      "id": "US-001",
      "title": "Add priority field to database",
      "description": "As a developer, I need to store task priority...",
      "acceptanceCriteria": [
        "Add priority column to tasks table",
        "Typecheck passes"
      ],
      "priority": 1,
      "passes": false,
      "notes": ""
    }
  ]
}
```

**Design decisions:**
- Priority-based ordering ensures dependencies are handled correctly
- `passes: false/true` provides clear completion tracking
- Acceptance criteria are verifiable (not vague)
- Stories are sized to fit within one context window

### 3. **Progressive Learning System**

The dual-file learning system distinguishes between:

**`progress.txt`** - Append-only chronological log:
```
## [Date/Time] - [Story ID]
- What was implemented
- Files changed
- **Learnings for future iterations:**
  - Patterns discovered
  - Gotchas encountered
  - Useful context
---
```

**`AGENTS.md` / `CLAUDE.md`** - Consolidated reusable patterns:
```
## Codebase Patterns
- Use `sql<number>` template for aggregations
- Always use `IF NOT EXISTS` for migrations
- Export types from actions.ts for UI components
```

**Key insight**: Chronological learnings for debugging, consolidated patterns for quick reference.

### 4. **Branch-Based Run Isolation**

- Each feature uses a dedicated branch (`ralph/feature-name`)
- When starting a new feature, previous runs are archived to `archive/YYYY-MM-DD-feature-name/`
- Clean separation between features prevents context pollution

### 5. **Quality Feedback Loops**

Ralph requires feedback loops to function:
- Typecheck catches type errors
- Tests verify behavior
- CI must stay green (broken code compounds)

Stories must include verifiable acceptance criteria like "Typecheck passes" and "Tests pass."

### 6. **Browser Verification for UI Stories**

Frontend stories include "Verify in browser using dev-browser skill" as acceptance criteria. This ensures visual verification of UI changes, not just code compilation.

### 7. **Stop Condition Protocol**

The loop terminates when all stories have `passes: true`. The AI outputs:
```
<promise>COMPLETE</promise>
```

This magic string is grep'd by `ralph.sh` to detect completion.

### 8. **Multi-Tool Support**

Ralph supports both Amp and Claude Code:
```bash
./ralph.sh --tool amp [max_iterations]   # Default
./ralph.sh --tool claude [max_iterations]
```

Each tool has its own prompt template (`prompt.md` for Amp, `CLAUDE.md` for Claude Code).

### 9. **Skills System for PRD Workflow**

Two skills automate PRD creation:

**`prd` skill**: Generates structured PRDs with clarifying questions
- Asks 3-5 essential questions with lettered options (for quick "1A, 2C, 3B" responses)
- Creates markdown PRD with user stories, functional requirements, non-goals

**`ralph` skill**: Converts markdown PRDs to JSON
- Enforces story sizing (completable in one iteration)
- Orders by dependencies (schema → backend → UI)
- Adds standard criteria ("Typecheck passes", "Verify in browser")

---

## Notable Patterns and Design Decisions

### 1. **Single Story Per Iteration**

**Design**: Each AI run handles exactly ONE user story, never more.

**Rationale**:
- Ensures complete focus on a single task
- Prevents context exhaustion mid-feature
- Creates clean commit boundaries
- Simplifies failure recovery (retry a single story, not multiple)

### 2. **Append-Only Progress Log**

**Design**: `progress.txt` is append-only, never overwritten.

**Rationale**:
- Preserves full history for debugging
- Enables pattern discovery over time
- Prevents accidental loss of learnings
- Supports consolidation into AGENTS.md when patterns emerge

### 3. **Story Sizing Rules**

**Design**: Stories must be small enough for one context window.

**Right-sized examples:**
- Add a database column and migration
- Add a UI component to an existing page
- Update a server action with new logic
- Add a filter dropdown to a list

**Too big (must split):**
- "Build the entire dashboard"
- "Add authentication"
- "Refactor the API"

**Rule of thumb**: If you can't describe the change in 2-3 sentences, it's too big.

### 4. **Dependency-Ordered Execution**

**Design**: Stories execute in priority order, earlier stories can't depend on later ones.

**Correct order:**
1. Schema/database changes (migrations)
2. Server actions / backend logic
3. UI components that use the backend
4. Dashboard/summary views that aggregate data

### 5. **Commit Discipline**

**Design**: Only commit when tests pass, with structured messages.

```
feat: [Story ID] - [Story Title]
```

**Rationale**: Clean git history provides context recovery for future iterations.

### 6. **Verifiable Acceptance Criteria**

**Design**: Every criterion must be testable, never vague.

**Good**: "Button shows confirmation dialog before deleting"
**Bad**: "Works correctly", "Good UX", "Handles edge cases"

### 7. **Archiving Previous Runs**

**Design**: When `branchName` changes, archive previous `prd.json` and `progress.txt` to `archive/YYYY-MM-DD-feature-name/`.

**Rationale**: Clean separation between features, preserves history for reference.

---

## Context Management Strategy

Ralph's context management is its most innovative aspect:

### Between Runs (Persistence)

| Mechanism | What It Carries | Format |
|-----------|-----------------|--------|
| Git commits | Code changes, file structure | Versioned files |
| `prd.json` | Task completion status | Structured JSON |
| `progress.txt` | Learnings, gotchas, patterns | Structured text |
| `AGENTS.md` | Consolidated reusable patterns | Markdown |

### Within a Run (Instructions)

The AI receives:
1. Instructions from `prompt.md` or `CLAUDE.md`
2. The `prd.json` file content
3. The `progress.txt` file (especially Codebase Patterns section)
4. Access to read any file via AI tool capabilities

### Context Recovery Pattern

Each iteration:
1. Reads `progress.txt` Codebase Patterns section first (quick reference)
2. Reads `prd.json` to find next incomplete story
3. Checks git branch matches expected branch
4. Implements story
5. Appends learnings to `progress.txt`
6. Optionally consolidates patterns to AGENTS.md

---

## Agent Orchestration Model

### Single-Agent Loop (Not Multi-Agent)

Ralph is NOT a multi-agent system. It's a single-agent loop where:
- One AI instance runs at a time
- Each instance is independent (no inter-agent communication)
- Coordination happens via file-based state (prd.json, progress.txt)

### Orchestration via Bash Script

`ralph.sh` is a simple bash loop:
```bash
for i in $(seq 1 $MAX_ITERATIONS); do
    OUTPUT=$(cat prompt.md | amp --dangerously-allow-all 2>&1 | tee /dev/stderr) || true

    if echo "$OUTPUT" | grep -q "<promise>COMPLETE</promise>"; then
        echo "Ralph completed all tasks!"
        exit 0
    fi
done
```

**Key points:**
- Uses `--dangerously-allow-all` (Amp) or `--dangerously-skip-permissions` (Claude) for autonomous operation
- Outputs are piped through `tee` for visibility
- Completion detected via grep for magic string
- 2-second sleep between iterations

---

## Error Handling and Recovery

### Implicit Error Handling

Ralph has minimal explicit error handling. Instead:
- If tests fail, the story isn't committed
- If the AI can't complete a story, it logs learnings and the next iteration retries
- If max iterations are reached, the script exits with an error
- Human intervention is expected for complex failures

### Recovery via Progress Log

Failed attempts are documented in `progress.txt`:
```
## [Date/Time] - [Story ID]
- Attempted to implement X
- Failed because Y
- **Learnings:**
  - Don't do Z
  - Instead try W
---
```

The next iteration reads these learnings and avoids the same mistakes.

---

## Configuration and Customization

### Per-Project Customization

After copying the prompt template to your project:
- Add project-specific quality check commands
- Include codebase conventions
- Add common gotchas for your stack

### Amp Auto-Handoff Configuration

For large stories that approach context limits:
```json
{
  "amp.experimental.autoHandoff": { "context": 90 }
}
```

This enables automatic handoff when context fills up.

### Iteration Limits

```bash
./ralph.sh [max_iterations]  # Default: 10
```

---

## Comparison to Typical Orchestration Approaches

| Aspect | Ralph | Typical Orchestration |
|--------|-------|----------------------|
| **Memory** | File-based (git, JSON, text) | In-memory state, databases |
| **Coordination** | Sequential loop | Often parallel/concurrent |
| **Agent Communication** | Via files | Direct messaging, queues |
| **Complexity** | Simple bash script (~100 LOC) | Often complex frameworks |
| **Failure Recovery** | Retry from last good state | Explicit retry logic, checkpoints |
| **Context Management** | Fresh context per iteration | Persistent context, context windows |
| **Task Decomposition** | Pre-planned user stories | Often dynamic planning |
| **Human Oversight** | Minimal during run | Often requires approval gates |

### Key Differentiators

1. **Simplicity**: Ralph is a bash script, not a framework
2. **Statelessness**: Each iteration is independent
3. **Git-Native**: Uses git as the primary state management
4. **AI-Tool Agnostic**: Works with both Amp and Claude Code
5. **Human-Readable Artifacts**: All state is in human-readable files

---

## Implications for Makima

### Features to Consider Adopting

1. **Structured PRD-to-JSON workflow** with skills
2. **Append-only progress logging** for context between runs
3. **Story sizing enforcement** (completable in one context window)
4. **Dependency-ordered task execution**
5. **Branch-based run isolation** with archiving
6. **Consolidated patterns file** (AGENTS.md equivalent)
7. **Magic string completion protocol** (`<promise>COMPLETE</promise>`)
8. **Verifiable acceptance criteria** enforcement
9. **Browser verification** for UI stories

### Optional Features (Flag-Controlled)

1. `--max-iterations` limit
2. `--auto-handoff` for context management
3. `--archive-previous` for run isolation
4. `--require-tests` for quality gates
5. `--single-story-per-run` mode

### Opinionated Features

1. Task decomposition must result in context-window-sized stories
2. Progress logs must be append-only
3. All commits must pass quality checks
4. Acceptance criteria must be verifiable
5. Dependencies must be ordered correctly

---

## Appendix: File Structure Reference

```
project/
├── scripts/ralph/
│   ├── ralph.sh            # Main loop script
│   ├── prompt.md           # Amp instructions
│   ├── CLAUDE.md           # Claude Code instructions
│   ├── prd.json            # Active task list
│   ├── progress.txt        # Append-only learnings
│   └── archive/            # Previous run archives
│       └── YYYY-MM-DD-feature-name/
│           ├── prd.json
│           └── progress.txt
├── skills/
│   ├── prd/
│   │   └── SKILL.md        # PRD generation skill
│   └── ralph/
│       └── SKILL.md        # PRD-to-JSON conversion skill
└── AGENTS.md               # Codebase-wide patterns
```

---

## References

- [Ralph GitHub Repository](https://github.com/snarktank/ralph)
- [Geoffrey Huntley's Ralph Article](https://ghuntley.com/ralph/)
- [Amp Documentation](https://ampcode.com/manual)
- [Claude Code Documentation](https://docs.anthropic.com/en/docs/claude-code)