# Ralph-Inspired Features for Makima
## Overview
This specification outlines features derived from the [ralph](https://github.com/snarktank/ralph) autonomous AI agent loop system that can be implemented in makima to reduce manual steering and improve context management between runs.
---
## Part 1: Opinionated Features (Always Enabled)
These features represent best practices that should be core to makima's behavior.
### 1.1 Structured Progress Logging
**Name:** `progress-log`
**Priority:** HIGH
**Description:**
Implement an append-only progress log file (`progress.txt` or similar) that persists learnings, patterns, and context across task iterations.
**Motivation:**
- Ralph's most powerful feature is its dual-file learning system
- Captures context that survives Claude's context window limits
- Enables pattern discovery over time
- Provides debugging history
**Current State in Makima:**
- `progress_summary` field exists but is per-task, not persistent
- Task events stored in database but not summarized
- No cross-task learning mechanism
**Implementation Approach:**
1. Add `progress.log` file to each task's worktree
2. Append structured entries at task completion:
```
## [Timestamp] - Task [ID]: [Name]
- Status: [done/failed]
- Files changed: [list]
- **Learnings:**
- [Pattern discovered]
- [Gotcha encountered]
---
```
3. Inject progress.log contents into new task prompts
4. Periodic consolidation into `AGENTS.md` equivalent
**Configuration:**
```toml
[progress_log]
enabled = true # Always on
max_entries_injected = 20 # Limit for prompt injection
consolidation_threshold = 50 # Trigger consolidation
```
**Integration Points:**
- `daemon/task/manager.rs` → `on_completion()` hook
- `daemon/process/claude.rs` → `inject_system_prompt()`
- New file: `daemon/task/progress_log.rs`
---
### 1.2 Context Recovery Pattern
**Name:** `context-recovery`
**Priority:** HIGH
**Description:**
Standardize how context is rebuilt when tasks resume or restart, ensuring Claude can quickly orient itself.
**Motivation:**
- Ralph's stateless model works because context recovery is systematic
- Each iteration reads from well-defined artifacts
- Reduces confusion and repeated work
**Current State in Makima:**
- `conversation_state` stored for resumption
- `--continue` flag relies on Claude's session state
- No structured "where we left off" pattern
**Implementation Approach:**
1. Create standard context recovery header for task prompts:
```
## Context Recovery
- Current branch: [branch name]
- Git status: [uncommitted changes summary]
- Last checkpoint: [timestamp, message]
- Progress log (recent): [last 5 entries]
- Current phase: [research/specify/plan/execute/review]
```
2. Auto-generate on task start/resume
3. Include in system prompt before user plan
**Integration Points:**
- `daemon/task/manager.rs` → `build_context_recovery()`
- `daemon/process/claude.rs` → Prepend to injected prompt
- New file: `daemon/task/context_recovery.rs`
---
### 1.3 Dependency-Ordered Task Execution
**Name:** `dependency-ordering`
**Priority:** MEDIUM
**Description:**
Enforce that tasks execute in dependency order: schema changes → backend → UI.
**Motivation:**
- Ralph explicitly orders stories: database → server → UI → dashboard
- Prevents tasks from failing due to missing dependencies
- Creates clean commit boundaries
**Current State in Makima:**
- Tasks have `priority` field but no dependency inference
- Supervisors manually order task creation
- No validation of execution order
**Implementation Approach:**
1. Add `depends_on: Vec<Uuid>` field to tasks
2. Validate dependencies before marking task as runnable
3. Auto-detect dependency patterns:
- Migration files → backend code
- Types/models → consumers
- APIs → UI components
4. Warn if a task seems out of order based on file patterns
**Configuration:**
```toml
[dependency_ordering]
enabled = true
auto_detect = true
warn_on_violation = true
```
**Integration Points:**
- `db/models.rs` → Task model extension
- `daemon/task/manager.rs` → `can_start_task()` validation
- New file: `daemon/task/dependency_analysis.rs`
---
### 1.4 Verifiable Acceptance Criteria
**Name:** `acceptance-criteria`
**Priority:** MEDIUM
**Description:**
Require that all tasks have verifiable (not vague) acceptance criteria, and automatically validate them.
**Motivation:**
- Ralph requires criteria like "Typecheck passes", "Tests pass"
- Prevents "done" status on incomplete work
- Provides clear success definition
**Current State in Makima:**
- `COMPLETION_GATE` signals readiness
- No structured criteria validation
- Manual interpretation of "ready"
**Implementation Approach:**
1. Parse task plans for acceptance criteria section
2. Identify verifiable vs vague criteria:
- **Good:** "All tests pass", "No TypeScript errors"
- **Bad:** "Works correctly", "Good UX"
3. Auto-append standard criteria if missing:
- "No uncommitted changes remain"
- "CI/linting passes" (if configured)
4. Validate criteria satisfaction before marking complete
**Configuration:**
```toml
[acceptance_criteria]
enabled = true
require_verifiable = true
auto_append_standard = ["no_uncommitted_changes", "tests_pass"]
```
**Integration Points:**
- `daemon/task/completion_gate.rs` → Extend validation
- `llm/task_output.rs` → Parse criteria from plan
- New file: `daemon/task/criteria_validator.rs`
---
### 1.5 Task Sizing Validation
**Name:** `task-sizing`
**Priority:** MEDIUM
**Description:**
Warn or prevent tasks that are likely too large to complete in one context window.
**Motivation:**
- Ralph's story sizing is crucial: "If you can't describe it in 2-3 sentences, it's too big"
- Large tasks exhaust context, require handoffs
- Smaller tasks = cleaner commits, easier recovery
**Current State in Makima:**
- No task size estimation
- `auto_handoff` exists but reactive
- Manual task breakdown by supervisors
**Implementation Approach:**
1. Estimate task complexity from plan text:
- Number of files mentioned
- Scope words ("entire", "all", "refactor")
- Estimated token count
2. Warn if task exceeds thresholds
3. Suggest breakdown for large tasks
**Thresholds:**
- Files mentioned > 10 → Warning
- Plan length > 500 words → Warning
- Scope words detected → Strong warning
**Configuration:**
```toml
[task_sizing]
enabled = true
max_files_mentioned = 10
max_plan_words = 500
warn_on_scope_words = ["entire", "all", "complete", "refactor"]
```
**Integration Points:**
- `daemon/task/manager.rs` → `validate_task_size()`
- Supervisor prompts → Include sizing guidance
- New file: `daemon/task/sizing_validator.rs`
---
### 1.6 Commit Discipline
**Name:** `commit-discipline`
**Priority:** HIGH
**Description:**
Enforce structured commit messages and only allow commits when quality checks pass.
**Motivation:**
- Ralph: "Only commit when tests pass"
- Clean git history aids context recovery
- Structured messages enable automation
**Current State in Makima:**
- Checkpoints create commits automatically
- No quality gate before commit
- Commit messages not standardized
**Implementation Approach:**
1. Standardize commit message format:
```
feat/fix/chore: [Task ID] - [Summary]
[Optional body]
Co-Authored-By: Claude <noreply@anthropic.com>
```
2. Run quality checks before checkpoint commit:
- TypeScript/lint (if configured)
- Tests (if configured)
3. Reject commit if checks fail, provide feedback
**Configuration:**
```toml
[commit_discipline]
enabled = true
require_tests = false # Optional
require_lint = false # Optional
message_format = "conventional" # conventional, simple
```
**Integration Points:**
- `daemon/worktree/manager.rs` → `create_checkpoint()`
- `daemon/task/manager.rs` → Pre-commit hooks
- New file: `daemon/task/commit_validator.rs`
---
## Part 2: Optional Features (Flag-Controlled)
These features provide advanced control and should be opt-in via CLI flags or configuration.
### 2.1 Maximum Iterations Limit
**Name:** `--max-iterations`
**Priority:** HIGH
**Flag:** `--max-iterations <N>` or `-i <N>`
**Description:**
Limit the number of autonomous loop iterations before stopping.
**Motivation:**
- Ralph uses `max_iterations` (default 10)
- Prevents runaway loops that waste tokens
- Provides predictable behavior
**Current State in Makima:**
- Circuit breaker has `iteration_count` limit (10)
- Not configurable at task/contract level
- No per-run override
**Implementation Approach:**
1. Add `--max-iterations` flag to contract/task creation
2. Store in task metadata
3. Check count in autonomous loop logic
4. Exit cleanly with message when limit reached
**CLI Usage:**
```bash
makima contract create --max-iterations 5 "Feature X"
makima supervisor spawn "Task" "Plan" --max-iterations 3
```
**Configuration:**
```toml
[autonomous_loop]
default_max_iterations = 10
hard_limit = 50 # Absolute maximum
```
**Integration Points:**
- `daemon/task/manager.rs` → Loop control
- `db/models.rs` → Task field
- CLI argument parsing
---
### 2.2 Single-Story-Per-Run Mode
**Name:** `--single-task`
**Priority:** MEDIUM
**Flag:** `--single-task` or `-1`
**Description:**
Execute exactly one task per Claude invocation, then stop (don't auto-continue).
**Motivation:**
- Ralph's model: one story per iteration
- Ensures complete focus
- Creates clean boundaries
- Simplifies failure recovery
**Current State in Makima:**
- Tasks can run multiple iterations
- Supervisor can spawn multiple concurrent tasks
- No single-task mode
**Implementation Approach:**
1. When `--single-task` enabled:
- Execute one task
- Parse completion gate
- Stop regardless of `ready` status
- Report status and exit
2. User reviews, then manually continues or adjusts
**CLI Usage:**
```bash
makima contract create --single-task "Feature X"
```
**Configuration:**
```toml
[execution]
single_task_mode = false # Default
```
**Integration Points:**
- `daemon/task/manager.rs` → Execution loop
- `server/handlers/contract_daemon.rs` → Contract options
- CLI flags
---
### 2.3 Archive Previous Runs
**Name:** `--archive-previous`
**Priority:** LOW
**Flag:** `--archive-previous` or `--archive`
**Description:**
When starting a new feature/contract, archive the previous run's artifacts.
**Motivation:**
- Ralph archives to `archive/YYYY-MM-DD-feature-name/`
- Clean separation between features
- Preserves history for reference
- Prevents context pollution
**Current State in Makima:**
- Worktrees are per-task but ephemeral
- No archiving mechanism
- Old task data in database but hard to access
**Implementation Approach:**
1. On contract creation with `--archive`:
- Find previous contract with same name/goal
- Copy key artifacts to `archive/` directory:
- progress.log
- Final checkpoint
- Summary document
2. Archive structure:
```
archive/
└── 2026-01-22-feature-name/
├── progress.log
├── summary.md
└── final-diff.patch
```
**CLI Usage:**
```bash
makima contract create --archive-previous "Feature X v2"
```
**Integration Points:**
- `daemon/worktree/manager.rs` → Archive logic
- `server/handlers/contracts.rs` → Archive on create
- New file: `daemon/archive/manager.rs`
---
### 2.4 Require Tests Quality Gate
**Name:** `--require-tests`
**Priority:** MEDIUM
**Flag:** `--require-tests` or `--tests`
**Description:**
Block task completion unless tests pass.
**Motivation:**
- Ralph: stories require "Tests pass" in acceptance criteria
- Ensures quality before merge
- Catches regressions early
**Current State in Makima:**
- Completion gate is self-reported by Claude
- No actual test execution
- Circuit breaker is reactive, not proactive
**Implementation Approach:**
1. Detect test framework from project:
- `package.json` scripts
- `pytest`, `cargo test`, etc.
2. Run tests before accepting completion
3. Parse test output for pass/fail
4. If failed:
- Don't mark complete
- Inject failure info into next prompt
- Increment failure counter
**CLI Usage:**
```bash
makima contract create --require-tests "Feature X"
```
**Configuration:**
```toml
[quality_gates]
require_tests = false
test_command = "npm test" # Auto-detected if not set
test_timeout_secs = 300
```
**Integration Points:**
- `daemon/task/completion_gate.rs` → Test validation
- `daemon/process/` → Test runner
- New file: `daemon/quality/test_runner.rs`
---
### 2.5 PRD Mode
**Name:** `--prd-mode`
**Priority:** MEDIUM
**Flag:** `--prd-mode` or `--prd`
**Description:**
Enable ralph-style PRD workflow with structured JSON task tracking.
**Motivation:**
- Ralph's `prd.json` provides clear task breakdown
- Structured format aids automation
- Priority-based execution
- Clear pass/fail tracking
**Current State in Makima:**
- Plans are free-form text
- Task status is in database, not file-based
- No structured PRD format
**Implementation Approach:**
1. When `--prd-mode` enabled:
- Create `prd.json` in worktree:
```json
{
"project": "Contract Name",
"branchName": "makima/feature",
"description": "Contract goal",
"userStories": [
{
"id": "US-001",
"title": "Story title",
"description": "As a...",
"acceptanceCriteria": ["Criterion 1"],
"priority": 1,
"passes": false,
"notes": ""
}
]
}
```
- Tasks update `passes` field on completion
- Supervisor reads PRD to find next incomplete story
2. Sync between database and `prd.json`
**CLI Usage:**
```bash
makima contract create --prd-mode "Feature X"
```
**Configuration:**
```toml
[prd_mode]
enabled = false
auto_generate_from_plan = true
sync_to_database = true
```
**Integration Points:**
- `daemon/task/manager.rs` → PRD sync
- New file: `daemon/prd/manager.rs`
- New file: `daemon/prd/models.rs`
---
### 2.6 Learning Mode
**Name:** `--learn`
**Priority:** LOW
**Flag:** `--learn` or `-l`
**Description:**
Enable cross-task learning that extracts patterns and improves future prompts.
**Motivation:**
- Ralph's AGENTS.md consolidates patterns
- Learning from success improves future runs
- Learning from failure prevents repeating mistakes
**Current State in Makima:**
- No cross-task learning
- Each task starts fresh
- Patterns not extracted or reused
**Implementation Approach:**
1. On task completion, extract:
- Files commonly modified together
- Commands that succeeded/failed
- Error patterns and solutions
2. Store in `learnings.db` (SQLite) per repository
3. Inject relevant learnings into future task prompts:
```
## Learned Patterns for this codebase:
- Always run `npm run typecheck` before commit
- The auth middleware is in src/middleware/auth.ts
- Database migrations require `npx prisma generate` after
```
**CLI Usage:**
```bash
makima contract create --learn "Feature X"
```
**Configuration:**
```toml
[learning]
enabled = false
extract_file_patterns = true
extract_command_patterns = true
max_learnings_injected = 10
```
**Integration Points:**
- `daemon/task/manager.rs` → Learning extraction
- `daemon/process/claude.rs` → Learning injection
- New file: `daemon/learning/extractor.rs`
- New file: `daemon/learning/database.rs`
---
### 2.7 Browser Verification for UI
**Name:** `--browser-verify`
**Priority:** LOW
**Flag:** `--browser-verify` or `--ui`
**Description:**
For UI-related tasks, require browser verification before completion.
**Motivation:**
- Ralph includes "Verify in browser" as acceptance criteria
- Visual verification catches issues that tests miss
- Ensures UI actually works, not just compiles
**Current State in Makima:**
- No browser verification
- UI tasks treated same as backend
- No visual testing integration
**Implementation Approach:**
1. Detect UI-related tasks from file patterns:
- `*.tsx`, `*.vue`, `*.svelte`
- `components/`, `pages/`, `views/`
2. When completing, prompt for verification:
- Launch dev server if needed
- Open browser to relevant URL
- Wait for user confirmation or screenshot analysis
3. Alternative: Integrate with Playwright for visual testing
**CLI Usage:**
```bash
makima contract create --browser-verify "Add login page"
```
**Configuration:**
```toml
[browser_verify]
enabled = false
auto_detect_ui_tasks = true
dev_server_command = "npm run dev"
base_url = "http://localhost:3000"
```
**Integration Points:**
- `daemon/task/completion_gate.rs` → Browser check
- New file: `daemon/quality/browser_verify.rs`
---
## Part 3: Implementation Priorities
### Phase 1: Foundation (High Priority)
1. **Structured Progress Logging** - Core to context management
2. **Context Recovery Pattern** - Enables stateless iterations
3. **Commit Discipline** - Ensures quality git history
4. **Maximum Iterations Limit** - Prevents runaway loops
### Phase 2: Quality (Medium Priority)
5. **Verifiable Acceptance Criteria** - Improves completion reliability
6. **Dependency-Ordered Execution** - Prevents out-of-order failures
7. **Task Sizing Validation** - Catches too-large tasks early
8. **Require Tests Quality Gate** - Ensures working code
9. **Single-Story Mode** - Ralph's core pattern
### Phase 3: Advanced (Low Priority)
10. **PRD Mode** - Full ralph-style workflow
11. **Learning Mode** - Cross-task intelligence
12. **Archive Previous** - Run isolation
13. **Browser Verification** - UI quality
---
## Part 4: Configuration Summary
### New Configuration File Sections
```toml
# makima-daemon.toml additions
[progress_log]
enabled = true
max_entries_injected = 20
consolidation_threshold = 50
[context_recovery]
enabled = true
include_git_status = true
include_recent_progress = true
[dependency_ordering]
enabled = true
auto_detect = true
warn_on_violation = true
[acceptance_criteria]
enabled = true
require_verifiable = true
auto_append_standard = ["no_uncommitted_changes"]
[task_sizing]
enabled = true
max_files_mentioned = 10
max_plan_words = 500
warn_on_scope_words = ["entire", "all", "complete", "refactor"]
[commit_discipline]
enabled = true
require_tests = false
require_lint = false
message_format = "conventional"
[autonomous_loop]
default_max_iterations = 10
hard_limit = 50
[execution]
single_task_mode = false
[quality_gates]
require_tests = false
test_command = "" # Auto-detected
test_timeout_secs = 300
[prd_mode]
enabled = false
auto_generate_from_plan = true
sync_to_database = true
[learning]
enabled = false
extract_file_patterns = true
extract_command_patterns = true
max_learnings_injected = 10
[browser_verify]
enabled = false
auto_detect_ui_tasks = true
dev_server_command = "npm run dev"
base_url = "http://localhost:3000"
```
### CLI Flag Summary
| Flag | Short | Feature | Default |
|------|-------|---------|---------|
| `--max-iterations` | `-i` | Iteration limit | 10 |
| `--single-task` | `-1` | One task per run | false |
| `--archive-previous` | `--archive` | Archive old runs | false |
| `--require-tests` | `--tests` | Test quality gate | false |
| `--prd-mode` | `--prd` | PRD-style workflow | false |
| `--learn` | `-l` | Cross-task learning | false |
| `--browser-verify` | `--ui` | UI verification | false |
---
## Part 5: Migration Path
### For Existing Contracts
1. Progress logging starts fresh (no historical data)
2. Context recovery applies to new tasks only
3. Existing tasks not affected by new validation
4. Can opt-in to optional features per-contract
### Backward Compatibility
- All opinionated features have graceful defaults
- Optional features are off by default
- No breaking changes to existing CLI/API
- Configuration is additive
---
## Conclusion
These features address the core challenges mentioned in the contract goal:
- **Manual steering** → Progress logging, context recovery, learning mode
- **Context between runs** → Structured persistence, progress.txt pattern
- **Handholding** → Verifiable criteria, commit discipline, quality gates
The opinionated features make makima more reliable out of the box, while optional features provide ralph-style workflows for users who want them.