summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/proposals/compound-engineering-analysis.md300
-rw-r--r--docs/proposals/feature-findings-tracking.md504
-rw-r--r--docs/proposals/feature-knowledge-accumulation.md539
-rw-r--r--docs/proposals/feature-multi-agent-review.md448
-rw-r--r--docs/proposals/feature-plan-deepening.md383
-rw-r--r--docs/proposals/feature-task-templates.md602
-rw-r--r--docs/proposals/feature-workflow-presets.md623
7 files changed, 3399 insertions, 0 deletions
diff --git a/docs/proposals/compound-engineering-analysis.md b/docs/proposals/compound-engineering-analysis.md
new file mode 100644
index 0000000..5a8c6da
--- /dev/null
+++ b/docs/proposals/compound-engineering-analysis.md
@@ -0,0 +1,300 @@
+# Compound Engineering Plugin — Analysis & Makima Feature Mapping
+
+> **Document Type:** Overview Analysis
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Related Proposals:** [Multi-Agent Review](feature-multi-agent-review.md) · [Knowledge Accumulation](feature-knowledge-accumulation.md) · [Plan Deepening](feature-plan-deepening.md) · [Workflow Presets](feature-workflow-presets.md) · [Findings Tracking](feature-findings-tracking.md) · [Task Templates](feature-task-templates.md)
+
+---
+
+## Executive Summary
+
+The [Compound Engineering Plugin](https://github.com/EveryInc/compound-engineering-plugin) is a Claude Code plugin comprising **29 agents, 25 commands, 16 skills, and 1 MCP server**. Its core innovation is a self-reinforcing engineering loop where every unit of work makes subsequent work easier—not harder.
+
+This document analyzes the plugin's architecture, maps its capabilities against makima's existing features, identifies gaps, and proposes a phased adoption strategy. The compound engineering plugin excels at **within-session orchestration** (parallel review agents, plan deepening, knowledge capture), while makima excels at **cross-session orchestration** (contract lifecycle, worktree isolation, DAG-based directives). Combining both creates a uniquely powerful system.
+
+---
+
+## Core Philosophy
+
+> *"Each unit of engineering work should make subsequent units easier—not harder."*
+
+The plugin operationalizes this through a four-phase feedback loop where the critical **Compound** step captures learnings that feed back into future planning:
+
+```
+┌─────────────────────────────────────────────────────────┐
+│ │
+│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
+│ │ │ │ │ │ │ │
+│ │ PLAN │───▶│ WORK │───▶│ REVIEW │ │
+│ │ │ │ │ │ │ │
+│ └──────────┘ └──────────┘ └──────────┘ │
+│ ▲ │ │
+│ │ ┌──────────┐ │ │
+│ │ │ │ │ │
+│ └──────────│ COMPOUND │◀────────┘ │
+│ │ │ │
+│ Learnings fed └──────────┘ Captures solutions, │
+│ back into │ patterns, failures │
+│ future plans ▼ │
+│ docs/solutions/ │
+│ ├── build-errors/ │
+│ ├── test-failures/ │
+│ ├── api-patterns/ │
+│ └── ...9 categories │
+│ │
+└─────────────────────────────────────────────────────────┘
+```
+
+This maps directly to makima's contract phases: **Research → Specify → Plan → Execute → Review** with a proposed new **Compound** phase inserted after Review.
+
+---
+
+## Plugin Architecture Overview
+
+### Agent Categories (29 Total)
+
+| Category | Count | Examples |
+|----------|-------|---------|
+| Review Agents | 12-15 | Security Sentinel, Performance Oracle, Architecture Strategist, Code Philosopher, Data Integrity Guardian, Error Resilience Analyzer, API Contract Validator, Dependency Health Checker, Test Coverage Analyzer, Documentation Completeness, Concurrency Safety |
+| Research Agents | 20-40 | Best practices, edge case analysis, dependency research, pattern matching |
+| Learning Agents | 5 | Context extractor, solution documenter, prevention strategist, categorizer, doc linker |
+| Pipeline Agents | ~5 | LFG orchestrator, SLFG parallelizer, phase coordinators |
+| Meta Agents | 2-3 | Agent creator, skill healer, template generator |
+
+### Command Categories (25 Total)
+
+| Category | Key Commands | Description |
+|----------|-------------|-------------|
+| Planning | `/plan`, `/deepen-plan` | Create and enhance implementation plans |
+| Execution | `/lfg`, `/slfg` | Full autonomous pipelines (serial/parallel) |
+| Review | `/parallel-review`, `/review` | Multi-agent code review |
+| Learning | `/compound`, `/search-learnings` | Capture and retrieve knowledge |
+| Meta | `/create-agent-skill`, `/heal-skill` | Self-improving tooling |
+| Findings | `/create-todo`, `/resolve-todo` | Structured issue tracking |
+
+### Skill Categories (16 Total)
+
+Skills provide specialized capabilities including code analysis, pattern detection, security scanning, performance profiling, and documentation generation.
+
+### MCP Server (1)
+
+Provides tool access for agents to interact with the file system, git, and external services during parallel execution.
+
+---
+
+## Agent-Native Architecture Concepts
+
+The compound engineering plugin embraces an **agent-native** design philosophy:
+
+1. **Parallel-First**: Tasks that can be parallelized are always parallelized (review agents, research agents, learning sub-agents)
+2. **Structured Output**: All agent outputs use YAML frontmatter + markdown, enabling machine parsing
+3. **Swarm Orchestration**: Groups of agents with synchronization gates (spawn N → wait for all → synthesize)
+4. **Self-Healing**: Meta-commands detect broken skills and auto-repair them
+5. **Progressive Enhancement**: Plans start simple, then are "deepened" with research results
+
+---
+
+## Mapping to Makima's Architecture
+
+### What Makima Already Has
+
+| Compound Engineering Feature | Makima Equivalent | Coverage |
+|------------------------------|-------------------|----------|
+| Plan → Work → Review loop | Contract phases (Research → Specify → Plan → Execute → Review) | ✅ Full |
+| Task orchestration | Supervisor/worker hierarchy with `spawn-task` | ✅ Full |
+| Parallel task execution | Multiple workers in separate worktrees | ✅ Full |
+| Task isolation | Git worktree per task | ✅ Full |
+| Phase transitions | `supervisor advance-phase` with phase guards | ✅ Full |
+| Pipeline orchestration | Directive system with DAG dependencies | ✅ Full |
+| User interaction during execution | `supervisor ask` with timeout/choices | ✅ Full |
+| Task continuation | `continue_from_task_id`, `--continue` flag | ✅ Full |
+| Branching/forking | `supervisor branch`, `task-fork`, `task-rewind` | ✅ Full |
+| Circuit breakers | CircuitBreaker (max iterations, stuck detection) | ✅ Full |
+| Completion gates | `<COMPLETION_GATE>` parsing in autonomous loop | ✅ Full |
+| Document management | Contract files with versioning, structured body | ✅ Full |
+
+### What Makima Is Missing (Gaps)
+
+| Compound Engineering Feature | Makima Gap | Priority | Proposal |
+|------------------------------|-----------|----------|----------|
+| Multi-agent parallel review | No automated review, no review task templates | **High** | [feature-multi-agent-review.md](feature-multi-agent-review.md) |
+| Compound learning / knowledge accumulation | No cross-contract knowledge capture | **High** | [feature-knowledge-accumulation.md](feature-knowledge-accumulation.md) |
+| Plan deepening with research agents | Single-pass planning, no research integration | **Medium** | [feature-plan-deepening.md](feature-plan-deepening.md) |
+| One-command pipelines (LFG/SLFG) | Manual orchestration per contract | **High** | [feature-workflow-presets.md](feature-workflow-presets.md) |
+| Structured findings/TODOs | Unstructured review output | **Medium** | [feature-findings-tracking.md](feature-findings-tracking.md) |
+| Reusable task/agent templates | Ad-hoc plans, no template reuse | **Medium** | [feature-task-templates.md](feature-task-templates.md) |
+
+---
+
+## Feature Set Summary
+
+| # | Feature | Priority | Complexity | Effort | Proposal |
+|---|---------|----------|------------|--------|----------|
+| 1 | Multi-Agent Parallel Review | High | Medium | 12-18 days | [Link](feature-multi-agent-review.md) |
+| 2 | Knowledge Accumulation | High | Medium | 10-15 days | [Link](feature-knowledge-accumulation.md) |
+| 3 | Plan Deepening | Medium | Low | 5-8 days | [Link](feature-plan-deepening.md) |
+| 4 | Workflow Presets | High | Medium | 10-15 days | [Link](feature-workflow-presets.md) |
+| 5 | Findings Tracking | Medium | Low | 7-10 days | [Link](feature-findings-tracking.md) |
+| 6 | Task Templates | Medium | Medium | 8-12 days | [Link](feature-task-templates.md) |
+| | **Total** | | | **52-78 days** | |
+
+---
+
+## Implementation Strategy
+
+### Recommended Phasing
+
+```
+Phase 1: Foundations (Weeks 1-4)
+├── Workflow Presets ────────── Enables one-command pipelines
+└── Findings Tracking ──────── Structured review output format
+
+Phase 2: Core Loop (Weeks 5-9)
+├── Multi-Agent Review ──────── Automated parallel review
+└── Knowledge Accumulation ──── Cross-contract learning
+
+Phase 3: Enhancement (Weeks 10-13)
+├── Plan Deepening ──────────── Research-enhanced planning
+└── Task Templates ──────────── Reusable patterns
+```
+
+**Rationale for ordering:**
+
+1. **Phase 1** builds infrastructure that Phase 2 depends on:
+ - Workflow Presets provide the pipeline framework that Review and Learning plug into
+ - Findings Tracking provides the structured output format that Review agents produce
+
+2. **Phase 2** implements the core compound loop:
+ - Multi-Agent Review produces structured findings
+ - Knowledge Accumulation closes the feedback loop
+
+3. **Phase 3** optimizes the system:
+ - Plan Deepening uses the knowledge base to enhance plans
+ - Task Templates codify proven patterns for reuse
+
+### Integration Points Between Features
+
+```
+ ┌─────────────────┐
+ │ Workflow Presets │
+ │ (orchestrator) │
+ └────────┬────────┘
+ │ triggers phases
+ ┌──────────────┼──────────────┐
+ ▼ ▼ ▼
+ ┌────────────┐ ┌──────────────┐ ┌───────────┐
+ │ Plan │ │ Multi-Agent │ │ Knowledge │
+ │ Deepening │ │ Review │ │ Accum. │
+ └─────┬──────┘ └──────┬───────┘ └─────┬─────┘
+ │ │ │
+ │ produces │ │
+ │ ▼ │
+ │ ┌──────────────┐ │
+ │ │ Findings │ │
+ │ │ Tracking │ │
+ │ └──────────────┘ │
+ │ │
+ └──────── feeds into ──────────────┘
+ │
+ ┌────┴─────┐
+ │ Task │
+ │ Templates│
+ └──────────┘
+ codifies patterns
+```
+
+---
+
+## Competitive Analysis
+
+### Compound Engineering Plugin Strengths
+
+| Strength | Detail |
+|----------|--------|
+| **Depth of review** | 12-15 specialized reviewers catch issues a single reviewer misses |
+| **Knowledge compounding** | Learnings are never lost; they compound over time |
+| **One-command pipelines** | `/lfg` runs full plan→work→review→compound cycle |
+| **Self-improvement** | Meta-commands create new agents/skills on demand |
+| **Swarm patterns** | Sophisticated parallel group management |
+
+### Makima Strengths
+
+| Strength | Detail |
+|----------|--------|
+| **True isolation** | Git worktrees provide real filesystem isolation, not just context isolation |
+| **Persistent orchestration** | Contracts survive across sessions; plugin agents are ephemeral |
+| **DAG execution** | Directives model complex dependency graphs natively |
+| **User interaction** | Rich question/answer system with timeouts and multi-select |
+| **Infrastructure** | Server-based architecture with WebSocket real-time communication |
+| **Checkpoint/recovery** | Full task rewind, fork, and patch-based recovery |
+| **Phase governance** | Phase guards require explicit user approval for transitions |
+
+### Combined Value Proposition
+
+| Dimension | Plugin Alone | Makima Alone | Combined |
+|-----------|-------------|-------------|----------|
+| Review quality | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Task isolation | ⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Knowledge retention | ⭐⭐⭐⭐ | ⭐ | ⭐⭐⭐⭐⭐ |
+| Persistent orchestration | ⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Pipeline automation | ⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Self-improvement | ⭐⭐⭐⭐ | ⭐ | ⭐⭐⭐⭐ |
+
+---
+
+## Risk Analysis
+
+### Technical Risks
+
+| Risk | Impact | Likelihood | Mitigation |
+|------|--------|------------|------------|
+| Parallel review agents overwhelm system resources | High | Medium | Implement concurrency limits; use makima's existing CircuitBreaker |
+| Knowledge base grows unwieldy | Medium | High | Implement relevance decay, deduplication, and quality gates |
+| Workflow presets too rigid for diverse use cases | Medium | Medium | Support variable substitution and optional steps |
+| Review synthesis produces noisy/contradictory results | Medium | Medium | Weighted deduplication with priority-based conflict resolution |
+| Template proliferation creates maintenance burden | Low | Medium | Template versioning and deprecation lifecycle |
+
+### Organizational Risks
+
+| Risk | Impact | Likelihood | Mitigation |
+|------|--------|------------|------------|
+| Scope creep across all 6 features | High | High | Strict phasing; each feature is independently shippable |
+| Users don't adopt knowledge accumulation habits | Medium | Medium | Make it automatic (not opt-in); integrate with workflow presets |
+| Configuration complexity deters users | Medium | Medium | Sensible defaults; progressive disclosure of configuration |
+
+---
+
+## Success Metrics
+
+### Per-Feature Metrics
+
+| Feature | Key Metric | Target |
+|---------|-----------|--------|
+| Multi-Agent Review | Defects caught before merge | 40% increase vs single review |
+| Knowledge Accumulation | Knowledge reuse rate | >30% of new contracts reference existing learnings |
+| Plan Deepening | Plan revision rate after execution starts | <15% (down from estimated ~40%) |
+| Workflow Presets | Time from contract creation to first commit | 50% reduction |
+| Findings Tracking | Finding resolution rate | >85% of P1/P2 findings resolved |
+| Task Templates | Template reuse rate | >25% of tasks use templates after 3 months |
+
+### System-Level Metrics
+
+- **Cycle time**: Time from contract creation to completion — target 30% reduction
+- **Defect escape rate**: Issues found post-merge — target 50% reduction
+- **Knowledge density**: Learnings per contract — target >2.5 after 6 months
+- **User satisfaction**: Survey score — target >4.2/5.0
+
+---
+
+## Conclusion
+
+The compound engineering plugin represents a mature implementation of agent-native engineering workflows. Its greatest innovations—parallel multi-perspective review, knowledge compounding, and autonomous pipelines—address real gaps in makima's current capabilities.
+
+Makima's infrastructure advantages (true worktree isolation, persistent contracts, DAG-based directives, server architecture) provide a superior foundation for implementing these features. The proposed phased approach delivers incremental value while building toward the full compound engineering loop.
+
+The combined system would offer something neither tool provides alone: **persistent, isolated, knowledge-compounding engineering workflows with multi-agent review and one-command pipeline automation**.
+
+---
+
+*Next steps: Review individual feature proposals for detailed implementation plans.*
diff --git a/docs/proposals/feature-findings-tracking.md b/docs/proposals/feature-findings-tracking.md
new file mode 100644
index 0000000..bb8a68e
--- /dev/null
+++ b/docs/proposals/feature-findings-tracking.md
@@ -0,0 +1,504 @@
+# Feature Proposal: Structured Findings / Issues Tracking
+
+> **Priority:** Medium
+> **Complexity:** Low
+> **Estimated Effort:** 7-10 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** None (standalone, but enhances [Multi-Agent Review](feature-multi-agent-review.md))
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Multi-Agent Review](feature-multi-agent-review.md) · [Workflow Presets](feature-workflow-presets.md)
+
+---
+
+## Problem Statement
+
+Currently, review outputs in makima are **unstructured text** in task conversation history:
+
+- **No standard format** for reporting issues found during review
+- **No severity classification** — all findings are treated equally
+- **No lifecycle tracking** — findings are either "in the review output" or "hopefully fixed"
+- **No verification** — there's no way to confirm a finding was actually resolved
+- **No aggregation** — findings from multiple review tasks can't be collected and deduplicated
+- **No blocking mechanism** — critical findings can't prevent phase transitions
+- **No metrics** — no data on how many findings are produced, resolved, or escaped
+
+This makes the review phase a documentation exercise rather than a quality gate.
+
+---
+
+## How Compound Engineering Solves This
+
+The compound engineering plugin uses **structured TODO/finding files** with YAML frontmatter and a defined lifecycle:
+
+### File Format
+
+```markdown
+---
+id: SEC-001
+status: open
+priority: P1
+category: security
+title: SQL injection in user search endpoint
+file: src/api/users.rs
+line: 47
+agent: security-sentinel
+created: 2026-02-09T10:30:00Z
+updated: 2026-02-09T10:30:00Z
+tags: [injection, input-validation, database]
+---
+
+# SQL Injection in User Search Endpoint
+
+## Finding
+The `search_users` handler directly interpolates the `query` parameter into
+a SQL string without parameterization.
+
+## Evidence
+```rust
+// src/api/users.rs:47
+let sql = format!("SELECT * FROM users WHERE name LIKE '%{}%'", query);
+```
+
+## Impact
+An attacker can execute arbitrary SQL queries, potentially:
+- Exfiltrating all user data
+- Modifying or deleting records
+- Escalating privileges
+
+## Recommendation
+Use parameterized queries:
+```rust
+let results = sqlx::query("SELECT * FROM users WHERE name LIKE $1")
+ .bind(format!("%{}%", query))
+ .fetch_all(&pool)
+ .await?;
+```
+
+## Resolution
+_Not yet resolved_
+```
+
+### File Naming Convention
+
+```
+findings/{issue_id}-{status}-{priority}-{description}.md
+```
+
+Example: `findings/SEC-001-open-P1-sql-injection-user-search.md`
+
+### Lifecycle
+
+```
+open ──▶ in-progress ──▶ resolved ──▶ verified
+ │ │
+ └── wont-fix ◀────────────┘
+```
+
+---
+
+## Proposed Makima Implementation
+
+### 1. Finding Record Format
+
+Findings are stored as **contract files** with structured metadata and body:
+
+```rust
+// Finding metadata (stored in file description as structured JSON)
+#[derive(Serialize, Deserialize)]
+pub struct FindingMetadata {
+ pub id: String, // "SEC-001", auto-generated
+ pub status: FindingStatus, // open, in_progress, resolved, verified, wont_fix
+ pub severity: FindingSeverity, // P1 (critical), P2 (major), P3 (minor)
+ pub category: String, // security, performance, architecture, etc.
+ pub title: String, // Short description
+ pub file_path: Option<String>, // Affected file
+ pub line_number: Option<u32>, // Affected line
+ pub source_agent: Option<String>, // Which review agent found this
+ pub source_task_id: Option<Uuid>, // Task that produced this finding
+ pub assigned_to: Option<Uuid>, // Task assigned to resolve this
+ pub created_at: DateTime<Utc>,
+ pub updated_at: DateTime<Utc>,
+ pub resolved_at: Option<DateTime<Utc>>,
+ pub verified_at: Option<DateTime<Utc>>,
+ pub tags: Vec<String>,
+}
+
+pub enum FindingStatus {
+ Open,
+ InProgress,
+ Resolved,
+ Verified,
+ WontFix,
+}
+
+pub enum FindingSeverity {
+ P1, // Critical — must fix before merge
+ P2, // Major — should fix, can defer with justification
+ P3, // Minor — nice to fix, can defer
+}
+```
+
+### 2. Supervisor Commands
+
+#### Create a Finding
+
+```bash
+# Create a finding from review output
+makima supervisor finding create \
+ --severity P1 \
+ --category security \
+ --title "SQL injection in user search endpoint" \
+ --file src/api/users.rs \
+ --line 47 \
+ --description "Direct string interpolation in SQL query"
+
+# Output: Created finding SEC-001 (P1/security)
+```
+
+#### List Findings
+
+```bash
+# List all findings for the current contract
+makima supervisor finding list
+# Output:
+# ID SEVERITY STATUS CATEGORY TITLE
+# SEC-001 P1 open security SQL injection in user search
+# PERF-001 P2 in-progress performance N+1 query in order listing
+# ARCH-001 P3 resolved architecture Handler accessing DB directly
+
+# Filter by severity
+makima supervisor finding list --severity P1
+
+# Filter by status
+makima supervisor finding list --status open
+
+# Summary only
+makima supervisor finding summary
+# Output:
+# Total: 12 findings
+# P1: 2 open, 1 resolved
+# P2: 3 open, 2 in-progress
+# P3: 4 resolved
+```
+
+#### Update Finding Status
+
+```bash
+# Mark as in-progress (assigned to a task)
+makima supervisor finding update SEC-001 --status in-progress --assigned-to <task-id>
+
+# Mark as resolved
+makima supervisor finding update SEC-001 --status resolved \
+ --resolution "Replaced with parameterized query in commit abc123"
+
+# Mark as verified (after re-review)
+makima supervisor finding update SEC-001 --status verified
+
+# Mark as won't fix
+makima supervisor finding update SEC-001 --status wont-fix \
+ --justification "Endpoint is internal-only, behind auth"
+```
+
+#### Auto-Create from Review Output
+
+```bash
+# Parse review agent output and create findings automatically
+makima supervisor finding parse-output --task-id <review-task-id>
+```
+
+This parses structured review output and creates individual finding records.
+
+### 3. Finding Lifecycle
+
+```
+┌────────────────────────────────────────────────────────────┐
+│ Finding Lifecycle │
+│ │
+│ ┌──────┐ ┌─────────────┐ ┌──────────┐ │
+│ │ │ │ │ │ │ │
+│ │ OPEN │───▶│ IN-PROGRESS │───▶│ RESOLVED │ │
+│ │ │ │ │ │ │ │
+│ └──┬───┘ └─────────────┘ └────┬─────┘ │
+│ │ │ │
+│ │ ┌─────────────┐ ┌────┴─────┐ │
+│ │ │ │ │ │ │
+│ └───────▶│ WONT-FIX │ │ VERIFIED │ │
+│ │ │ │ │ │
+│ └─────────────┘ └──────────┘ │
+│ │
+│ Triggers: │
+│ open ─▶ in_progress : Task assigned to fix │
+│ in_progress ─▶ resolved : Fix committed │
+│ resolved ─▶ verified : Re-review confirms fix │
+│ open ─▶ wont_fix : Explicit decision with justification │
+│ resolved ─▶ wont_fix : Fix deemed unnecessary after review│
+└────────────────────────────────────────────────────────────┘
+```
+
+### 4. P1/P2/P3 Severity System
+
+| Severity | Name | Description | Merge Policy |
+|----------|------|-------------|--------------|
+| **P1** | Critical | Security vulnerabilities, data loss risks, crash bugs | **Blocks merge** — must be resolved before contract completion |
+| **P2** | Major | Performance issues, architectural concerns, significant tech debt | **Should fix** — can defer with explicit justification |
+| **P3** | Minor | Style issues, minor improvements, documentation gaps | **Nice to fix** — can defer freely |
+
+### 5. Merge Blocking
+
+When findings exist, phase transitions and merge operations check for blockers:
+
+```rust
+// In advance-phase handler
+async fn check_findings_gate(contract_id: Uuid) -> Result<bool> {
+ let findings = get_findings(contract_id).await?;
+ let open_p1s = findings.iter()
+ .filter(|f| f.severity == P1 && f.status == Open)
+ .count();
+
+ if open_p1s > 0 {
+ warn!("{} open P1 findings block phase transition", open_p1s);
+ return Ok(false);
+ }
+ Ok(true)
+}
+```
+
+### 6. Auto-Resolution Workflow
+
+When the Multi-Agent Review feature is available, findings drive an automated resolution cycle:
+
+```
+┌──────────┐ ┌───────────┐ ┌──────────┐ ┌──────────┐
+│ Review │────▶│ Findings │────▶│ Resolve │────▶│ Verify │
+│ Phase │ │ Created │ │ Tasks │ │ Fixes │
+│ │ │ (P1/P2/P3)│ │ Spawned │ │ Pass? │
+└──────────┘ └───────────┘ └──────────┘ └────┬─────┘
+ │
+ Yes │ No
+ ┌────┴────┐
+ ▼ ▼
+ ┌──────────┐ Loop back
+ │ Findings │ to resolve
+ │ Verified │
+ └──────────┘
+```
+
+```bash
+# Auto-resolve: spawn tasks to fix each P1/P2 finding
+makima supervisor finding auto-resolve --severity P1,P2
+
+# This spawns one task per finding:
+# - Task plan includes the finding details and recommendation
+# - Task is assigned to the finding (finding.assigned_to = task.id)
+# - When task completes, finding status → resolved
+# - Verification task confirms the fix
+```
+
+---
+
+## Integration with Existing Makima Features
+
+### Contract Files
+
+Each finding is stored as a **contract file**:
+
+```rust
+File {
+ contract_id: Some(contract.id),
+ contract_phase: Some("review"),
+ name: "Finding: SEC-001 — SQL injection in user search",
+ description: Some(serde_json::to_string(&finding_metadata)?),
+ body: vec![
+ BodyElement::Heading { level: 1, text: finding.title },
+ BodyElement::Heading { level: 2, text: "Finding" },
+ BodyElement::Paragraph { text: finding.description },
+ BodyElement::Heading { level: 2, text: "Evidence" },
+ BodyElement::Code { language: Some("rust"), content: finding.evidence },
+ BodyElement::Heading { level: 2, text: "Recommendation" },
+ BodyElement::Paragraph { text: finding.recommendation },
+ ],
+}
+```
+
+### Phase Guards
+
+Findings integrate with existing phase guards:
+- Phase guard checks finding gate before allowing transition
+- User sees a summary of open findings when reviewing phase transition
+- P1 findings produce a warning that requires explicit override
+
+### Supervisor Questions
+
+When P1 findings block a transition, the supervisor can ask:
+
+```bash
+makima supervisor ask \
+ "2 P1 findings are still open. How would you like to proceed?" \
+ --choices "Fix findings first,Override and continue,Mark as won't-fix" \
+ --context "SEC-001: SQL injection (P1), PERF-001: Memory leak (P1)"
+```
+
+### Task Assignment
+
+Findings reference tasks:
+- `source_task_id`: The review task that discovered the finding
+- `assigned_to`: The task spawned to resolve the finding
+
+```bash
+# Spawn a fix task and assign the finding
+makima supervisor spawn "fix-sec-001" \
+ --plan "Fix SQL injection vulnerability in src/api/users.rs:47. Use parameterized queries."
+
+makima supervisor finding update SEC-001 \
+ --status in-progress \
+ --assigned-to <spawned-task-id>
+```
+
+### Autonomous Loop
+
+The autonomous loop can use findings as a completion gate condition:
+
+```xml
+<COMPLETION_GATE>
+ready: false
+reason: "2 P1 findings still open"
+progress: "Resolved 5/7 findings"
+blockers: ["SEC-001: SQL injection", "PERF-001: Memory leak"]
+</COMPLETION_GATE>
+```
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Finding System (3-4 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Finding metadata schema | 0.5 days | FindingMetadata struct, validation |
+| `finding create` command | 1 day | Create finding as contract file |
+| `finding list/summary` commands | 0.5 days | Query and display findings |
+| `finding update` command | 0.5 days | Status transitions, validation |
+| Auto-ID generation | 0.5 days | Category-based IDs (SEC-001, PERF-002) |
+
+### Phase 2: Integration (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Phase guard integration | 0.5 days | Check P1 findings before transition |
+| `finding parse-output` | 1 day | Parse review task output into findings |
+| Merge blocking logic | 0.5 days | Block merge with open P1s |
+| Finding assignment to tasks | 0.5 days | Track resolution via task ID |
+
+### Phase 3: Automation & Polish (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `finding auto-resolve` | 1 day | Spawn fix tasks per finding |
+| Verification workflow | 0.5 days | Re-review to verify fixes |
+| Finding reports | 0.5 days | Summary contract file |
+| Documentation | 0.5 days | User guide |
+| Tests | 0.5 days | Unit + integration |
+
+---
+
+## Configuration Examples
+
+### Finding Creation in Review Agent Output
+
+Review agents produce structured findings in their output:
+
+```markdown
+## FINDING: SQL Injection in User Search
+
+- **Severity**: P1
+- **Category**: security
+- **File**: src/api/users.rs
+- **Line**: 47
+- **Tags**: injection, input-validation, database
+
+### Description
+The `search_users` handler directly interpolates the `query` parameter...
+
+### Evidence
+```rust
+let sql = format!("SELECT * FROM users WHERE name LIKE '%{}%'", query);
+```
+
+### Recommendation
+Use parameterized queries with sqlx::query().bind()
+```
+
+The synthesis step parses these into formal Finding records.
+
+### Merge Blocking Configuration
+
+```yaml
+# .makima/review-agents.yaml (or contract config)
+review:
+ findings:
+ merge_blocking_severity: P1 # P1 blocks merge
+ require_justification: P2 # P2 needs justification to defer
+ auto_resolve: true # Spawn fix tasks for P1/P2
+ auto_resolve_severity: P1,P2 # Which severities to auto-resolve
+ verification:
+ enabled: true # Re-review after resolution
+ re_review_agents: # Which agents verify fixes
+ - security-sentinel # Security findings verified by security agent
+```
+
+### Finding Lifecycle Example
+
+```bash
+# 1. Review creates finding
+makima supervisor finding create --severity P1 --category security \
+ --title "SQL injection in user search" --file src/api/users.rs --line 47
+
+# 2. Auto-resolve spawns fix task
+makima supervisor finding auto-resolve --severity P1
+# → Spawns task "fix-SEC-001" with plan based on finding details
+
+# 3. Fix task completes, finding auto-updated
+# finding SEC-001: open → in-progress → resolved
+
+# 4. Verification re-reviews the fix
+makima supervisor finding verify SEC-001
+# → Spawns verification task targeting the specific file/line
+
+# 5. Verification passes
+# finding SEC-001: resolved → verified
+
+# 6. Phase transition allowed
+makima supervisor advance-phase compound -y
+```
+
+---
+
+## Open Questions
+
+1. **Finding storage**: Contract files vs. dedicated findings table in the database? Contract files are simpler but querying is less efficient.
+2. **Cross-contract findings**: Should findings persist across contracts? (e.g., a P2 deferred from one contract carries to the next)
+3. **Finding templates**: Should common finding types have templates? (e.g., "SQL injection" pre-fills category, severity, recommendation)
+4. **External integration**: Should findings be exportable to GitHub Issues, Jira, or other issue trackers?
+5. **Metric tracking**: How granular should finding metrics be? Per-contract? Per-repository? Per-category?
+6. **False positive handling**: How should agents indicate confidence level? Should low-confidence findings be automatically P3?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| GitHub Issues integration | Rich UI, collaboration | External dependency; not all projects use GitHub | Deferred — consider as export target |
+| Plain text findings | Simple | Not queryable, no lifecycle | Rejected — defeats the purpose |
+| Dedicated findings DB table | Fast queries, rich indexing | New infrastructure, migration | Recommended for v2 |
+| Contract file-based | Uses existing infrastructure | Slower queries for large sets | Adopted for v1 |
+| Inline code comments | Close to code | Lost on next commit; hard to track | Rejected — not persistent |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: MEDIUM** — Structured findings transform the review phase from documentation to a quality gate. Essential for the Multi-Agent Review feature to produce actionable output.
+- **Complexity: LOW** — Finding records are simple structured data. Lifecycle state machine is straightforward. Main integration point (phase guards) already exists.
+- **Risk: LOW** — Purely additive feature. Worst case: findings exist but aren't used (same as today). Can be adopted incrementally.
diff --git a/docs/proposals/feature-knowledge-accumulation.md b/docs/proposals/feature-knowledge-accumulation.md
new file mode 100644
index 0000000..faef06a
--- /dev/null
+++ b/docs/proposals/feature-knowledge-accumulation.md
@@ -0,0 +1,539 @@
+# Feature Proposal: Knowledge Accumulation / Compound Learning System
+
+> **Priority:** High
+> **Complexity:** Medium
+> **Estimated Effort:** 10-15 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** Contract Files system (existing)
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Plan Deepening](feature-plan-deepening.md) · [Workflow Presets](feature-workflow-presets.md)
+
+---
+
+## Problem Statement
+
+When a makima contract completes, the **knowledge generated during that contract is effectively lost**:
+
+- **Solutions to tricky problems** exist only in task conversation history, which is not searchable or surfaceable
+- **Patterns discovered** during one contract cannot inform future contracts
+- **Mistakes made** in one contract are likely to be repeated in similar future contracts
+- **Best practices** established during execution are not codified anywhere retrievable
+- **Contract files** capture deliverables but not the *meta-knowledge* about how those deliverables were produced
+
+This means every new contract starts from zero context, even when the team has solved similar problems before. Engineering effort does not compound.
+
+---
+
+## How Compound Engineering Solves This
+
+The compound engineering plugin implements a `/compound` command that runs **5 parallel sub-agents** immediately after review:
+
+```
+┌─────────────────────────────────────────────────────────┐
+│ /compound │
+│ │
+│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
+│ │ Context │ │ Solution │ │ Prevention │ │
+│ │ Extractor │ │ Documenter │ │ Strategist │ │
+│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
+│ │ │ │ │
+│ ┌──────┴──────┐ ┌──────┴──────┐ │
+│ │ Doc │ │ Category │ │
+│ │ Linker │ │ Classifier │ │
+│ └──────┬──────┘ └──────┬──────┘ │
+│ │ │ │
+│ ▼ ▼ │
+│ ┌──────────────────────────────────────┐ │
+│ │ docs/solutions/[category]/file.md │ │
+│ │ │ │
+│ │ --- │ │
+│ │ category: build-errors │ │
+│ │ severity: medium │ │
+│ │ tags: [webpack, esm, cjs] │ │
+│ │ date: 2026-02-09 │ │
+│ │ contract: abc-123 │ │
+│ │ --- │ │
+│ │ │ │
+│ │ # Mixed ESM/CJS Import Resolution │ │
+│ │ │ │
+│ │ ## Problem │ │
+│ │ ... │ │
+│ │ ## Solution │ │
+│ │ ... │ │
+│ │ ## Prevention │ │
+│ │ ... │ │
+│ └──────────────────────────────────────┘ │
+└─────────────────────────────────────────────────────────┘
+```
+
+### 9 Auto-Detected Categories
+
+| Category | Description |
+|----------|-------------|
+| `build-errors` | Compilation, bundling, dependency resolution |
+| `test-failures` | Test setup, assertion patterns, mocking |
+| `api-patterns` | API design, endpoint structure, versioning |
+| `architecture-decisions` | Structural choices, trade-offs, patterns |
+| `performance-optimizations` | Speed, memory, caching strategies |
+| `security-practices` | Auth, input validation, secrets management |
+| `debugging-techniques` | Investigation methods, logging strategies |
+| `tooling-configurations` | Tool setup, config patterns, CI/CD |
+| `domain-knowledge` | Business logic, domain-specific patterns |
+
+---
+
+## Proposed Makima Implementation
+
+### 1. New "Compound" Phase
+
+Add an optional **compound** phase to the contract lifecycle, positioned after review:
+
+```
+Research → Specify → Plan → Execute → Review → Compound
+ ▲
+ (new phase)
+```
+
+**Phase behavior:**
+- **Auto-triggered** after review phase completes (configurable)
+- **Short-lived** — typically completes in 1-3 minutes
+- Extracts learnings from the contract's execution and review
+- Stores them as searchable, categorized learning documents
+- Can be skipped via configuration for trivial contracts
+
+### 2. New Supervisor Command: `makima supervisor compound`
+
+```bash
+# Run compound learning for the current contract
+makima supervisor compound
+
+# Compound with specific focus areas
+makima supervisor compound --focus "security,performance"
+
+# Compound with explicit learnings
+makima supervisor compound --learning "The retry logic needed exponential backoff, not fixed delay"
+```
+
+**Implementation:**
+
+```bash
+# Under the hood, this spawns learning sub-agents
+makima supervisor spawn-group "compound" \
+ --tasks '[
+ {
+ "name": "context-extractor",
+ "plan": "Extract the problem context, constraints, and environment details from the contract execution history..."
+ },
+ {
+ "name": "solution-documenter",
+ "plan": "Document the solutions that were applied, including code patterns and configuration changes..."
+ },
+ {
+ "name": "prevention-strategist",
+ "plan": "Identify what could prevent this class of problem in the future..."
+ },
+ {
+ "name": "category-classifier",
+ "plan": "Classify these learnings into the appropriate category..."
+ },
+ {
+ "name": "doc-linker",
+ "plan": "Link these learnings to existing documentation and related learnings..."
+ }
+ ]'
+```
+
+### 3. Learning Document Schema
+
+Each learning is stored as a **contract file** with structured content and metadata:
+
+```yaml
+# Learning document metadata (stored in file description/metadata)
+learning:
+ category: "build-errors" # One of 9 categories
+ severity: "medium" # low, medium, high, critical
+ tags: ["webpack", "esm", "cjs"] # Free-form tags
+ source_contract_id: "abc-123" # Contract that produced this learning
+ source_contract_name: "Fix webpack bundling"
+ repository: "github.com/org/repo"
+ date: "2026-02-09"
+ quality_score: 0.85 # 0-1, set by quality gate
+ access_count: 0 # Incremented on retrieval
+ last_accessed: null
+ relevance_decay: 0.95 # Per-month decay factor
+```
+
+**Document body structure:**
+
+```markdown
+# Mixed ESM/CJS Import Resolution
+
+## Problem
+When upgrading to webpack 5, mixed ESM and CommonJS imports caused
+"Cannot use import statement outside a module" errors in production
+but not development.
+
+## Root Cause
+The `type: "module"` field in package.json applied ESM resolution
+globally, but several dependencies only provided CJS exports.
+
+## Solution
+1. Added `resolve.fullySpecified: false` to webpack config
+2. Used `@babel/plugin-transform-modules-commonjs` for CJS deps
+3. Created explicit `.cjs` extensions for config files
+
+## Code Pattern
+```javascript
+// webpack.config.cjs (note: .cjs extension)
+module.exports = {
+ resolve: {
+ fullySpecified: false,
+ extensions: ['.js', '.mjs', '.cjs', '.json']
+ }
+};
+```
+
+## Prevention
+- Add webpack build check to CI before merging
+- Document module system choice in project README
+- Use `resolve.fullySpecified: false` by default in webpack 5 projects
+
+## Related
+- docs/solutions/tooling-configurations/webpack-5-migration.md
+- Contract: "Initial Webpack 5 Migration" (2026-01-15)
+```
+
+### 4. Storage Architecture
+
+Learnings are stored in two complementary locations:
+
+#### A. Contract Files (Structured, Persistent)
+
+```rust
+// Each learning becomes a contract file
+File {
+ contract_id: Some(source_contract.id),
+ contract_phase: Some("compound"),
+ name: "Learning: Mixed ESM/CJS Import Resolution",
+ description: Some("category=build-errors; tags=webpack,esm,cjs; severity=medium"),
+ body: vec![
+ BodyElement::Heading { level: 1, text: "Mixed ESM/CJS Import Resolution" },
+ BodyElement::Heading { level: 2, text: "Problem" },
+ BodyElement::Paragraph { text: "..." },
+ // ... structured content
+ ],
+ repo_file_path: Some("docs/solutions/build-errors/mixed-esm-cjs-resolution.md"),
+ repo_sync_status: Some("synced"),
+}
+```
+
+#### B. Repository Files (Searchable, Portable)
+
+```
+docs/solutions/
+├── build-errors/
+│ ├── mixed-esm-cjs-resolution.md
+│ └── docker-multi-stage-cache.md
+├── test-failures/
+│ ├── async-test-timeout-patterns.md
+│ └── mock-service-worker-setup.md
+├── api-patterns/
+│ └── pagination-cursor-vs-offset.md
+├── architecture-decisions/
+│ └── event-sourcing-tradeoffs.md
+├── performance-optimizations/
+│ └── database-connection-pooling.md
+├── security-practices/
+│ └── jwt-refresh-token-rotation.md
+├── debugging-techniques/
+│ └── distributed-tracing-setup.md
+├── tooling-configurations/
+│ └── github-actions-cache-strategy.md
+└── domain-knowledge/
+ └── payment-processing-idempotency.md
+```
+
+### 5. Auto-Surface Relevant Learnings
+
+When a new contract is created, automatically search for relevant learnings:
+
+```bash
+# Supervisor plan template automatically includes:
+# "Search existing learnings relevant to this task"
+
+makima supervisor search-learnings --query "webpack bundling errors"
+makima supervisor search-learnings --category "build-errors" --tags "webpack"
+makima supervisor search-learnings --repository "github.com/org/repo"
+```
+
+**Search algorithm:**
+
+```
+Relevance Score =
+ keyword_match_score * 0.4
+ + category_match_score * 0.2
+ + tag_overlap_score * 0.2
+ + recency_score * 0.1 # Decays over time
+ + quality_score * 0.1 # Higher quality = more relevant
+```
+
+**Integration with plan phase:**
+
+```
+┌──────────────┐ ┌───────────────────┐
+│ New Contract │──────▶│ Plan Phase │
+│ Created │ │ │
+└──────────────┘ │ 1. Create plan │
+ │ 2. Search for │◀── Learnings DB
+ │ relevant │
+ │ learnings │
+ │ 3. Inject context │
+ │ into plan │
+ └───────────────────┘
+```
+
+### 6. Quality Control
+
+#### Relevance Decay
+
+Learnings lose relevance over time unless accessed:
+
+```
+effective_relevance = quality_score * (decay_factor ^ months_since_creation)
+ + access_bonus * recent_access_count
+```
+
+- Default decay factor: 0.95/month (learning at 60% relevance after 1 year)
+- Access bonus: +0.05 per access (caps at +0.25)
+- Learnings below 0.3 effective relevance are archived
+
+#### Deduplication
+
+When a new learning is created, check for existing similar learnings:
+
+```
+similarity = cosine_similarity(new_learning_embedding, existing_learning_embedding)
+if similarity > 0.85:
+ merge_or_update(existing_learning, new_learning)
+elif similarity > 0.70:
+ link_as_related(new_learning, existing_learning)
+```
+
+#### Quality Gate
+
+Before storing a learning, validate:
+
+| Check | Threshold | Action if Failed |
+|-------|-----------|------------------|
+| Has problem statement | Required | Reject |
+| Has solution | Required | Reject |
+| Has prevention strategy | Recommended | Warn, store with quality penalty |
+| Code examples present | Recommended | Warn, store with quality penalty |
+| Category valid | Required | Auto-classify |
+| Not duplicate | >0.85 similarity | Merge with existing |
+| Minimum length | >200 characters | Reject |
+
+---
+
+## Integration with Existing Makima Features
+
+### Contract Phases
+
+The compound phase integrates into the existing phase system:
+
+```rust
+// New phase variant
+enum ContractPhase {
+ Research,
+ Specify,
+ Plan,
+ Execute,
+ Review,
+ Compound, // NEW
+}
+```
+
+- Contracts with `contract_type: "specification"` get the full 6-phase cycle
+- Contracts with `contract_type: "simple"` can opt-in via config
+- Phase guard still applies: user must approve transition to compound
+
+### Contract Files
+
+Learnings are first-class contract files, leveraging existing:
+- Versioning system
+- Structured body format (`BodyElement` types)
+- Repository file sync (`repo_file_path`, `repo_sync_status`)
+- Phase association (`contract_phase: "compound"`)
+
+### Directive System
+
+For directive-based workflows, learnings can be captured per-step:
+
+```rust
+DirectiveStep {
+ name: "compound-step-3",
+ description: "Capture learnings from database migration step",
+ depends_on: [step_3_id, review_step_id],
+ task_plan: "Extract and document learnings from the completed migration...",
+}
+```
+
+### Supervisor CLI
+
+New commands integrate with existing CLI infrastructure:
+
+```bash
+# In supervisor context
+makima supervisor compound # Run compound phase
+makima supervisor search-learnings "query" # Search knowledge base
+makima supervisor list-learnings # List all learnings
+makima supervisor learning-stats # Knowledge base statistics
+```
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Infrastructure (4-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Add `compound` phase to contract lifecycle | 1 day | New phase enum, transition rules |
+| Learning document schema | 1 day | Metadata structure, validation |
+| `supervisor compound` command | 1-2 days | Spawn learning sub-agents |
+| Repository file sync for learnings | 1 day | Write to `docs/solutions/` |
+
+### Phase 2: Search & Retrieval (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `search-learnings` command | 1-2 days | Keyword + category search |
+| Auto-surface in plan phase | 1-2 days | Inject relevant learnings into plans |
+| Learning index | 1 day | Category/tag index for fast lookup |
+
+### Phase 3: Quality & Maintenance (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Quality gate validation | 1 day | Pre-storage checks |
+| Relevance decay system | 1 day | Scheduled decay + access tracking |
+| Deduplication check | 1-2 days | Similarity detection and merging |
+| Documentation & defaults | 1 day | User guide, default categories |
+
+---
+
+## Configuration Examples
+
+### Enable Compound Phase (Contract-Level)
+
+```yaml
+# Contract configuration
+compound:
+ enabled: true
+ auto_trigger: true # Auto-run after review completes
+ categories: # Override default categories
+ - build-errors
+ - test-failures
+ - api-patterns
+ - architecture-decisions
+ - performance-optimizations
+ - security-practices
+ - debugging-techniques
+ - tooling-configurations
+ - domain-knowledge
+ quality_gate:
+ min_length: 200
+ require_problem: true
+ require_solution: true
+ require_prevention: false
+ storage:
+ contract_files: true # Store as contract files
+ repo_files: true # Also write to docs/solutions/
+ repo_path: "docs/solutions"
+```
+
+### Repository-Level Configuration (`.makima/compound.yaml`)
+
+```yaml
+# .makima/compound.yaml
+version: 1
+compound:
+ # Default settings for all contracts in this repo
+ auto_trigger: true
+
+ # Custom categories for this project
+ categories:
+ - build-errors
+ - test-failures
+ - api-patterns
+ - payment-processing # Custom domain category
+ - compliance-requirements # Custom domain category
+
+ # Search settings
+ search:
+ max_results: 10
+ min_relevance: 0.3
+ include_archived: false
+
+ # Decay settings
+ decay:
+ factor: 0.95 # Per month
+ archive_threshold: 0.3
+ access_bonus: 0.05
+ max_access_bonus: 0.25
+```
+
+### Searching Learnings
+
+```bash
+# Full-text search
+makima supervisor search-learnings "webpack ESM import error"
+
+# Category filter
+makima supervisor search-learnings --category build-errors
+
+# Tag filter
+makima supervisor search-learnings --tags webpack,esm
+
+# Repository filter
+makima supervisor search-learnings --repo github.com/org/repo
+
+# Combined
+makima supervisor search-learnings "import error" \
+ --category build-errors \
+ --tags webpack \
+ --min-relevance 0.5 \
+ --limit 5
+```
+
+---
+
+## Open Questions
+
+1. **Cross-repository knowledge**: Should learnings be scoped to a single repository or shared across all repositories for an owner?
+2. **Learning ownership**: Who owns a learning — the contract creator, the repository, or the organization?
+3. **Privacy**: Are learnings visible to all users, or scoped by access control?
+4. **Embedding model**: For similarity-based deduplication and search, which embedding model should be used? Trade-off between quality and cost.
+5. **Storage limits**: Should there be a cap on the number of learnings per repository/owner?
+6. **Manual curation**: Should users be able to manually create, edit, or delete learnings outside the compound phase?
+7. **Export/import**: Should learnings be exportable/importable across makima instances?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Store learnings only in contract files | Simple, uses existing infrastructure | Not easily searchable across contracts | Rejected — search is critical |
+| Store learnings only in repo files | Portable, version-controlled, greppable | Lost if repo deleted; no cross-repo search | Partial — use as secondary storage |
+| Use external knowledge base (e.g., vector DB) | Best search quality | Added infrastructure dependency | Deferred — consider for v2 |
+| Manual-only knowledge capture | No noise | Knowledge rarely captured | Rejected — must be automatic |
+| Full contract history indexing | Most complete | Massive storage, noise, privacy concerns | Rejected — too much signal-to-noise |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: HIGH** — This is the defining feature of compound engineering. Without knowledge accumulation, every contract starts from scratch. This is the feature that creates compounding returns.
+- **Complexity: MEDIUM** — Core capture and storage is straightforward using existing contract files and repo sync. Search quality and relevance decay require iterative refinement.
+- **Risk: MEDIUM** — Primary risk is low adoption (users skip compound phase) mitigated by auto-trigger. Secondary risk is knowledge base noise mitigated by quality gates.
diff --git a/docs/proposals/feature-multi-agent-review.md b/docs/proposals/feature-multi-agent-review.md
new file mode 100644
index 0000000..d678756
--- /dev/null
+++ b/docs/proposals/feature-multi-agent-review.md
@@ -0,0 +1,448 @@
+# Feature Proposal: Multi-Agent Parallel Review System
+
+> **Priority:** High
+> **Complexity:** Medium
+> **Estimated Effort:** 12-18 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** [Findings Tracking](feature-findings-tracking.md) (recommended)
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Workflow Presets](feature-workflow-presets.md)
+
+---
+
+## Problem Statement
+
+Makima's contract lifecycle includes a **Review** phase, but it currently has:
+
+- **No automated review mechanism** — the review phase relies entirely on manual user inspection or a single supervisor task
+- **Single-perspective review** — even when a review task is spawned, it examines code from one viewpoint
+- **No structured review output** — findings are captured as unstructured text in task output
+- **No review templates** — each review must be configured from scratch
+- **No synthesis** — when multiple reviewers exist, there's no mechanism to deduplicate and prioritize findings
+
+For complex contracts touching security, performance, and architecture, a single-pass review consistently misses category-specific issues that specialized reviewers would catch.
+
+---
+
+## How Compound Engineering Solves This
+
+The compound engineering plugin spawns **12-15 specialized review agents in parallel**, each examining the code from a unique perspective:
+
+| Agent | Focus Area | Example Findings |
+|-------|-----------|-----------------|
+| Security Sentinel | Auth, injection, secrets, CSRF | SQL injection in user input handler |
+| Performance Oracle | N+1 queries, memory leaks, caching | Unbounded list growth in event handler |
+| Architecture Strategist | Coupling, SOLID, layering | Service directly accessing repository internals |
+| Code Philosopher | Readability, naming, complexity | Cyclomatic complexity > 15 in payment flow |
+| Data Integrity Guardian | Validation, constraints, migrations | Missing NOT NULL constraint on required field |
+| Error Resilience Analyzer | Error handling, retries, fallbacks | Unhandled timeout in external API call |
+| API Contract Validator | Breaking changes, versioning | Removed required field from response |
+| Dependency Health Checker | Vulnerabilities, licensing, freshness | CVE-2025-XXXX in transitive dependency |
+| Test Coverage Analyzer | Coverage gaps, edge cases, mocking | No tests for error path in checkout flow |
+| Documentation Completeness | Docs accuracy, examples, changelog | Public API endpoint undocumented |
+| Concurrency Safety | Race conditions, deadlocks, atomicity | Non-atomic read-modify-write on shared counter |
+
+After all agents complete, a **synthesis agent** deduplicates findings, resolves contradictions, and produces a prioritized report.
+
+```
+┌───────────────────────────────────────────────────────┐
+│ Review Orchestrator │
+│ │
+│ spawn-group "review" │
+│ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
+│ │Security │ │ Perf │ │ Arch │ │ Code │ │
+│ │Sentinel │ │ Oracle │ │Strategy │ │ Phil │ │
+│ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │
+│ │ │ │ │ │
+│ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ │
+│ │ Data │ │ Error │ │ API │ │ Deps │ │
+│ │Guardian │ │Resilien.│ │Contract │ │ Health │ │
+│ └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘ │
+│ │ │ │ │ │
+│ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ │
+│ │ Test │ │ Docs │ │Concurr. │ │
+│ │Coverage │ │Complete │ │ Safety │ │
+│ └────┬────┘ └────┬────┘ └────┬────┘ │
+│ │ │ │ │
+│ wait-group "review" │
+│ ▼ ▼ ▼ │
+│ ┌──────────────────────────────────────────┐ │
+│ │ Synthesis Agent │ │
+│ │ - Deduplicate findings │ │
+│ │ - Resolve contradictions │ │
+│ │ - Prioritize by severity │ │
+│ │ - Generate summary report │ │
+│ └──────────────────────────────────────────┘ │
+│ │ │
+│ ▼ │
+│ Structured Findings │
+│ (P1 / P2 / P3) │
+└───────────────────────────────────────────────────────┘
+```
+
+---
+
+## Proposed Makima Implementation
+
+### 1. New Supervisor Commands
+
+#### `makima supervisor spawn-group`
+
+Spawns multiple tasks as a named group and returns immediately:
+
+```bash
+# Spawn a review group with 5 agents
+makima supervisor spawn-group "review" \
+ --tasks '[
+ {"name": "security-review", "plan": "Review for security vulnerabilities..."},
+ {"name": "performance-review", "plan": "Review for performance issues..."},
+ {"name": "architecture-review", "plan": "Review for architecture concerns..."}
+ ]' \
+ --share-worktree \
+ --read-only
+```
+
+**Key parameters:**
+- `--tasks` — JSON array of task definitions
+- `--share-worktree` — All tasks in the group share the supervisor's worktree (read-only access)
+- `--read-only` — Tasks cannot modify files, only produce output
+- `--max-concurrent N` — Limit parallel execution (default: unlimited)
+
+#### `makima supervisor wait-group`
+
+Waits for all tasks in a named group to complete:
+
+```bash
+# Wait for all review tasks, timeout after 10 minutes
+makima supervisor wait-group "review" --timeout 600
+
+# Returns JSON with all task results
+```
+
+**Output format:**
+```json
+{
+ "group": "review",
+ "status": "completed",
+ "tasks": [
+ {"name": "security-review", "status": "done", "output": "..."},
+ {"name": "performance-review", "status": "done", "output": "..."}
+ ],
+ "duration_seconds": 127
+}
+```
+
+#### `makima supervisor review`
+
+High-level command that orchestrates the full review pipeline:
+
+```bash
+# Run review with default agent config
+makima supervisor review
+
+# Run review with custom config
+makima supervisor review --config .makima/review-agents.yaml
+
+# Run only specific review categories
+makima supervisor review --only security,performance,architecture
+```
+
+### 2. Review Agent Configuration
+
+#### Repository-Level Configuration (`.makima/review-agents.yaml`)
+
+```yaml
+# .makima/review-agents.yaml
+version: 1
+review:
+ # Maximum number of concurrent review agents
+ max_concurrent: 8
+
+ # Timeout per agent (seconds)
+ agent_timeout: 300
+
+ # Auto-trigger review when phase transitions to 'review'
+ auto_trigger: true
+
+ # Finding severity that blocks merge
+ merge_blocking_severity: P1
+
+ agents:
+ - name: security-sentinel
+ enabled: true
+ plan: |
+ You are a Security Sentinel reviewing code changes.
+
+ Focus areas:
+ - Authentication and authorization flaws
+ - Injection vulnerabilities (SQL, XSS, command injection)
+ - Secret/credential exposure
+ - CSRF and session management
+ - Input validation gaps
+
+ Output format: One finding per section with severity (P1/P2/P3),
+ affected file/line, description, and suggested fix.
+ priority: critical # Always runs
+
+ - name: performance-oracle
+ enabled: true
+ plan: |
+ You are a Performance Oracle reviewing code changes.
+
+ Focus areas:
+ - N+1 query patterns
+ - Memory leaks and unbounded growth
+ - Missing caching opportunities
+ - Algorithmic complexity issues
+ - Database index utilization
+
+ Output format: One finding per section with severity (P1/P2/P3),
+ affected file/line, description, and suggested fix.
+ priority: standard
+
+ - name: architecture-strategist
+ enabled: true
+ plan: |
+ You are an Architecture Strategist reviewing code changes.
+
+ Focus areas:
+ - SOLID principle violations
+ - Inappropriate coupling between modules
+ - Layering violations (e.g., handler accessing DB directly)
+ - Missing abstraction boundaries
+ - Inconsistency with existing patterns
+
+ Output format: One finding per section with severity (P1/P2/P3),
+ affected file/line, description, and suggested fix.
+ priority: standard
+
+ - name: test-coverage-analyzer
+ enabled: true
+ plan: |
+ You are a Test Coverage Analyzer reviewing code changes.
+
+ Focus areas:
+ - Missing test coverage for new code paths
+ - Untested error/edge cases
+ - Test quality (meaningful assertions vs superficial)
+ - Integration test gaps
+ - Mock appropriateness
+
+ Output format: One finding per section with severity (P1/P2/P3),
+ affected file/line, description, and suggested fix.
+ priority: standard
+
+ # Users can add custom agents here
+ - name: custom-domain-reviewer
+ enabled: false
+ plan: "Review for domain-specific business logic concerns..."
+ priority: optional
+```
+
+#### Contract-Level Override
+
+```yaml
+# In contract configuration or via CLI
+review:
+ agents:
+ # Disable agents not relevant to this contract
+ - name: concurrency-safety
+ enabled: false
+ # Add contract-specific reviewer
+ - name: migration-safety
+ enabled: true
+ plan: "Review database migrations for data loss risks..."
+```
+
+### 3. Synthesis Step
+
+After all review agents complete, a synthesis task:
+
+1. **Collects** all findings from group task outputs
+2. **Deduplicates** findings about the same issue from different perspectives
+3. **Resolves contradictions** (e.g., one agent says "add caching" while another says "caching adds complexity")
+4. **Prioritizes** by severity and cross-agent agreement
+5. **Produces** a structured review report as a contract file
+
+```bash
+# Synthesis is automatically run after wait-group completes
+makima supervisor synthesize-review "review" \
+ --output-format findings \
+ --create-contract-file
+```
+
+### 4. Auto-Review Trigger
+
+When a contract's phase transitions to `review`:
+
+```rust
+// In phase transition handler
+if new_phase == "review" && contract.review_config.auto_trigger {
+ // Spawn review group automatically
+ spawn_review_group(contract, review_config).await?;
+}
+```
+
+---
+
+## Integration with Existing Makima Features
+
+### Supervisor/Worker Hierarchy
+
+Review agents are spawned as **worker tasks** under the supervisor, using existing `spawn-task` infrastructure. The new `spawn-group`/`wait-group` commands are syntactic sugar over batch `spawn-task` + `wait` calls.
+
+### Git Worktree Isolation
+
+Review agents share the supervisor's worktree in **read-only mode** (a new capability). This avoids creating N separate worktrees for review-only tasks. Implementation:
+- New `supervisor_worktree_task_id` parameter (already exists in SpawnTask)
+- New `read_only: true` flag to prevent file modifications
+- Workers see the same code state that triggered the review
+
+### Contract Files
+
+The synthesized review report is stored as a **contract file** attached to the review phase:
+```rust
+File {
+ contract_id: contract.id,
+ contract_phase: "review",
+ name: "Review Report — 2026-02-09",
+ body: vec![
+ BodyElement::Heading { level: 1, text: "Review Summary" },
+ BodyElement::Paragraph { text: "3 P1 findings, 7 P2 findings, 12 P3 findings" },
+ // ... structured findings
+ ],
+}
+```
+
+### Phase Guards
+
+If `phase_guard` is enabled and P1 findings exist, the phase transition from Review to Execute (or Compound) is blocked until P1s are resolved. This integrates with the existing `advance-phase` confirmation flow.
+
+### Completion Gates
+
+Each review agent uses the existing `<COMPLETION_GATE>` mechanism to signal when its review is complete:
+```xml
+<COMPLETION_GATE>
+ready: true
+reason: "Security review complete. Found 2 P1 and 3 P2 findings."
+progress: "Reviewed 47 files across 12 modules."
+</COMPLETION_GATE>
+```
+
+### Circuit Breaker
+
+The existing CircuitBreaker protects against review agents getting stuck. If a review agent loops without progress for 3 iterations, it's terminated and its partial findings are included in synthesis.
+
+---
+
+## Implementation Plan
+
+### Phase 1: Group Task Infrastructure (5-7 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `spawn-group` command | 2 days | Batch task spawning with named groups |
+| `wait-group` command | 1 day | Wait for all tasks in group |
+| Group tracking in DB | 1 day | Task group table, membership, status |
+| Shared worktree (read-only) | 1-2 days | Workers share supervisor worktree |
+| Tests | 1 day | Unit + integration tests |
+
+### Phase 2: Review Agent System (4-6 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Review config YAML parser | 1 day | Parse `.makima/review-agents.yaml` |
+| `supervisor review` command | 2 days | Orchestrate review pipeline |
+| Synthesis agent logic | 1-2 days | Deduplicate, prioritize, format |
+| Review report as contract file | 1 day | Store structured output |
+
+### Phase 3: Automation & Polish (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Auto-trigger on phase transition | 1 day | Hook into `advance-phase` |
+| P1 merge blocking | 1 day | Phase guard integration |
+| Default review agent templates | 1-2 days | Ship 8-10 built-in agents |
+| Documentation | 1 day | User guide and config reference |
+
+---
+
+## Configuration Examples
+
+### Minimal Setup (Zero Config)
+
+```bash
+# Uses built-in review agents with default settings
+makima supervisor review
+```
+
+### Custom Review for a Specific Contract
+
+```bash
+# Override for this contract only
+makima supervisor review \
+ --only security,performance \
+ --merge-blocking P1 \
+ --timeout 300
+```
+
+### Full Custom Configuration
+
+```yaml
+# .makima/review-agents.yaml
+version: 1
+review:
+ max_concurrent: 6
+ agent_timeout: 300
+ auto_trigger: true
+ merge_blocking_severity: P1
+
+ synthesis:
+ dedup_threshold: 0.8 # Similarity score for deduplication
+ min_agreement: 2 # Findings flagged by 2+ agents get priority boost
+ output_format: "findings" # "findings" | "report" | "both"
+ create_contract_file: true
+
+ agents:
+ - name: security-sentinel
+ enabled: true
+ priority: critical
+ plan: |
+ ...
+ - name: performance-oracle
+ enabled: true
+ priority: standard
+ plan: |
+ ...
+ # ... more agents
+```
+
+---
+
+## Open Questions
+
+1. **Shared worktree read-only enforcement**: Should this be enforced at the filesystem level (mount read-only) or via convention (instructions to the agent)?
+2. **Review scope**: Should review agents see all files or only changed files (git diff)?
+3. **Incremental review**: When new commits are added during review, should agents re-review or only review the delta?
+4. **Agent output parsing**: Should agents output structured YAML findings, or should the synthesis step parse natural language?
+5. **Cost control**: With 10+ parallel agents, how do we manage API costs? Should there be a budget ceiling per review?
+6. **Finding deduplication**: What similarity threshold should trigger deduplication? How to handle partial overlaps?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Single comprehensive review agent | Simple, no coordination overhead | Misses perspective-specific issues | Rejected — diminishes review quality |
+| Sequential reviews (one after another) | Simpler orchestration | 5-10x slower; later reviews can't benefit from earlier ones | Rejected — latency unacceptable |
+| External review tools integration | Leverage existing static analysis | Limited to tool capabilities; no semantic review | Complement — can integrate alongside agent review |
+| User-configured number of agents | Maximum flexibility | Analysis paralysis for new users | Adopted — sensible defaults + customization |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: HIGH** — Multi-agent review is the highest-impact feature from the compound engineering plugin. It directly improves code quality with no change to developer workflow.
+- **Complexity: MEDIUM** — The core `spawn-group`/`wait-group` pattern is straightforward. The synthesis step requires careful design. Shared worktree read-only mode is a new capability.
+- **Risk: LOW-MEDIUM** — Main risks are resource consumption (manageable with concurrency limits) and synthesis quality (improvable iteratively).
diff --git a/docs/proposals/feature-plan-deepening.md b/docs/proposals/feature-plan-deepening.md
new file mode 100644
index 0000000..c2d8aeb
--- /dev/null
+++ b/docs/proposals/feature-plan-deepening.md
@@ -0,0 +1,383 @@
+# Feature Proposal: Parallel Plan Deepening
+
+> **Priority:** Medium
+> **Complexity:** Low
+> **Estimated Effort:** 5-8 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** [Knowledge Accumulation](feature-knowledge-accumulation.md) (recommended, not required)
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Multi-Agent Review](feature-multi-agent-review.md)
+
+---
+
+## Problem Statement
+
+Makima's planning phase currently suffers from **single-pass planning**:
+
+- A supervisor creates a plan based on its immediate analysis of the task
+- **No systematic research** is conducted before finalizing the plan
+- **Edge cases are discovered during execution**, requiring mid-stream plan changes
+- **Best practices are not consulted** — the plan relies solely on the model's training knowledge
+- **Existing project learnings** (if the knowledge accumulation feature exists) are not surfaced during planning
+- **Revision rate is high** — an estimated ~40% of plans require significant changes after execution begins
+
+The result: plans are shallow, execution discovers problems that planning should have caught, and contracts take longer than necessary.
+
+---
+
+## How Compound Engineering Solves This
+
+The compound engineering plugin's `/deepen-plan` command takes an existing plan and enhances it by spawning **20-40 parallel research agents**:
+
+```
+┌──────────────────────────────────────────────────────────────┐
+│ /deepen-plan │
+│ │
+│ Input: Initial plan (from /plan) │
+│ │
+│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
+│ │ Best │ │ Edge │ │ Dep. │ │ Pattern │ │
+│ │ Practice │ │ Case │ │ Research │ │ Matching │ │
+│ │ Agent 1 │ │ Agent 1 │ │ Agent 1 │ │ Agent 1 │ │
+│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
+│ │ │ │ │ │
+│ ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐ │
+│ │ Best │ │ Edge │ │ Security │ │ Existing │ │
+│ │ Practice │ │ Case │ │ Concerns │ │ Learning │ │
+│ │ Agent 2 │ │ Agent 2 │ │ Agent │ │ Agent │ │
+│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
+│ │ │ │ │ │
+│ ... (20-40 agents per plan item) ... │
+│ │ │ │ │ │
+│ ▼ ▼ ▼ ▼ │
+│ ┌──────────────────────────────────────────────────┐ │
+│ │ Synthesis Agent │ │
+│ │ - Merge research into plan │ │
+│ │ - Add edge case handling │ │
+│ │ - Insert best practice notes │ │
+│ │ - Flag risks and dependencies │ │
+│ └──────────────────────────────────────────────────┘ │
+│ │ │
+│ ▼ │
+│ Enhanced Plan (Deepened) │
+│ - Original steps preserved │
+│ - Edge cases added per step │
+│ - Best practices annotated │
+│ - Risks flagged │
+│ - Dependencies clarified │
+└──────────────────────────────────────────────────────────────┘
+```
+
+The key insight: **research is embarrassingly parallel**. Each plan item can be researched independently, and each research dimension (best practices, edge cases, security, etc.) is independent.
+
+---
+
+## Proposed Makima Implementation
+
+### 1. New Supervisor Command: `makima supervisor deepen-plan`
+
+```bash
+# Deepen the current contract's plan
+makima supervisor deepen-plan
+
+# Deepen with specific focus areas
+makima supervisor deepen-plan --focus "security,edge-cases,performance"
+
+# Deepen with explicit plan file reference
+makima supervisor deepen-plan --plan-file plan.md
+
+# Control parallelism
+makima supervisor deepen-plan --max-agents 10
+
+# Include knowledge base search (requires Knowledge Accumulation feature)
+makima supervisor deepen-plan --search-learnings
+```
+
+### 2. Research Agent Categories
+
+Each plan item is researched along multiple dimensions:
+
+| Agent Category | Purpose | Example Output |
+|----------------|---------|----------------|
+| **Best Practices** | Industry standards for the technology/pattern | "Use parameterized queries for all DB operations" |
+| **Edge Cases** | Boundary conditions and error scenarios | "Handle concurrent modification of shared resource" |
+| **Dependency Research** | Compatibility, versions, known issues | "Library X v3 has breaking changes from v2" |
+| **Security Concerns** | Security implications of the planned approach | "JWT stored in localStorage is vulnerable to XSS" |
+| **Performance Implications** | Performance characteristics and bottlenecks | "N+1 query risk with eager loading disabled" |
+| **Pattern Matching** | Similar patterns in the existing codebase | "Module Y already implements this pattern; follow its conventions" |
+| **Existing Learnings** | Prior solutions from knowledge base | "Similar issue solved in contract Z; see docs/solutions/..." |
+
+### 3. Deepening Flow
+
+```
+┌─────────────┐ ┌──────────────────┐ ┌────────────────┐
+│ Original │ │ Research Phase │ │ Enhanced Plan │
+│ Plan │────▶│ │────▶│ │
+│ │ │ Per plan item: │ │ Original + │
+│ Step 1 │ │ - Best practices │ │ annotations │
+│ Step 2 │ │ - Edge cases │ │ │
+│ Step 3 │ │ - Dependencies │ │ Step 1 │
+│ Step 4 │ │ - Security │ │ ├ Edge cases │
+│ │ │ - Performance │ │ ├ Best pracs │
+│ │ │ - Patterns │ │ └ Risks │
+│ │ │ - Learnings │ │ Step 2 │
+│ │ │ │ │ ├ Edge cases │
+│ │ │ All in parallel │ │ └ ... │
+└─────────────┘ └──────────────────┘ └────────────────┘
+```
+
+**Implementation using existing infrastructure:**
+
+```bash
+# Step 1: Parse plan into items
+plan_items=$(makima supervisor get-plan-items)
+
+# Step 2: For each item, spawn research agents as a group
+for item in $plan_items; do
+ makima supervisor spawn-group "deepen-${item.id}" \
+ --tasks "[
+ {\"name\": \"best-practices\", \"plan\": \"Research best practices for: ${item.description}\"},
+ {\"name\": \"edge-cases\", \"plan\": \"Identify edge cases for: ${item.description}\"},
+ {\"name\": \"security\", \"plan\": \"Analyze security implications of: ${item.description}\"},
+ {\"name\": \"performance\", \"plan\": \"Assess performance implications of: ${item.description}\"}
+ ]" \
+ --share-worktree \
+ --read-only
+done
+
+# Step 3: Wait for all groups
+makima supervisor wait-group "deepen-*" --timeout 300
+
+# Step 4: Synthesize results into enhanced plan
+makima supervisor synthesize-plan
+```
+
+### 4. Enhanced Plan Format
+
+The deepened plan augments each step with structured annotations:
+
+```markdown
+## Step 3: Implement JWT Authentication
+
+### Original Plan
+Add JWT-based authentication middleware to the API gateway.
+Generate tokens on login, validate on each request.
+
+### Research Findings
+
+#### Best Practices
+- Use RS256 (asymmetric) for microservices, HS256 for monoliths
+- Set short access token TTL (15 min) with refresh token rotation
+- Include only essential claims (sub, exp, iat, roles)
+- Never store sensitive data in JWT payload (it's base64, not encrypted)
+
+#### Edge Cases
+- Token expiry during long-running requests
+- Clock skew between services (use ±30s leeway)
+- Concurrent refresh token rotation (race condition)
+- Token size exceeding header limits (>8KB with many claims)
+
+#### Security Concerns
+- **P2**: JWT in localStorage is XSS-vulnerable; prefer httpOnly cookies
+- **P3**: Missing CSRF protection if using cookies
+- **P2**: No token revocation mechanism for compromised tokens
+
+#### Performance Notes
+- JWT validation is CPU-bound (RS256 ~1ms per validation)
+- Consider caching decoded tokens for repeated validation
+- Refresh token DB lookup adds latency (~5ms)
+
+#### Existing Learnings
+- See: docs/solutions/security-practices/jwt-refresh-token-rotation.md
+- Previous contract "Auth Service Refactor" used similar pattern
+
+### Risks
+- [ ] Clock skew handling not in original plan
+- [ ] Token revocation strategy needed
+- [ ] CSRF protection if using cookie storage
+```
+
+### 5. Integration with Knowledge Base
+
+When the Knowledge Accumulation feature is available, `deepen-plan` automatically includes a **learning search agent** for each plan item:
+
+```
+Research Agent: "Search existing learnings relevant to JWT authentication"
+
+Results:
+- docs/solutions/security-practices/jwt-refresh-token-rotation.md (relevance: 0.92)
+- docs/solutions/api-patterns/authentication-middleware-pattern.md (relevance: 0.78)
+- docs/solutions/debugging-techniques/token-expiry-debugging.md (relevance: 0.65)
+```
+
+These results are included in the deepened plan with direct links.
+
+---
+
+## Integration with Existing Makima Features
+
+### Contract Phases
+
+Plan deepening occurs during the **Plan phase**, between initial plan creation and phase transition to Execute:
+
+```
+Plan Phase Timeline:
+ 1. Supervisor creates initial plan
+ 2. makima supervisor deepen-plan ← NEW
+ 3. User reviews deepened plan
+ 4. makima supervisor advance-phase execute
+```
+
+### Supervisor/Worker Hierarchy
+
+Research agents are spawned as **worker tasks** under the supervisor. Uses the existing `spawn-task` infrastructure with the proposed `spawn-group`/`wait-group` from the [Multi-Agent Review](feature-multi-agent-review.md) proposal.
+
+### Contract Files
+
+The deepened plan replaces or augments the plan document as a contract file:
+
+```rust
+File {
+ contract_id: contract.id,
+ contract_phase: "plan",
+ name: "Implementation Plan (Deepened)",
+ body: vec![
+ // Enhanced plan content with annotations
+ ],
+}
+```
+
+### Directive System
+
+For directive-based workflows, plan deepening can be added as a step:
+
+```rust
+DirectiveStep {
+ name: "deepen-plan",
+ description: "Enhance implementation plan with parallel research",
+ depends_on: [initial_plan_step_id],
+ task_plan: "Run deepen-plan on the initial plan...",
+}
+```
+
+### Phase Guards
+
+If `phase_guard` is enabled, the user reviews the deepened plan before approving transition to execute. This is the natural checkpoint for plan quality.
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Command (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `deepen-plan` command | 1 day | Parse plan, spawn research groups |
+| Research agent templates | 0.5 days | Default prompts for each category |
+| Synthesis logic | 1 day | Merge research into annotated plan |
+| Plan file update | 0.5 days | Write deepened plan as contract file |
+
+### Phase 2: Knowledge Integration (1-2 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Learning search agent | 0.5 days | Search knowledge base per plan item |
+| Result integration | 0.5 days | Include learning links in plan |
+| Fallback when no KB | 0.5 days | Graceful degradation without KB |
+
+### Phase 3: Configuration & Polish (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Config file support | 0.5 days | `.makima/deepen.yaml` |
+| Focus area filtering | 0.5 days | `--focus` flag implementation |
+| Concurrency control | 0.5 days | `--max-agents` limit |
+| Documentation | 0.5 days | User guide |
+| Tests | 1 day | Unit + integration |
+
+---
+
+## Configuration Examples
+
+### Repository-Level Configuration
+
+```yaml
+# .makima/deepen.yaml
+version: 1
+deepen:
+ # Auto-deepen when plan is created
+ auto_trigger: false
+
+ # Maximum agents per plan item
+ max_agents_per_item: 5
+
+ # Total maximum concurrent agents
+ max_concurrent: 20
+
+ # Timeout per research agent (seconds)
+ agent_timeout: 120
+
+ # Research dimensions to include
+ dimensions:
+ - best-practices
+ - edge-cases
+ - security
+ - performance
+ - dependencies
+ - patterns
+ - learnings # Requires Knowledge Accumulation
+
+ # Minimum plan items to trigger deepening
+ min_plan_items: 3
+
+ # Search learnings (requires Knowledge Accumulation)
+ search_learnings: true
+ search_min_relevance: 0.5
+```
+
+### Inline Usage
+
+```bash
+# Quick deepen with defaults
+makima supervisor deepen-plan
+
+# Focused deepen for security-sensitive work
+makima supervisor deepen-plan --focus security,edge-cases
+
+# Deepen with more agents for complex plans
+makima supervisor deepen-plan --max-agents 30
+
+# Deepen without knowledge base search
+makima supervisor deepen-plan --no-learnings
+```
+
+---
+
+## Open Questions
+
+1. **Plan format parsing**: How should the system parse existing plans to identify discrete items? Markdown headers? Numbered lists? YAML structure?
+2. **Research depth vs. cost**: 20-40 agents per deepening is expensive. Should there be a "lite" mode with fewer agents?
+3. **Deepening multiple times**: Can a plan be deepened iteratively? Should subsequent deepenings build on previous research?
+4. **User-provided context**: Should users be able to provide additional context (e.g., "this project uses PostgreSQL, not MySQL") to guide research?
+5. **Codebase analysis**: Should research agents analyze the existing codebase to find relevant patterns, or only reason from general knowledge?
+6. **Conflicting research**: When research agents disagree (e.g., one says "use Redis" and another says "avoid Redis"), how should the synthesis handle it?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Sequential research (one agent) | Simple, cheaper | Slow; misses multi-perspective insights | Rejected — parallel is core value |
+| Automatic deepening (always on) | No manual step | Adds latency to every plan; unnecessary for simple tasks | Optional auto-trigger |
+| Web search integration | Real-time information | Inconsistent quality; potential hallucination from web results | Deferred — consider for v2 |
+| User-provided research questions | Targeted research | Requires user to know what to ask | Complement — support alongside auto-research |
+| LLM-only research (no task spawning) | Simpler, no infrastructure | Limited by single context window; no parallelism | Rejected — defeats the purpose |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: MEDIUM** — Plan deepening significantly improves plan quality, but it's enhancement over an already-functional planning workflow. The compound engineering plugin's data shows ~40% plan revision reduction.
+- **Complexity: LOW** — This feature is largely a composition of existing primitives (task spawning, group waiting, plan file updates). The main new work is research agent prompts and synthesis logic.
+- **Risk: LOW** — Worst case is slightly better plans. No system changes required. Can be adopted incrementally.
diff --git a/docs/proposals/feature-task-templates.md b/docs/proposals/feature-task-templates.md
new file mode 100644
index 0000000..98abde9
--- /dev/null
+++ b/docs/proposals/feature-task-templates.md
@@ -0,0 +1,602 @@
+# Feature Proposal: Reusable Task Templates & Meta-Commands
+
+> **Priority:** Medium
+> **Complexity:** Medium
+> **Estimated Effort:** 8-12 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** None (standalone, but complements [Workflow Presets](feature-workflow-presets.md))
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Workflow Presets](feature-workflow-presets.md) · [Multi-Agent Review](feature-multi-agent-review.md)
+
+---
+
+## Problem Statement
+
+Makima tasks are created with **ad-hoc plans** every time:
+
+- **No plan reuse** — even when spawning the same type of task (e.g., "add API endpoint"), the plan is written from scratch
+- **No standardization** — different supervisors produce different quality plans for the same task type
+- **No best practices encoding** — hard-won knowledge about how to structure certain tasks isn't captured
+- **No variable substitution** — plans can't be parameterized for reuse
+- **No validation** — there's no way to verify a plan includes required steps before execution
+- **No meta-creation** — the system cannot create its own task templates or improve its own capabilities
+
+The compound engineering plugin addresses this with meta-commands (`/create-agent-skill`, `/heal-skill`) that allow the system to create and repair its own specialized capabilities.
+
+---
+
+## How Compound Engineering Solves This
+
+### `/create-agent-skill`
+
+Creates new specialized agents and skills on demand:
+
+```bash
+/create-agent-skill "database migration reviewer"
+```
+
+This generates:
+1. An agent definition file with specialized prompts
+2. A skill file that exposes the agent as a command
+3. Registration in the agent/skill registry
+
+### `/heal-skill`
+
+When a skill breaks (e.g., after a dependency change), this meta-command:
+1. Analyzes the error
+2. Identifies the root cause
+3. Patches the skill definition
+4. Tests the fix
+
+The key insight: **the system should be able to improve and extend itself**.
+
+---
+
+## Proposed Makima Implementation
+
+### 1. Task Recipe Format
+
+Task recipes are parameterized plan templates with validation and metadata:
+
+```yaml
+# .makima/recipes/api-endpoint.yaml
+name: api-endpoint
+description: "Create a new REST API endpoint"
+version: 1
+author: "team"
+tags: [api, backend, rest]
+
+# Input variables
+variables:
+ endpoint_name:
+ required: true
+ description: "Name of the endpoint (e.g., 'users', 'orders')"
+ validation: "^[a-z][a-z0-9-]*$"
+
+ http_method:
+ required: true
+ description: "HTTP method"
+ enum: [GET, POST, PUT, PATCH, DELETE]
+ default: GET
+
+ resource_name:
+ required: true
+ description: "Name of the resource/model"
+
+ requires_auth:
+ required: false
+ default: true
+ description: "Whether the endpoint requires authentication"
+
+ database_table:
+ required: false
+ description: "Database table name (if applicable)"
+
+# Plan template with variable substitution
+plan: |
+ ## Task: Create {{ http_method }} /api/{{ endpoint_name }} Endpoint
+
+ ### Step 1: Define Route
+ Add the `{{ http_method }} /api/{{ endpoint_name }}` route to the router.
+ {% if requires_auth %}
+ Apply authentication middleware to this route.
+ {% endif %}
+
+ ### Step 2: Create Handler
+ Create the handler function for {{ endpoint_name }}.
+ {% if database_table %}
+ The handler should query the `{{ database_table }}` table.
+ {% endif %}
+
+ ### Step 3: Request/Response Models
+ Define request and response types for the {{ resource_name }} resource.
+ Include validation for all input fields.
+
+ ### Step 4: Error Handling
+ Implement proper error responses:
+ - 400 for validation errors
+ - 401 for authentication failures
+ {% if requires_auth %}
+ - 403 for authorization failures
+ {% endif %}
+ - 404 for not found
+ - 500 for server errors
+
+ ### Step 5: Tests
+ Write tests covering:
+ - Happy path
+ - Input validation
+ {% if requires_auth %}
+ - Authentication required
+ - Authorization check
+ {% endif %}
+ - Error cases
+ - Edge cases
+
+ ### Step 6: Documentation
+ Update API documentation with:
+ - Endpoint URL and method
+ - Request/response schemas
+ - Example requests and responses
+ - Error codes
+
+# Validation rules — checks that must pass before execution
+validation:
+ - check: "file_exists"
+ path: "src/api/mod.rs"
+ message: "API module must exist"
+ - check: "grep"
+ pattern: "Router"
+ path: "src/api/mod.rs"
+ message: "Router must be defined in API module"
+
+# Expected outputs
+outputs:
+ files:
+ - "src/api/{{ endpoint_name }}.rs"
+ - "src/api/{{ endpoint_name }}_test.rs"
+ tests:
+ - "cargo test {{ endpoint_name }}"
+
+# Metadata for recipe discovery
+metadata:
+ estimated_time: "30-60 minutes"
+ difficulty: "easy"
+ example_usage: |
+ makima recipe run api-endpoint \
+ --var endpoint_name=users \
+ --var http_method=GET \
+ --var resource_name=User \
+ --var database_table=users
+```
+
+### 2. Recipe Registry
+
+Recipes are discovered from three sources (same hierarchy as workflow presets):
+
+| Level | Location | Scope |
+|-------|----------|-------|
+| Built-in | Shipped with makima | All users |
+| Repository | `.makima/recipes/` | All users of the repo |
+| User | `~/.makima/recipes/` | Single user |
+
+**Precedence**: User > Repository > Built-in (same name overrides)
+
+### 3. Supervisor Commands
+
+#### List Available Recipes
+
+```bash
+makima recipe list
+
+# Output:
+# NAME DESCRIPTION SOURCE TAGS
+# api-endpoint Create a new REST API endpoint built-in api, backend
+# db-migration Create a database migration built-in database
+# react-component Create a React component built-in frontend, react
+# unit-test Create unit tests for a module built-in testing
+# bug-fix Structured bug fix workflow built-in debugging
+# custom-validator Create input validation module repo validation
+```
+
+#### Run a Recipe
+
+```bash
+# Run with explicit variables
+makima recipe run api-endpoint \
+ --var endpoint_name=users \
+ --var http_method=GET \
+ --var resource_name=User \
+ --var database_table=users
+
+# Run with interactive variable input
+makima recipe run api-endpoint
+
+# Preview the generated plan (dry run)
+makima recipe preview api-endpoint \
+ --var endpoint_name=users \
+ --var http_method=GET
+```
+
+#### Create a Recipe
+
+```bash
+# Create recipe from scratch
+makima recipe create --name "my-recipe" --edit
+
+# Generate recipe from a completed task (meta-creation)
+makima recipe create --from-task <task-id> --name "my-recipe"
+
+# Generate recipe from a plan file
+makima recipe create --from-plan plan.md --name "my-recipe"
+```
+
+#### Validate a Recipe
+
+```bash
+# Validate recipe file
+makima recipe validate .makima/recipes/my-recipe.yaml
+
+# Validate recipe variables
+makima recipe validate api-endpoint \
+ --var endpoint_name=users \
+ --var http_method=GET
+```
+
+### 4. Meta-Commands: Self-Improving Templates
+
+The most powerful aspect of the compound engineering plugin is its ability to **create its own capabilities**. Makima can implement similar meta-commands:
+
+#### `makima recipe generate`
+
+The system analyzes completed tasks and suggests recipe templates:
+
+```bash
+# Analyze recent tasks and suggest recipes
+makima recipe generate --analyze-last 20
+
+# Output:
+# Detected patterns:
+# 1. "API endpoint creation" — 7 tasks followed similar pattern
+# Suggested recipe: api-endpoint (confidence: 0.89)
+# Variables: endpoint_name, http_method, resource_name
+#
+# 2. "Database migration" — 4 tasks followed similar pattern
+# Suggested recipe: db-migration (confidence: 0.76)
+# Variables: table_name, migration_type
+#
+# Generate these recipes? [y/N]
+```
+
+#### `makima recipe heal`
+
+When a recipe fails repeatedly, the system can analyze and fix it:
+
+```bash
+# Analyze recipe failures and suggest fixes
+makima recipe heal api-endpoint
+
+# Output:
+# Analyzed 3 recent failures of 'api-endpoint':
+# Root cause: Step 1 references 'src/api/mod.rs' but project uses 'src/routes/mod.rs'
+# Suggested fix: Change validation path and plan references
+# Apply fix? [y/N]
+```
+
+#### `makima recipe evolve`
+
+Improve recipes based on review findings:
+
+```bash
+# Check if review findings suggest recipe improvements
+makima recipe evolve api-endpoint --from-findings
+
+# Output:
+# Review findings from tasks using 'api-endpoint' recipe:
+# - SEC-001: "Missing rate limiting" (3 occurrences)
+# - PERF-001: "Missing pagination" (2 occurrences)
+#
+# Suggested additions to recipe:
+# 1. Add "Rate Limiting" step after Step 1
+# 2. Add pagination to Step 2 for GET endpoints
+# Apply improvements? [y/N]
+```
+
+### 5. Built-In Recipes
+
+#### `api-endpoint`
+
+Creates a REST API endpoint with handler, models, validation, tests, and docs.
+
+#### `db-migration`
+
+Creates a database migration with up/down scripts, validation, and rollback plan.
+
+```yaml
+name: db-migration
+variables:
+ table_name: { required: true }
+ migration_type: { required: true, enum: [create-table, alter-table, add-index, seed-data] }
+plan: |
+ ## Create Database Migration: {{ migration_type }} on {{ table_name }}
+ ### Step 1: Create migration file
+ ### Step 2: Write up migration
+ ### Step 3: Write down migration (rollback)
+ ### Step 4: Test migration on clean database
+ ### Step 5: Test rollback
+ ### Step 6: Document migration in changelog
+```
+
+#### `react-component`
+
+Creates a React component with props, state, styling, and tests.
+
+#### `unit-test`
+
+Generates unit tests for an existing module by analyzing its public API.
+
+#### `bug-fix`
+
+Structured bug fix workflow: reproduce → root cause → fix → test → document.
+
+```yaml
+name: bug-fix
+variables:
+ bug_description: { required: true }
+ reproduction_steps: { required: false }
+ affected_area: { required: false }
+plan: |
+ ## Bug Fix: {{ bug_description }}
+
+ ### Step 1: Reproduce
+ {% if reproduction_steps %}
+ Follow these reproduction steps: {{ reproduction_steps }}
+ {% else %}
+ Identify and document reproduction steps.
+ {% endif %}
+
+ ### Step 2: Root Cause Analysis
+ Trace the code path to identify the root cause.
+ {% if affected_area %}
+ Start in: {{ affected_area }}
+ {% endif %}
+
+ ### Step 3: Implement Fix
+ Fix the root cause, not just the symptom.
+
+ ### Step 4: Write Regression Test
+ Create a test that would have caught this bug.
+
+ ### Step 5: Verify Fix
+ Run the reproduction steps and confirm the bug is fixed.
+ Run the full test suite to check for regressions.
+
+ ### Step 6: Document
+ Document what caused the bug and how it was fixed.
+```
+
+---
+
+## Integration with Existing Makima Features
+
+### Supervisor Task Spawning
+
+Recipes generate plans that are passed to `spawn-task`:
+
+```rust
+// Recipe execution
+let plan = recipe.render_plan(&variables)?;
+let task = spawn_task(SpawnTaskRequest {
+ task_name: format!("{} ({})", recipe.name, variables.get("primary_var")),
+ plan,
+ // ... other params from context
+})?;
+```
+
+### Contract Files
+
+Recipe definitions can be stored as contract files for versioning:
+
+```rust
+File {
+ contract_id: None, // Global, not contract-specific
+ name: "Recipe: api-endpoint",
+ body: vec![
+ BodyElement::Code { language: Some("yaml"), content: recipe_yaml },
+ ],
+}
+```
+
+### Workflow Presets
+
+Recipes and presets are complementary:
+- **Presets** define the high-level workflow (which phases, what triggers)
+- **Recipes** define the low-level task plans (what each task does)
+
+A preset can reference recipes:
+
+```yaml
+# In a preset
+phases:
+ execute:
+ recipe: api-endpoint # Use the api-endpoint recipe for this phase's tasks
+ recipe_vars:
+ endpoint_name: "{{ task_description }}"
+```
+
+### Knowledge Accumulation
+
+Recipes can be **evolved** based on learnings:
+- When compound learning captures a pattern, check if it maps to an existing recipe
+- If so, suggest recipe improvements
+- If not, suggest creating a new recipe
+
+### Directive System
+
+For directive-based workflows, recipes can be used as task plan sources:
+
+```rust
+DirectiveStep {
+ name: "create-users-endpoint",
+ task_plan: recipe.render_plan(&variables)?, // Generated from recipe
+ // ...
+}
+```
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Recipe System (3-4 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Recipe YAML schema | 0.5 days | Define format, validation rules |
+| YAML parser with Jinja-like templating | 1 day | Variable substitution, conditionals |
+| `recipe list` command | 0.5 days | Discover and list recipes |
+| `recipe run` command | 1 day | Parse, validate, render, spawn task |
+| `recipe preview` command | 0.5 days | Dry-run display |
+
+### Phase 2: Recipe Management (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Multi-level discovery | 0.5 days | Built-in, repo, user resolution |
+| `recipe create` command | 1 day | Create from scratch or from task |
+| `recipe validate` command | 0.5 days | YAML validation, variable check |
+| Built-in recipe definitions | 1 day | Write 5 default recipes |
+
+### Phase 3: Meta-Commands (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `recipe generate` | 1.5 days | Pattern detection from task history |
+| `recipe heal` | 1 day | Failure analysis and auto-fix |
+| `recipe evolve` | 1 day | Improve recipes from findings/learnings |
+| Recipe versioning | 0.5 days | Version tracking, deprecation |
+| Documentation | 0.5 days | User guide, recipe authoring guide |
+
+---
+
+## Configuration Examples
+
+### Running a Recipe
+
+```bash
+# Simple usage
+makima recipe run api-endpoint \
+ --var endpoint_name=orders \
+ --var http_method=POST \
+ --var resource_name=Order \
+ --var requires_auth=true \
+ --var database_table=orders
+
+# This spawns a task with the rendered plan:
+# "## Task: Create POST /api/orders Endpoint
+# ### Step 1: Define Route
+# Add the POST /api/orders route to the router.
+# Apply authentication middleware to this route.
+# ..."
+```
+
+### Creating a Recipe from a Completed Task
+
+```bash
+# After completing a successful task
+makima recipe create --from-task abc-123 --name "graphql-resolver"
+
+# Analyzes the task's plan and execution to generate:
+# .makima/recipes/graphql-resolver.yaml
+# with variables extracted from repeated patterns
+```
+
+### Recipe with Validation
+
+```yaml
+# .makima/recipes/react-component.yaml
+name: react-component
+variables:
+ component_name:
+ required: true
+ validation: "^[A-Z][a-zA-Z]*$" # PascalCase
+ use_typescript:
+ required: false
+ default: true
+ include_tests:
+ required: false
+ default: true
+ styling:
+ required: false
+ enum: [css-modules, styled-components, tailwind]
+ default: css-modules
+
+validation:
+ - check: "file_exists"
+ path: "src/components"
+ message: "Components directory must exist"
+ - check: "not_exists"
+ path: "src/components/{{ component_name }}"
+ message: "Component {{ component_name }} already exists"
+
+plan: |
+ ## Create React Component: {{ component_name }}
+
+ ### Step 1: Component File
+ Create `src/components/{{ component_name }}/{{ component_name }}.{{ 'tsx' if use_typescript else 'jsx' }}`
+ with the component skeleton.
+
+ ### Step 2: Styling
+ {% if styling == 'css-modules' %}
+ Create `{{ component_name }}.module.css` with base styles.
+ {% elif styling == 'styled-components' %}
+ Create styled components in the component file.
+ {% elif styling == 'tailwind' %}
+ Use Tailwind CSS classes directly in the component.
+ {% endif %}
+
+ {% if include_tests %}
+ ### Step 3: Tests
+ Create `{{ component_name }}.test.{{ 'tsx' if use_typescript else 'jsx' }}`
+ with tests for rendering, props, and user interactions.
+ {% endif %}
+
+ ### Step {{ '4' if include_tests else '3' }}: Export
+ Add {{ component_name }} to the components index file.
+
+outputs:
+ files:
+ - "src/components/{{ component_name }}/{{ component_name }}.{{ 'tsx' if use_typescript else 'jsx' }}"
+ - "src/components/{{ component_name }}/index.{{ 'ts' if use_typescript else 'js' }}"
+```
+
+---
+
+## Open Questions
+
+1. **Templating language**: Should we use a full Jinja2-like syntax or a simpler `{{ variable }}` substitution? Jinja adds power but complexity.
+2. **Recipe dependencies**: Can recipes depend on other recipes? (e.g., "api-endpoint requires db-migration to have run first")
+3. **Recipe testing**: How do you test that a recipe produces valid plans? Should recipes have test cases?
+4. **Recipe marketplace**: Should there be a community registry for sharing recipes?
+5. **Pattern detection**: How sophisticated should `recipe generate` be? Simple plan comparison, or full semantic analysis?
+6. **Recipe scope**: Should recipes generate just plans, or also pre-create file scaffolding (like code generators)?
+7. **Backwards compatibility**: When a recipe is updated, what happens to tasks that were created with the old version?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Plan library (copy-paste) | Simple | No variables, no validation | Rejected — not reusable enough |
+| Code generators (scaffolding) | Creates actual files | Over-prescriptive; doesn't handle logic | Complement — recipes can reference generators |
+| LLM-only planning | Maximum flexibility | Inconsistent; no standardization | Current state — recipes improve on this |
+| Cookiecutter-style templates | Familiar | Wrong level (project-level vs task-level) | Rejected — different abstraction |
+| Hardcoded task types | Fast | Not extensible; limited variety | Rejected — need flexibility |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: MEDIUM** — Task templates improve consistency and speed but aren't required for makima to function. They become increasingly valuable as the system is used more (patterns emerge).
+- **Complexity: MEDIUM** — YAML parsing and variable substitution are straightforward. Meta-commands (generate, heal, evolve) require sophisticated analysis of task history and are the main complexity drivers.
+- **Risk: LOW-MEDIUM** — Core recipe system is low risk. Meta-commands (auto-generation, healing) involve AI-driven analysis that may produce variable quality. Mitigated by requiring human approval before applying changes.
diff --git a/docs/proposals/feature-workflow-presets.md b/docs/proposals/feature-workflow-presets.md
new file mode 100644
index 0000000..1468a8a
--- /dev/null
+++ b/docs/proposals/feature-workflow-presets.md
@@ -0,0 +1,623 @@
+# Feature Proposal: Workflow Presets / Pipeline Templates
+
+> **Priority:** High
+> **Complexity:** Medium
+> **Estimated Effort:** 10-15 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** None (foundational feature)
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Multi-Agent Review](feature-multi-agent-review.md) · [Knowledge Accumulation](feature-knowledge-accumulation.md)
+
+---
+
+## Problem Statement
+
+Every makima contract currently requires **manual orchestration**:
+
+- Users must decide which contract type to use (simple, specification, execute)
+- Supervisors must manually spawn tasks, wait for results, advance phases
+- There are **no pre-built pipelines** for common workflows (full feature development, quick bug fix, refactoring, investigation)
+- The supervisor plan must encode the full orchestration logic every time
+- **Repetitive patterns** (plan → execute → test → review) are re-invented for each contract
+- New users face a steep learning curve to orchestrate contracts effectively
+
+The compound engineering plugin's `/lfg` (Let's F***ing Go) and `/slfg` (Super LFG) commands solve this with **one-command full pipelines** that chain all phases automatically.
+
+---
+
+## How Compound Engineering Solves This
+
+### LFG Pipeline (Serial)
+
+```bash
+/lfg "Implement user authentication"
+```
+
+Automatically chains:
+```
+Plan → Deepen Plan → Work → Review → Resolve Findings → Test → Compound → Done
+```
+
+### SLFG Pipeline (Parallel)
+
+```bash
+/slfg "Implement user authentication"
+```
+
+Same as LFG but parallelizes independent steps:
+```
+Plan ──▶ Deepen Plan ──▶ Work ──▶ ┌─ Review ─────┐ ──▶ Test ──▶ Compound
+ │ (parallel) │
+ └──────────────┘
+```
+
+The key insight: **most engineering workflows follow predictable patterns** that can be templated and reused.
+
+---
+
+## Proposed Makima Implementation
+
+### 1. Preset Definition Format
+
+Presets are defined in YAML and describe a complete workflow:
+
+```yaml
+# .makima/presets/full-pipeline.yaml
+name: full-pipeline
+description: "Complete feature development pipeline with review and learning"
+contract_type: specification
+version: 1
+
+# Variables that can be substituted at runtime
+variables:
+ task_description:
+ required: true
+ description: "What to build"
+ repository:
+ required: false
+ description: "Target repository URL"
+ base_branch:
+ required: false
+ default: "main"
+ description: "Branch to work from"
+
+# Phase configuration
+phases:
+ research:
+ enabled: true
+ deliverables:
+ - id: research-notes
+ name: "Research Notes"
+ priority: required
+ supervisor_plan: |
+ Research the requirements for: {{ task_description }}
+ - Analyze the existing codebase for relevant patterns
+ - Identify dependencies and constraints
+ - Document findings as research notes
+
+ plan:
+ enabled: true
+ deliverables:
+ - id: plan-document
+ name: "Implementation Plan"
+ priority: required
+ supervisor_plan: |
+ Create an implementation plan for: {{ task_description }}
+ Based on the research findings.
+ # Auto-deepen plan (requires Plan Deepening feature)
+ deepen: true
+ deepen_focus:
+ - edge-cases
+ - security
+ - performance
+
+ execute:
+ enabled: true
+ deliverables:
+ - id: implementation
+ name: "Implementation"
+ priority: required
+ supervisor_plan: |
+ Execute the plan for: {{ task_description }}
+ Follow the deepened plan step by step.
+ # Spawn configuration
+ max_concurrent_tasks: 3
+ completion_action: "branch"
+
+ review:
+ enabled: true
+ deliverables:
+ - id: review-report
+ name: "Review Report"
+ priority: required
+ # Auto-review configuration (requires Multi-Agent Review feature)
+ auto_review: true
+ review_agents:
+ - security-sentinel
+ - performance-oracle
+ - architecture-strategist
+ - test-coverage-analyzer
+ merge_blocking_severity: P1
+
+ compound:
+ enabled: true
+ # Auto-compound (requires Knowledge Accumulation feature)
+ auto_compound: true
+ categories:
+ - architecture-decisions
+ - security-practices
+ - performance-optimizations
+
+# Hooks
+hooks:
+ on_phase_complete:
+ execute:
+ - run: "makima supervisor spawn 'run-tests' --plan 'Run the full test suite'"
+ - wait_for: "run-tests"
+ on_contract_complete:
+ - run: "makima supervisor compound"
+```
+
+### 2. Built-In Presets
+
+#### `full-pipeline` — Complete Feature Development
+
+```
+Research → Plan → Deepen → Execute → Test → Review → Resolve → Compound
+```
+
+Best for: New features, major changes, complex implementations.
+
+#### `quick-fix` — Rapid Bug Fix
+
+```
+Execute → Test → Done
+```
+
+Best for: Small bug fixes, typo corrections, config changes.
+
+```yaml
+# .makima/presets/quick-fix.yaml
+name: quick-fix
+description: "Fast bug fix with minimal ceremony"
+contract_type: simple
+
+phases:
+ plan:
+ enabled: true
+ deliverables:
+ - id: fix-plan
+ name: "Fix Plan"
+ priority: required
+ supervisor_plan: |
+ Quick analysis and fix plan for: {{ task_description }}
+ Keep it brief — identify the bug and the fix.
+
+ execute:
+ enabled: true
+ deliverables:
+ - id: fix
+ name: "Bug Fix"
+ priority: required
+ supervisor_plan: |
+ Fix the bug: {{ task_description }}
+ Run relevant tests after fixing.
+ completion_action: "branch"
+```
+
+#### `refactor` — Code Refactoring
+
+```
+Research → Plan → Deepen → Execute → Test → Review → Done
+```
+
+Best for: Code restructuring, pattern changes, dependency updates.
+
+```yaml
+# .makima/presets/refactor.yaml
+name: refactor
+description: "Systematic refactoring with safety checks"
+contract_type: specification
+
+phases:
+ research:
+ enabled: true
+ supervisor_plan: |
+ Analyze the codebase to understand the current structure for: {{ task_description }}
+ Document all files that will be affected.
+ Identify dependencies and potential breaking changes.
+
+ plan:
+ enabled: true
+ deepen: true
+ deepen_focus:
+ - edge-cases
+ - patterns
+ supervisor_plan: |
+ Create a step-by-step refactoring plan for: {{ task_description }}
+ Ensure each step maintains a working state (no big-bang changes).
+
+ execute:
+ enabled: true
+ supervisor_plan: |
+ Execute the refactoring plan for: {{ task_description }}
+ After each significant change, run tests to verify nothing is broken.
+ completion_action: "branch"
+
+ review:
+ enabled: true
+ auto_review: true
+ review_agents:
+ - architecture-strategist
+ - test-coverage-analyzer
+ merge_blocking_severity: P1
+```
+
+#### `investigation` — Research & Analysis
+
+```
+Research → Document → Done
+```
+
+Best for: Bug investigation, feasibility analysis, technology evaluation.
+
+```yaml
+# .makima/presets/investigation.yaml
+name: investigation
+description: "Research-focused workflow for analysis and documentation"
+contract_type: simple
+
+phases:
+ plan:
+ enabled: true
+ supervisor_plan: |
+ Plan the investigation for: {{ task_description }}
+ Define what questions need answering and what to examine.
+
+ execute:
+ enabled: true
+ deliverables:
+ - id: investigation-report
+ name: "Investigation Report"
+ priority: required
+ supervisor_plan: |
+ Investigate: {{ task_description }}
+ Document findings thoroughly.
+ Create actionable recommendations.
+ completion_action: "none"
+```
+
+### 3. Preset Discovery & Usage
+
+#### CLI Commands
+
+```bash
+# List available presets
+makima preset list
+# Output:
+# NAME DESCRIPTION SOURCE
+# full-pipeline Complete feature development pipeline built-in
+# quick-fix Fast bug fix with minimal ceremony built-in
+# refactor Systematic refactoring with safety checks built-in
+# investigation Research-focused analysis workflow built-in
+# custom-deploy Deployment pipeline with staging .makima/presets/
+
+# Run a preset
+makima preset run full-pipeline \
+ --var task_description="Add user authentication with JWT" \
+ --var repository="github.com/org/repo"
+
+# Run with interactive variable input
+makima preset run full-pipeline
+
+# Preview what a preset will do (dry run)
+makima preset preview full-pipeline \
+ --var task_description="Add user authentication with JWT"
+
+# Create a new preset from an existing contract
+makima preset create --from-contract <contract-id> --name "my-workflow"
+
+# Validate a preset file
+makima preset validate .makima/presets/my-preset.yaml
+```
+
+#### Under the Hood
+
+When `makima preset run full-pipeline` executes:
+
+```
+1. Parse preset YAML
+2. Substitute variables
+3. Create contract with specified type
+4. Configure phases from preset
+5. Create supervisor task with generated plan
+6. Supervisor executes phases according to preset configuration
+7. Auto-triggers (review, compound) fire at appropriate phase transitions
+```
+
+```
+┌─────────────────────────────────────────────────────────┐
+│ Preset Engine │
+│ │
+│ ┌──────────┐ ┌──────────┐ ┌──────────────────┐ │
+│ │ Parse │───▶│ Variable │───▶│ Create Contract │ │
+│ │ YAML │ │ Subst. │ │ + Supervisor │ │
+│ └──────────┘ └──────────┘ └────────┬─────────┘ │
+│ │ │
+│ ┌────────────────────────┐│ │
+│ │ Phase Orchestration ││ │
+│ │ │▼ │
+│ │ research ──▶ plan ──▶ execute │
+│ │ │ │ │
+│ │ deepen-plan │ │
+│ │ (if enabled) │ │
+│ │ ▼ │
+│ │ review ──▶ compound│
+│ │ (auto) (auto) │
+│ └────────────────────────────────────┘ │
+└─────────────────────────────────────────────────────────┘
+```
+
+### 4. Custom Preset Creation
+
+Users create presets at three levels:
+
+| Level | Location | Scope |
+|-------|----------|-------|
+| Built-in | Shipped with makima | All users |
+| Repository | `.makima/presets/` | All users of the repo |
+| User | `~/.makima/presets/` | Single user |
+
+**Precedence**: User > Repository > Built-in (same name overrides)
+
+#### Creating from Existing Contract
+
+```bash
+# Analyze a successful contract and generate a preset from it
+makima preset create --from-contract abc-123 --name "my-api-workflow"
+
+# This generates:
+# ~/.makima/presets/my-api-workflow.yaml
+# with phases, timings, and patterns extracted from the contract
+```
+
+---
+
+## Integration with Existing Makima Features
+
+### Contract System
+
+Presets create contracts with the appropriate type:
+```rust
+// Preset specifies contract_type
+let contract = create_contract(CreateContractRequest {
+ name: format!("{} ({})", task_description, preset.name),
+ contract_type: preset.contract_type.clone(), // "simple", "specification", "execute"
+ phase: preset.first_enabled_phase(),
+ autonomous_loop: true,
+ phase_guard: preset.phase_guard,
+ // ...
+});
+```
+
+### Supervisor Plans
+
+The preset generates a comprehensive supervisor plan by combining phase-specific instructions:
+
+```rust
+let supervisor_plan = preset.generate_supervisor_plan(&variables);
+// This produces a plan like:
+// "You are orchestrating a full-pipeline workflow.
+// Phase 1 (Research): ...
+// Phase 2 (Plan): ...
+// ..."
+```
+
+### Directive System Integration
+
+For complex presets, phases can be modeled as directive steps with dependencies:
+
+```rust
+// Each phase becomes a directive step
+let steps = preset.phases.iter().map(|phase| {
+ DirectiveStep {
+ name: phase.name.clone(),
+ description: Some(phase.description.clone()),
+ task_plan: Some(phase.supervisor_plan.clone()),
+ depends_on: phase.dependencies(),
+ // ...
+ }
+}).collect();
+```
+
+This allows parallel phases (e.g., independent review agents) to execute concurrently while respecting dependencies.
+
+### Hooks System
+
+Presets define hooks that trigger at phase transitions:
+
+```yaml
+hooks:
+ on_phase_complete:
+ execute:
+ - run: "makima supervisor spawn 'tests' --plan 'Run test suite'"
+ - wait_for: "tests"
+ on_review_complete:
+ - condition: "findings.p1_count == 0"
+ run: "makima supervisor advance-phase compound -y"
+ - condition: "findings.p1_count > 0"
+ run: "makima supervisor ask 'P1 findings detected. Continue?' --choices 'Fix first,Continue anyway'"
+```
+
+### Autonomous Loop
+
+Presets work with the existing autonomous loop:
+- Each phase uses `<COMPLETION_GATE>` to signal completion
+- Circuit breaker prevents stuck phases
+- `autonomous_loop: true` on the contract enables automatic continuation
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Preset Engine (4-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Preset YAML schema definition | 0.5 days | Define YAML format, validation rules |
+| YAML parser with variable substitution | 1 day | Parse presets, substitute `{{ variables }}` |
+| `preset list` command | 0.5 days | Discover and list available presets |
+| `preset run` command | 1.5 days | Create contract + supervisor from preset |
+| `preset preview` command | 0.5 days | Dry-run display |
+| Built-in preset definitions | 1 day | Write 4 default presets |
+
+### Phase 2: Custom Presets (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| User/repo preset discovery | 1 day | Multi-level preset resolution |
+| `preset create` command | 1.5 days | Generate preset from existing contract |
+| `preset validate` command | 0.5 days | Validate preset YAML |
+| Preset versioning | 1 day | Version field, migration support |
+
+### Phase 3: Integration & Polish (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Hooks system | 1.5 days | Phase transition hooks |
+| Auto-trigger integration | 1 day | Wire to review/compound auto-triggers |
+| Directive system integration | 1 day | Complex presets as directive DAGs |
+| Documentation | 0.5 days | User guide, preset authoring guide |
+
+---
+
+## Configuration Examples
+
+### Running a Preset
+
+```bash
+# Simplest usage — one command to run a full pipeline
+makima preset run full-pipeline --var task_description="Add OAuth2 login"
+
+# This creates:
+# - Contract: "Add OAuth2 login (full-pipeline)"
+# - Supervisor task with complete phase orchestration
+# - Auto-review enabled
+# - Auto-compound enabled
+# - All phases configured with deliverables
+```
+
+### Creating a Custom Preset
+
+```yaml
+# .makima/presets/api-feature.yaml
+name: api-feature
+description: "API feature development with schema validation"
+contract_type: specification
+version: 1
+
+variables:
+ feature_name:
+ required: true
+ description: "Name of the API feature"
+ api_version:
+ required: false
+ default: "v1"
+ description: "API version"
+
+phases:
+ research:
+ enabled: true
+ supervisor_plan: |
+ Research existing API patterns in the codebase for {{ api_version }}.
+ Document the current API schema structure.
+ Identify relevant endpoints and data models for {{ feature_name }}.
+
+ plan:
+ enabled: true
+ deepen: true
+ deepen_focus:
+ - api-patterns
+ - security
+ - edge-cases
+ supervisor_plan: |
+ Plan the {{ feature_name }} API feature for {{ api_version }}.
+ Include: endpoint design, request/response schemas, validation rules,
+ error handling, and test cases.
+
+ execute:
+ enabled: true
+ max_concurrent_tasks: 2
+ supervisor_plan: |
+ Implement the {{ feature_name }} API feature.
+ Follow the plan. Create endpoints, handlers, validators, and tests.
+ Run tests after implementation.
+ completion_action: "branch"
+
+ review:
+ enabled: true
+ auto_review: true
+ review_agents:
+ - security-sentinel
+ - api-contract-validator
+ - test-coverage-analyzer
+ merge_blocking_severity: P1
+
+ compound:
+ enabled: true
+ auto_compound: true
+ categories:
+ - api-patterns
+ - security-practices
+```
+
+### Listing Presets
+
+```
+$ makima preset list
+
+BUILT-IN PRESETS
+ full-pipeline Complete feature development pipeline with review and learning
+ quick-fix Fast bug fix with minimal ceremony
+ refactor Systematic refactoring with safety checks
+ investigation Research-focused analysis workflow
+
+REPOSITORY PRESETS (.makima/presets/)
+ api-feature API feature development with schema validation
+ migration Database migration with rollback plan
+
+USER PRESETS (~/.makima/presets/)
+ my-workflow Custom workflow for frontend development
+```
+
+---
+
+## Open Questions
+
+1. **Preset inheritance**: Should presets be able to extend other presets? (e.g., `extends: full-pipeline` with overrides)
+2. **Conditional phases**: Should phases be conditionally enabled based on runtime conditions? (e.g., skip review for changes under 50 lines)
+3. **Preset parameters validation**: How strict should variable validation be? Allow arbitrary variables or enforce a schema?
+4. **Preset sharing**: Should presets be sharable via a registry or marketplace?
+5. **Preset analytics**: Should we track which presets are most used and their success rates?
+6. **Rollback**: If a preset-driven workflow fails mid-phase, how should recovery work?
+7. **Interactive mode**: Should presets support interactive steps where the user provides input mid-pipeline?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Hardcoded pipelines | Simple, predictable | Not customizable; one-size-fits-all | Rejected — need flexibility |
+| Pure CLI scripting | Maximum flexibility | Not portable; error-prone; no validation | Rejected — too fragile |
+| GUI workflow builder | Visual, intuitive | High development cost; not scriptable | Deferred — consider for UI |
+| Contract type expansion | Minimal new concepts | Doesn't solve orchestration; just adds phase combos | Partial — presets use contract types |
+| Makefile-style approach | Familiar to developers | Wrong abstraction level; no variable substitution | Rejected — YAML is better fit |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: HIGH** — Workflow presets are the **gateway feature** that makes all other features accessible. Without presets, users must manually orchestrate review, deepening, and compounding. With presets, these features are activated with a single command.
+- **Complexity: MEDIUM** — YAML parsing and variable substitution are straightforward. Hooks system and directive integration add complexity. Main challenge is designing a preset schema that's flexible enough for diverse workflows without being overwhelming.
+- **Risk: LOW** — Presets are purely additive. They don't change existing behavior. Users can always fall back to manual orchestration.