Add compound engineering feature proposals for makima (#58)

Analyze the compound engineering plugin (https://github.com/EveryInc/compound-engineering-plugin) and propose 6 features inspired by its patterns for adoption into makima: - Multi-agent parallel review system (spawn-group/wait-group) - Knowledge accumulation / compound learning phase - Parallel plan deepening with research agents - Workflow presets / pipeline templates (LFG-style one-command pipelines) - Structured findings tracking with severity and lifecycle - Reusable task templates with meta-commands Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
author: soryu <soryu@soryu.co> 2026-02-09 16:51:59 +0000
committer: GitHub <noreply@github.com> 2026-02-09 16:51:59 +0000
commit: 76bb9da745f6c12c8e7e587a9096677bbf98f395 (patch)
tree: 5bd856d1018c6fab4700b625e5ffefb344200bf4 /docs
parent: 268cdce19b1e17128cb8806bee7e0ead1afaa95b (diff)
download: soryu-76bb9da745f6c12c8e7e587a9096677bbf98f395.tar.gz
soryu-76bb9da745f6c12c8e7e587a9096677bbf98f395.zip
7 files changed, 3399 insertions, 0 deletions
diff --git a/docs/proposals/compound-engineering-analysis.md b/docs/proposals/compound-engineering-analysis.md
new file mode 100644
index 0000000..5a8c6da
--- /dev/null
+++ b/docs/proposals/compound-engineering-analysis.md
@@ -0,0 +1,300 @@
+# Compound Engineering Plugin — Analysis & Makima Feature Mapping
+
+> **Document Type:** Overview Analysis
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Related Proposals:** [Multi-Agent Review](feature-multi-agent-review.md) · [Knowledge Accumulation](feature-knowledge-accumulation.md) · [Plan Deepening](feature-plan-deepening.md) · [Workflow Presets](feature-workflow-presets.md) · [Findings Tracking](feature-findings-tracking.md) · [Task Templates](feature-task-templates.md)
+
+---
+
+## Executive Summary
+
+The [Compound Engineering Plugin](https://github.com/EveryInc/compound-engineering-plugin) is a Claude Code plugin comprising **29 agents, 25 commands, 16 skills, and 1 MCP server**. Its core innovation is a self-reinforcing engineering loop where every unit of work makes subsequent work easier—not harder.
+
+This document analyzes the plugin's architecture, maps its capabilities against makima's existing features, identifies gaps, and proposes a phased adoption strategy. The compound engineering plugin excels at **within-session orchestration** (parallel review agents, plan deepening, knowledge capture), while makima excels at **cross-session orchestration** (contract lifecycle, worktree isolation, DAG-based directives). Combining both creates a uniquely powerful system.
+
+---
+
+## Core Philosophy
+
+> *"Each unit of engineering work should make subsequent units easier—not harder."*
+
+The plugin operationalizes this through a four-phase feedback loop where the critical **Compound** step captures learnings that feed back into future planning:
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                                                         │
+│   ┌──────────┐    ┌──────────┐    ┌──────────┐         │
+│   │          │    │          │    │          │         │
+│   │   PLAN   │───▶│   WORK   │───▶│  REVIEW  │         │
+│   │          │    │          │    │          │         │
+│   └──────────┘    └──────────┘    └──────────┘         │
+│        ▲                               │                │
+│        │          ┌──────────┐         │                │
+│        │          │          │         │                │
+│        └──────────│ COMPOUND │◀────────┘                │
+│                   │          │                          │
+│   Learnings fed   └──────────┘   Captures solutions,   │
+│   back into         │           patterns, failures      │
+│   future plans      ▼                                   │
+│              docs/solutions/                            │
+│              ├── build-errors/                          │
+│              ├── test-failures/                         │
+│              ├── api-patterns/                          │
+│              └── ...9 categories                        │
+│                                                         │
+└─────────────────────────────────────────────────────────┘
+```
+
+This maps directly to makima's contract phases: **Research → Specify → Plan → Execute → Review** with a proposed new **Compound** phase inserted after Review.
+
+---
+
+## Plugin Architecture Overview
+
+### Agent Categories (29 Total)
+
+| Category | Count | Examples |
+|----------|-------|---------|
+| Review Agents | 12-15 | Security Sentinel, Performance Oracle, Architecture Strategist, Code Philosopher, Data Integrity Guardian, Error Resilience Analyzer, API Contract Validator, Dependency Health Checker, Test Coverage Analyzer, Documentation Completeness, Concurrency Safety |
+| Research Agents | 20-40 | Best practices, edge case analysis, dependency research, pattern matching |
+| Learning Agents | 5 | Context extractor, solution documenter, prevention strategist, categorizer, doc linker |
+| Pipeline Agents | ~5 | LFG orchestrator, SLFG parallelizer, phase coordinators |
+| Meta Agents | 2-3 | Agent creator, skill healer, template generator |
+
+### Command Categories (25 Total)
+
+| Category | Key Commands | Description |
+|----------|-------------|-------------|
+| Planning | `/plan`, `/deepen-plan` | Create and enhance implementation plans |
+| Execution | `/lfg`, `/slfg` | Full autonomous pipelines (serial/parallel) |
+| Review | `/parallel-review`, `/review` | Multi-agent code review |
+| Learning | `/compound`, `/search-learnings` | Capture and retrieve knowledge |
+| Meta | `/create-agent-skill`, `/heal-skill` | Self-improving tooling |
+| Findings | `/create-todo`, `/resolve-todo` | Structured issue tracking |
+
+### Skill Categories (16 Total)
+
+Skills provide specialized capabilities including code analysis, pattern detection, security scanning, performance profiling, and documentation generation.
+
+### MCP Server (1)
+
+Provides tool access for agents to interact with the file system, git, and external services during parallel execution.
+
+---
+
+## Agent-Native Architecture Concepts
+
+The compound engineering plugin embraces an **agent-native** design philosophy:
+
+1. **Parallel-First**: Tasks that can be parallelized are always parallelized (review agents, research agents, learning sub-agents)
+2. **Structured Output**: All agent outputs use YAML frontmatter + markdown, enabling machine parsing
+3. **Swarm Orchestration**: Groups of agents with synchronization gates (spawn N → wait for all → synthesize)
+4. **Self-Healing**: Meta-commands detect broken skills and auto-repair them
+5. **Progressive Enhancement**: Plans start simple, then are "deepened" with research results
+
+---
+
+## Mapping to Makima's Architecture
+
+### What Makima Already Has
+
+| Compound Engineering Feature | Makima Equivalent | Coverage |
+|------------------------------|-------------------|----------|
+| Plan → Work → Review loop | Contract phases (Research → Specify → Plan → Execute → Review) | ✅ Full |
+| Task orchestration | Supervisor/worker hierarchy with `spawn-task` | ✅ Full |
+| Parallel task execution | Multiple workers in separate worktrees | ✅ Full |
+| Task isolation | Git worktree per task | ✅ Full |
+| Phase transitions | `supervisor advance-phase` with phase guards | ✅ Full |
+| Pipeline orchestration | Directive system with DAG dependencies | ✅ Full |
+| User interaction during execution | `supervisor ask` with timeout/choices | ✅ Full |
+| Task continuation | `continue_from_task_id`, `--continue` flag | ✅ Full |
+| Branching/forking | `supervisor branch`, `task-fork`, `task-rewind` | ✅ Full |
+| Circuit breakers | CircuitBreaker (max iterations, stuck detection) | ✅ Full |
+| Completion gates | `<COMPLETION_GATE>` parsing in autonomous loop | ✅ Full |
+| Document management | Contract files with versioning, structured body | ✅ Full |
+
+### What Makima Is Missing (Gaps)
+
+| Compound Engineering Feature | Makima Gap | Priority | Proposal |
+|------------------------------|-----------|----------|----------|
+| Multi-agent parallel review | No automated review, no review task templates | **High** | [feature-multi-agent-review.md](feature-multi-agent-review.md) |
+| Compound learning / knowledge accumulation | No cross-contract knowledge capture | **High** | [feature-knowledge-accumulation.md](feature-knowledge-accumulation.md) |
+| Plan deepening with research agents | Single-pass planning, no research integration | **Medium** | [feature-plan-deepening.md](feature-plan-deepening.md) |
+| One-command pipelines (LFG/SLFG) | Manual orchestration per contract | **High** | [feature-workflow-presets.md](feature-workflow-presets.md) |
+| Structured findings/TODOs | Unstructured review output | **Medium** | [feature-findings-tracking.md](feature-findings-tracking.md) |
+| Reusable task/agent templates | Ad-hoc plans, no template reuse | **Medium** | [feature-task-templates.md](feature-task-templates.md) |
+
+---
+
+## Feature Set Summary
+
+| # | Feature | Priority | Complexity | Effort | Proposal |
+|---|---------|----------|------------|--------|----------|
+| 1 | Multi-Agent Parallel Review | High | Medium | 12-18 days | [Link](feature-multi-agent-review.md) |
+| 2 | Knowledge Accumulation | High | Medium | 10-15 days | [Link](feature-knowledge-accumulation.md) |
+| 3 | Plan Deepening | Medium | Low | 5-8 days | [Link](feature-plan-deepening.md) |
+| 4 | Workflow Presets | High | Medium | 10-15 days | [Link](feature-workflow-presets.md) |
+| 5 | Findings Tracking | Medium | Low | 7-10 days | [Link](feature-findings-tracking.md) |
+| 6 | Task Templates | Medium | Medium | 8-12 days | [Link](feature-task-templates.md) |
+| | **Total** | | | **52-78 days** | |
+
+---
+
+## Implementation Strategy
+
+### Recommended Phasing
+
+```
+Phase 1: Foundations (Weeks 1-4)
+├── Workflow Presets ────────── Enables one-command pipelines
+└── Findings Tracking ──────── Structured review output format
+
+Phase 2: Core Loop (Weeks 5-9)
+├── Multi-Agent Review ──────── Automated parallel review
+└── Knowledge Accumulation ──── Cross-contract learning
+
+Phase 3: Enhancement (Weeks 10-13)
+├── Plan Deepening ──────────── Research-enhanced planning
+└── Task Templates ──────────── Reusable patterns
+```
+
+**Rationale for ordering:**
+
+1. **Phase 1** builds infrastructure that Phase 2 depends on:
+   - Workflow Presets provide the pipeline framework that Review and Learning plug into
+   - Findings Tracking provides the structured output format that Review agents produce
+
+2. **Phase 2** implements the core compound loop:
+   - Multi-Agent Review produces structured findings
+   - Knowledge Accumulation closes the feedback loop
+
+3. **Phase 3** optimizes the system:
+   - Plan Deepening uses the knowledge base to enhance plans
+   - Task Templates codify proven patterns for reuse
+
+### Integration Points Between Features
+
+```
+                    ┌─────────────────┐
+                    │ Workflow Presets │
+                    │  (orchestrator)  │
+                    └────────┬────────┘
+                             │ triggers phases
+              ┌──────────────┼──────────────┐
+              ▼              ▼              ▼
+     ┌────────────┐  ┌──────────────┐  ┌───────────┐
+     │   Plan     │  │  Multi-Agent │  │ Knowledge │
+     │ Deepening  │  │   Review     │  │ Accum.    │
+     └─────┬──────┘  └──────┬───────┘  └─────┬─────┘
+           │                │                 │
+           │         produces │                │
+           │                ▼                 │
+           │        ┌──────────────┐          │
+           │        │  Findings    │          │
+           │        │  Tracking    │          │
+           │        └──────────────┘          │
+           │                                  │
+           └──────── feeds into ──────────────┘
+                         │
+                    ┌────┴─────┐
+                    │   Task   │
+                    │ Templates│
+                    └──────────┘
+                  codifies patterns
+```
+
+---
+
+## Competitive Analysis
+
+### Compound Engineering Plugin Strengths
+
+| Strength | Detail |
+|----------|--------|
+| **Depth of review** | 12-15 specialized reviewers catch issues a single reviewer misses |
+| **Knowledge compounding** | Learnings are never lost; they compound over time |
+| **One-command pipelines** | `/lfg` runs full plan→work→review→compound cycle |
+| **Self-improvement** | Meta-commands create new agents/skills on demand |
+| **Swarm patterns** | Sophisticated parallel group management |
+
+### Makima Strengths
+
+| Strength | Detail |
+|----------|--------|
+| **True isolation** | Git worktrees provide real filesystem isolation, not just context isolation |
+| **Persistent orchestration** | Contracts survive across sessions; plugin agents are ephemeral |
+| **DAG execution** | Directives model complex dependency graphs natively |
+| **User interaction** | Rich question/answer system with timeouts and multi-select |
+| **Infrastructure** | Server-based architecture with WebSocket real-time communication |
+| **Checkpoint/recovery** | Full task rewind, fork, and patch-based recovery |
+| **Phase governance** | Phase guards require explicit user approval for transitions |
+
+### Combined Value Proposition
+
+| Dimension | Plugin Alone | Makima Alone | Combined |
+|-----------|-------------|-------------|----------|
+| Review quality | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Task isolation | ⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Knowledge retention | ⭐⭐⭐⭐ | ⭐ | ⭐⭐⭐⭐⭐ |
+| Persistent orchestration | ⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Pipeline automation | ⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Self-improvement | ⭐⭐⭐⭐ | ⭐ | ⭐⭐⭐⭐ |
+
+---
+
+## Risk Analysis
+
+### Technical Risks
+
+| Risk | Impact | Likelihood | Mitigation |
+|------|--------|------------|------------|
+| Parallel review agents overwhelm system resources | High | Medium | Implement concurrency limits; use makima's existing CircuitBreaker |
+| Knowledge base grows unwieldy | Medium | High | Implement relevance decay, deduplication, and quality gates |
+| Workflow presets too rigid for diverse use cases | Medium | Medium | Support variable substitution and optional steps |
+| Review synthesis produces noisy/contradictory results | Medium | Medium | Weighted deduplication with priority-based conflict resolution |
+| Template proliferation creates maintenance burden | Low | Medium | Template versioning and deprecation lifecycle |
+
+### Organizational Risks
+
+| Risk | Impact | Likelihood | Mitigation |
+|------|--------|------------|------------|
+| Scope creep across all 6 features | High | High | Strict phasing; each feature is independently shippable |
+| Users don't adopt knowledge accumulation habits | Medium | Medium | Make it automatic (not opt-in); integrate with workflow presets |
+| Configuration complexity deters users | Medium | Medium | Sensible defaults; progressive disclosure of configuration |
+
+---
+
+## Success Metrics
+
+### Per-Feature Metrics
+
+| Feature | Key Metric | Target |
+|---------|-----------|--------|
+| Multi-Agent Review | Defects caught before merge | 40% increase vs single review |
+| Knowledge Accumulation | Knowledge reuse rate | >30% of new contracts reference existing learnings |
+| Plan Deepening | Plan revision rate after execution starts | <15% (down from estimated ~40%) |
+| Workflow Presets | Time from contract creation to first commit | 50% reduction |
+| Findings Tracking | Finding resolution rate | >85% of P1/P2 findings resolved |
+| Task Templates | Template reuse rate | >25% of tasks use templates after 3 months |
+
+### System-Level Metrics
+
+- **Cycle time**: Time from contract creation to completion — target 30% reduction
+- **Defect escape rate**: Issues found post-merge — target 50% reduction
+- **Knowledge density**: Learnings per contract — target >2.5 after 6 months
+- **User satisfaction**: Survey score — target >4.2/5.0
+
+---
+
+## Conclusion
+
+The compound engineering plugin represents a mature implementation of agent-native engineering workflows. Its greatest innovations—parallel multi-perspective review, knowledge compounding, and autonomous pipelines—address real gaps in makima's current capabilities.
+
+Makima's infrastructure advantages (true worktree isolation, persistent contracts, DAG-based directives, server architecture) provide a superior foundation for implementing these features. The proposed phased approach delivers incremental value while building toward the full compound engineering loop.
+
+The combined system would offer something neither tool provides alone: **persistent, isolated, knowledge-compounding engineering workflows with multi-agent review and one-command pipeline automation**.
+
+---
+
+*Next steps: Review individual feature proposals for detailed implementation plans.*
diff --git a/docs/proposals/feature-findings-tracking.md b/docs/proposals/feature-findings-tracking.md
new file mode 100644
index 0000000..bb8a68e
--- /dev/null
+++ b/docs/proposals/feature-findings-tracking.md
@@ -0,0 +1,504 @@
+# Feature Proposal: Structured Findings / Issues Tracking
+
+> **Priority:** Medium
+> **Complexity:** Low
+> **Estimated Effort:** 7-10 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** None (standalone, but enhances [Multi-Agent Review](feature-multi-agent-review.md))
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Multi-Agent Review](feature-multi-agent-review.md) · [Workflow Presets](feature-workflow-presets.md)
+
+---
+
+## Problem Statement
+
+Currently, review outputs in makima are **unstructured text** in task conversation history:
+
+- **No standard format** for reporting issues found during review
+- **No severity classification** — all findings are treated equally
+- **No lifecycle tracking** — findings are either "in the review output" or "hopefully fixed"
+- **No verification** — there's no way to confirm a finding was actually resolved
+- **No aggregation** — findings from multiple review tasks can't be collected and deduplicated
+- **No blocking mechanism** — critical findings can't prevent phase transitions
+- **No metrics** — no data on how many findings are produced, resolved, or escaped
+
+This makes the review phase a documentation exercise rather than a quality gate.
+
+---
+
+## How Compound Engineering Solves This
+
+The compound engineering plugin uses **structured TODO/finding files** with YAML frontmatter and a defined lifecycle:
+
+### File Format
+
+```markdown
+---
+id: SEC-001
+status: open
+priority: P1
+category: security
+title: SQL injection in user search endpoint
+file: src/api/users.rs
+line: 47
+agent: security-sentinel
+created: 2026-02-09T10:30:00Z
+updated: 2026-02-09T10:30:00Z
+tags: [injection, input-validation, database]
+---
+
+# SQL Injection in User Search Endpoint
+
+## Finding
+The `search_users` handler directly interpolates the `query` parameter into
+a SQL string without parameterization.
+
+## Evidence
+```rust
+// src/api/users.rs:47
+let sql = format!("SELECT * FROM users WHERE name LIKE '%{}%'", query);
+```
+
+## Impact
+An attacker can execute arbitrary SQL queries, potentially:
+- Exfiltrating all user data
+- Modifying or deleting records
+- Escalating privileges
+
+## Recommendation
+Use parameterized queries:
+```rust
+let results = sqlx::query("SELECT * FROM users WHERE name LIKE $1")
+    .bind(format!("%{}%", query))
+    .fetch_all(&pool)
+    .await?;
+```
+
+## Resolution
+_Not yet resolved_
+```
+
+### File Naming Convention
+
+```
+findings/{issue_id}-{status}-{priority}-{description}.md
+```
+
+Example: `findings/SEC-001-open-P1-sql-injection-user-search.md`
+
+### Lifecycle
+
+```
+open ──▶ in-progress ──▶ resolved ──▶ verified
+  │                         │
+  └── wont-fix ◀────────────┘
+```
+
+---
+
+## Proposed Makima Implementation
+
+### 1. Finding Record Format
+
+Findings are stored as **contract files** with structured metadata and body:
+
+```rust
+// Finding metadata (stored in file description as structured JSON)
+#[derive(Serialize, Deserialize)]
+pub struct FindingMetadata {
+    pub id: String,                    // "SEC-001", auto-generated
+    pub status: FindingStatus,         // open, in_progress, resolved, verified, wont_fix
+    pub severity: FindingSeverity,     // P1 (critical), P2 (major), P3 (minor)
+    pub category: String,             // security, performance, architecture, etc.
+    pub title: String,                // Short description
+    pub file_path: Option<String>,    // Affected file
+    pub line_number: Option<u32>,     // Affected line
+    pub source_agent: Option<String>, // Which review agent found this
+    pub source_task_id: Option<Uuid>, // Task that produced this finding
+    pub assigned_to: Option<Uuid>,    // Task assigned to resolve this
+    pub created_at: DateTime<Utc>,
+    pub updated_at: DateTime<Utc>,
+    pub resolved_at: Option<DateTime<Utc>>,
+    pub verified_at: Option<DateTime<Utc>>,
+    pub tags: Vec<String>,
+}
+
+pub enum FindingStatus {
+    Open,
+    InProgress,
+    Resolved,
+    Verified,
+    WontFix,
+}
+
+pub enum FindingSeverity {
+    P1,  // Critical — must fix before merge
+    P2,  // Major — should fix, can defer with justification
+    P3,  // Minor — nice to fix, can defer
+}
+```
+
+### 2. Supervisor Commands
+
+#### Create a Finding
+
+```bash
+# Create a finding from review output
+makima supervisor finding create \
+  --severity P1 \
+  --category security \
+  --title "SQL injection in user search endpoint" \
+  --file src/api/users.rs \
+  --line 47 \
+  --description "Direct string interpolation in SQL query"
+
+# Output: Created finding SEC-001 (P1/security)
+```
+
+#### List Findings
+
+```bash
+# List all findings for the current contract
+makima supervisor finding list
+# Output:
+# ID       SEVERITY  STATUS       CATEGORY      TITLE
+# SEC-001  P1        open         security      SQL injection in user search
+# PERF-001 P2        in-progress  performance   N+1 query in order listing
+# ARCH-001 P3        resolved     architecture  Handler accessing DB directly
+
+# Filter by severity
+makima supervisor finding list --severity P1
+
+# Filter by status
+makima supervisor finding list --status open
+
+# Summary only
+makima supervisor finding summary
+# Output:
+# Total: 12 findings
+# P1: 2 open, 1 resolved
+# P2: 3 open, 2 in-progress
+# P3: 4 resolved
+```
+
+#### Update Finding Status
+
+```bash
+# Mark as in-progress (assigned to a task)
+makima supervisor finding update SEC-001 --status in-progress --assigned-to <task-id>
+
+# Mark as resolved
+makima supervisor finding update SEC-001 --status resolved \
+  --resolution "Replaced with parameterized query in commit abc123"
+
+# Mark as verified (after re-review)
+makima supervisor finding update SEC-001 --status verified
+
+# Mark as won't fix
+makima supervisor finding update SEC-001 --status wont-fix \
+  --justification "Endpoint is internal-only, behind auth"
+```
+
+#### Auto-Create from Review Output
+
+```bash
+# Parse review agent output and create findings automatically
+makima supervisor finding parse-output --task-id <review-task-id>
+```
+
+This parses structured review output and creates individual finding records.
+
+### 3. Finding Lifecycle
+
+```
+┌────────────────────────────────────────────────────────────┐
+│                    Finding Lifecycle                        │
+│                                                            │
+│  ┌──────┐    ┌─────────────┐    ┌──────────┐              │
+│  │      │    │             │    │          │              │
+│  │ OPEN │───▶│ IN-PROGRESS │───▶│ RESOLVED │              │
+│  │      │    │             │    │          │              │
+│  └──┬───┘    └─────────────┘    └────┬─────┘              │
+│     │                                │                     │
+│     │        ┌─────────────┐    ┌────┴─────┐              │
+│     │        │             │    │          │              │
+│     └───────▶│  WONT-FIX   │    │ VERIFIED │              │
+│              │             │    │          │              │
+│              └─────────────┘    └──────────┘              │
+│                                                            │
+│  Triggers:                                                 │
+│  open ─▶ in_progress : Task assigned to fix                │
+│  in_progress ─▶ resolved : Fix committed                   │
+│  resolved ─▶ verified : Re-review confirms fix             │
+│  open ─▶ wont_fix : Explicit decision with justification   │
+│  resolved ─▶ wont_fix : Fix deemed unnecessary after review│
+└────────────────────────────────────────────────────────────┘
+```
+
+### 4. P1/P2/P3 Severity System
+
+| Severity | Name | Description | Merge Policy |
+|----------|------|-------------|--------------|
+| **P1** | Critical | Security vulnerabilities, data loss risks, crash bugs | **Blocks merge** — must be resolved before contract completion |
+| **P2** | Major | Performance issues, architectural concerns, significant tech debt | **Should fix** — can defer with explicit justification |
+| **P3** | Minor | Style issues, minor improvements, documentation gaps | **Nice to fix** — can defer freely |
+
+### 5. Merge Blocking
+
+When findings exist, phase transitions and merge operations check for blockers:
+
+```rust
+// In advance-phase handler
+async fn check_findings_gate(contract_id: Uuid) -> Result<bool> {
+    let findings = get_findings(contract_id).await?;
+    let open_p1s = findings.iter()
+        .filter(|f| f.severity == P1 && f.status == Open)
+        .count();
+
+    if open_p1s > 0 {
+        warn!("{} open P1 findings block phase transition", open_p1s);
+        return Ok(false);
+    }
+    Ok(true)
+}
+```
+
+### 6. Auto-Resolution Workflow
+
+When the Multi-Agent Review feature is available, findings drive an automated resolution cycle:
+
+```
+┌──────────┐     ┌───────────┐     ┌──────────┐     ┌──────────┐
+│  Review  │────▶│ Findings  │────▶│ Resolve  │────▶│ Verify   │
+│  Phase   │     │ Created   │     │ Tasks    │     │ Fixes    │
+│          │     │ (P1/P2/P3)│     │ Spawned  │     │ Pass?    │
+└──────────┘     └───────────┘     └──────────┘     └────┬─────┘
+                                                         │
+                                                    Yes  │  No
+                                                    ┌────┴────┐
+                                                    ▼         ▼
+                                              ┌──────────┐  Loop back
+                                              │ Findings │  to resolve
+                                              │ Verified │
+                                              └──────────┘
+```
+
+```bash
+# Auto-resolve: spawn tasks to fix each P1/P2 finding
+makima supervisor finding auto-resolve --severity P1,P2
+
+# This spawns one task per finding:
+# - Task plan includes the finding details and recommendation
+# - Task is assigned to the finding (finding.assigned_to = task.id)
+# - When task completes, finding status → resolved
+# - Verification task confirms the fix
+```
+
+---
+
+## Integration with Existing Makima Features
+
+### Contract Files
+
+Each finding is stored as a **contract file**:
+
+```rust
+File {
+    contract_id: Some(contract.id),
+    contract_phase: Some("review"),
+    name: "Finding: SEC-001 — SQL injection in user search",
+    description: Some(serde_json::to_string(&finding_metadata)?),
+    body: vec![
+        BodyElement::Heading { level: 1, text: finding.title },
+        BodyElement::Heading { level: 2, text: "Finding" },
+        BodyElement::Paragraph { text: finding.description },
+        BodyElement::Heading { level: 2, text: "Evidence" },
+        BodyElement::Code { language: Some("rust"), content: finding.evidence },
+        BodyElement::Heading { level: 2, text: "Recommendation" },
+        BodyElement::Paragraph { text: finding.recommendation },
+    ],
+}
+```
+
+### Phase Guards
+
+Findings integrate with existing phase guards:
+- Phase guard checks finding gate before allowing transition
+- User sees a summary of open findings when reviewing phase transition
+- P1 findings produce a warning that requires explicit override
+
+### Supervisor Questions
+
+When P1 findings block a transition, the supervisor can ask:
+
+```bash
+makima supervisor ask \
+  "2 P1 findings are still open. How would you like to proceed?" \
+  --choices "Fix findings first,Override and continue,Mark as won't-fix" \
+  --context "SEC-001: SQL injection (P1), PERF-001: Memory leak (P1)"
+```
+
+### Task Assignment
+
+Findings reference tasks:
+- `source_task_id`: The review task that discovered the finding
+- `assigned_to`: The task spawned to resolve the finding
+
+```bash
+# Spawn a fix task and assign the finding
+makima supervisor spawn "fix-sec-001" \
+  --plan "Fix SQL injection vulnerability in src/api/users.rs:47. Use parameterized queries."
+
+makima supervisor finding update SEC-001 \
+  --status in-progress \
+  --assigned-to <spawned-task-id>
+```
+
+### Autonomous Loop
+
+The autonomous loop can use findings as a completion gate condition:
+
+```xml
+<COMPLETION_GATE>
+ready: false
+reason: "2 P1 findings still open"
+progress: "Resolved 5/7 findings"
+blockers: ["SEC-001: SQL injection", "PERF-001: Memory leak"]
+</COMPLETION_GATE>
+```
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Finding System (3-4 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Finding metadata schema | 0.5 days | FindingMetadata struct, validation |
+| `finding create` command | 1 day | Create finding as contract file |
+| `finding list/summary` commands | 0.5 days | Query and display findings |
+| `finding update` command | 0.5 days | Status transitions, validation |
+| Auto-ID generation | 0.5 days | Category-based IDs (SEC-001, PERF-002) |
+
+### Phase 2: Integration (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Phase guard integration | 0.5 days | Check P1 findings before transition |
+| `finding parse-output` | 1 day | Parse review task output into findings |
+| Merge blocking logic | 0.5 days | Block merge with open P1s |
+| Finding assignment to tasks | 0.5 days | Track resolution via task ID |
+
+### Phase 3: Automation & Polish (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `finding auto-resolve` | 1 day | Spawn fix tasks per finding |
+| Verification workflow | 0.5 days | Re-review to verify fixes |
+| Finding reports | 0.5 days | Summary contract file |
+| Documentation | 0.5 days | User guide |
+| Tests | 0.5 days | Unit + integration |
+
+---
+
+## Configuration Examples
+
+### Finding Creation in Review Agent Output
+
+Review agents produce structured findings in their output:
+
+```markdown
+## FINDING: SQL Injection in User Search
+
+- **Severity**: P1
+- **Category**: security
+- **File**: src/api/users.rs
+- **Line**: 47
+- **Tags**: injection, input-validation, database
+
+### Description
+The `search_users` handler directly interpolates the `query` parameter...
+
+### Evidence
+```rust
+let sql = format!("SELECT * FROM users WHERE name LIKE '%{}%'", query);
+```
+
+### Recommendation
+Use parameterized queries with sqlx::query().bind()
+```
+
+The synthesis step parses these into formal Finding records.
+
+### Merge Blocking Configuration
+
+```yaml
+# .makima/review-agents.yaml (or contract config)
+review:
+  findings:
+    merge_blocking_severity: P1     # P1 blocks merge
+    require_justification: P2       # P2 needs justification to defer
+    auto_resolve: true              # Spawn fix tasks for P1/P2
+    auto_resolve_severity: P1,P2    # Which severities to auto-resolve
+    verification:
+      enabled: true                 # Re-review after resolution
+      re_review_agents:             # Which agents verify fixes
+        - security-sentinel         # Security findings verified by security agent
+```
+
+### Finding Lifecycle Example
+
+```bash
+# 1. Review creates finding
+makima supervisor finding create --severity P1 --category security \
+  --title "SQL injection in user search" --file src/api/users.rs --line 47
+
+# 2. Auto-resolve spawns fix task
+makima supervisor finding auto-resolve --severity P1
+# → Spawns task "fix-SEC-001" with plan based on finding details
+
+# 3. Fix task completes, finding auto-updated
+# finding SEC-001: open → in-progress → resolved
+
+# 4. Verification re-reviews the fix
+makima supervisor finding verify SEC-001
+# → Spawns verification task targeting the specific file/line
+
+# 5. Verification passes
+# finding SEC-001: resolved → verified
+
+# 6. Phase transition allowed
+makima supervisor advance-phase compound -y
+```
+
+---
+
+## Open Questions
+
+1. **Finding storage**: Contract files vs. dedicated findings table in the database? Contract files are simpler but querying is less efficient.
+2. **Cross-contract findings**: Should findings persist across contracts? (e.g., a P2 deferred from one contract carries to the next)
+3. **Finding templates**: Should common finding types have templates? (e.g., "SQL injection" pre-fills category, severity, recommendation)
+4. **External integration**: Should findings be exportable to GitHub Issues, Jira, or other issue trackers?
+5. **Metric tracking**: How granular should finding metrics be? Per-contract? Per-repository? Per-category?
+6. **False positive handling**: How should agents indicate confidence level? Should low-confidence findings be automatically P3?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| GitHub Issues integration | Rich UI, collaboration | External dependency; not all projects use GitHub | Deferred — consider as export target |
+| Plain text findings | Simple | Not queryable, no lifecycle | Rejected — defeats the purpose |
+| Dedicated findings DB table | Fast queries, rich indexing | New infrastructure, migration | Recommended for v2 |
+| Contract file-based | Uses existing infrastructure | Slower queries for large sets | Adopted for v1 |
+| Inline code comments | Close to code | Lost on next commit; hard to track | Rejected — not persistent |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: MEDIUM** — Structured findings transform the review phase from documentation to a quality gate. Essential for the Multi-Agent Review feature to produce actionable output.
+- **Complexity: LOW** — Finding records are simple structured data. Lifecycle state machine is straightforward. Main integration point (phase guards) already exists.
+- **Risk: LOW** — Purely additive feature. Worst case: findings exist but aren't used (same as today). Can be adopted incrementally.
diff --git a/docs/proposals/feature-knowledge-accumulation.md b/docs/proposals/feature-knowledge-accumulation.md
new file mode 100644
index 0000000..faef06a
--- /dev/null
+++ b/docs/proposals/feature-knowledge-accumulation.md
@@ -0,0 +1,539 @@
+# Feature Proposal: Knowledge Accumulation / Compound Learning System
+
+> **Priority:** High
+> **Complexity:** Medium
+> **Estimated Effort:** 10-15 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** Contract Files system (existing)
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Plan Deepening](feature-plan-deepening.md) · [Workflow Presets](feature-workflow-presets.md)
+
+---
+
+## Problem Statement
+
+When a makima contract completes, the **knowledge generated during that contract is effectively lost**:
+
+- **Solutions to tricky problems** exist only in task conversation history, which is not searchable or surfaceable
+- **Patterns discovered** during one contract cannot inform future contracts
+- **Mistakes made** in one contract are likely to be repeated in similar future contracts
+- **Best practices** established during execution are not codified anywhere retrievable
+- **Contract files** capture deliverables but not the *meta-knowledge* about how those deliverables were produced
+
+This means every new contract starts from zero context, even when the team has solved similar problems before. Engineering effort does not compound.
+
+---
+
+## How Compound Engineering Solves This
+
+The compound engineering plugin implements a `/compound` command that runs **5 parallel sub-agents** immediately after review:
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                   /compound                              │
+│                                                         │
+│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐    │
+│  │  Context    │  │  Solution   │  │ Prevention  │    │
+│  │  Extractor  │  │ Documenter  │  │ Strategist  │    │
+│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘    │
+│         │                │                │            │
+│  ┌──────┴──────┐  ┌──────┴──────┐                      │
+│  │   Doc       │  │  Category   │                      │
+│  │   Linker    │  │  Classifier │                      │
+│  └──────┬──────┘  └──────┬──────┘                      │
+│         │                │                              │
+│         ▼                ▼                              │
+│  ┌──────────────────────────────────────┐              │
+│  │  docs/solutions/[category]/file.md   │              │
+│  │                                      │              │
+│  │  ---                                 │              │
+│  │  category: build-errors              │              │
+│  │  severity: medium                    │              │
+│  │  tags: [webpack, esm, cjs]           │              │
+│  │  date: 2026-02-09                    │              │
+│  │  contract: abc-123                   │              │
+│  │  ---                                 │              │
+│  │                                      │              │
+│  │  # Mixed ESM/CJS Import Resolution  │              │
+│  │                                      │              │
+│  │  ## Problem                          │              │
+│  │  ...                                 │              │
+│  │  ## Solution                         │              │
+│  │  ...                                 │              │
+│  │  ## Prevention                       │              │
+│  │  ...                                 │              │
+│  └──────────────────────────────────────┘              │
+└─────────────────────────────────────────────────────────┘
+```
+
+### 9 Auto-Detected Categories
+
+| Category | Description |
+|----------|-------------|
+| `build-errors` | Compilation, bundling, dependency resolution |
+| `test-failures` | Test setup, assertion patterns, mocking |
+| `api-patterns` | API design, endpoint structure, versioning |
+| `architecture-decisions` | Structural choices, trade-offs, patterns |
+| `performance-optimizations` | Speed, memory, caching strategies |
+| `security-practices` | Auth, input validation, secrets management |
+| `debugging-techniques` | Investigation methods, logging strategies |
+| `tooling-configurations` | Tool setup, config patterns, CI/CD |
+| `domain-knowledge` | Business logic, domain-specific patterns |
+
+---
+
+## Proposed Makima Implementation
+
+### 1. New "Compound" Phase
+
+Add an optional **compound** phase to the contract lifecycle, positioned after review:
+
+```
+Research → Specify → Plan → Execute → Review → Compound
+                                                  ▲
+                                            (new phase)
+```
+
+**Phase behavior:**
+- **Auto-triggered** after review phase completes (configurable)
+- **Short-lived** — typically completes in 1-3 minutes
+- Extracts learnings from the contract's execution and review
+- Stores them as searchable, categorized learning documents
+- Can be skipped via configuration for trivial contracts
+
+### 2. New Supervisor Command: `makima supervisor compound`
+
+```bash
+# Run compound learning for the current contract
+makima supervisor compound
+
+# Compound with specific focus areas
+makima supervisor compound --focus "security,performance"
+
+# Compound with explicit learnings
+makima supervisor compound --learning "The retry logic needed exponential backoff, not fixed delay"
+```
+
+**Implementation:**
+
+```bash
+# Under the hood, this spawns learning sub-agents
+makima supervisor spawn-group "compound" \
+  --tasks '[
+    {
+      "name": "context-extractor",
+      "plan": "Extract the problem context, constraints, and environment details from the contract execution history..."
+    },
+    {
+      "name": "solution-documenter",
+      "plan": "Document the solutions that were applied, including code patterns and configuration changes..."
+    },
+    {
+      "name": "prevention-strategist",
+      "plan": "Identify what could prevent this class of problem in the future..."
+    },
+    {
+      "name": "category-classifier",
+      "plan": "Classify these learnings into the appropriate category..."
+    },
+    {
+      "name": "doc-linker",
+      "plan": "Link these learnings to existing documentation and related learnings..."
+    }
+  ]'
+```
+
+### 3. Learning Document Schema
+
+Each learning is stored as a **contract file** with structured content and metadata:
+
+```yaml
+# Learning document metadata (stored in file description/metadata)
+learning:
+  category: "build-errors"          # One of 9 categories
+  severity: "medium"                # low, medium, high, critical
+  tags: ["webpack", "esm", "cjs"]   # Free-form tags
+  source_contract_id: "abc-123"     # Contract that produced this learning
+  source_contract_name: "Fix webpack bundling"
+  repository: "github.com/org/repo"
+  date: "2026-02-09"
+  quality_score: 0.85               # 0-1, set by quality gate
+  access_count: 0                   # Incremented on retrieval
+  last_accessed: null
+  relevance_decay: 0.95             # Per-month decay factor
+```
+
+**Document body structure:**
+
+```markdown
+# Mixed ESM/CJS Import Resolution
+
+## Problem
+When upgrading to webpack 5, mixed ESM and CommonJS imports caused
+"Cannot use import statement outside a module" errors in production
+but not development.
+
+## Root Cause
+The `type: "module"` field in package.json applied ESM resolution
+globally, but several dependencies only provided CJS exports.
+
+## Solution
+1. Added `resolve.fullySpecified: false` to webpack config
+2. Used `@babel/plugin-transform-modules-commonjs` for CJS deps
+3. Created explicit `.cjs` extensions for config files
+
+## Code Pattern
+```javascript
+// webpack.config.cjs (note: .cjs extension)
+module.exports = {
+  resolve: {
+    fullySpecified: false,
+    extensions: ['.js', '.mjs', '.cjs', '.json']
+  }
+};
+```
+
+## Prevention
+- Add webpack build check to CI before merging
+- Document module system choice in project README
+- Use `resolve.fullySpecified: false` by default in webpack 5 projects
+
+## Related
+- docs/solutions/tooling-configurations/webpack-5-migration.md
+- Contract: "Initial Webpack 5 Migration" (2026-01-15)
+```
+
+### 4. Storage Architecture
+
+Learnings are stored in two complementary locations:
+
+#### A. Contract Files (Structured, Persistent)
+
+```rust
+// Each learning becomes a contract file
+File {
+    contract_id: Some(source_contract.id),
+    contract_phase: Some("compound"),
+    name: "Learning: Mixed ESM/CJS Import Resolution",
+    description: Some("category=build-errors; tags=webpack,esm,cjs; severity=medium"),
+    body: vec![
+        BodyElement::Heading { level: 1, text: "Mixed ESM/CJS Import Resolution" },
+        BodyElement::Heading { level: 2, text: "Problem" },
+        BodyElement::Paragraph { text: "..." },
+        // ... structured content
+    ],
+    repo_file_path: Some("docs/solutions/build-errors/mixed-esm-cjs-resolution.md"),
+    repo_sync_status: Some("synced"),
+}
+```
+
+#### B. Repository Files (Searchable, Portable)
+
+```
+docs/solutions/
+├── build-errors/
+│   ├── mixed-esm-cjs-resolution.md
+│   └── docker-multi-stage-cache.md
+├── test-failures/
+│   ├── async-test-timeout-patterns.md
+│   └── mock-service-worker-setup.md
+├── api-patterns/
+│   └── pagination-cursor-vs-offset.md
+├── architecture-decisions/
+│   └── event-sourcing-tradeoffs.md
+├── performance-optimizations/
+│   └── database-connection-pooling.md
+├── security-practices/
+│   └── jwt-refresh-token-rotation.md
+├── debugging-techniques/
+│   └── distributed-tracing-setup.md
+├── tooling-configurations/
+│   └── github-actions-cache-strategy.md
+└── domain-knowledge/
+    └── payment-processing-idempotency.md
+```
+
+### 5. Auto-Surface Relevant Learnings
+
+When a new contract is created, automatically search for relevant learnings:
+
+```bash
+# Supervisor plan template automatically includes:
+# "Search existing learnings relevant to this task"
+
+makima supervisor search-learnings --query "webpack bundling errors"
+makima supervisor search-learnings --category "build-errors" --tags "webpack"
+makima supervisor search-learnings --repository "github.com/org/repo"
+```
+
+**Search algorithm:**
+
+```
+Relevance Score =
+    keyword_match_score * 0.4
+  + category_match_score * 0.2
+  + tag_overlap_score * 0.2
+  + recency_score * 0.1        # Decays over time
+  + quality_score * 0.1        # Higher quality = more relevant
+```
+
+**Integration with plan phase:**
+
+```
+┌──────────────┐       ┌───────────────────┐
+│ New Contract │──────▶│ Plan Phase        │
+│ Created      │       │                   │
+└──────────────┘       │ 1. Create plan    │
+                       │ 2. Search for     │◀── Learnings DB
+                       │    relevant       │
+                       │    learnings      │
+                       │ 3. Inject context │
+                       │    into plan      │
+                       └───────────────────┘
+```
+
+### 6. Quality Control
+
+#### Relevance Decay
+
+Learnings lose relevance over time unless accessed:
+
+```
+effective_relevance = quality_score * (decay_factor ^ months_since_creation)
+                    + access_bonus * recent_access_count
+```
+
+- Default decay factor: 0.95/month (learning at 60% relevance after 1 year)
+- Access bonus: +0.05 per access (caps at +0.25)
+- Learnings below 0.3 effective relevance are archived
+
+#### Deduplication
+
+When a new learning is created, check for existing similar learnings:
+
+```
+similarity = cosine_similarity(new_learning_embedding, existing_learning_embedding)
+if similarity > 0.85:
+    merge_or_update(existing_learning, new_learning)
+elif similarity > 0.70:
+    link_as_related(new_learning, existing_learning)
+```
+
+#### Quality Gate
+
+Before storing a learning, validate:
+
+| Check | Threshold | Action if Failed |
+|-------|-----------|------------------|
+| Has problem statement | Required | Reject |
+| Has solution | Required | Reject |
+| Has prevention strategy | Recommended | Warn, store with quality penalty |
+| Code examples present | Recommended | Warn, store with quality penalty |
+| Category valid | Required | Auto-classify |
+| Not duplicate | >0.85 similarity | Merge with existing |
+| Minimum length | >200 characters | Reject |
+
+---
+
+## Integration with Existing Makima Features
+
+### Contract Phases
+
+The compound phase integrates into the existing phase system:
+
+```rust
+// New phase variant
+enum ContractPhase {
+    Research,
+    Specify,
+    Plan,
+    Execute,
+    Review,
+    Compound,  // NEW
+}
+```
+
+- Contracts with `contract_type: "specification"` get the full 6-phase cycle
+- Contracts with `contract_type: "simple"` can opt-in via config
+- Phase guard still applies: user must approve transition to compound
+
+### Contract Files
+
+Learnings are first-class contract files, leveraging existing:
+- Versioning system
+- Structured body format (`BodyElement` types)
+- Repository file sync (`repo_file_path`, `repo_sync_status`)
+- Phase association (`contract_phase: "compound"`)
+
+### Directive System
+
+For directive-based workflows, learnings can be captured per-step:
+
+```rust
+DirectiveStep {
+    name: "compound-step-3",
+    description: "Capture learnings from database migration step",
+    depends_on: [step_3_id, review_step_id],
+    task_plan: "Extract and document learnings from the completed migration...",
+}
+```
+
+### Supervisor CLI
+
+New commands integrate with existing CLI infrastructure:
+
+```bash
+# In supervisor context
+makima supervisor compound                    # Run compound phase
+makima supervisor search-learnings "query"    # Search knowledge base
+makima supervisor list-learnings              # List all learnings
+makima supervisor learning-stats              # Knowledge base statistics
+```
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Infrastructure (4-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Add `compound` phase to contract lifecycle | 1 day | New phase enum, transition rules |
+| Learning document schema | 1 day | Metadata structure, validation |
+| `supervisor compound` command | 1-2 days | Spawn learning sub-agents |
+| Repository file sync for learnings | 1 day | Write to `docs/solutions/` |
+
+### Phase 2: Search & Retrieval (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `search-learnings` command | 1-2 days | Keyword + category search |
+| Auto-surface in plan phase | 1-2 days | Inject relevant learnings into plans |
+| Learning index | 1 day | Category/tag index for fast lookup |
+
+### Phase 3: Quality & Maintenance (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Quality gate validation | 1 day | Pre-storage checks |
+| Relevance decay system | 1 day | Scheduled decay + access tracking |
+| Deduplication check | 1-2 days | Similarity detection and merging |
+| Documentation & defaults | 1 day | User guide, default categories |
+
+---
+
+## Configuration Examples
+
+### Enable Compound Phase (Contract-Level)
+
+```yaml
+# Contract configuration
+compound:
+  enabled: true
+  auto_trigger: true        # Auto-run after review completes
+  categories:               # Override default categories
+    - build-errors
+    - test-failures
+    - api-patterns
+    - architecture-decisions
+    - performance-optimizations
+    - security-practices
+    - debugging-techniques
+    - tooling-configurations
+    - domain-knowledge
+  quality_gate:
+    min_length: 200
+    require_problem: true
+    require_solution: true
+    require_prevention: false
+  storage:
+    contract_files: true     # Store as contract files
+    repo_files: true         # Also write to docs/solutions/
+    repo_path: "docs/solutions"
+```
+
+### Repository-Level Configuration (`.makima/compound.yaml`)
+
+```yaml
+# .makima/compound.yaml
+version: 1
+compound:
+  # Default settings for all contracts in this repo
+  auto_trigger: true
+
+  # Custom categories for this project
+  categories:
+    - build-errors
+    - test-failures
+    - api-patterns
+    - payment-processing     # Custom domain category
+    - compliance-requirements # Custom domain category
+
+  # Search settings
+  search:
+    max_results: 10
+    min_relevance: 0.3
+    include_archived: false
+
+  # Decay settings
+  decay:
+    factor: 0.95             # Per month
+    archive_threshold: 0.3
+    access_bonus: 0.05
+    max_access_bonus: 0.25
+```
+
+### Searching Learnings
+
+```bash
+# Full-text search
+makima supervisor search-learnings "webpack ESM import error"
+
+# Category filter
+makima supervisor search-learnings --category build-errors
+
+# Tag filter
+makima supervisor search-learnings --tags webpack,esm
+
+# Repository filter
+makima supervisor search-learnings --repo github.com/org/repo
+
+# Combined
+makima supervisor search-learnings "import error" \
+  --category build-errors \
+  --tags webpack \
+  --min-relevance 0.5 \
+  --limit 5
+```
+
+---
+
+## Open Questions
+
+1. **Cross-repository knowledge**: Should learnings be scoped to a single repository or shared across all repositories for an owner?
+2. **Learning ownership**: Who owns a learning — the contract creator, the repository, or the organization?
+3. **Privacy**: Are learnings visible to all users, or scoped by access control?
+4. **Embedding model**: For similarity-based deduplication and search, which embedding model should be used? Trade-off between quality and cost.
+5. **Storage limits**: Should there be a cap on the number of learnings per repository/owner?
+6. **Manual curation**: Should users be able to manually create, edit, or delete learnings outside the compound phase?
+7. **Export/import**: Should learnings be exportable/importable across makima instances?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Store learnings only in contract files | Simple, uses existing infrastructure | Not easily searchable across contracts | Rejected — search is critical |
+| Store learnings only in repo files | Portable, version-controlled, greppable | Lost if repo deleted; no cross-repo search | Partial — use as secondary storage |
+| Use external knowledge base (e.g., vector DB) | Best search quality | Added infrastructure dependency | Deferred — consider for v2 |
+| Manual-only knowledge capture | No noise | Knowledge rarely captured | Rejected — must be automatic |
+| Full contract history indexing | Most complete | Massive storage, noise, privacy concerns | Rejected — too much signal-to-noise |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: HIGH** — This is the defining feature of compound engineering. Without knowledge accumulation, every contract starts from scratch. This is the feature that creates compounding returns.
+- **Complexity: MEDIUM** — Core capture and storage is straightforward using existing contract files and repo sync. Search quality and relevance decay require iterative refinement.
+- **Risk: MEDIUM** — Primary risk is low adoption (users skip compound phase) mitigated by auto-trigger. Secondary risk is knowledge base noise mitigated by quality gates.
diff --git a/docs/proposals/feature-multi-agent-review.md b/docs/proposals/feature-multi-agent-review.md
new file mode 100644
index 0000000..d678756
--- /dev/null
+++ b/docs/proposals/feature-multi-agent-review.md
@@ -0,0 +1,448 @@
+# Feature Proposal: Multi-Agent Parallel Review System
+
+> **Priority:** High
+> **Complexity:** Medium
+> **Estimated Effort:** 12-18 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** [Findings Tracking](feature-findings-tracking.md) (recommended)
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Workflow Presets](feature-workflow-presets.md)
+
+---
+
+## Problem Statement
+
+Makima's contract lifecycle includes a **Review** phase, but it currently has:
+
+- **No automated review mechanism** — the review phase relies entirely on manual user inspection or a single supervisor task
+- **Single-perspective review** — even when a review task is spawned, it examines code from one viewpoint
+- **No structured review output** — findings are captured as unstructured text in task output
+- **No review templates** — each review must be configured from scratch
+- **No synthesis** — when multiple reviewers exist, there's no mechanism to deduplicate and prioritize findings
+
+For complex contracts touching security, performance, and architecture, a single-pass review consistently misses category-specific issues that specialized reviewers would catch.
+
+---
+
+## How Compound Engineering Solves This
+
+The compound engineering plugin spawns **12-15 specialized review agents in parallel**, each examining the code from a unique perspective:
+
+| Agent | Focus Area | Example Findings |
+|-------|-----------|-----------------|
+| Security Sentinel | Auth, injection, secrets, CSRF | SQL injection in user input handler |
+| Performance Oracle | N+1 queries, memory leaks, caching | Unbounded list growth in event handler |
+| Architecture Strategist | Coupling, SOLID, layering | Service directly accessing repository internals |
+| Code Philosopher | Readability, naming, complexity | Cyclomatic complexity > 15 in payment flow |
+| Data Integrity Guardian | Validation, constraints, migrations | Missing NOT NULL constraint on required field |
+| Error Resilience Analyzer | Error handling, retries, fallbacks | Unhandled timeout in external API call |
+| API Contract Validator | Breaking changes, versioning | Removed required field from response |
+| Dependency Health Checker | Vulnerabilities, licensing, freshness | CVE-2025-XXXX in transitive dependency |
+| Test Coverage Analyzer | Coverage gaps, edge cases, mocking | No tests for error path in checkout flow |
+| Documentation Completeness | Docs accuracy, examples, changelog | Public API endpoint undocumented |
+| Concurrency Safety | Race conditions, deadlocks, atomicity | Non-atomic read-modify-write on shared counter |
+
+After all agents complete, a **synthesis agent** deduplicates findings, resolves contradictions, and produces a prioritized report.
+
+```
+┌───────────────────────────────────────────────────────┐
+│                  Review Orchestrator                   │
+│                                                       │
+│  spawn-group "review"                                 │
+│  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐   │
+│  │Security │ │ Perf    │ │  Arch   │ │  Code   │   │
+│  │Sentinel │ │ Oracle  │ │Strategy │ │  Phil   │   │
+│  └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘   │
+│       │           │           │           │         │
+│  ┌────┴────┐ ┌────┴────┐ ┌────┴────┐ ┌────┴────┐   │
+│  │  Data   │ │ Error   │ │  API    │ │  Deps   │   │
+│  │Guardian │ │Resilien.│ │Contract │ │ Health  │   │
+│  └────┬────┘ └────┬────┘ └────┬────┘ └────┬────┘   │
+│       │           │           │           │         │
+│  ┌────┴────┐ ┌────┴────┐ ┌────┴────┐               │
+│  │  Test   │ │  Docs   │ │Concurr. │               │
+│  │Coverage │ │Complete │ │ Safety  │               │
+│  └────┬────┘ └────┬────┘ └────┬────┘               │
+│       │           │           │                     │
+│  wait-group "review"                                 │
+│       ▼           ▼           ▼                     │
+│  ┌──────────────────────────────────────────┐       │
+│  │         Synthesis Agent                   │       │
+│  │  - Deduplicate findings                   │       │
+│  │  - Resolve contradictions                 │       │
+│  │  - Prioritize by severity                 │       │
+│  │  - Generate summary report                │       │
+│  └──────────────────────────────────────────┘       │
+│                      │                               │
+│                      ▼                               │
+│             Structured Findings                      │
+│             (P1 / P2 / P3)                           │
+└───────────────────────────────────────────────────────┘
+```
+
+---
+
+## Proposed Makima Implementation
+
+### 1. New Supervisor Commands
+
+#### `makima supervisor spawn-group`
+
+Spawns multiple tasks as a named group and returns immediately:
+
+```bash
+# Spawn a review group with 5 agents
+makima supervisor spawn-group "review" \
+  --tasks '[
+    {"name": "security-review", "plan": "Review for security vulnerabilities..."},
+    {"name": "performance-review", "plan": "Review for performance issues..."},
+    {"name": "architecture-review", "plan": "Review for architecture concerns..."}
+  ]' \
+  --share-worktree \
+  --read-only
+```
+
+**Key parameters:**
+- `--tasks` — JSON array of task definitions
+- `--share-worktree` — All tasks in the group share the supervisor's worktree (read-only access)
+- `--read-only` — Tasks cannot modify files, only produce output
+- `--max-concurrent N` — Limit parallel execution (default: unlimited)
+
+#### `makima supervisor wait-group`
+
+Waits for all tasks in a named group to complete:
+
+```bash
+# Wait for all review tasks, timeout after 10 minutes
+makima supervisor wait-group "review" --timeout 600
+
+# Returns JSON with all task results
+```
+
+**Output format:**
+```json
+{
+  "group": "review",
+  "status": "completed",
+  "tasks": [
+    {"name": "security-review", "status": "done", "output": "..."},
+    {"name": "performance-review", "status": "done", "output": "..."}
+  ],
+  "duration_seconds": 127
+}
+```
+
+#### `makima supervisor review`
+
+High-level command that orchestrates the full review pipeline:
+
+```bash
+# Run review with default agent config
+makima supervisor review
+
+# Run review with custom config
+makima supervisor review --config .makima/review-agents.yaml
+
+# Run only specific review categories
+makima supervisor review --only security,performance,architecture
+```
+
+### 2. Review Agent Configuration
+
+#### Repository-Level Configuration (`.makima/review-agents.yaml`)
+
+```yaml
+# .makima/review-agents.yaml
+version: 1
+review:
+  # Maximum number of concurrent review agents
+  max_concurrent: 8
+
+  # Timeout per agent (seconds)
+  agent_timeout: 300
+
+  # Auto-trigger review when phase transitions to 'review'
+  auto_trigger: true
+
+  # Finding severity that blocks merge
+  merge_blocking_severity: P1
+
+  agents:
+    - name: security-sentinel
+      enabled: true
+      plan: |
+        You are a Security Sentinel reviewing code changes.
+
+        Focus areas:
+        - Authentication and authorization flaws
+        - Injection vulnerabilities (SQL, XSS, command injection)
+        - Secret/credential exposure
+        - CSRF and session management
+        - Input validation gaps
+
+        Output format: One finding per section with severity (P1/P2/P3),
+        affected file/line, description, and suggested fix.
+      priority: critical  # Always runs
+
+    - name: performance-oracle
+      enabled: true
+      plan: |
+        You are a Performance Oracle reviewing code changes.
+
+        Focus areas:
+        - N+1 query patterns
+        - Memory leaks and unbounded growth
+        - Missing caching opportunities
+        - Algorithmic complexity issues
+        - Database index utilization
+
+        Output format: One finding per section with severity (P1/P2/P3),
+        affected file/line, description, and suggested fix.
+      priority: standard
+
+    - name: architecture-strategist
+      enabled: true
+      plan: |
+        You are an Architecture Strategist reviewing code changes.
+
+        Focus areas:
+        - SOLID principle violations
+        - Inappropriate coupling between modules
+        - Layering violations (e.g., handler accessing DB directly)
+        - Missing abstraction boundaries
+        - Inconsistency with existing patterns
+
+        Output format: One finding per section with severity (P1/P2/P3),
+        affected file/line, description, and suggested fix.
+      priority: standard
+
+    - name: test-coverage-analyzer
+      enabled: true
+      plan: |
+        You are a Test Coverage Analyzer reviewing code changes.
+
+        Focus areas:
+        - Missing test coverage for new code paths
+        - Untested error/edge cases
+        - Test quality (meaningful assertions vs superficial)
+        - Integration test gaps
+        - Mock appropriateness
+
+        Output format: One finding per section with severity (P1/P2/P3),
+        affected file/line, description, and suggested fix.
+      priority: standard
+
+    # Users can add custom agents here
+    - name: custom-domain-reviewer
+      enabled: false
+      plan: "Review for domain-specific business logic concerns..."
+      priority: optional
+```
+
+#### Contract-Level Override
+
+```yaml
+# In contract configuration or via CLI
+review:
+  agents:
+    # Disable agents not relevant to this contract
+    - name: concurrency-safety
+      enabled: false
+    # Add contract-specific reviewer
+    - name: migration-safety
+      enabled: true
+      plan: "Review database migrations for data loss risks..."
+```
+
+### 3. Synthesis Step
+
+After all review agents complete, a synthesis task:
+
+1. **Collects** all findings from group task outputs
+2. **Deduplicates** findings about the same issue from different perspectives
+3. **Resolves contradictions** (e.g., one agent says "add caching" while another says "caching adds complexity")
+4. **Prioritizes** by severity and cross-agent agreement
+5. **Produces** a structured review report as a contract file
+
+```bash
+# Synthesis is automatically run after wait-group completes
+makima supervisor synthesize-review "review" \
+  --output-format findings \
+  --create-contract-file
+```
+
+### 4. Auto-Review Trigger
+
+When a contract's phase transitions to `review`:
+
+```rust
+// In phase transition handler
+if new_phase == "review" && contract.review_config.auto_trigger {
+    // Spawn review group automatically
+    spawn_review_group(contract, review_config).await?;
+}
+```
+
+---
+
+## Integration with Existing Makima Features
+
+### Supervisor/Worker Hierarchy
+
+Review agents are spawned as **worker tasks** under the supervisor, using existing `spawn-task` infrastructure. The new `spawn-group`/`wait-group` commands are syntactic sugar over batch `spawn-task` + `wait` calls.
+
+### Git Worktree Isolation
+
+Review agents share the supervisor's worktree in **read-only mode** (a new capability). This avoids creating N separate worktrees for review-only tasks. Implementation:
+- New `supervisor_worktree_task_id` parameter (already exists in SpawnTask)
+- New `read_only: true` flag to prevent file modifications
+- Workers see the same code state that triggered the review
+
+### Contract Files
+
+The synthesized review report is stored as a **contract file** attached to the review phase:
+```rust
+File {
+    contract_id: contract.id,
+    contract_phase: "review",
+    name: "Review Report — 2026-02-09",
+    body: vec![
+        BodyElement::Heading { level: 1, text: "Review Summary" },
+        BodyElement::Paragraph { text: "3 P1 findings, 7 P2 findings, 12 P3 findings" },
+        // ... structured findings
+    ],
+}
+```
+
+### Phase Guards
+
+If `phase_guard` is enabled and P1 findings exist, the phase transition from Review to Execute (or Compound) is blocked until P1s are resolved. This integrates with the existing `advance-phase` confirmation flow.
+
+### Completion Gates
+
+Each review agent uses the existing `<COMPLETION_GATE>` mechanism to signal when its review is complete:
+```xml
+<COMPLETION_GATE>
+ready: true
+reason: "Security review complete. Found 2 P1 and 3 P2 findings."
+progress: "Reviewed 47 files across 12 modules."
+</COMPLETION_GATE>
+```
+
+### Circuit Breaker
+
+The existing CircuitBreaker protects against review agents getting stuck. If a review agent loops without progress for 3 iterations, it's terminated and its partial findings are included in synthesis.
+
+---
+
+## Implementation Plan
+
+### Phase 1: Group Task Infrastructure (5-7 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `spawn-group` command | 2 days | Batch task spawning with named groups |
+| `wait-group` command | 1 day | Wait for all tasks in group |
+| Group tracking in DB | 1 day | Task group table, membership, status |
+| Shared worktree (read-only) | 1-2 days | Workers share supervisor worktree |
+| Tests | 1 day | Unit + integration tests |
+
+### Phase 2: Review Agent System (4-6 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Review config YAML parser | 1 day | Parse `.makima/review-agents.yaml` |
+| `supervisor review` command | 2 days | Orchestrate review pipeline |
+| Synthesis agent logic | 1-2 days | Deduplicate, prioritize, format |
+| Review report as contract file | 1 day | Store structured output |
+
+### Phase 3: Automation & Polish (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Auto-trigger on phase transition | 1 day | Hook into `advance-phase` |
+| P1 merge blocking | 1 day | Phase guard integration |
+| Default review agent templates | 1-2 days | Ship 8-10 built-in agents |
+| Documentation | 1 day | User guide and config reference |
+
+---
+
+## Configuration Examples
+
+### Minimal Setup (Zero Config)
+
+```bash
+# Uses built-in review agents with default settings
+makima supervisor review
+```
+
+### Custom Review for a Specific Contract
+
+```bash
+# Override for this contract only
+makima supervisor review \
+  --only security,performance \
+  --merge-blocking P1 \
+  --timeout 300
+```
+
+### Full Custom Configuration
+
+```yaml
+# .makima/review-agents.yaml
+version: 1
+review:
+  max_concurrent: 6
+  agent_timeout: 300
+  auto_trigger: true
+  merge_blocking_severity: P1
+
+  synthesis:
+    dedup_threshold: 0.8        # Similarity score for deduplication
+    min_agreement: 2             # Findings flagged by 2+ agents get priority boost
+    output_format: "findings"    # "findings" | "report" | "both"
+    create_contract_file: true
+
+  agents:
+    - name: security-sentinel
+      enabled: true
+      priority: critical
+      plan: |
+        ...
+    - name: performance-oracle
+      enabled: true
+      priority: standard
+      plan: |
+        ...
+    # ... more agents
+```
+
+---
+
+## Open Questions
+
+1. **Shared worktree read-only enforcement**: Should this be enforced at the filesystem level (mount read-only) or via convention (instructions to the agent)?
+2. **Review scope**: Should review agents see all files or only changed files (git diff)?
+3. **Incremental review**: When new commits are added during review, should agents re-review or only review the delta?
+4. **Agent output parsing**: Should agents output structured YAML findings, or should the synthesis step parse natural language?
+5. **Cost control**: With 10+ parallel agents, how do we manage API costs? Should there be a budget ceiling per review?
+6. **Finding deduplication**: What similarity threshold should trigger deduplication? How to handle partial overlaps?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Single comprehensive review agent | Simple, no coordination overhead | Misses perspective-specific issues | Rejected — diminishes review quality |
+| Sequential reviews (one after another) | Simpler orchestration | 5-10x slower; later reviews can't benefit from earlier ones | Rejected — latency unacceptable |
+| External review tools integration | Leverage existing static analysis | Limited to tool capabilities; no semantic review | Complement — can integrate alongside agent review |
+| User-configured number of agents | Maximum flexibility | Analysis paralysis for new users | Adopted — sensible defaults + customization |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: HIGH** — Multi-agent review is the highest-impact feature from the compound engineering plugin. It directly improves code quality with no change to developer workflow.
+- **Complexity: MEDIUM** — The core `spawn-group`/`wait-group` pattern is straightforward. The synthesis step requires careful design. Shared worktree read-only mode is a new capability.
+- **Risk: LOW-MEDIUM** — Main risks are resource consumption (manageable with concurrency limits) and synthesis quality (improvable iteratively).
diff --git a/docs/proposals/feature-plan-deepening.md b/docs/proposals/feature-plan-deepening.md
new file mode 100644
index 0000000..c2d8aeb
--- /dev/null
+++ b/docs/proposals/feature-plan-deepening.md
@@ -0,0 +1,383 @@
+# Feature Proposal: Parallel Plan Deepening
+
+> **Priority:** Medium
+> **Complexity:** Low
+> **Estimated Effort:** 5-8 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** [Knowledge Accumulation](feature-knowledge-accumulation.md) (recommended, not required)
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Multi-Agent Review](feature-multi-agent-review.md)
+
+---
+
+## Problem Statement
+
+Makima's planning phase currently suffers from **single-pass planning**:
+
+- A supervisor creates a plan based on its immediate analysis of the task
+- **No systematic research** is conducted before finalizing the plan
+- **Edge cases are discovered during execution**, requiring mid-stream plan changes
+- **Best practices are not consulted** — the plan relies solely on the model's training knowledge
+- **Existing project learnings** (if the knowledge accumulation feature exists) are not surfaced during planning
+- **Revision rate is high** — an estimated ~40% of plans require significant changes after execution begins
+
+The result: plans are shallow, execution discovers problems that planning should have caught, and contracts take longer than necessary.
+
+---
+
+## How Compound Engineering Solves This
+
+The compound engineering plugin's `/deepen-plan` command takes an existing plan and enhances it by spawning **20-40 parallel research agents**:
+
+```
+┌──────────────────────────────────────────────────────────────┐
+│                      /deepen-plan                             │
+│                                                              │
+│  Input: Initial plan (from /plan)                            │
+│                                                              │
+│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐       │
+│  │ Best     │ │ Edge     │ │ Dep.     │ │ Pattern  │       │
+│  │ Practice │ │ Case     │ │ Research │ │ Matching │       │
+│  │ Agent 1  │ │ Agent 1  │ │ Agent 1  │ │ Agent 1  │       │
+│  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘       │
+│       │            │            │            │              │
+│  ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐       │
+│  │ Best     │ │ Edge     │ │ Security │ │ Existing │       │
+│  │ Practice │ │ Case     │ │ Concerns │ │ Learning │       │
+│  │ Agent 2  │ │ Agent 2  │ │ Agent    │ │ Agent    │       │
+│  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘       │
+│       │            │            │            │              │
+│  ... (20-40 agents per plan item) ...                        │
+│       │            │            │            │              │
+│       ▼            ▼            ▼            ▼              │
+│  ┌──────────────────────────────────────────────────┐       │
+│  │              Synthesis Agent                      │       │
+│  │  - Merge research into plan                       │       │
+│  │  - Add edge case handling                         │       │
+│  │  - Insert best practice notes                     │       │
+│  │  - Flag risks and dependencies                    │       │
+│  └──────────────────────────────────────────────────┘       │
+│                        │                                     │
+│                        ▼                                     │
+│              Enhanced Plan (Deepened)                         │
+│              - Original steps preserved                      │
+│              - Edge cases added per step                      │
+│              - Best practices annotated                       │
+│              - Risks flagged                                  │
+│              - Dependencies clarified                         │
+└──────────────────────────────────────────────────────────────┘
+```
+
+The key insight: **research is embarrassingly parallel**. Each plan item can be researched independently, and each research dimension (best practices, edge cases, security, etc.) is independent.
+
+---
+
+## Proposed Makima Implementation
+
+### 1. New Supervisor Command: `makima supervisor deepen-plan`
+
+```bash
+# Deepen the current contract's plan
+makima supervisor deepen-plan
+
+# Deepen with specific focus areas
+makima supervisor deepen-plan --focus "security,edge-cases,performance"
+
+# Deepen with explicit plan file reference
+makima supervisor deepen-plan --plan-file plan.md
+
+# Control parallelism
+makima supervisor deepen-plan --max-agents 10
+
+# Include knowledge base search (requires Knowledge Accumulation feature)
+makima supervisor deepen-plan --search-learnings
+```
+
+### 2. Research Agent Categories
+
+Each plan item is researched along multiple dimensions:
+
+| Agent Category | Purpose | Example Output |
+|----------------|---------|----------------|
+| **Best Practices** | Industry standards for the technology/pattern | "Use parameterized queries for all DB operations" |
+| **Edge Cases** | Boundary conditions and error scenarios | "Handle concurrent modification of shared resource" |
+| **Dependency Research** | Compatibility, versions, known issues | "Library X v3 has breaking changes from v2" |
+| **Security Concerns** | Security implications of the planned approach | "JWT stored in localStorage is vulnerable to XSS" |
+| **Performance Implications** | Performance characteristics and bottlenecks | "N+1 query risk with eager loading disabled" |
+| **Pattern Matching** | Similar patterns in the existing codebase | "Module Y already implements this pattern; follow its conventions" |
+| **Existing Learnings** | Prior solutions from knowledge base | "Similar issue solved in contract Z; see docs/solutions/..." |
+
+### 3. Deepening Flow
+
+```
+┌─────────────┐     ┌──────────────────┐     ┌────────────────┐
+│ Original    │     │ Research Phase    │     │ Enhanced Plan  │
+│ Plan        │────▶│                  │────▶│                │
+│             │     │ Per plan item:    │     │ Original +     │
+│ Step 1      │     │ - Best practices │     │ annotations    │
+│ Step 2      │     │ - Edge cases     │     │               │
+│ Step 3      │     │ - Dependencies   │     │ Step 1         │
+│ Step 4      │     │ - Security       │     │  ├ Edge cases  │
+│             │     │ - Performance    │     │  ├ Best pracs  │
+│             │     │ - Patterns       │     │  └ Risks       │
+│             │     │ - Learnings      │     │ Step 2         │
+│             │     │                  │     │  ├ Edge cases  │
+│             │     │ All in parallel  │     │  └ ...         │
+└─────────────┘     └──────────────────┘     └────────────────┘
+```
+
+**Implementation using existing infrastructure:**
+
+```bash
+# Step 1: Parse plan into items
+plan_items=$(makima supervisor get-plan-items)
+
+# Step 2: For each item, spawn research agents as a group
+for item in $plan_items; do
+  makima supervisor spawn-group "deepen-${item.id}" \
+    --tasks "[
+      {\"name\": \"best-practices\", \"plan\": \"Research best practices for: ${item.description}\"},
+      {\"name\": \"edge-cases\", \"plan\": \"Identify edge cases for: ${item.description}\"},
+      {\"name\": \"security\", \"plan\": \"Analyze security implications of: ${item.description}\"},
+      {\"name\": \"performance\", \"plan\": \"Assess performance implications of: ${item.description}\"}
+    ]" \
+    --share-worktree \
+    --read-only
+done
+
+# Step 3: Wait for all groups
+makima supervisor wait-group "deepen-*" --timeout 300
+
+# Step 4: Synthesize results into enhanced plan
+makima supervisor synthesize-plan
+```
+
+### 4. Enhanced Plan Format
+
+The deepened plan augments each step with structured annotations:
+
+```markdown
+## Step 3: Implement JWT Authentication
+
+### Original Plan
+Add JWT-based authentication middleware to the API gateway.
+Generate tokens on login, validate on each request.
+
+### Research Findings
+
+#### Best Practices
+- Use RS256 (asymmetric) for microservices, HS256 for monoliths
+- Set short access token TTL (15 min) with refresh token rotation
+- Include only essential claims (sub, exp, iat, roles)
+- Never store sensitive data in JWT payload (it's base64, not encrypted)
+
+#### Edge Cases
+- Token expiry during long-running requests
+- Clock skew between services (use ±30s leeway)
+- Concurrent refresh token rotation (race condition)
+- Token size exceeding header limits (>8KB with many claims)
+
+#### Security Concerns
+- **P2**: JWT in localStorage is XSS-vulnerable; prefer httpOnly cookies
+- **P3**: Missing CSRF protection if using cookies
+- **P2**: No token revocation mechanism for compromised tokens
+
+#### Performance Notes
+- JWT validation is CPU-bound (RS256 ~1ms per validation)
+- Consider caching decoded tokens for repeated validation
+- Refresh token DB lookup adds latency (~5ms)
+
+#### Existing Learnings
+- See: docs/solutions/security-practices/jwt-refresh-token-rotation.md
+- Previous contract "Auth Service Refactor" used similar pattern
+
+### Risks
+- [ ] Clock skew handling not in original plan
+- [ ] Token revocation strategy needed
+- [ ] CSRF protection if using cookie storage
+```
+
+### 5. Integration with Knowledge Base
+
+When the Knowledge Accumulation feature is available, `deepen-plan` automatically includes a **learning search agent** for each plan item:
+
+```
+Research Agent: "Search existing learnings relevant to JWT authentication"
+
+Results:
+- docs/solutions/security-practices/jwt-refresh-token-rotation.md (relevance: 0.92)
+- docs/solutions/api-patterns/authentication-middleware-pattern.md (relevance: 0.78)
+- docs/solutions/debugging-techniques/token-expiry-debugging.md (relevance: 0.65)
+```
+
+These results are included in the deepened plan with direct links.
+
+---
+
+## Integration with Existing Makima Features
+
+### Contract Phases
+
+Plan deepening occurs during the **Plan phase**, between initial plan creation and phase transition to Execute:
+
+```
+Plan Phase Timeline:
+  1. Supervisor creates initial plan
+  2. makima supervisor deepen-plan    ← NEW
+  3. User reviews deepened plan
+  4. makima supervisor advance-phase execute
+```
+
+### Supervisor/Worker Hierarchy
+
+Research agents are spawned as **worker tasks** under the supervisor. Uses the existing `spawn-task` infrastructure with the proposed `spawn-group`/`wait-group` from the [Multi-Agent Review](feature-multi-agent-review.md) proposal.
+
+### Contract Files
+
+The deepened plan replaces or augments the plan document as a contract file:
+
+```rust
+File {
+    contract_id: contract.id,
+    contract_phase: "plan",
+    name: "Implementation Plan (Deepened)",
+    body: vec![
+        // Enhanced plan content with annotations
+    ],
+}
+```
+
+### Directive System
+
+For directive-based workflows, plan deepening can be added as a step:
+
+```rust
+DirectiveStep {
+    name: "deepen-plan",
+    description: "Enhance implementation plan with parallel research",
+    depends_on: [initial_plan_step_id],
+    task_plan: "Run deepen-plan on the initial plan...",
+}
+```
+
+### Phase Guards
+
+If `phase_guard` is enabled, the user reviews the deepened plan before approving transition to execute. This is the natural checkpoint for plan quality.
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Command (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `deepen-plan` command | 1 day | Parse plan, spawn research groups |
+| Research agent templates | 0.5 days | Default prompts for each category |
+| Synthesis logic | 1 day | Merge research into annotated plan |
+| Plan file update | 0.5 days | Write deepened plan as contract file |
+
+### Phase 2: Knowledge Integration (1-2 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Learning search agent | 0.5 days | Search knowledge base per plan item |
+| Result integration | 0.5 days | Include learning links in plan |
+| Fallback when no KB | 0.5 days | Graceful degradation without KB |
+
+### Phase 3: Configuration & Polish (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Config file support | 0.5 days | `.makima/deepen.yaml` |
+| Focus area filtering | 0.5 days | `--focus` flag implementation |
+| Concurrency control | 0.5 days | `--max-agents` limit |
+| Documentation | 0.5 days | User guide |
+| Tests | 1 day | Unit + integration |
+
+---
+
+## Configuration Examples
+
+### Repository-Level Configuration
+
+```yaml
+# .makima/deepen.yaml
+version: 1
+deepen:
+  # Auto-deepen when plan is created
+  auto_trigger: false
+
+  # Maximum agents per plan item
+  max_agents_per_item: 5
+
+  # Total maximum concurrent agents
+  max_concurrent: 20
+
+  # Timeout per research agent (seconds)
+  agent_timeout: 120
+
+  # Research dimensions to include
+  dimensions:
+    - best-practices
+    - edge-cases
+    - security
+    - performance
+    - dependencies
+    - patterns
+    - learnings          # Requires Knowledge Accumulation
+
+  # Minimum plan items to trigger deepening
+  min_plan_items: 3
+
+  # Search learnings (requires Knowledge Accumulation)
+  search_learnings: true
+  search_min_relevance: 0.5
+```
+
+### Inline Usage
+
+```bash
+# Quick deepen with defaults
+makima supervisor deepen-plan
+
+# Focused deepen for security-sensitive work
+makima supervisor deepen-plan --focus security,edge-cases
+
+# Deepen with more agents for complex plans
+makima supervisor deepen-plan --max-agents 30
+
+# Deepen without knowledge base search
+makima supervisor deepen-plan --no-learnings
+```
+
+---
+
+## Open Questions
+
+1. **Plan format parsing**: How should the system parse existing plans to identify discrete items? Markdown headers? Numbered lists? YAML structure?
+2. **Research depth vs. cost**: 20-40 agents per deepening is expensive. Should there be a "lite" mode with fewer agents?
+3. **Deepening multiple times**: Can a plan be deepened iteratively? Should subsequent deepenings build on previous research?
+4. **User-provided context**: Should users be able to provide additional context (e.g., "this project uses PostgreSQL, not MySQL") to guide research?
+5. **Codebase analysis**: Should research agents analyze the existing codebase to find relevant patterns, or only reason from general knowledge?
+6. **Conflicting research**: When research agents disagree (e.g., one says "use Redis" and another says "avoid Redis"), how should the synthesis handle it?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Sequential research (one agent) | Simple, cheaper | Slow; misses multi-perspective insights | Rejected — parallel is core value |
+| Automatic deepening (always on) | No manual step | Adds latency to every plan; unnecessary for simple tasks | Optional auto-trigger |
+| Web search integration | Real-time information | Inconsistent quality; potential hallucination from web results | Deferred — consider for v2 |
+| User-provided research questions | Targeted research | Requires user to know what to ask | Complement — support alongside auto-research |
+| LLM-only research (no task spawning) | Simpler, no infrastructure | Limited by single context window; no parallelism | Rejected — defeats the purpose |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: MEDIUM** — Plan deepening significantly improves plan quality, but it's enhancement over an already-functional planning workflow. The compound engineering plugin's data shows ~40% plan revision reduction.
+- **Complexity: LOW** — This feature is largely a composition of existing primitives (task spawning, group waiting, plan file updates). The main new work is research agent prompts and synthesis logic.
+- **Risk: LOW** — Worst case is slightly better plans. No system changes required. Can be adopted incrementally.
diff --git a/docs/proposals/feature-task-templates.md b/docs/proposals/feature-task-templates.md
new file mode 100644
index 0000000..98abde9
--- /dev/null
+++ b/docs/proposals/feature-task-templates.md
@@ -0,0 +1,602 @@
+# Feature Proposal: Reusable Task Templates & Meta-Commands
+
+> **Priority:** Medium
+> **Complexity:** Medium
+> **Estimated Effort:** 8-12 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** None (standalone, but complements [Workflow Presets](feature-workflow-presets.md))
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Workflow Presets](feature-workflow-presets.md) · [Multi-Agent Review](feature-multi-agent-review.md)
+
+---
+
+## Problem Statement
+
+Makima tasks are created with **ad-hoc plans** every time:
+
+- **No plan reuse** — even when spawning the same type of task (e.g., "add API endpoint"), the plan is written from scratch
+- **No standardization** — different supervisors produce different quality plans for the same task type
+- **No best practices encoding** — hard-won knowledge about how to structure certain tasks isn't captured
+- **No variable substitution** — plans can't be parameterized for reuse
+- **No validation** — there's no way to verify a plan includes required steps before execution
+- **No meta-creation** — the system cannot create its own task templates or improve its own capabilities
+
+The compound engineering plugin addresses this with meta-commands (`/create-agent-skill`, `/heal-skill`) that allow the system to create and repair its own specialized capabilities.
+
+---
+
+## How Compound Engineering Solves This
+
+### `/create-agent-skill`
+
+Creates new specialized agents and skills on demand:
+
+```bash
+/create-agent-skill "database migration reviewer"
+```
+
+This generates:
+1. An agent definition file with specialized prompts
+2. A skill file that exposes the agent as a command
+3. Registration in the agent/skill registry
+
+### `/heal-skill`
+
+When a skill breaks (e.g., after a dependency change), this meta-command:
+1. Analyzes the error
+2. Identifies the root cause
+3. Patches the skill definition
+4. Tests the fix
+
+The key insight: **the system should be able to improve and extend itself**.
+
+---
+
+## Proposed Makima Implementation
+
+### 1. Task Recipe Format
+
+Task recipes are parameterized plan templates with validation and metadata:
+
+```yaml
+# .makima/recipes/api-endpoint.yaml
+name: api-endpoint
+description: "Create a new REST API endpoint"
+version: 1
+author: "team"
+tags: [api, backend, rest]
+
+# Input variables
+variables:
+  endpoint_name:
+    required: true
+    description: "Name of the endpoint (e.g., 'users', 'orders')"
+    validation: "^[a-z][a-z0-9-]*$"
+
+  http_method:
+    required: true
+    description: "HTTP method"
+    enum: [GET, POST, PUT, PATCH, DELETE]
+    default: GET
+
+  resource_name:
+    required: true
+    description: "Name of the resource/model"
+
+  requires_auth:
+    required: false
+    default: true
+    description: "Whether the endpoint requires authentication"
+
+  database_table:
+    required: false
+    description: "Database table name (if applicable)"
+
+# Plan template with variable substitution
+plan: |
+  ## Task: Create {{ http_method }} /api/{{ endpoint_name }} Endpoint
+
+  ### Step 1: Define Route
+  Add the `{{ http_method }} /api/{{ endpoint_name }}` route to the router.
+  {% if requires_auth %}
+  Apply authentication middleware to this route.
+  {% endif %}
+
+  ### Step 2: Create Handler
+  Create the handler function for {{ endpoint_name }}.
+  {% if database_table %}
+  The handler should query the `{{ database_table }}` table.
+  {% endif %}
+
+  ### Step 3: Request/Response Models
+  Define request and response types for the {{ resource_name }} resource.
+  Include validation for all input fields.
+
+  ### Step 4: Error Handling
+  Implement proper error responses:
+  - 400 for validation errors
+  - 401 for authentication failures
+  {% if requires_auth %}
+  - 403 for authorization failures
+  {% endif %}
+  - 404 for not found
+  - 500 for server errors
+
+  ### Step 5: Tests
+  Write tests covering:
+  - Happy path
+  - Input validation
+  {% if requires_auth %}
+  - Authentication required
+  - Authorization check
+  {% endif %}
+  - Error cases
+  - Edge cases
+
+  ### Step 6: Documentation
+  Update API documentation with:
+  - Endpoint URL and method
+  - Request/response schemas
+  - Example requests and responses
+  - Error codes
+
+# Validation rules — checks that must pass before execution
+validation:
+  - check: "file_exists"
+    path: "src/api/mod.rs"
+    message: "API module must exist"
+  - check: "grep"
+    pattern: "Router"
+    path: "src/api/mod.rs"
+    message: "Router must be defined in API module"
+
+# Expected outputs
+outputs:
+  files:
+    - "src/api/{{ endpoint_name }}.rs"
+    - "src/api/{{ endpoint_name }}_test.rs"
+  tests:
+    - "cargo test {{ endpoint_name }}"
+
+# Metadata for recipe discovery
+metadata:
+  estimated_time: "30-60 minutes"
+  difficulty: "easy"
+  example_usage: |
+    makima recipe run api-endpoint \
+      --var endpoint_name=users \
+      --var http_method=GET \
+      --var resource_name=User \
+      --var database_table=users
+```
+
+### 2. Recipe Registry
+
+Recipes are discovered from three sources (same hierarchy as workflow presets):
+
+| Level | Location | Scope |
+|-------|----------|-------|
+| Built-in | Shipped with makima | All users |
+| Repository | `.makima/recipes/` | All users of the repo |
+| User | `~/.makima/recipes/` | Single user |
+
+**Precedence**: User > Repository > Built-in (same name overrides)
+
+### 3. Supervisor Commands
+
+#### List Available Recipes
+
+```bash
+makima recipe list
+
+# Output:
+# NAME              DESCRIPTION                          SOURCE     TAGS
+# api-endpoint      Create a new REST API endpoint        built-in   api, backend
+# db-migration      Create a database migration           built-in   database
+# react-component   Create a React component              built-in   frontend, react
+# unit-test         Create unit tests for a module        built-in   testing
+# bug-fix           Structured bug fix workflow            built-in   debugging
+# custom-validator  Create input validation module         repo       validation
+```
+
+#### Run a Recipe
+
+```bash
+# Run with explicit variables
+makima recipe run api-endpoint \
+  --var endpoint_name=users \
+  --var http_method=GET \
+  --var resource_name=User \
+  --var database_table=users
+
+# Run with interactive variable input
+makima recipe run api-endpoint
+
+# Preview the generated plan (dry run)
+makima recipe preview api-endpoint \
+  --var endpoint_name=users \
+  --var http_method=GET
+```
+
+#### Create a Recipe
+
+```bash
+# Create recipe from scratch
+makima recipe create --name "my-recipe" --edit
+
+# Generate recipe from a completed task (meta-creation)
+makima recipe create --from-task <task-id> --name "my-recipe"
+
+# Generate recipe from a plan file
+makima recipe create --from-plan plan.md --name "my-recipe"
+```
+
+#### Validate a Recipe
+
+```bash
+# Validate recipe file
+makima recipe validate .makima/recipes/my-recipe.yaml
+
+# Validate recipe variables
+makima recipe validate api-endpoint \
+  --var endpoint_name=users \
+  --var http_method=GET
+```
+
+### 4. Meta-Commands: Self-Improving Templates
+
+The most powerful aspect of the compound engineering plugin is its ability to **create its own capabilities**. Makima can implement similar meta-commands:
+
+#### `makima recipe generate`
+
+The system analyzes completed tasks and suggests recipe templates:
+
+```bash
+# Analyze recent tasks and suggest recipes
+makima recipe generate --analyze-last 20
+
+# Output:
+# Detected patterns:
+# 1. "API endpoint creation" — 7 tasks followed similar pattern
+#    Suggested recipe: api-endpoint (confidence: 0.89)
+#    Variables: endpoint_name, http_method, resource_name
+#
+# 2. "Database migration" — 4 tasks followed similar pattern
+#    Suggested recipe: db-migration (confidence: 0.76)
+#    Variables: table_name, migration_type
+#
+# Generate these recipes? [y/N]
+```
+
+#### `makima recipe heal`
+
+When a recipe fails repeatedly, the system can analyze and fix it:
+
+```bash
+# Analyze recipe failures and suggest fixes
+makima recipe heal api-endpoint
+
+# Output:
+# Analyzed 3 recent failures of 'api-endpoint':
+# Root cause: Step 1 references 'src/api/mod.rs' but project uses 'src/routes/mod.rs'
+# Suggested fix: Change validation path and plan references
+# Apply fix? [y/N]
+```
+
+#### `makima recipe evolve`
+
+Improve recipes based on review findings:
+
+```bash
+# Check if review findings suggest recipe improvements
+makima recipe evolve api-endpoint --from-findings
+
+# Output:
+# Review findings from tasks using 'api-endpoint' recipe:
+# - SEC-001: "Missing rate limiting" (3 occurrences)
+# - PERF-001: "Missing pagination" (2 occurrences)
+#
+# Suggested additions to recipe:
+# 1. Add "Rate Limiting" step after Step 1
+# 2. Add pagination to Step 2 for GET endpoints
+# Apply improvements? [y/N]
+```
+
+### 5. Built-In Recipes
+
+#### `api-endpoint`
+
+Creates a REST API endpoint with handler, models, validation, tests, and docs.
+
+#### `db-migration`
+
+Creates a database migration with up/down scripts, validation, and rollback plan.
+
+```yaml
+name: db-migration
+variables:
+  table_name: { required: true }
+  migration_type: { required: true, enum: [create-table, alter-table, add-index, seed-data] }
+plan: |
+  ## Create Database Migration: {{ migration_type }} on {{ table_name }}
+  ### Step 1: Create migration file
+  ### Step 2: Write up migration
+  ### Step 3: Write down migration (rollback)
+  ### Step 4: Test migration on clean database
+  ### Step 5: Test rollback
+  ### Step 6: Document migration in changelog
+```
+
+#### `react-component`
+
+Creates a React component with props, state, styling, and tests.
+
+#### `unit-test`
+
+Generates unit tests for an existing module by analyzing its public API.
+
+#### `bug-fix`
+
+Structured bug fix workflow: reproduce → root cause → fix → test → document.
+
+```yaml
+name: bug-fix
+variables:
+  bug_description: { required: true }
+  reproduction_steps: { required: false }
+  affected_area: { required: false }
+plan: |
+  ## Bug Fix: {{ bug_description }}
+
+  ### Step 1: Reproduce
+  {% if reproduction_steps %}
+  Follow these reproduction steps: {{ reproduction_steps }}
+  {% else %}
+  Identify and document reproduction steps.
+  {% endif %}
+
+  ### Step 2: Root Cause Analysis
+  Trace the code path to identify the root cause.
+  {% if affected_area %}
+  Start in: {{ affected_area }}
+  {% endif %}
+
+  ### Step 3: Implement Fix
+  Fix the root cause, not just the symptom.
+
+  ### Step 4: Write Regression Test
+  Create a test that would have caught this bug.
+
+  ### Step 5: Verify Fix
+  Run the reproduction steps and confirm the bug is fixed.
+  Run the full test suite to check for regressions.
+
+  ### Step 6: Document
+  Document what caused the bug and how it was fixed.
+```
+
+---
+
+## Integration with Existing Makima Features
+
+### Supervisor Task Spawning
+
+Recipes generate plans that are passed to `spawn-task`:
+
+```rust
+// Recipe execution
+let plan = recipe.render_plan(&variables)?;
+let task = spawn_task(SpawnTaskRequest {
+    task_name: format!("{} ({})", recipe.name, variables.get("primary_var")),
+    plan,
+    // ... other params from context
+})?;
+```
+
+### Contract Files
+
+Recipe definitions can be stored as contract files for versioning:
+
+```rust
+File {
+    contract_id: None, // Global, not contract-specific
+    name: "Recipe: api-endpoint",
+    body: vec![
+        BodyElement::Code { language: Some("yaml"), content: recipe_yaml },
+    ],
+}
+```
+
+### Workflow Presets
+
+Recipes and presets are complementary:
+- **Presets** define the high-level workflow (which phases, what triggers)
+- **Recipes** define the low-level task plans (what each task does)
+
+A preset can reference recipes:
+
+```yaml
+# In a preset
+phases:
+  execute:
+    recipe: api-endpoint    # Use the api-endpoint recipe for this phase's tasks
+    recipe_vars:
+      endpoint_name: "{{ task_description }}"
+```
+
+### Knowledge Accumulation
+
+Recipes can be **evolved** based on learnings:
+- When compound learning captures a pattern, check if it maps to an existing recipe
+- If so, suggest recipe improvements
+- If not, suggest creating a new recipe
+
+### Directive System
+
+For directive-based workflows, recipes can be used as task plan sources:
+
+```rust
+DirectiveStep {
+    name: "create-users-endpoint",
+    task_plan: recipe.render_plan(&variables)?,  // Generated from recipe
+    // ...
+}
+```
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Recipe System (3-4 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Recipe YAML schema | 0.5 days | Define format, validation rules |
+| YAML parser with Jinja-like templating | 1 day | Variable substitution, conditionals |
+| `recipe list` command | 0.5 days | Discover and list recipes |
+| `recipe run` command | 1 day | Parse, validate, render, spawn task |
+| `recipe preview` command | 0.5 days | Dry-run display |
+
+### Phase 2: Recipe Management (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Multi-level discovery | 0.5 days | Built-in, repo, user resolution |
+| `recipe create` command | 1 day | Create from scratch or from task |
+| `recipe validate` command | 0.5 days | YAML validation, variable check |
+| Built-in recipe definitions | 1 day | Write 5 default recipes |
+
+### Phase 3: Meta-Commands (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `recipe generate` | 1.5 days | Pattern detection from task history |
+| `recipe heal` | 1 day | Failure analysis and auto-fix |
+| `recipe evolve` | 1 day | Improve recipes from findings/learnings |
+| Recipe versioning | 0.5 days | Version tracking, deprecation |
+| Documentation | 0.5 days | User guide, recipe authoring guide |
+
+---
+
+## Configuration Examples
+
+### Running a Recipe
+
+```bash
+# Simple usage
+makima recipe run api-endpoint \
+  --var endpoint_name=orders \
+  --var http_method=POST \
+  --var resource_name=Order \
+  --var requires_auth=true \
+  --var database_table=orders
+
+# This spawns a task with the rendered plan:
+# "## Task: Create POST /api/orders Endpoint
+#  ### Step 1: Define Route
+#  Add the POST /api/orders route to the router.
+#  Apply authentication middleware to this route.
+#  ..."
+```
+
+### Creating a Recipe from a Completed Task
+
+```bash
+# After completing a successful task
+makima recipe create --from-task abc-123 --name "graphql-resolver"
+
+# Analyzes the task's plan and execution to generate:
+# .makima/recipes/graphql-resolver.yaml
+# with variables extracted from repeated patterns
+```
+
+### Recipe with Validation
+
+```yaml
+# .makima/recipes/react-component.yaml
+name: react-component
+variables:
+  component_name:
+    required: true
+    validation: "^[A-Z][a-zA-Z]*$"  # PascalCase
+  use_typescript:
+    required: false
+    default: true
+  include_tests:
+    required: false
+    default: true
+  styling:
+    required: false
+    enum: [css-modules, styled-components, tailwind]
+    default: css-modules
+
+validation:
+  - check: "file_exists"
+    path: "src/components"
+    message: "Components directory must exist"
+  - check: "not_exists"
+    path: "src/components/{{ component_name }}"
+    message: "Component {{ component_name }} already exists"
+
+plan: |
+  ## Create React Component: {{ component_name }}
+
+  ### Step 1: Component File
+  Create `src/components/{{ component_name }}/{{ component_name }}.{{ 'tsx' if use_typescript else 'jsx' }}`
+  with the component skeleton.
+
+  ### Step 2: Styling
+  {% if styling == 'css-modules' %}
+  Create `{{ component_name }}.module.css` with base styles.
+  {% elif styling == 'styled-components' %}
+  Create styled components in the component file.
+  {% elif styling == 'tailwind' %}
+  Use Tailwind CSS classes directly in the component.
+  {% endif %}
+
+  {% if include_tests %}
+  ### Step 3: Tests
+  Create `{{ component_name }}.test.{{ 'tsx' if use_typescript else 'jsx' }}`
+  with tests for rendering, props, and user interactions.
+  {% endif %}
+
+  ### Step {{ '4' if include_tests else '3' }}: Export
+  Add {{ component_name }} to the components index file.
+
+outputs:
+  files:
+    - "src/components/{{ component_name }}/{{ component_name }}.{{ 'tsx' if use_typescript else 'jsx' }}"
+    - "src/components/{{ component_name }}/index.{{ 'ts' if use_typescript else 'js' }}"
+```
+
+---
+
+## Open Questions
+
+1. **Templating language**: Should we use a full Jinja2-like syntax or a simpler `{{ variable }}` substitution? Jinja adds power but complexity.
+2. **Recipe dependencies**: Can recipes depend on other recipes? (e.g., "api-endpoint requires db-migration to have run first")
+3. **Recipe testing**: How do you test that a recipe produces valid plans? Should recipes have test cases?
+4. **Recipe marketplace**: Should there be a community registry for sharing recipes?
+5. **Pattern detection**: How sophisticated should `recipe generate` be? Simple plan comparison, or full semantic analysis?
+6. **Recipe scope**: Should recipes generate just plans, or also pre-create file scaffolding (like code generators)?
+7. **Backwards compatibility**: When a recipe is updated, what happens to tasks that were created with the old version?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Plan library (copy-paste) | Simple | No variables, no validation | Rejected — not reusable enough |
+| Code generators (scaffolding) | Creates actual files | Over-prescriptive; doesn't handle logic | Complement — recipes can reference generators |
+| LLM-only planning | Maximum flexibility | Inconsistent; no standardization | Current state — recipes improve on this |
+| Cookiecutter-style templates | Familiar | Wrong level (project-level vs task-level) | Rejected — different abstraction |
+| Hardcoded task types | Fast | Not extensible; limited variety | Rejected — need flexibility |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: MEDIUM** — Task templates improve consistency and speed but aren't required for makima to function. They become increasingly valuable as the system is used more (patterns emerge).
+- **Complexity: MEDIUM** — YAML parsing and variable substitution are straightforward. Meta-commands (generate, heal, evolve) require sophisticated analysis of task history and are the main complexity drivers.
+- **Risk: LOW-MEDIUM** — Core recipe system is low risk. Meta-commands (auto-generation, healing) involve AI-driven analysis that may produce variable quality. Mitigated by requiring human approval before applying changes.
diff --git a/docs/proposals/feature-workflow-presets.md b/docs/proposals/feature-workflow-presets.md
new file mode 100644
index 0000000..1468a8a
--- /dev/null
+++ b/docs/proposals/feature-workflow-presets.md
@@ -0,0 +1,623 @@
+# Feature Proposal: Workflow Presets / Pipeline Templates
+
+> **Priority:** High
+> **Complexity:** Medium
+> **Estimated Effort:** 10-15 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** None (foundational feature)
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Multi-Agent Review](feature-multi-agent-review.md) · [Knowledge Accumulation](feature-knowledge-accumulation.md)
+
+---
+
+## Problem Statement
+
+Every makima contract currently requires **manual orchestration**:
+
+- Users must decide which contract type to use (simple, specification, execute)
+- Supervisors must manually spawn tasks, wait for results, advance phases
+- There are **no pre-built pipelines** for common workflows (full feature development, quick bug fix, refactoring, investigation)
+- The supervisor plan must encode the full orchestration logic every time
+- **Repetitive patterns** (plan → execute → test → review) are re-invented for each contract
+- New users face a steep learning curve to orchestrate contracts effectively
+
+The compound engineering plugin's `/lfg` (Let's F***ing Go) and `/slfg` (Super LFG) commands solve this with **one-command full pipelines** that chain all phases automatically.
+
+---
+
+## How Compound Engineering Solves This
+
+### LFG Pipeline (Serial)
+
+```bash
+/lfg "Implement user authentication"
+```
+
+Automatically chains:
+```
+Plan → Deepen Plan → Work → Review → Resolve Findings → Test → Compound → Done
+```
+
+### SLFG Pipeline (Parallel)
+
+```bash
+/slfg "Implement user authentication"
+```
+
+Same as LFG but parallelizes independent steps:
+```
+Plan ──▶ Deepen Plan ──▶ Work ──▶ ┌─ Review ─────┐ ──▶ Test ──▶ Compound
+                                   │  (parallel)  │
+                                   └──────────────┘
+```
+
+The key insight: **most engineering workflows follow predictable patterns** that can be templated and reused.
+
+---
+
+## Proposed Makima Implementation
+
+### 1. Preset Definition Format
+
+Presets are defined in YAML and describe a complete workflow:
+
+```yaml
+# .makima/presets/full-pipeline.yaml
+name: full-pipeline
+description: "Complete feature development pipeline with review and learning"
+contract_type: specification
+version: 1
+
+# Variables that can be substituted at runtime
+variables:
+  task_description:
+    required: true
+    description: "What to build"
+  repository:
+    required: false
+    description: "Target repository URL"
+  base_branch:
+    required: false
+    default: "main"
+    description: "Branch to work from"
+
+# Phase configuration
+phases:
+  research:
+    enabled: true
+    deliverables:
+      - id: research-notes
+        name: "Research Notes"
+        priority: required
+    supervisor_plan: |
+      Research the requirements for: {{ task_description }}
+      - Analyze the existing codebase for relevant patterns
+      - Identify dependencies and constraints
+      - Document findings as research notes
+
+  plan:
+    enabled: true
+    deliverables:
+      - id: plan-document
+        name: "Implementation Plan"
+        priority: required
+    supervisor_plan: |
+      Create an implementation plan for: {{ task_description }}
+      Based on the research findings.
+    # Auto-deepen plan (requires Plan Deepening feature)
+    deepen: true
+    deepen_focus:
+      - edge-cases
+      - security
+      - performance
+
+  execute:
+    enabled: true
+    deliverables:
+      - id: implementation
+        name: "Implementation"
+        priority: required
+    supervisor_plan: |
+      Execute the plan for: {{ task_description }}
+      Follow the deepened plan step by step.
+    # Spawn configuration
+    max_concurrent_tasks: 3
+    completion_action: "branch"
+
+  review:
+    enabled: true
+    deliverables:
+      - id: review-report
+        name: "Review Report"
+        priority: required
+    # Auto-review configuration (requires Multi-Agent Review feature)
+    auto_review: true
+    review_agents:
+      - security-sentinel
+      - performance-oracle
+      - architecture-strategist
+      - test-coverage-analyzer
+    merge_blocking_severity: P1
+
+  compound:
+    enabled: true
+    # Auto-compound (requires Knowledge Accumulation feature)
+    auto_compound: true
+    categories:
+      - architecture-decisions
+      - security-practices
+      - performance-optimizations
+
+# Hooks
+hooks:
+  on_phase_complete:
+    execute:
+      - run: "makima supervisor spawn 'run-tests' --plan 'Run the full test suite'"
+      - wait_for: "run-tests"
+  on_contract_complete:
+    - run: "makima supervisor compound"
+```
+
+### 2. Built-In Presets
+
+#### `full-pipeline` — Complete Feature Development
+
+```
+Research → Plan → Deepen → Execute → Test → Review → Resolve → Compound
+```
+
+Best for: New features, major changes, complex implementations.
+
+#### `quick-fix` — Rapid Bug Fix
+
+```
+Execute → Test → Done
+```
+
+Best for: Small bug fixes, typo corrections, config changes.
+
+```yaml
+# .makima/presets/quick-fix.yaml
+name: quick-fix
+description: "Fast bug fix with minimal ceremony"
+contract_type: simple
+
+phases:
+  plan:
+    enabled: true
+    deliverables:
+      - id: fix-plan
+        name: "Fix Plan"
+        priority: required
+    supervisor_plan: |
+      Quick analysis and fix plan for: {{ task_description }}
+      Keep it brief — identify the bug and the fix.
+
+  execute:
+    enabled: true
+    deliverables:
+      - id: fix
+        name: "Bug Fix"
+        priority: required
+    supervisor_plan: |
+      Fix the bug: {{ task_description }}
+      Run relevant tests after fixing.
+    completion_action: "branch"
+```
+
+#### `refactor` — Code Refactoring
+
+```
+Research → Plan → Deepen → Execute → Test → Review → Done
+```
+
+Best for: Code restructuring, pattern changes, dependency updates.
+
+```yaml
+# .makima/presets/refactor.yaml
+name: refactor
+description: "Systematic refactoring with safety checks"
+contract_type: specification
+
+phases:
+  research:
+    enabled: true
+    supervisor_plan: |
+      Analyze the codebase to understand the current structure for: {{ task_description }}
+      Document all files that will be affected.
+      Identify dependencies and potential breaking changes.
+
+  plan:
+    enabled: true
+    deepen: true
+    deepen_focus:
+      - edge-cases
+      - patterns
+    supervisor_plan: |
+      Create a step-by-step refactoring plan for: {{ task_description }}
+      Ensure each step maintains a working state (no big-bang changes).
+
+  execute:
+    enabled: true
+    supervisor_plan: |
+      Execute the refactoring plan for: {{ task_description }}
+      After each significant change, run tests to verify nothing is broken.
+    completion_action: "branch"
+
+  review:
+    enabled: true
+    auto_review: true
+    review_agents:
+      - architecture-strategist
+      - test-coverage-analyzer
+    merge_blocking_severity: P1
+```
+
+#### `investigation` — Research & Analysis
+
+```
+Research → Document → Done
+```
+
+Best for: Bug investigation, feasibility analysis, technology evaluation.
+
+```yaml
+# .makima/presets/investigation.yaml
+name: investigation
+description: "Research-focused workflow for analysis and documentation"
+contract_type: simple
+
+phases:
+  plan:
+    enabled: true
+    supervisor_plan: |
+      Plan the investigation for: {{ task_description }}
+      Define what questions need answering and what to examine.
+
+  execute:
+    enabled: true
+    deliverables:
+      - id: investigation-report
+        name: "Investigation Report"
+        priority: required
+    supervisor_plan: |
+      Investigate: {{ task_description }}
+      Document findings thoroughly.
+      Create actionable recommendations.
+    completion_action: "none"
+```
+
+### 3. Preset Discovery & Usage
+
+#### CLI Commands
+
+```bash
+# List available presets
+makima preset list
+# Output:
+# NAME             DESCRIPTION                                  SOURCE
+# full-pipeline    Complete feature development pipeline         built-in
+# quick-fix        Fast bug fix with minimal ceremony            built-in
+# refactor         Systematic refactoring with safety checks     built-in
+# investigation    Research-focused analysis workflow             built-in
+# custom-deploy    Deployment pipeline with staging              .makima/presets/
+
+# Run a preset
+makima preset run full-pipeline \
+  --var task_description="Add user authentication with JWT" \
+  --var repository="github.com/org/repo"
+
+# Run with interactive variable input
+makima preset run full-pipeline
+
+# Preview what a preset will do (dry run)
+makima preset preview full-pipeline \
+  --var task_description="Add user authentication with JWT"
+
+# Create a new preset from an existing contract
+makima preset create --from-contract <contract-id> --name "my-workflow"
+
+# Validate a preset file
+makima preset validate .makima/presets/my-preset.yaml
+```
+
+#### Under the Hood
+
+When `makima preset run full-pipeline` executes:
+
+```
+1. Parse preset YAML
+2. Substitute variables
+3. Create contract with specified type
+4. Configure phases from preset
+5. Create supervisor task with generated plan
+6. Supervisor executes phases according to preset configuration
+7. Auto-triggers (review, compound) fire at appropriate phase transitions
+```
+
+```
+┌─────────────────────────────────────────────────────────┐
+│                    Preset Engine                         │
+│                                                         │
+│  ┌──────────┐    ┌──────────┐    ┌──────────────────┐  │
+│  │  Parse   │───▶│ Variable │───▶│ Create Contract  │  │
+│  │  YAML    │    │  Subst.  │    │ + Supervisor     │  │
+│  └──────────┘    └──────────┘    └────────┬─────────┘  │
+│                                           │             │
+│                  ┌────────────────────────┐│             │
+│                  │  Phase Orchestration   ││             │
+│                  │                        │▼             │
+│                  │  research ──▶ plan ──▶ execute       │
+│                  │                   │         │        │
+│                  │             deepen-plan     │        │
+│                  │             (if enabled)    │        │
+│                  │                             ▼        │
+│                  │                    review ──▶ compound│
+│                  │                    (auto)    (auto)  │
+│                  └────────────────────────────────────┘ │
+└─────────────────────────────────────────────────────────┘
+```
+
+### 4. Custom Preset Creation
+
+Users create presets at three levels:
+
+| Level | Location | Scope |
+|-------|----------|-------|
+| Built-in | Shipped with makima | All users |
+| Repository | `.makima/presets/` | All users of the repo |
+| User | `~/.makima/presets/` | Single user |
+
+**Precedence**: User > Repository > Built-in (same name overrides)
+
+#### Creating from Existing Contract
+
+```bash
+# Analyze a successful contract and generate a preset from it
+makima preset create --from-contract abc-123 --name "my-api-workflow"
+
+# This generates:
+# ~/.makima/presets/my-api-workflow.yaml
+# with phases, timings, and patterns extracted from the contract
+```
+
+---
+
+## Integration with Existing Makima Features
+
+### Contract System
+
+Presets create contracts with the appropriate type:
+```rust
+// Preset specifies contract_type
+let contract = create_contract(CreateContractRequest {
+    name: format!("{} ({})", task_description, preset.name),
+    contract_type: preset.contract_type.clone(), // "simple", "specification", "execute"
+    phase: preset.first_enabled_phase(),
+    autonomous_loop: true,
+    phase_guard: preset.phase_guard,
+    // ...
+});
+```
+
+### Supervisor Plans
+
+The preset generates a comprehensive supervisor plan by combining phase-specific instructions:
+
+```rust
+let supervisor_plan = preset.generate_supervisor_plan(&variables);
+// This produces a plan like:
+// "You are orchestrating a full-pipeline workflow.
+//  Phase 1 (Research): ...
+//  Phase 2 (Plan): ...
+//  ..."
+```
+
+### Directive System Integration
+
+For complex presets, phases can be modeled as directive steps with dependencies:
+
+```rust
+// Each phase becomes a directive step
+let steps = preset.phases.iter().map(|phase| {
+    DirectiveStep {
+        name: phase.name.clone(),
+        description: Some(phase.description.clone()),
+        task_plan: Some(phase.supervisor_plan.clone()),
+        depends_on: phase.dependencies(),
+        // ...
+    }
+}).collect();
+```
+
+This allows parallel phases (e.g., independent review agents) to execute concurrently while respecting dependencies.
+
+### Hooks System
+
+Presets define hooks that trigger at phase transitions:
+
+```yaml
+hooks:
+  on_phase_complete:
+    execute:
+      - run: "makima supervisor spawn 'tests' --plan 'Run test suite'"
+      - wait_for: "tests"
+  on_review_complete:
+    - condition: "findings.p1_count == 0"
+      run: "makima supervisor advance-phase compound -y"
+    - condition: "findings.p1_count > 0"
+      run: "makima supervisor ask 'P1 findings detected. Continue?' --choices 'Fix first,Continue anyway'"
+```
+
+### Autonomous Loop
+
+Presets work with the existing autonomous loop:
+- Each phase uses `<COMPLETION_GATE>` to signal completion
+- Circuit breaker prevents stuck phases
+- `autonomous_loop: true` on the contract enables automatic continuation
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Preset Engine (4-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Preset YAML schema definition | 0.5 days | Define YAML format, validation rules |
+| YAML parser with variable substitution | 1 day | Parse presets, substitute `{{ variables }}` |
+| `preset list` command | 0.5 days | Discover and list available presets |
+| `preset run` command | 1.5 days | Create contract + supervisor from preset |
+| `preset preview` command | 0.5 days | Dry-run display |
+| Built-in preset definitions | 1 day | Write 4 default presets |
+
+### Phase 2: Custom Presets (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| User/repo preset discovery | 1 day | Multi-level preset resolution |
+| `preset create` command | 1.5 days | Generate preset from existing contract |
+| `preset validate` command | 0.5 days | Validate preset YAML |
+| Preset versioning | 1 day | Version field, migration support |
+
+### Phase 3: Integration & Polish (3-5 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Hooks system | 1.5 days | Phase transition hooks |
+| Auto-trigger integration | 1 day | Wire to review/compound auto-triggers |
+| Directive system integration | 1 day | Complex presets as directive DAGs |
+| Documentation | 0.5 days | User guide, preset authoring guide |
+
+---
+
+## Configuration Examples
+
+### Running a Preset
+
+```bash
+# Simplest usage — one command to run a full pipeline
+makima preset run full-pipeline --var task_description="Add OAuth2 login"
+
+# This creates:
+# - Contract: "Add OAuth2 login (full-pipeline)"
+# - Supervisor task with complete phase orchestration
+# - Auto-review enabled
+# - Auto-compound enabled
+# - All phases configured with deliverables
+```
+
+### Creating a Custom Preset
+
+```yaml
+# .makima/presets/api-feature.yaml
+name: api-feature
+description: "API feature development with schema validation"
+contract_type: specification
+version: 1
+
+variables:
+  feature_name:
+    required: true
+    description: "Name of the API feature"
+  api_version:
+    required: false
+    default: "v1"
+    description: "API version"
+
+phases:
+  research:
+    enabled: true
+    supervisor_plan: |
+      Research existing API patterns in the codebase for {{ api_version }}.
+      Document the current API schema structure.
+      Identify relevant endpoints and data models for {{ feature_name }}.
+
+  plan:
+    enabled: true
+    deepen: true
+    deepen_focus:
+      - api-patterns
+      - security
+      - edge-cases
+    supervisor_plan: |
+      Plan the {{ feature_name }} API feature for {{ api_version }}.
+      Include: endpoint design, request/response schemas, validation rules,
+      error handling, and test cases.
+
+  execute:
+    enabled: true
+    max_concurrent_tasks: 2
+    supervisor_plan: |
+      Implement the {{ feature_name }} API feature.
+      Follow the plan. Create endpoints, handlers, validators, and tests.
+      Run tests after implementation.
+    completion_action: "branch"
+
+  review:
+    enabled: true
+    auto_review: true
+    review_agents:
+      - security-sentinel
+      - api-contract-validator
+      - test-coverage-analyzer
+    merge_blocking_severity: P1
+
+  compound:
+    enabled: true
+    auto_compound: true
+    categories:
+      - api-patterns
+      - security-practices
+```
+
+### Listing Presets
+
+```
+$ makima preset list
+
+BUILT-IN PRESETS
+  full-pipeline     Complete feature development pipeline with review and learning
+  quick-fix         Fast bug fix with minimal ceremony
+  refactor          Systematic refactoring with safety checks
+  investigation     Research-focused analysis workflow
+
+REPOSITORY PRESETS (.makima/presets/)
+  api-feature       API feature development with schema validation
+  migration         Database migration with rollback plan
+
+USER PRESETS (~/.makima/presets/)
+  my-workflow       Custom workflow for frontend development
+```
+
+---
+
+## Open Questions
+
+1. **Preset inheritance**: Should presets be able to extend other presets? (e.g., `extends: full-pipeline` with overrides)
+2. **Conditional phases**: Should phases be conditionally enabled based on runtime conditions? (e.g., skip review for changes under 50 lines)
+3. **Preset parameters validation**: How strict should variable validation be? Allow arbitrary variables or enforce a schema?
+4. **Preset sharing**: Should presets be sharable via a registry or marketplace?
+5. **Preset analytics**: Should we track which presets are most used and their success rates?
+6. **Rollback**: If a preset-driven workflow fails mid-phase, how should recovery work?
+7. **Interactive mode**: Should presets support interactive steps where the user provides input mid-pipeline?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Hardcoded pipelines | Simple, predictable | Not customizable; one-size-fits-all | Rejected — need flexibility |
+| Pure CLI scripting | Maximum flexibility | Not portable; error-prone; no validation | Rejected — too fragile |
+| GUI workflow builder | Visual, intuitive | High development cost; not scriptable | Deferred — consider for UI |
+| Contract type expansion | Minimal new concepts | Doesn't solve orchestration; just adds phase combos | Partial — presets use contract types |
+| Makefile-style approach | Familiar to developers | Wrong abstraction level; no variable substitution | Rejected — YAML is better fit |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: HIGH** — Workflow presets are the **gateway feature** that makes all other features accessible. Without presets, users must manually orchestrate review, deepening, and compounding. With presets, these features are activated with a single command.
+- **Complexity: MEDIUM** — YAML parsing and variable substitution are straightforward. Hooks system and directive integration add complexity. Main challenge is designing a preset schema that's flexible enough for diverse workflows without being overwhelming.
+- **Risk: LOW** — Presets are purely additive. They don't change existing behavior. Users can always fall back to manual orchestration.
author	soryu <soryu@soryu.co>	2026-02-09 16:51:59 +0000
committer	GitHub <noreply@github.com>	2026-02-09 16:51:59 +0000
commit	76bb9da745f6c12c8e7e587a9096677bbf98f395 (patch)
tree	5bd856d1018c6fab4700b625e5ffefb344200bf4 /docs
parent	268cdce19b1e17128cb8806bee7e0ead1afaa95b (diff)
download	soryu-76bb9da745f6c12c8e7e587a9096677bbf98f395.tar.gz soryu-76bb9da745f6c12c8e7e587a9096677bbf98f395.zip