summaryrefslogtreecommitdiff
path: root/docs/proposals/compound-engineering-analysis.md
diff options
context:
space:
mode:
Diffstat (limited to 'docs/proposals/compound-engineering-analysis.md')
-rw-r--r--docs/proposals/compound-engineering-analysis.md300
1 files changed, 300 insertions, 0 deletions
diff --git a/docs/proposals/compound-engineering-analysis.md b/docs/proposals/compound-engineering-analysis.md
new file mode 100644
index 0000000..5a8c6da
--- /dev/null
+++ b/docs/proposals/compound-engineering-analysis.md
@@ -0,0 +1,300 @@
+# Compound Engineering Plugin — Analysis & Makima Feature Mapping
+
+> **Document Type:** Overview Analysis
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Related Proposals:** [Multi-Agent Review](feature-multi-agent-review.md) · [Knowledge Accumulation](feature-knowledge-accumulation.md) · [Plan Deepening](feature-plan-deepening.md) · [Workflow Presets](feature-workflow-presets.md) · [Findings Tracking](feature-findings-tracking.md) · [Task Templates](feature-task-templates.md)
+
+---
+
+## Executive Summary
+
+The [Compound Engineering Plugin](https://github.com/EveryInc/compound-engineering-plugin) is a Claude Code plugin comprising **29 agents, 25 commands, 16 skills, and 1 MCP server**. Its core innovation is a self-reinforcing engineering loop where every unit of work makes subsequent work easier—not harder.
+
+This document analyzes the plugin's architecture, maps its capabilities against makima's existing features, identifies gaps, and proposes a phased adoption strategy. The compound engineering plugin excels at **within-session orchestration** (parallel review agents, plan deepening, knowledge capture), while makima excels at **cross-session orchestration** (contract lifecycle, worktree isolation, DAG-based directives). Combining both creates a uniquely powerful system.
+
+---
+
+## Core Philosophy
+
+> *"Each unit of engineering work should make subsequent units easier—not harder."*
+
+The plugin operationalizes this through a four-phase feedback loop where the critical **Compound** step captures learnings that feed back into future planning:
+
+```
+┌─────────────────────────────────────────────────────────┐
+│ │
+│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
+│ │ │ │ │ │ │ │
+│ │ PLAN │───▶│ WORK │───▶│ REVIEW │ │
+│ │ │ │ │ │ │ │
+│ └──────────┘ └──────────┘ └──────────┘ │
+│ ▲ │ │
+│ │ ┌──────────┐ │ │
+│ │ │ │ │ │
+│ └──────────│ COMPOUND │◀────────┘ │
+│ │ │ │
+│ Learnings fed └──────────┘ Captures solutions, │
+│ back into │ patterns, failures │
+│ future plans ▼ │
+│ docs/solutions/ │
+│ ├── build-errors/ │
+│ ├── test-failures/ │
+│ ├── api-patterns/ │
+│ └── ...9 categories │
+│ │
+└─────────────────────────────────────────────────────────┘
+```
+
+This maps directly to makima's contract phases: **Research → Specify → Plan → Execute → Review** with a proposed new **Compound** phase inserted after Review.
+
+---
+
+## Plugin Architecture Overview
+
+### Agent Categories (29 Total)
+
+| Category | Count | Examples |
+|----------|-------|---------|
+| Review Agents | 12-15 | Security Sentinel, Performance Oracle, Architecture Strategist, Code Philosopher, Data Integrity Guardian, Error Resilience Analyzer, API Contract Validator, Dependency Health Checker, Test Coverage Analyzer, Documentation Completeness, Concurrency Safety |
+| Research Agents | 20-40 | Best practices, edge case analysis, dependency research, pattern matching |
+| Learning Agents | 5 | Context extractor, solution documenter, prevention strategist, categorizer, doc linker |
+| Pipeline Agents | ~5 | LFG orchestrator, SLFG parallelizer, phase coordinators |
+| Meta Agents | 2-3 | Agent creator, skill healer, template generator |
+
+### Command Categories (25 Total)
+
+| Category | Key Commands | Description |
+|----------|-------------|-------------|
+| Planning | `/plan`, `/deepen-plan` | Create and enhance implementation plans |
+| Execution | `/lfg`, `/slfg` | Full autonomous pipelines (serial/parallel) |
+| Review | `/parallel-review`, `/review` | Multi-agent code review |
+| Learning | `/compound`, `/search-learnings` | Capture and retrieve knowledge |
+| Meta | `/create-agent-skill`, `/heal-skill` | Self-improving tooling |
+| Findings | `/create-todo`, `/resolve-todo` | Structured issue tracking |
+
+### Skill Categories (16 Total)
+
+Skills provide specialized capabilities including code analysis, pattern detection, security scanning, performance profiling, and documentation generation.
+
+### MCP Server (1)
+
+Provides tool access for agents to interact with the file system, git, and external services during parallel execution.
+
+---
+
+## Agent-Native Architecture Concepts
+
+The compound engineering plugin embraces an **agent-native** design philosophy:
+
+1. **Parallel-First**: Tasks that can be parallelized are always parallelized (review agents, research agents, learning sub-agents)
+2. **Structured Output**: All agent outputs use YAML frontmatter + markdown, enabling machine parsing
+3. **Swarm Orchestration**: Groups of agents with synchronization gates (spawn N → wait for all → synthesize)
+4. **Self-Healing**: Meta-commands detect broken skills and auto-repair them
+5. **Progressive Enhancement**: Plans start simple, then are "deepened" with research results
+
+---
+
+## Mapping to Makima's Architecture
+
+### What Makima Already Has
+
+| Compound Engineering Feature | Makima Equivalent | Coverage |
+|------------------------------|-------------------|----------|
+| Plan → Work → Review loop | Contract phases (Research → Specify → Plan → Execute → Review) | ✅ Full |
+| Task orchestration | Supervisor/worker hierarchy with `spawn-task` | ✅ Full |
+| Parallel task execution | Multiple workers in separate worktrees | ✅ Full |
+| Task isolation | Git worktree per task | ✅ Full |
+| Phase transitions | `supervisor advance-phase` with phase guards | ✅ Full |
+| Pipeline orchestration | Directive system with DAG dependencies | ✅ Full |
+| User interaction during execution | `supervisor ask` with timeout/choices | ✅ Full |
+| Task continuation | `continue_from_task_id`, `--continue` flag | ✅ Full |
+| Branching/forking | `supervisor branch`, `task-fork`, `task-rewind` | ✅ Full |
+| Circuit breakers | CircuitBreaker (max iterations, stuck detection) | ✅ Full |
+| Completion gates | `<COMPLETION_GATE>` parsing in autonomous loop | ✅ Full |
+| Document management | Contract files with versioning, structured body | ✅ Full |
+
+### What Makima Is Missing (Gaps)
+
+| Compound Engineering Feature | Makima Gap | Priority | Proposal |
+|------------------------------|-----------|----------|----------|
+| Multi-agent parallel review | No automated review, no review task templates | **High** | [feature-multi-agent-review.md](feature-multi-agent-review.md) |
+| Compound learning / knowledge accumulation | No cross-contract knowledge capture | **High** | [feature-knowledge-accumulation.md](feature-knowledge-accumulation.md) |
+| Plan deepening with research agents | Single-pass planning, no research integration | **Medium** | [feature-plan-deepening.md](feature-plan-deepening.md) |
+| One-command pipelines (LFG/SLFG) | Manual orchestration per contract | **High** | [feature-workflow-presets.md](feature-workflow-presets.md) |
+| Structured findings/TODOs | Unstructured review output | **Medium** | [feature-findings-tracking.md](feature-findings-tracking.md) |
+| Reusable task/agent templates | Ad-hoc plans, no template reuse | **Medium** | [feature-task-templates.md](feature-task-templates.md) |
+
+---
+
+## Feature Set Summary
+
+| # | Feature | Priority | Complexity | Effort | Proposal |
+|---|---------|----------|------------|--------|----------|
+| 1 | Multi-Agent Parallel Review | High | Medium | 12-18 days | [Link](feature-multi-agent-review.md) |
+| 2 | Knowledge Accumulation | High | Medium | 10-15 days | [Link](feature-knowledge-accumulation.md) |
+| 3 | Plan Deepening | Medium | Low | 5-8 days | [Link](feature-plan-deepening.md) |
+| 4 | Workflow Presets | High | Medium | 10-15 days | [Link](feature-workflow-presets.md) |
+| 5 | Findings Tracking | Medium | Low | 7-10 days | [Link](feature-findings-tracking.md) |
+| 6 | Task Templates | Medium | Medium | 8-12 days | [Link](feature-task-templates.md) |
+| | **Total** | | | **52-78 days** | |
+
+---
+
+## Implementation Strategy
+
+### Recommended Phasing
+
+```
+Phase 1: Foundations (Weeks 1-4)
+├── Workflow Presets ────────── Enables one-command pipelines
+└── Findings Tracking ──────── Structured review output format
+
+Phase 2: Core Loop (Weeks 5-9)
+├── Multi-Agent Review ──────── Automated parallel review
+└── Knowledge Accumulation ──── Cross-contract learning
+
+Phase 3: Enhancement (Weeks 10-13)
+├── Plan Deepening ──────────── Research-enhanced planning
+└── Task Templates ──────────── Reusable patterns
+```
+
+**Rationale for ordering:**
+
+1. **Phase 1** builds infrastructure that Phase 2 depends on:
+ - Workflow Presets provide the pipeline framework that Review and Learning plug into
+ - Findings Tracking provides the structured output format that Review agents produce
+
+2. **Phase 2** implements the core compound loop:
+ - Multi-Agent Review produces structured findings
+ - Knowledge Accumulation closes the feedback loop
+
+3. **Phase 3** optimizes the system:
+ - Plan Deepening uses the knowledge base to enhance plans
+ - Task Templates codify proven patterns for reuse
+
+### Integration Points Between Features
+
+```
+ ┌─────────────────┐
+ │ Workflow Presets │
+ │ (orchestrator) │
+ └────────┬────────┘
+ │ triggers phases
+ ┌──────────────┼──────────────┐
+ ▼ ▼ ▼
+ ┌────────────┐ ┌──────────────┐ ┌───────────┐
+ │ Plan │ │ Multi-Agent │ │ Knowledge │
+ │ Deepening │ │ Review │ │ Accum. │
+ └─────┬──────┘ └──────┬───────┘ └─────┬─────┘
+ │ │ │
+ │ produces │ │
+ │ ▼ │
+ │ ┌──────────────┐ │
+ │ │ Findings │ │
+ │ │ Tracking │ │
+ │ └──────────────┘ │
+ │ │
+ └──────── feeds into ──────────────┘
+ │
+ ┌────┴─────┐
+ │ Task │
+ │ Templates│
+ └──────────┘
+ codifies patterns
+```
+
+---
+
+## Competitive Analysis
+
+### Compound Engineering Plugin Strengths
+
+| Strength | Detail |
+|----------|--------|
+| **Depth of review** | 12-15 specialized reviewers catch issues a single reviewer misses |
+| **Knowledge compounding** | Learnings are never lost; they compound over time |
+| **One-command pipelines** | `/lfg` runs full plan→work→review→compound cycle |
+| **Self-improvement** | Meta-commands create new agents/skills on demand |
+| **Swarm patterns** | Sophisticated parallel group management |
+
+### Makima Strengths
+
+| Strength | Detail |
+|----------|--------|
+| **True isolation** | Git worktrees provide real filesystem isolation, not just context isolation |
+| **Persistent orchestration** | Contracts survive across sessions; plugin agents are ephemeral |
+| **DAG execution** | Directives model complex dependency graphs natively |
+| **User interaction** | Rich question/answer system with timeouts and multi-select |
+| **Infrastructure** | Server-based architecture with WebSocket real-time communication |
+| **Checkpoint/recovery** | Full task rewind, fork, and patch-based recovery |
+| **Phase governance** | Phase guards require explicit user approval for transitions |
+
+### Combined Value Proposition
+
+| Dimension | Plugin Alone | Makima Alone | Combined |
+|-----------|-------------|-------------|----------|
+| Review quality | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Task isolation | ⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Knowledge retention | ⭐⭐⭐⭐ | ⭐ | ⭐⭐⭐⭐⭐ |
+| Persistent orchestration | ⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Pipeline automation | ⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ |
+| Self-improvement | ⭐⭐⭐⭐ | ⭐ | ⭐⭐⭐⭐ |
+
+---
+
+## Risk Analysis
+
+### Technical Risks
+
+| Risk | Impact | Likelihood | Mitigation |
+|------|--------|------------|------------|
+| Parallel review agents overwhelm system resources | High | Medium | Implement concurrency limits; use makima's existing CircuitBreaker |
+| Knowledge base grows unwieldy | Medium | High | Implement relevance decay, deduplication, and quality gates |
+| Workflow presets too rigid for diverse use cases | Medium | Medium | Support variable substitution and optional steps |
+| Review synthesis produces noisy/contradictory results | Medium | Medium | Weighted deduplication with priority-based conflict resolution |
+| Template proliferation creates maintenance burden | Low | Medium | Template versioning and deprecation lifecycle |
+
+### Organizational Risks
+
+| Risk | Impact | Likelihood | Mitigation |
+|------|--------|------------|------------|
+| Scope creep across all 6 features | High | High | Strict phasing; each feature is independently shippable |
+| Users don't adopt knowledge accumulation habits | Medium | Medium | Make it automatic (not opt-in); integrate with workflow presets |
+| Configuration complexity deters users | Medium | Medium | Sensible defaults; progressive disclosure of configuration |
+
+---
+
+## Success Metrics
+
+### Per-Feature Metrics
+
+| Feature | Key Metric | Target |
+|---------|-----------|--------|
+| Multi-Agent Review | Defects caught before merge | 40% increase vs single review |
+| Knowledge Accumulation | Knowledge reuse rate | >30% of new contracts reference existing learnings |
+| Plan Deepening | Plan revision rate after execution starts | <15% (down from estimated ~40%) |
+| Workflow Presets | Time from contract creation to first commit | 50% reduction |
+| Findings Tracking | Finding resolution rate | >85% of P1/P2 findings resolved |
+| Task Templates | Template reuse rate | >25% of tasks use templates after 3 months |
+
+### System-Level Metrics
+
+- **Cycle time**: Time from contract creation to completion — target 30% reduction
+- **Defect escape rate**: Issues found post-merge — target 50% reduction
+- **Knowledge density**: Learnings per contract — target >2.5 after 6 months
+- **User satisfaction**: Survey score — target >4.2/5.0
+
+---
+
+## Conclusion
+
+The compound engineering plugin represents a mature implementation of agent-native engineering workflows. Its greatest innovations—parallel multi-perspective review, knowledge compounding, and autonomous pipelines—address real gaps in makima's current capabilities.
+
+Makima's infrastructure advantages (true worktree isolation, persistent contracts, DAG-based directives, server architecture) provide a superior foundation for implementing these features. The proposed phased approach delivers incremental value while building toward the full compound engineering loop.
+
+The combined system would offer something neither tool provides alone: **persistent, isolated, knowledge-compounding engineering workflows with multi-agent review and one-command pipeline automation**.
+
+---
+
+*Next steps: Review individual feature proposals for detailed implementation plans.*