diff options
Diffstat (limited to 'docs/proposals/compound-engineering-analysis.md')
| -rw-r--r-- | docs/proposals/compound-engineering-analysis.md | 300 |
1 files changed, 300 insertions, 0 deletions
diff --git a/docs/proposals/compound-engineering-analysis.md b/docs/proposals/compound-engineering-analysis.md new file mode 100644 index 0000000..5a8c6da --- /dev/null +++ b/docs/proposals/compound-engineering-analysis.md @@ -0,0 +1,300 @@ +# Compound Engineering Plugin — Analysis & Makima Feature Mapping + +> **Document Type:** Overview Analysis +> **Status:** Proposal +> **Date:** 2026-02-09 +> **Related Proposals:** [Multi-Agent Review](feature-multi-agent-review.md) · [Knowledge Accumulation](feature-knowledge-accumulation.md) · [Plan Deepening](feature-plan-deepening.md) · [Workflow Presets](feature-workflow-presets.md) · [Findings Tracking](feature-findings-tracking.md) · [Task Templates](feature-task-templates.md) + +--- + +## Executive Summary + +The [Compound Engineering Plugin](https://github.com/EveryInc/compound-engineering-plugin) is a Claude Code plugin comprising **29 agents, 25 commands, 16 skills, and 1 MCP server**. Its core innovation is a self-reinforcing engineering loop where every unit of work makes subsequent work easier—not harder. + +This document analyzes the plugin's architecture, maps its capabilities against makima's existing features, identifies gaps, and proposes a phased adoption strategy. The compound engineering plugin excels at **within-session orchestration** (parallel review agents, plan deepening, knowledge capture), while makima excels at **cross-session orchestration** (contract lifecycle, worktree isolation, DAG-based directives). Combining both creates a uniquely powerful system. + +--- + +## Core Philosophy + +> *"Each unit of engineering work should make subsequent units easier—not harder."* + +The plugin operationalizes this through a four-phase feedback loop where the critical **Compound** step captures learnings that feed back into future planning: + +``` +┌─────────────────────────────────────────────────────────┐ +│ │ +│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ +│ │ │ │ │ │ │ │ +│ │ PLAN │───▶│ WORK │───▶│ REVIEW │ │ +│ │ │ │ │ │ │ │ +│ └──────────┘ └──────────┘ └──────────┘ │ +│ ▲ │ │ +│ │ ┌──────────┐ │ │ +│ │ │ │ │ │ +│ └──────────│ COMPOUND │◀────────┘ │ +│ │ │ │ +│ Learnings fed └──────────┘ Captures solutions, │ +│ back into │ patterns, failures │ +│ future plans ▼ │ +│ docs/solutions/ │ +│ ├── build-errors/ │ +│ ├── test-failures/ │ +│ ├── api-patterns/ │ +│ └── ...9 categories │ +│ │ +└─────────────────────────────────────────────────────────┘ +``` + +This maps directly to makima's contract phases: **Research → Specify → Plan → Execute → Review** with a proposed new **Compound** phase inserted after Review. + +--- + +## Plugin Architecture Overview + +### Agent Categories (29 Total) + +| Category | Count | Examples | +|----------|-------|---------| +| Review Agents | 12-15 | Security Sentinel, Performance Oracle, Architecture Strategist, Code Philosopher, Data Integrity Guardian, Error Resilience Analyzer, API Contract Validator, Dependency Health Checker, Test Coverage Analyzer, Documentation Completeness, Concurrency Safety | +| Research Agents | 20-40 | Best practices, edge case analysis, dependency research, pattern matching | +| Learning Agents | 5 | Context extractor, solution documenter, prevention strategist, categorizer, doc linker | +| Pipeline Agents | ~5 | LFG orchestrator, SLFG parallelizer, phase coordinators | +| Meta Agents | 2-3 | Agent creator, skill healer, template generator | + +### Command Categories (25 Total) + +| Category | Key Commands | Description | +|----------|-------------|-------------| +| Planning | `/plan`, `/deepen-plan` | Create and enhance implementation plans | +| Execution | `/lfg`, `/slfg` | Full autonomous pipelines (serial/parallel) | +| Review | `/parallel-review`, `/review` | Multi-agent code review | +| Learning | `/compound`, `/search-learnings` | Capture and retrieve knowledge | +| Meta | `/create-agent-skill`, `/heal-skill` | Self-improving tooling | +| Findings | `/create-todo`, `/resolve-todo` | Structured issue tracking | + +### Skill Categories (16 Total) + +Skills provide specialized capabilities including code analysis, pattern detection, security scanning, performance profiling, and documentation generation. + +### MCP Server (1) + +Provides tool access for agents to interact with the file system, git, and external services during parallel execution. + +--- + +## Agent-Native Architecture Concepts + +The compound engineering plugin embraces an **agent-native** design philosophy: + +1. **Parallel-First**: Tasks that can be parallelized are always parallelized (review agents, research agents, learning sub-agents) +2. **Structured Output**: All agent outputs use YAML frontmatter + markdown, enabling machine parsing +3. **Swarm Orchestration**: Groups of agents with synchronization gates (spawn N → wait for all → synthesize) +4. **Self-Healing**: Meta-commands detect broken skills and auto-repair them +5. **Progressive Enhancement**: Plans start simple, then are "deepened" with research results + +--- + +## Mapping to Makima's Architecture + +### What Makima Already Has + +| Compound Engineering Feature | Makima Equivalent | Coverage | +|------------------------------|-------------------|----------| +| Plan → Work → Review loop | Contract phases (Research → Specify → Plan → Execute → Review) | ✅ Full | +| Task orchestration | Supervisor/worker hierarchy with `spawn-task` | ✅ Full | +| Parallel task execution | Multiple workers in separate worktrees | ✅ Full | +| Task isolation | Git worktree per task | ✅ Full | +| Phase transitions | `supervisor advance-phase` with phase guards | ✅ Full | +| Pipeline orchestration | Directive system with DAG dependencies | ✅ Full | +| User interaction during execution | `supervisor ask` with timeout/choices | ✅ Full | +| Task continuation | `continue_from_task_id`, `--continue` flag | ✅ Full | +| Branching/forking | `supervisor branch`, `task-fork`, `task-rewind` | ✅ Full | +| Circuit breakers | CircuitBreaker (max iterations, stuck detection) | ✅ Full | +| Completion gates | `<COMPLETION_GATE>` parsing in autonomous loop | ✅ Full | +| Document management | Contract files with versioning, structured body | ✅ Full | + +### What Makima Is Missing (Gaps) + +| Compound Engineering Feature | Makima Gap | Priority | Proposal | +|------------------------------|-----------|----------|----------| +| Multi-agent parallel review | No automated review, no review task templates | **High** | [feature-multi-agent-review.md](feature-multi-agent-review.md) | +| Compound learning / knowledge accumulation | No cross-contract knowledge capture | **High** | [feature-knowledge-accumulation.md](feature-knowledge-accumulation.md) | +| Plan deepening with research agents | Single-pass planning, no research integration | **Medium** | [feature-plan-deepening.md](feature-plan-deepening.md) | +| One-command pipelines (LFG/SLFG) | Manual orchestration per contract | **High** | [feature-workflow-presets.md](feature-workflow-presets.md) | +| Structured findings/TODOs | Unstructured review output | **Medium** | [feature-findings-tracking.md](feature-findings-tracking.md) | +| Reusable task/agent templates | Ad-hoc plans, no template reuse | **Medium** | [feature-task-templates.md](feature-task-templates.md) | + +--- + +## Feature Set Summary + +| # | Feature | Priority | Complexity | Effort | Proposal | +|---|---------|----------|------------|--------|----------| +| 1 | Multi-Agent Parallel Review | High | Medium | 12-18 days | [Link](feature-multi-agent-review.md) | +| 2 | Knowledge Accumulation | High | Medium | 10-15 days | [Link](feature-knowledge-accumulation.md) | +| 3 | Plan Deepening | Medium | Low | 5-8 days | [Link](feature-plan-deepening.md) | +| 4 | Workflow Presets | High | Medium | 10-15 days | [Link](feature-workflow-presets.md) | +| 5 | Findings Tracking | Medium | Low | 7-10 days | [Link](feature-findings-tracking.md) | +| 6 | Task Templates | Medium | Medium | 8-12 days | [Link](feature-task-templates.md) | +| | **Total** | | | **52-78 days** | | + +--- + +## Implementation Strategy + +### Recommended Phasing + +``` +Phase 1: Foundations (Weeks 1-4) +├── Workflow Presets ────────── Enables one-command pipelines +└── Findings Tracking ──────── Structured review output format + +Phase 2: Core Loop (Weeks 5-9) +├── Multi-Agent Review ──────── Automated parallel review +└── Knowledge Accumulation ──── Cross-contract learning + +Phase 3: Enhancement (Weeks 10-13) +├── Plan Deepening ──────────── Research-enhanced planning +└── Task Templates ──────────── Reusable patterns +``` + +**Rationale for ordering:** + +1. **Phase 1** builds infrastructure that Phase 2 depends on: + - Workflow Presets provide the pipeline framework that Review and Learning plug into + - Findings Tracking provides the structured output format that Review agents produce + +2. **Phase 2** implements the core compound loop: + - Multi-Agent Review produces structured findings + - Knowledge Accumulation closes the feedback loop + +3. **Phase 3** optimizes the system: + - Plan Deepening uses the knowledge base to enhance plans + - Task Templates codify proven patterns for reuse + +### Integration Points Between Features + +``` + ┌─────────────────┐ + │ Workflow Presets │ + │ (orchestrator) │ + └────────┬────────┘ + │ triggers phases + ┌──────────────┼──────────────┐ + ▼ ▼ ▼ + ┌────────────┐ ┌──────────────┐ ┌───────────┐ + │ Plan │ │ Multi-Agent │ │ Knowledge │ + │ Deepening │ │ Review │ │ Accum. │ + └─────┬──────┘ └──────┬───────┘ └─────┬─────┘ + │ │ │ + │ produces │ │ + │ ▼ │ + │ ┌──────────────┐ │ + │ │ Findings │ │ + │ │ Tracking │ │ + │ └──────────────┘ │ + │ │ + └──────── feeds into ──────────────┘ + │ + ┌────┴─────┐ + │ Task │ + │ Templates│ + └──────────┘ + codifies patterns +``` + +--- + +## Competitive Analysis + +### Compound Engineering Plugin Strengths + +| Strength | Detail | +|----------|--------| +| **Depth of review** | 12-15 specialized reviewers catch issues a single reviewer misses | +| **Knowledge compounding** | Learnings are never lost; they compound over time | +| **One-command pipelines** | `/lfg` runs full plan→work→review→compound cycle | +| **Self-improvement** | Meta-commands create new agents/skills on demand | +| **Swarm patterns** | Sophisticated parallel group management | + +### Makima Strengths + +| Strength | Detail | +|----------|--------| +| **True isolation** | Git worktrees provide real filesystem isolation, not just context isolation | +| **Persistent orchestration** | Contracts survive across sessions; plugin agents are ephemeral | +| **DAG execution** | Directives model complex dependency graphs natively | +| **User interaction** | Rich question/answer system with timeouts and multi-select | +| **Infrastructure** | Server-based architecture with WebSocket real-time communication | +| **Checkpoint/recovery** | Full task rewind, fork, and patch-based recovery | +| **Phase governance** | Phase guards require explicit user approval for transitions | + +### Combined Value Proposition + +| Dimension | Plugin Alone | Makima Alone | Combined | +|-----------|-------------|-------------|----------| +| Review quality | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ | +| Task isolation | ⭐⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | +| Knowledge retention | ⭐⭐⭐⭐ | ⭐ | ⭐⭐⭐⭐⭐ | +| Persistent orchestration | ⭐ | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ | +| Pipeline automation | ⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐⭐ | +| Self-improvement | ⭐⭐⭐⭐ | ⭐ | ⭐⭐⭐⭐ | + +--- + +## Risk Analysis + +### Technical Risks + +| Risk | Impact | Likelihood | Mitigation | +|------|--------|------------|------------| +| Parallel review agents overwhelm system resources | High | Medium | Implement concurrency limits; use makima's existing CircuitBreaker | +| Knowledge base grows unwieldy | Medium | High | Implement relevance decay, deduplication, and quality gates | +| Workflow presets too rigid for diverse use cases | Medium | Medium | Support variable substitution and optional steps | +| Review synthesis produces noisy/contradictory results | Medium | Medium | Weighted deduplication with priority-based conflict resolution | +| Template proliferation creates maintenance burden | Low | Medium | Template versioning and deprecation lifecycle | + +### Organizational Risks + +| Risk | Impact | Likelihood | Mitigation | +|------|--------|------------|------------| +| Scope creep across all 6 features | High | High | Strict phasing; each feature is independently shippable | +| Users don't adopt knowledge accumulation habits | Medium | Medium | Make it automatic (not opt-in); integrate with workflow presets | +| Configuration complexity deters users | Medium | Medium | Sensible defaults; progressive disclosure of configuration | + +--- + +## Success Metrics + +### Per-Feature Metrics + +| Feature | Key Metric | Target | +|---------|-----------|--------| +| Multi-Agent Review | Defects caught before merge | 40% increase vs single review | +| Knowledge Accumulation | Knowledge reuse rate | >30% of new contracts reference existing learnings | +| Plan Deepening | Plan revision rate after execution starts | <15% (down from estimated ~40%) | +| Workflow Presets | Time from contract creation to first commit | 50% reduction | +| Findings Tracking | Finding resolution rate | >85% of P1/P2 findings resolved | +| Task Templates | Template reuse rate | >25% of tasks use templates after 3 months | + +### System-Level Metrics + +- **Cycle time**: Time from contract creation to completion — target 30% reduction +- **Defect escape rate**: Issues found post-merge — target 50% reduction +- **Knowledge density**: Learnings per contract — target >2.5 after 6 months +- **User satisfaction**: Survey score — target >4.2/5.0 + +--- + +## Conclusion + +The compound engineering plugin represents a mature implementation of agent-native engineering workflows. Its greatest innovations—parallel multi-perspective review, knowledge compounding, and autonomous pipelines—address real gaps in makima's current capabilities. + +Makima's infrastructure advantages (true worktree isolation, persistent contracts, DAG-based directives, server architecture) provide a superior foundation for implementing these features. The proposed phased approach delivers incremental value while building toward the full compound engineering loop. + +The combined system would offer something neither tool provides alone: **persistent, isolated, knowledge-compounding engineering workflows with multi-agent review and one-command pipeline automation**. + +--- + +*Next steps: Review individual feature proposals for detailed implementation plans.* |
