Add compound engineering feature proposals for makima (#58)

Analyze the compound engineering plugin (https://github.com/EveryInc/compound-engineering-plugin) and propose 6 features inspired by its patterns for adoption into makima: - Multi-agent parallel review system (spawn-group/wait-group) - Knowledge accumulation / compound learning phase - Parallel plan deepening with research agents - Workflow presets / pipeline templates (LFG-style one-command pipelines) - Structured findings tracking with severity and lifecycle - Reusable task templates with meta-commands Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
author: soryu <soryu@soryu.co> 2026-02-09 16:51:59 +0000
committer: GitHub <noreply@github.com> 2026-02-09 16:51:59 +0000
commit: 76bb9da745f6c12c8e7e587a9096677bbf98f395 (patch)
tree: 5bd856d1018c6fab4700b625e5ffefb344200bf4 /docs/proposals/feature-plan-deepening.md
parent: 268cdce19b1e17128cb8806bee7e0ead1afaa95b (diff)
download: soryu-76bb9da745f6c12c8e7e587a9096677bbf98f395.tar.gz
soryu-76bb9da745f6c12c8e7e587a9096677bbf98f395.zip
1 files changed, 383 insertions, 0 deletions
diff --git a/docs/proposals/feature-plan-deepening.md b/docs/proposals/feature-plan-deepening.md
new file mode 100644
index 0000000..c2d8aeb
--- /dev/null
+++ b/docs/proposals/feature-plan-deepening.md
@@ -0,0 +1,383 @@
+# Feature Proposal: Parallel Plan Deepening
+
+> **Priority:** Medium
+> **Complexity:** Low
+> **Estimated Effort:** 5-8 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** [Knowledge Accumulation](feature-knowledge-accumulation.md) (recommended, not required)
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Multi-Agent Review](feature-multi-agent-review.md)
+
+---
+
+## Problem Statement
+
+Makima's planning phase currently suffers from **single-pass planning**:
+
+- A supervisor creates a plan based on its immediate analysis of the task
+- **No systematic research** is conducted before finalizing the plan
+- **Edge cases are discovered during execution**, requiring mid-stream plan changes
+- **Best practices are not consulted** — the plan relies solely on the model's training knowledge
+- **Existing project learnings** (if the knowledge accumulation feature exists) are not surfaced during planning
+- **Revision rate is high** — an estimated ~40% of plans require significant changes after execution begins
+
+The result: plans are shallow, execution discovers problems that planning should have caught, and contracts take longer than necessary.
+
+---
+
+## How Compound Engineering Solves This
+
+The compound engineering plugin's `/deepen-plan` command takes an existing plan and enhances it by spawning **20-40 parallel research agents**:
+
+```
+┌──────────────────────────────────────────────────────────────┐
+│                      /deepen-plan                             │
+│                                                              │
+│  Input: Initial plan (from /plan)                            │
+│                                                              │
+│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐       │
+│  │ Best     │ │ Edge     │ │ Dep.     │ │ Pattern  │       │
+│  │ Practice │ │ Case     │ │ Research │ │ Matching │       │
+│  │ Agent 1  │ │ Agent 1  │ │ Agent 1  │ │ Agent 1  │       │
+│  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘       │
+│       │            │            │            │              │
+│  ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐       │
+│  │ Best     │ │ Edge     │ │ Security │ │ Existing │       │
+│  │ Practice │ │ Case     │ │ Concerns │ │ Learning │       │
+│  │ Agent 2  │ │ Agent 2  │ │ Agent    │ │ Agent    │       │
+│  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘       │
+│       │            │            │            │              │
+│  ... (20-40 agents per plan item) ...                        │
+│       │            │            │            │              │
+│       ▼            ▼            ▼            ▼              │
+│  ┌──────────────────────────────────────────────────┐       │
+│  │              Synthesis Agent                      │       │
+│  │  - Merge research into plan                       │       │
+│  │  - Add edge case handling                         │       │
+│  │  - Insert best practice notes                     │       │
+│  │  - Flag risks and dependencies                    │       │
+│  └──────────────────────────────────────────────────┘       │
+│                        │                                     │
+│                        ▼                                     │
+│              Enhanced Plan (Deepened)                         │
+│              - Original steps preserved                      │
+│              - Edge cases added per step                      │
+│              - Best practices annotated                       │
+│              - Risks flagged                                  │
+│              - Dependencies clarified                         │
+└──────────────────────────────────────────────────────────────┘
+```
+
+The key insight: **research is embarrassingly parallel**. Each plan item can be researched independently, and each research dimension (best practices, edge cases, security, etc.) is independent.
+
+---
+
+## Proposed Makima Implementation
+
+### 1. New Supervisor Command: `makima supervisor deepen-plan`
+
+```bash
+# Deepen the current contract's plan
+makima supervisor deepen-plan
+
+# Deepen with specific focus areas
+makima supervisor deepen-plan --focus "security,edge-cases,performance"
+
+# Deepen with explicit plan file reference
+makima supervisor deepen-plan --plan-file plan.md
+
+# Control parallelism
+makima supervisor deepen-plan --max-agents 10
+
+# Include knowledge base search (requires Knowledge Accumulation feature)
+makima supervisor deepen-plan --search-learnings
+```
+
+### 2. Research Agent Categories
+
+Each plan item is researched along multiple dimensions:
+
+| Agent Category | Purpose | Example Output |
+|----------------|---------|----------------|
+| **Best Practices** | Industry standards for the technology/pattern | "Use parameterized queries for all DB operations" |
+| **Edge Cases** | Boundary conditions and error scenarios | "Handle concurrent modification of shared resource" |
+| **Dependency Research** | Compatibility, versions, known issues | "Library X v3 has breaking changes from v2" |
+| **Security Concerns** | Security implications of the planned approach | "JWT stored in localStorage is vulnerable to XSS" |
+| **Performance Implications** | Performance characteristics and bottlenecks | "N+1 query risk with eager loading disabled" |
+| **Pattern Matching** | Similar patterns in the existing codebase | "Module Y already implements this pattern; follow its conventions" |
+| **Existing Learnings** | Prior solutions from knowledge base | "Similar issue solved in contract Z; see docs/solutions/..." |
+
+### 3. Deepening Flow
+
+```
+┌─────────────┐     ┌──────────────────┐     ┌────────────────┐
+│ Original    │     │ Research Phase    │     │ Enhanced Plan  │
+│ Plan        │────▶│                  │────▶│                │
+│             │     │ Per plan item:    │     │ Original +     │
+│ Step 1      │     │ - Best practices │     │ annotations    │
+│ Step 2      │     │ - Edge cases     │     │               │
+│ Step 3      │     │ - Dependencies   │     │ Step 1         │
+│ Step 4      │     │ - Security       │     │  ├ Edge cases  │
+│             │     │ - Performance    │     │  ├ Best pracs  │
+│             │     │ - Patterns       │     │  └ Risks       │
+│             │     │ - Learnings      │     │ Step 2         │
+│             │     │                  │     │  ├ Edge cases  │
+│             │     │ All in parallel  │     │  └ ...         │
+└─────────────┘     └──────────────────┘     └────────────────┘
+```
+
+**Implementation using existing infrastructure:**
+
+```bash
+# Step 1: Parse plan into items
+plan_items=$(makima supervisor get-plan-items)
+
+# Step 2: For each item, spawn research agents as a group
+for item in $plan_items; do
+  makima supervisor spawn-group "deepen-${item.id}" \
+    --tasks "[
+      {\"name\": \"best-practices\", \"plan\": \"Research best practices for: ${item.description}\"},
+      {\"name\": \"edge-cases\", \"plan\": \"Identify edge cases for: ${item.description}\"},
+      {\"name\": \"security\", \"plan\": \"Analyze security implications of: ${item.description}\"},
+      {\"name\": \"performance\", \"plan\": \"Assess performance implications of: ${item.description}\"}
+    ]" \
+    --share-worktree \
+    --read-only
+done
+
+# Step 3: Wait for all groups
+makima supervisor wait-group "deepen-*" --timeout 300
+
+# Step 4: Synthesize results into enhanced plan
+makima supervisor synthesize-plan
+```
+
+### 4. Enhanced Plan Format
+
+The deepened plan augments each step with structured annotations:
+
+```markdown
+## Step 3: Implement JWT Authentication
+
+### Original Plan
+Add JWT-based authentication middleware to the API gateway.
+Generate tokens on login, validate on each request.
+
+### Research Findings
+
+#### Best Practices
+- Use RS256 (asymmetric) for microservices, HS256 for monoliths
+- Set short access token TTL (15 min) with refresh token rotation
+- Include only essential claims (sub, exp, iat, roles)
+- Never store sensitive data in JWT payload (it's base64, not encrypted)
+
+#### Edge Cases
+- Token expiry during long-running requests
+- Clock skew between services (use ±30s leeway)
+- Concurrent refresh token rotation (race condition)
+- Token size exceeding header limits (>8KB with many claims)
+
+#### Security Concerns
+- **P2**: JWT in localStorage is XSS-vulnerable; prefer httpOnly cookies
+- **P3**: Missing CSRF protection if using cookies
+- **P2**: No token revocation mechanism for compromised tokens
+
+#### Performance Notes
+- JWT validation is CPU-bound (RS256 ~1ms per validation)
+- Consider caching decoded tokens for repeated validation
+- Refresh token DB lookup adds latency (~5ms)
+
+#### Existing Learnings
+- See: docs/solutions/security-practices/jwt-refresh-token-rotation.md
+- Previous contract "Auth Service Refactor" used similar pattern
+
+### Risks
+- [ ] Clock skew handling not in original plan
+- [ ] Token revocation strategy needed
+- [ ] CSRF protection if using cookie storage
+```
+
+### 5. Integration with Knowledge Base
+
+When the Knowledge Accumulation feature is available, `deepen-plan` automatically includes a **learning search agent** for each plan item:
+
+```
+Research Agent: "Search existing learnings relevant to JWT authentication"
+
+Results:
+- docs/solutions/security-practices/jwt-refresh-token-rotation.md (relevance: 0.92)
+- docs/solutions/api-patterns/authentication-middleware-pattern.md (relevance: 0.78)
+- docs/solutions/debugging-techniques/token-expiry-debugging.md (relevance: 0.65)
+```
+
+These results are included in the deepened plan with direct links.
+
+---
+
+## Integration with Existing Makima Features
+
+### Contract Phases
+
+Plan deepening occurs during the **Plan phase**, between initial plan creation and phase transition to Execute:
+
+```
+Plan Phase Timeline:
+  1. Supervisor creates initial plan
+  2. makima supervisor deepen-plan    ← NEW
+  3. User reviews deepened plan
+  4. makima supervisor advance-phase execute
+```
+
+### Supervisor/Worker Hierarchy
+
+Research agents are spawned as **worker tasks** under the supervisor. Uses the existing `spawn-task` infrastructure with the proposed `spawn-group`/`wait-group` from the [Multi-Agent Review](feature-multi-agent-review.md) proposal.
+
+### Contract Files
+
+The deepened plan replaces or augments the plan document as a contract file:
+
+```rust
+File {
+    contract_id: contract.id,
+    contract_phase: "plan",
+    name: "Implementation Plan (Deepened)",
+    body: vec![
+        // Enhanced plan content with annotations
+    ],
+}
+```
+
+### Directive System
+
+For directive-based workflows, plan deepening can be added as a step:
+
+```rust
+DirectiveStep {
+    name: "deepen-plan",
+    description: "Enhance implementation plan with parallel research",
+    depends_on: [initial_plan_step_id],
+    task_plan: "Run deepen-plan on the initial plan...",
+}
+```
+
+### Phase Guards
+
+If `phase_guard` is enabled, the user reviews the deepened plan before approving transition to execute. This is the natural checkpoint for plan quality.
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Command (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `deepen-plan` command | 1 day | Parse plan, spawn research groups |
+| Research agent templates | 0.5 days | Default prompts for each category |
+| Synthesis logic | 1 day | Merge research into annotated plan |
+| Plan file update | 0.5 days | Write deepened plan as contract file |
+
+### Phase 2: Knowledge Integration (1-2 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Learning search agent | 0.5 days | Search knowledge base per plan item |
+| Result integration | 0.5 days | Include learning links in plan |
+| Fallback when no KB | 0.5 days | Graceful degradation without KB |
+
+### Phase 3: Configuration & Polish (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Config file support | 0.5 days | `.makima/deepen.yaml` |
+| Focus area filtering | 0.5 days | `--focus` flag implementation |
+| Concurrency control | 0.5 days | `--max-agents` limit |
+| Documentation | 0.5 days | User guide |
+| Tests | 1 day | Unit + integration |
+
+---
+
+## Configuration Examples
+
+### Repository-Level Configuration
+
+```yaml
+# .makima/deepen.yaml
+version: 1
+deepen:
+  # Auto-deepen when plan is created
+  auto_trigger: false
+
+  # Maximum agents per plan item
+  max_agents_per_item: 5
+
+  # Total maximum concurrent agents
+  max_concurrent: 20
+
+  # Timeout per research agent (seconds)
+  agent_timeout: 120
+
+  # Research dimensions to include
+  dimensions:
+    - best-practices
+    - edge-cases
+    - security
+    - performance
+    - dependencies
+    - patterns
+    - learnings          # Requires Knowledge Accumulation
+
+  # Minimum plan items to trigger deepening
+  min_plan_items: 3
+
+  # Search learnings (requires Knowledge Accumulation)
+  search_learnings: true
+  search_min_relevance: 0.5
+```
+
+### Inline Usage
+
+```bash
+# Quick deepen with defaults
+makima supervisor deepen-plan
+
+# Focused deepen for security-sensitive work
+makima supervisor deepen-plan --focus security,edge-cases
+
+# Deepen with more agents for complex plans
+makima supervisor deepen-plan --max-agents 30
+
+# Deepen without knowledge base search
+makima supervisor deepen-plan --no-learnings
+```
+
+---
+
+## Open Questions
+
+1. **Plan format parsing**: How should the system parse existing plans to identify discrete items? Markdown headers? Numbered lists? YAML structure?
+2. **Research depth vs. cost**: 20-40 agents per deepening is expensive. Should there be a "lite" mode with fewer agents?
+3. **Deepening multiple times**: Can a plan be deepened iteratively? Should subsequent deepenings build on previous research?
+4. **User-provided context**: Should users be able to provide additional context (e.g., "this project uses PostgreSQL, not MySQL") to guide research?
+5. **Codebase analysis**: Should research agents analyze the existing codebase to find relevant patterns, or only reason from general knowledge?
+6. **Conflicting research**: When research agents disagree (e.g., one says "use Redis" and another says "avoid Redis"), how should the synthesis handle it?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Sequential research (one agent) | Simple, cheaper | Slow; misses multi-perspective insights | Rejected — parallel is core value |
+| Automatic deepening (always on) | No manual step | Adds latency to every plan; unnecessary for simple tasks | Optional auto-trigger |
+| Web search integration | Real-time information | Inconsistent quality; potential hallucination from web results | Deferred — consider for v2 |
+| User-provided research questions | Targeted research | Requires user to know what to ask | Complement — support alongside auto-research |
+| LLM-only research (no task spawning) | Simpler, no infrastructure | Limited by single context window; no parallelism | Rejected — defeats the purpose |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: MEDIUM** — Plan deepening significantly improves plan quality, but it's enhancement over an already-functional planning workflow. The compound engineering plugin's data shows ~40% plan revision reduction.
+- **Complexity: LOW** — This feature is largely a composition of existing primitives (task spawning, group waiting, plan file updates). The main new work is research agent prompts and synthesis logic.
+- **Risk: LOW** — Worst case is slightly better plans. No system changes required. Can be adopted incrementally.
author	soryu <soryu@soryu.co>	2026-02-09 16:51:59 +0000
committer	GitHub <noreply@github.com>	2026-02-09 16:51:59 +0000
commit	76bb9da745f6c12c8e7e587a9096677bbf98f395 (patch)
tree	5bd856d1018c6fab4700b625e5ffefb344200bf4 /docs/proposals/feature-plan-deepening.md
parent	268cdce19b1e17128cb8806bee7e0ead1afaa95b (diff)
download	soryu-76bb9da745f6c12c8e7e587a9096677bbf98f395.tar.gz soryu-76bb9da745f6c12c8e7e587a9096677bbf98f395.zip