summaryrefslogtreecommitdiff
path: root/docs/proposals/feature-plan-deepening.md
diff options
context:
space:
mode:
authorsoryu <soryu@soryu.co>2026-02-09 16:51:59 +0000
committerGitHub <noreply@github.com>2026-02-09 16:51:59 +0000
commit76bb9da745f6c12c8e7e587a9096677bbf98f395 (patch)
tree5bd856d1018c6fab4700b625e5ffefb344200bf4 /docs/proposals/feature-plan-deepening.md
parent268cdce19b1e17128cb8806bee7e0ead1afaa95b (diff)
downloadsoryu-76bb9da745f6c12c8e7e587a9096677bbf98f395.tar.gz
soryu-76bb9da745f6c12c8e7e587a9096677bbf98f395.zip
Add compound engineering feature proposals for makima (#58)
Analyze the compound engineering plugin (https://github.com/EveryInc/compound-engineering-plugin) and propose 6 features inspired by its patterns for adoption into makima: - Multi-agent parallel review system (spawn-group/wait-group) - Knowledge accumulation / compound learning phase - Parallel plan deepening with research agents - Workflow presets / pipeline templates (LFG-style one-command pipelines) - Structured findings tracking with severity and lifecycle - Reusable task templates with meta-commands Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Diffstat (limited to 'docs/proposals/feature-plan-deepening.md')
-rw-r--r--docs/proposals/feature-plan-deepening.md383
1 files changed, 383 insertions, 0 deletions
diff --git a/docs/proposals/feature-plan-deepening.md b/docs/proposals/feature-plan-deepening.md
new file mode 100644
index 0000000..c2d8aeb
--- /dev/null
+++ b/docs/proposals/feature-plan-deepening.md
@@ -0,0 +1,383 @@
+# Feature Proposal: Parallel Plan Deepening
+
+> **Priority:** Medium
+> **Complexity:** Low
+> **Estimated Effort:** 5-8 days
+> **Status:** Proposal
+> **Date:** 2026-02-09
+> **Dependencies:** [Knowledge Accumulation](feature-knowledge-accumulation.md) (recommended, not required)
+> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Multi-Agent Review](feature-multi-agent-review.md)
+
+---
+
+## Problem Statement
+
+Makima's planning phase currently suffers from **single-pass planning**:
+
+- A supervisor creates a plan based on its immediate analysis of the task
+- **No systematic research** is conducted before finalizing the plan
+- **Edge cases are discovered during execution**, requiring mid-stream plan changes
+- **Best practices are not consulted** — the plan relies solely on the model's training knowledge
+- **Existing project learnings** (if the knowledge accumulation feature exists) are not surfaced during planning
+- **Revision rate is high** — an estimated ~40% of plans require significant changes after execution begins
+
+The result: plans are shallow, execution discovers problems that planning should have caught, and contracts take longer than necessary.
+
+---
+
+## How Compound Engineering Solves This
+
+The compound engineering plugin's `/deepen-plan` command takes an existing plan and enhances it by spawning **20-40 parallel research agents**:
+
+```
+┌──────────────────────────────────────────────────────────────┐
+│ /deepen-plan │
+│ │
+│ Input: Initial plan (from /plan) │
+│ │
+│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
+│ │ Best │ │ Edge │ │ Dep. │ │ Pattern │ │
+│ │ Practice │ │ Case │ │ Research │ │ Matching │ │
+│ │ Agent 1 │ │ Agent 1 │ │ Agent 1 │ │ Agent 1 │ │
+│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
+│ │ │ │ │ │
+│ ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐ │
+│ │ Best │ │ Edge │ │ Security │ │ Existing │ │
+│ │ Practice │ │ Case │ │ Concerns │ │ Learning │ │
+│ │ Agent 2 │ │ Agent 2 │ │ Agent │ │ Agent │ │
+│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
+│ │ │ │ │ │
+│ ... (20-40 agents per plan item) ... │
+│ │ │ │ │ │
+│ ▼ ▼ ▼ ▼ │
+│ ┌──────────────────────────────────────────────────┐ │
+│ │ Synthesis Agent │ │
+│ │ - Merge research into plan │ │
+│ │ - Add edge case handling │ │
+│ │ - Insert best practice notes │ │
+│ │ - Flag risks and dependencies │ │
+│ └──────────────────────────────────────────────────┘ │
+│ │ │
+│ ▼ │
+│ Enhanced Plan (Deepened) │
+│ - Original steps preserved │
+│ - Edge cases added per step │
+│ - Best practices annotated │
+│ - Risks flagged │
+│ - Dependencies clarified │
+└──────────────────────────────────────────────────────────────┘
+```
+
+The key insight: **research is embarrassingly parallel**. Each plan item can be researched independently, and each research dimension (best practices, edge cases, security, etc.) is independent.
+
+---
+
+## Proposed Makima Implementation
+
+### 1. New Supervisor Command: `makima supervisor deepen-plan`
+
+```bash
+# Deepen the current contract's plan
+makima supervisor deepen-plan
+
+# Deepen with specific focus areas
+makima supervisor deepen-plan --focus "security,edge-cases,performance"
+
+# Deepen with explicit plan file reference
+makima supervisor deepen-plan --plan-file plan.md
+
+# Control parallelism
+makima supervisor deepen-plan --max-agents 10
+
+# Include knowledge base search (requires Knowledge Accumulation feature)
+makima supervisor deepen-plan --search-learnings
+```
+
+### 2. Research Agent Categories
+
+Each plan item is researched along multiple dimensions:
+
+| Agent Category | Purpose | Example Output |
+|----------------|---------|----------------|
+| **Best Practices** | Industry standards for the technology/pattern | "Use parameterized queries for all DB operations" |
+| **Edge Cases** | Boundary conditions and error scenarios | "Handle concurrent modification of shared resource" |
+| **Dependency Research** | Compatibility, versions, known issues | "Library X v3 has breaking changes from v2" |
+| **Security Concerns** | Security implications of the planned approach | "JWT stored in localStorage is vulnerable to XSS" |
+| **Performance Implications** | Performance characteristics and bottlenecks | "N+1 query risk with eager loading disabled" |
+| **Pattern Matching** | Similar patterns in the existing codebase | "Module Y already implements this pattern; follow its conventions" |
+| **Existing Learnings** | Prior solutions from knowledge base | "Similar issue solved in contract Z; see docs/solutions/..." |
+
+### 3. Deepening Flow
+
+```
+┌─────────────┐ ┌──────────────────┐ ┌────────────────┐
+│ Original │ │ Research Phase │ │ Enhanced Plan │
+│ Plan │────▶│ │────▶│ │
+│ │ │ Per plan item: │ │ Original + │
+│ Step 1 │ │ - Best practices │ │ annotations │
+│ Step 2 │ │ - Edge cases │ │ │
+│ Step 3 │ │ - Dependencies │ │ Step 1 │
+│ Step 4 │ │ - Security │ │ ├ Edge cases │
+│ │ │ - Performance │ │ ├ Best pracs │
+│ │ │ - Patterns │ │ └ Risks │
+│ │ │ - Learnings │ │ Step 2 │
+│ │ │ │ │ ├ Edge cases │
+│ │ │ All in parallel │ │ └ ... │
+└─────────────┘ └──────────────────┘ └────────────────┘
+```
+
+**Implementation using existing infrastructure:**
+
+```bash
+# Step 1: Parse plan into items
+plan_items=$(makima supervisor get-plan-items)
+
+# Step 2: For each item, spawn research agents as a group
+for item in $plan_items; do
+ makima supervisor spawn-group "deepen-${item.id}" \
+ --tasks "[
+ {\"name\": \"best-practices\", \"plan\": \"Research best practices for: ${item.description}\"},
+ {\"name\": \"edge-cases\", \"plan\": \"Identify edge cases for: ${item.description}\"},
+ {\"name\": \"security\", \"plan\": \"Analyze security implications of: ${item.description}\"},
+ {\"name\": \"performance\", \"plan\": \"Assess performance implications of: ${item.description}\"}
+ ]" \
+ --share-worktree \
+ --read-only
+done
+
+# Step 3: Wait for all groups
+makima supervisor wait-group "deepen-*" --timeout 300
+
+# Step 4: Synthesize results into enhanced plan
+makima supervisor synthesize-plan
+```
+
+### 4. Enhanced Plan Format
+
+The deepened plan augments each step with structured annotations:
+
+```markdown
+## Step 3: Implement JWT Authentication
+
+### Original Plan
+Add JWT-based authentication middleware to the API gateway.
+Generate tokens on login, validate on each request.
+
+### Research Findings
+
+#### Best Practices
+- Use RS256 (asymmetric) for microservices, HS256 for monoliths
+- Set short access token TTL (15 min) with refresh token rotation
+- Include only essential claims (sub, exp, iat, roles)
+- Never store sensitive data in JWT payload (it's base64, not encrypted)
+
+#### Edge Cases
+- Token expiry during long-running requests
+- Clock skew between services (use ±30s leeway)
+- Concurrent refresh token rotation (race condition)
+- Token size exceeding header limits (>8KB with many claims)
+
+#### Security Concerns
+- **P2**: JWT in localStorage is XSS-vulnerable; prefer httpOnly cookies
+- **P3**: Missing CSRF protection if using cookies
+- **P2**: No token revocation mechanism for compromised tokens
+
+#### Performance Notes
+- JWT validation is CPU-bound (RS256 ~1ms per validation)
+- Consider caching decoded tokens for repeated validation
+- Refresh token DB lookup adds latency (~5ms)
+
+#### Existing Learnings
+- See: docs/solutions/security-practices/jwt-refresh-token-rotation.md
+- Previous contract "Auth Service Refactor" used similar pattern
+
+### Risks
+- [ ] Clock skew handling not in original plan
+- [ ] Token revocation strategy needed
+- [ ] CSRF protection if using cookie storage
+```
+
+### 5. Integration with Knowledge Base
+
+When the Knowledge Accumulation feature is available, `deepen-plan` automatically includes a **learning search agent** for each plan item:
+
+```
+Research Agent: "Search existing learnings relevant to JWT authentication"
+
+Results:
+- docs/solutions/security-practices/jwt-refresh-token-rotation.md (relevance: 0.92)
+- docs/solutions/api-patterns/authentication-middleware-pattern.md (relevance: 0.78)
+- docs/solutions/debugging-techniques/token-expiry-debugging.md (relevance: 0.65)
+```
+
+These results are included in the deepened plan with direct links.
+
+---
+
+## Integration with Existing Makima Features
+
+### Contract Phases
+
+Plan deepening occurs during the **Plan phase**, between initial plan creation and phase transition to Execute:
+
+```
+Plan Phase Timeline:
+ 1. Supervisor creates initial plan
+ 2. makima supervisor deepen-plan ← NEW
+ 3. User reviews deepened plan
+ 4. makima supervisor advance-phase execute
+```
+
+### Supervisor/Worker Hierarchy
+
+Research agents are spawned as **worker tasks** under the supervisor. Uses the existing `spawn-task` infrastructure with the proposed `spawn-group`/`wait-group` from the [Multi-Agent Review](feature-multi-agent-review.md) proposal.
+
+### Contract Files
+
+The deepened plan replaces or augments the plan document as a contract file:
+
+```rust
+File {
+ contract_id: contract.id,
+ contract_phase: "plan",
+ name: "Implementation Plan (Deepened)",
+ body: vec![
+ // Enhanced plan content with annotations
+ ],
+}
+```
+
+### Directive System
+
+For directive-based workflows, plan deepening can be added as a step:
+
+```rust
+DirectiveStep {
+ name: "deepen-plan",
+ description: "Enhance implementation plan with parallel research",
+ depends_on: [initial_plan_step_id],
+ task_plan: "Run deepen-plan on the initial plan...",
+}
+```
+
+### Phase Guards
+
+If `phase_guard` is enabled, the user reviews the deepened plan before approving transition to execute. This is the natural checkpoint for plan quality.
+
+---
+
+## Implementation Plan
+
+### Phase 1: Core Command (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| `deepen-plan` command | 1 day | Parse plan, spawn research groups |
+| Research agent templates | 0.5 days | Default prompts for each category |
+| Synthesis logic | 1 day | Merge research into annotated plan |
+| Plan file update | 0.5 days | Write deepened plan as contract file |
+
+### Phase 2: Knowledge Integration (1-2 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Learning search agent | 0.5 days | Search knowledge base per plan item |
+| Result integration | 0.5 days | Include learning links in plan |
+| Fallback when no KB | 0.5 days | Graceful degradation without KB |
+
+### Phase 3: Configuration & Polish (2-3 days)
+
+| Task | Effort | Description |
+|------|--------|-------------|
+| Config file support | 0.5 days | `.makima/deepen.yaml` |
+| Focus area filtering | 0.5 days | `--focus` flag implementation |
+| Concurrency control | 0.5 days | `--max-agents` limit |
+| Documentation | 0.5 days | User guide |
+| Tests | 1 day | Unit + integration |
+
+---
+
+## Configuration Examples
+
+### Repository-Level Configuration
+
+```yaml
+# .makima/deepen.yaml
+version: 1
+deepen:
+ # Auto-deepen when plan is created
+ auto_trigger: false
+
+ # Maximum agents per plan item
+ max_agents_per_item: 5
+
+ # Total maximum concurrent agents
+ max_concurrent: 20
+
+ # Timeout per research agent (seconds)
+ agent_timeout: 120
+
+ # Research dimensions to include
+ dimensions:
+ - best-practices
+ - edge-cases
+ - security
+ - performance
+ - dependencies
+ - patterns
+ - learnings # Requires Knowledge Accumulation
+
+ # Minimum plan items to trigger deepening
+ min_plan_items: 3
+
+ # Search learnings (requires Knowledge Accumulation)
+ search_learnings: true
+ search_min_relevance: 0.5
+```
+
+### Inline Usage
+
+```bash
+# Quick deepen with defaults
+makima supervisor deepen-plan
+
+# Focused deepen for security-sensitive work
+makima supervisor deepen-plan --focus security,edge-cases
+
+# Deepen with more agents for complex plans
+makima supervisor deepen-plan --max-agents 30
+
+# Deepen without knowledge base search
+makima supervisor deepen-plan --no-learnings
+```
+
+---
+
+## Open Questions
+
+1. **Plan format parsing**: How should the system parse existing plans to identify discrete items? Markdown headers? Numbered lists? YAML structure?
+2. **Research depth vs. cost**: 20-40 agents per deepening is expensive. Should there be a "lite" mode with fewer agents?
+3. **Deepening multiple times**: Can a plan be deepened iteratively? Should subsequent deepenings build on previous research?
+4. **User-provided context**: Should users be able to provide additional context (e.g., "this project uses PostgreSQL, not MySQL") to guide research?
+5. **Codebase analysis**: Should research agents analyze the existing codebase to find relevant patterns, or only reason from general knowledge?
+6. **Conflicting research**: When research agents disagree (e.g., one says "use Redis" and another says "avoid Redis"), how should the synthesis handle it?
+
+---
+
+## Alternatives Considered
+
+| Alternative | Pros | Cons | Decision |
+|-------------|------|------|----------|
+| Sequential research (one agent) | Simple, cheaper | Slow; misses multi-perspective insights | Rejected — parallel is core value |
+| Automatic deepening (always on) | No manual step | Adds latency to every plan; unnecessary for simple tasks | Optional auto-trigger |
+| Web search integration | Real-time information | Inconsistent quality; potential hallucination from web results | Deferred — consider for v2 |
+| User-provided research questions | Targeted research | Requires user to know what to ask | Complement — support alongside auto-research |
+| LLM-only research (no task spawning) | Simpler, no infrastructure | Limited by single context window; no parallelism | Rejected — defeats the purpose |
+
+---
+
+## Priority & Complexity Assessment
+
+- **Priority: MEDIUM** — Plan deepening significantly improves plan quality, but it's enhancement over an already-functional planning workflow. The compound engineering plugin's data shows ~40% plan revision reduction.
+- **Complexity: LOW** — This feature is largely a composition of existing primitives (task spawning, group waiting, plan file updates). The main new work is research agent prompts and synthesis logic.
+- **Risk: LOW** — Worst case is slightly better plans. No system changes required. Can be adopted incrementally.