diff options
Diffstat (limited to 'docs/proposals/feature-plan-deepening.md')
| -rw-r--r-- | docs/proposals/feature-plan-deepening.md | 383 |
1 files changed, 383 insertions, 0 deletions
diff --git a/docs/proposals/feature-plan-deepening.md b/docs/proposals/feature-plan-deepening.md new file mode 100644 index 0000000..c2d8aeb --- /dev/null +++ b/docs/proposals/feature-plan-deepening.md @@ -0,0 +1,383 @@ +# Feature Proposal: Parallel Plan Deepening + +> **Priority:** Medium +> **Complexity:** Low +> **Estimated Effort:** 5-8 days +> **Status:** Proposal +> **Date:** 2026-02-09 +> **Dependencies:** [Knowledge Accumulation](feature-knowledge-accumulation.md) (recommended, not required) +> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Multi-Agent Review](feature-multi-agent-review.md) + +--- + +## Problem Statement + +Makima's planning phase currently suffers from **single-pass planning**: + +- A supervisor creates a plan based on its immediate analysis of the task +- **No systematic research** is conducted before finalizing the plan +- **Edge cases are discovered during execution**, requiring mid-stream plan changes +- **Best practices are not consulted** — the plan relies solely on the model's training knowledge +- **Existing project learnings** (if the knowledge accumulation feature exists) are not surfaced during planning +- **Revision rate is high** — an estimated ~40% of plans require significant changes after execution begins + +The result: plans are shallow, execution discovers problems that planning should have caught, and contracts take longer than necessary. + +--- + +## How Compound Engineering Solves This + +The compound engineering plugin's `/deepen-plan` command takes an existing plan and enhances it by spawning **20-40 parallel research agents**: + +``` +┌──────────────────────────────────────────────────────────────┐ +│ /deepen-plan │ +│ │ +│ Input: Initial plan (from /plan) │ +│ │ +│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ +│ │ Best │ │ Edge │ │ Dep. │ │ Pattern │ │ +│ │ Practice │ │ Case │ │ Research │ │ Matching │ │ +│ │ Agent 1 │ │ Agent 1 │ │ Agent 1 │ │ Agent 1 │ │ +│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ +│ │ │ │ │ │ +│ ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐ │ +│ │ Best │ │ Edge │ │ Security │ │ Existing │ │ +│ │ Practice │ │ Case │ │ Concerns │ │ Learning │ │ +│ │ Agent 2 │ │ Agent 2 │ │ Agent │ │ Agent │ │ +│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ │ +│ │ │ │ │ │ +│ ... (20-40 agents per plan item) ... │ +│ │ │ │ │ │ +│ ▼ ▼ ▼ ▼ │ +│ ┌──────────────────────────────────────────────────┐ │ +│ │ Synthesis Agent │ │ +│ │ - Merge research into plan │ │ +│ │ - Add edge case handling │ │ +│ │ - Insert best practice notes │ │ +│ │ - Flag risks and dependencies │ │ +│ └──────────────────────────────────────────────────┘ │ +│ │ │ +│ ▼ │ +│ Enhanced Plan (Deepened) │ +│ - Original steps preserved │ +│ - Edge cases added per step │ +│ - Best practices annotated │ +│ - Risks flagged │ +│ - Dependencies clarified │ +└──────────────────────────────────────────────────────────────┘ +``` + +The key insight: **research is embarrassingly parallel**. Each plan item can be researched independently, and each research dimension (best practices, edge cases, security, etc.) is independent. + +--- + +## Proposed Makima Implementation + +### 1. New Supervisor Command: `makima supervisor deepen-plan` + +```bash +# Deepen the current contract's plan +makima supervisor deepen-plan + +# Deepen with specific focus areas +makima supervisor deepen-plan --focus "security,edge-cases,performance" + +# Deepen with explicit plan file reference +makima supervisor deepen-plan --plan-file plan.md + +# Control parallelism +makima supervisor deepen-plan --max-agents 10 + +# Include knowledge base search (requires Knowledge Accumulation feature) +makima supervisor deepen-plan --search-learnings +``` + +### 2. Research Agent Categories + +Each plan item is researched along multiple dimensions: + +| Agent Category | Purpose | Example Output | +|----------------|---------|----------------| +| **Best Practices** | Industry standards for the technology/pattern | "Use parameterized queries for all DB operations" | +| **Edge Cases** | Boundary conditions and error scenarios | "Handle concurrent modification of shared resource" | +| **Dependency Research** | Compatibility, versions, known issues | "Library X v3 has breaking changes from v2" | +| **Security Concerns** | Security implications of the planned approach | "JWT stored in localStorage is vulnerable to XSS" | +| **Performance Implications** | Performance characteristics and bottlenecks | "N+1 query risk with eager loading disabled" | +| **Pattern Matching** | Similar patterns in the existing codebase | "Module Y already implements this pattern; follow its conventions" | +| **Existing Learnings** | Prior solutions from knowledge base | "Similar issue solved in contract Z; see docs/solutions/..." | + +### 3. Deepening Flow + +``` +┌─────────────┐ ┌──────────────────┐ ┌────────────────┐ +│ Original │ │ Research Phase │ │ Enhanced Plan │ +│ Plan │────▶│ │────▶│ │ +│ │ │ Per plan item: │ │ Original + │ +│ Step 1 │ │ - Best practices │ │ annotations │ +│ Step 2 │ │ - Edge cases │ │ │ +│ Step 3 │ │ - Dependencies │ │ Step 1 │ +│ Step 4 │ │ - Security │ │ ├ Edge cases │ +│ │ │ - Performance │ │ ├ Best pracs │ +│ │ │ - Patterns │ │ └ Risks │ +│ │ │ - Learnings │ │ Step 2 │ +│ │ │ │ │ ├ Edge cases │ +│ │ │ All in parallel │ │ └ ... │ +└─────────────┘ └──────────────────┘ └────────────────┘ +``` + +**Implementation using existing infrastructure:** + +```bash +# Step 1: Parse plan into items +plan_items=$(makima supervisor get-plan-items) + +# Step 2: For each item, spawn research agents as a group +for item in $plan_items; do + makima supervisor spawn-group "deepen-${item.id}" \ + --tasks "[ + {\"name\": \"best-practices\", \"plan\": \"Research best practices for: ${item.description}\"}, + {\"name\": \"edge-cases\", \"plan\": \"Identify edge cases for: ${item.description}\"}, + {\"name\": \"security\", \"plan\": \"Analyze security implications of: ${item.description}\"}, + {\"name\": \"performance\", \"plan\": \"Assess performance implications of: ${item.description}\"} + ]" \ + --share-worktree \ + --read-only +done + +# Step 3: Wait for all groups +makima supervisor wait-group "deepen-*" --timeout 300 + +# Step 4: Synthesize results into enhanced plan +makima supervisor synthesize-plan +``` + +### 4. Enhanced Plan Format + +The deepened plan augments each step with structured annotations: + +```markdown +## Step 3: Implement JWT Authentication + +### Original Plan +Add JWT-based authentication middleware to the API gateway. +Generate tokens on login, validate on each request. + +### Research Findings + +#### Best Practices +- Use RS256 (asymmetric) for microservices, HS256 for monoliths +- Set short access token TTL (15 min) with refresh token rotation +- Include only essential claims (sub, exp, iat, roles) +- Never store sensitive data in JWT payload (it's base64, not encrypted) + +#### Edge Cases +- Token expiry during long-running requests +- Clock skew between services (use ±30s leeway) +- Concurrent refresh token rotation (race condition) +- Token size exceeding header limits (>8KB with many claims) + +#### Security Concerns +- **P2**: JWT in localStorage is XSS-vulnerable; prefer httpOnly cookies +- **P3**: Missing CSRF protection if using cookies +- **P2**: No token revocation mechanism for compromised tokens + +#### Performance Notes +- JWT validation is CPU-bound (RS256 ~1ms per validation) +- Consider caching decoded tokens for repeated validation +- Refresh token DB lookup adds latency (~5ms) + +#### Existing Learnings +- See: docs/solutions/security-practices/jwt-refresh-token-rotation.md +- Previous contract "Auth Service Refactor" used similar pattern + +### Risks +- [ ] Clock skew handling not in original plan +- [ ] Token revocation strategy needed +- [ ] CSRF protection if using cookie storage +``` + +### 5. Integration with Knowledge Base + +When the Knowledge Accumulation feature is available, `deepen-plan` automatically includes a **learning search agent** for each plan item: + +``` +Research Agent: "Search existing learnings relevant to JWT authentication" + +Results: +- docs/solutions/security-practices/jwt-refresh-token-rotation.md (relevance: 0.92) +- docs/solutions/api-patterns/authentication-middleware-pattern.md (relevance: 0.78) +- docs/solutions/debugging-techniques/token-expiry-debugging.md (relevance: 0.65) +``` + +These results are included in the deepened plan with direct links. + +--- + +## Integration with Existing Makima Features + +### Contract Phases + +Plan deepening occurs during the **Plan phase**, between initial plan creation and phase transition to Execute: + +``` +Plan Phase Timeline: + 1. Supervisor creates initial plan + 2. makima supervisor deepen-plan ← NEW + 3. User reviews deepened plan + 4. makima supervisor advance-phase execute +``` + +### Supervisor/Worker Hierarchy + +Research agents are spawned as **worker tasks** under the supervisor. Uses the existing `spawn-task` infrastructure with the proposed `spawn-group`/`wait-group` from the [Multi-Agent Review](feature-multi-agent-review.md) proposal. + +### Contract Files + +The deepened plan replaces or augments the plan document as a contract file: + +```rust +File { + contract_id: contract.id, + contract_phase: "plan", + name: "Implementation Plan (Deepened)", + body: vec![ + // Enhanced plan content with annotations + ], +} +``` + +### Directive System + +For directive-based workflows, plan deepening can be added as a step: + +```rust +DirectiveStep { + name: "deepen-plan", + description: "Enhance implementation plan with parallel research", + depends_on: [initial_plan_step_id], + task_plan: "Run deepen-plan on the initial plan...", +} +``` + +### Phase Guards + +If `phase_guard` is enabled, the user reviews the deepened plan before approving transition to execute. This is the natural checkpoint for plan quality. + +--- + +## Implementation Plan + +### Phase 1: Core Command (2-3 days) + +| Task | Effort | Description | +|------|--------|-------------| +| `deepen-plan` command | 1 day | Parse plan, spawn research groups | +| Research agent templates | 0.5 days | Default prompts for each category | +| Synthesis logic | 1 day | Merge research into annotated plan | +| Plan file update | 0.5 days | Write deepened plan as contract file | + +### Phase 2: Knowledge Integration (1-2 days) + +| Task | Effort | Description | +|------|--------|-------------| +| Learning search agent | 0.5 days | Search knowledge base per plan item | +| Result integration | 0.5 days | Include learning links in plan | +| Fallback when no KB | 0.5 days | Graceful degradation without KB | + +### Phase 3: Configuration & Polish (2-3 days) + +| Task | Effort | Description | +|------|--------|-------------| +| Config file support | 0.5 days | `.makima/deepen.yaml` | +| Focus area filtering | 0.5 days | `--focus` flag implementation | +| Concurrency control | 0.5 days | `--max-agents` limit | +| Documentation | 0.5 days | User guide | +| Tests | 1 day | Unit + integration | + +--- + +## Configuration Examples + +### Repository-Level Configuration + +```yaml +# .makima/deepen.yaml +version: 1 +deepen: + # Auto-deepen when plan is created + auto_trigger: false + + # Maximum agents per plan item + max_agents_per_item: 5 + + # Total maximum concurrent agents + max_concurrent: 20 + + # Timeout per research agent (seconds) + agent_timeout: 120 + + # Research dimensions to include + dimensions: + - best-practices + - edge-cases + - security + - performance + - dependencies + - patterns + - learnings # Requires Knowledge Accumulation + + # Minimum plan items to trigger deepening + min_plan_items: 3 + + # Search learnings (requires Knowledge Accumulation) + search_learnings: true + search_min_relevance: 0.5 +``` + +### Inline Usage + +```bash +# Quick deepen with defaults +makima supervisor deepen-plan + +# Focused deepen for security-sensitive work +makima supervisor deepen-plan --focus security,edge-cases + +# Deepen with more agents for complex plans +makima supervisor deepen-plan --max-agents 30 + +# Deepen without knowledge base search +makima supervisor deepen-plan --no-learnings +``` + +--- + +## Open Questions + +1. **Plan format parsing**: How should the system parse existing plans to identify discrete items? Markdown headers? Numbered lists? YAML structure? +2. **Research depth vs. cost**: 20-40 agents per deepening is expensive. Should there be a "lite" mode with fewer agents? +3. **Deepening multiple times**: Can a plan be deepened iteratively? Should subsequent deepenings build on previous research? +4. **User-provided context**: Should users be able to provide additional context (e.g., "this project uses PostgreSQL, not MySQL") to guide research? +5. **Codebase analysis**: Should research agents analyze the existing codebase to find relevant patterns, or only reason from general knowledge? +6. **Conflicting research**: When research agents disagree (e.g., one says "use Redis" and another says "avoid Redis"), how should the synthesis handle it? + +--- + +## Alternatives Considered + +| Alternative | Pros | Cons | Decision | +|-------------|------|------|----------| +| Sequential research (one agent) | Simple, cheaper | Slow; misses multi-perspective insights | Rejected — parallel is core value | +| Automatic deepening (always on) | No manual step | Adds latency to every plan; unnecessary for simple tasks | Optional auto-trigger | +| Web search integration | Real-time information | Inconsistent quality; potential hallucination from web results | Deferred — consider for v2 | +| User-provided research questions | Targeted research | Requires user to know what to ask | Complement — support alongside auto-research | +| LLM-only research (no task spawning) | Simpler, no infrastructure | Limited by single context window; no parallelism | Rejected — defeats the purpose | + +--- + +## Priority & Complexity Assessment + +- **Priority: MEDIUM** — Plan deepening significantly improves plan quality, but it's enhancement over an already-functional planning workflow. The compound engineering plugin's data shows ~40% plan revision reduction. +- **Complexity: LOW** — This feature is largely a composition of existing primitives (task spawning, group waiting, plan file updates). The main new work is research agent prompts and synthesis logic. +- **Risk: LOW** — Worst case is slightly better plans. No system changes required. Can be adopted incrementally. |
