summaryrefslogblamecommitdiff
path: root/docs/proposals/feature-plan-deepening.md
blob: c2d8aeb26f05810149552dcbf22fcbb4d36d6a3a (plain) (tree)






























































































































































































































































































































































































                                                                                                                                                                                                                                       
# Feature Proposal: Parallel Plan Deepening

> **Priority:** Medium
> **Complexity:** Low
> **Estimated Effort:** 5-8 days
> **Status:** Proposal
> **Date:** 2026-02-09
> **Dependencies:** [Knowledge Accumulation](feature-knowledge-accumulation.md) (recommended, not required)
> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Multi-Agent Review](feature-multi-agent-review.md)

---

## Problem Statement

Makima's planning phase currently suffers from **single-pass planning**:

- A supervisor creates a plan based on its immediate analysis of the task
- **No systematic research** is conducted before finalizing the plan
- **Edge cases are discovered during execution**, requiring mid-stream plan changes
- **Best practices are not consulted** — the plan relies solely on the model's training knowledge
- **Existing project learnings** (if the knowledge accumulation feature exists) are not surfaced during planning
- **Revision rate is high** — an estimated ~40% of plans require significant changes after execution begins

The result: plans are shallow, execution discovers problems that planning should have caught, and contracts take longer than necessary.

---

## How Compound Engineering Solves This

The compound engineering plugin's `/deepen-plan` command takes an existing plan and enhances it by spawning **20-40 parallel research agents**:

```
┌──────────────────────────────────────────────────────────────┐
│                      /deepen-plan                             │
│                                                              │
│  Input: Initial plan (from /plan)                            │
│                                                              │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐       │
│  │ Best     │ │ Edge     │ │ Dep.     │ │ Pattern  │       │
│  │ Practice │ │ Case     │ │ Research │ │ Matching │       │
│  │ Agent 1  │ │ Agent 1  │ │ Agent 1  │ │ Agent 1  │       │
│  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘       │
│       │            │            │            │              │
│  ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐ ┌────┴─────┐       │
│  │ Best     │ │ Edge     │ │ Security │ │ Existing │       │
│  │ Practice │ │ Case     │ │ Concerns │ │ Learning │       │
│  │ Agent 2  │ │ Agent 2  │ │ Agent    │ │ Agent    │       │
│  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘       │
│       │            │            │            │              │
│  ... (20-40 agents per plan item) ...                        │
│       │            │            │            │              │
│       ▼            ▼            ▼            ▼              │
│  ┌──────────────────────────────────────────────────┐       │
│  │              Synthesis Agent                      │       │
│  │  - Merge research into plan                       │       │
│  │  - Add edge case handling                         │       │
│  │  - Insert best practice notes                     │       │
│  │  - Flag risks and dependencies                    │       │
│  └──────────────────────────────────────────────────┘       │
│                        │                                     │
│                        ▼                                     │
│              Enhanced Plan (Deepened)                         │
│              - Original steps preserved                      │
│              - Edge cases added per step                      │
│              - Best practices annotated                       │
│              - Risks flagged                                  │
│              - Dependencies clarified                         │
└──────────────────────────────────────────────────────────────┘
```

The key insight: **research is embarrassingly parallel**. Each plan item can be researched independently, and each research dimension (best practices, edge cases, security, etc.) is independent.

---

## Proposed Makima Implementation

### 1. New Supervisor Command: `makima supervisor deepen-plan`

```bash
# Deepen the current contract's plan
makima supervisor deepen-plan

# Deepen with specific focus areas
makima supervisor deepen-plan --focus "security,edge-cases,performance"

# Deepen with explicit plan file reference
makima supervisor deepen-plan --plan-file plan.md

# Control parallelism
makima supervisor deepen-plan --max-agents 10

# Include knowledge base search (requires Knowledge Accumulation feature)
makima supervisor deepen-plan --search-learnings
```

### 2. Research Agent Categories

Each plan item is researched along multiple dimensions:

| Agent Category | Purpose | Example Output |
|----------------|---------|----------------|
| **Best Practices** | Industry standards for the technology/pattern | "Use parameterized queries for all DB operations" |
| **Edge Cases** | Boundary conditions and error scenarios | "Handle concurrent modification of shared resource" |
| **Dependency Research** | Compatibility, versions, known issues | "Library X v3 has breaking changes from v2" |
| **Security Concerns** | Security implications of the planned approach | "JWT stored in localStorage is vulnerable to XSS" |
| **Performance Implications** | Performance characteristics and bottlenecks | "N+1 query risk with eager loading disabled" |
| **Pattern Matching** | Similar patterns in the existing codebase | "Module Y already implements this pattern; follow its conventions" |
| **Existing Learnings** | Prior solutions from knowledge base | "Similar issue solved in contract Z; see docs/solutions/..." |

### 3. Deepening Flow

```
┌─────────────┐     ┌──────────────────┐     ┌────────────────┐
│ Original    │     │ Research Phase    │     │ Enhanced Plan  │
│ Plan        │────▶│                  │────▶│                │
│             │     │ Per plan item:    │     │ Original +     │
│ Step 1      │     │ - Best practices │     │ annotations    │
│ Step 2      │     │ - Edge cases     │     │               │
│ Step 3      │     │ - Dependencies   │     │ Step 1         │
│ Step 4      │     │ - Security       │     │  ├ Edge cases  │
│             │     │ - Performance    │     │  ├ Best pracs  │
│             │     │ - Patterns       │     │  └ Risks       │
│             │     │ - Learnings      │     │ Step 2         │
│             │     │                  │     │  ├ Edge cases  │
│             │     │ All in parallel  │     │  └ ...         │
└─────────────┘     └──────────────────┘     └────────────────┘
```

**Implementation using existing infrastructure:**

```bash
# Step 1: Parse plan into items
plan_items=$(makima supervisor get-plan-items)

# Step 2: For each item, spawn research agents as a group
for item in $plan_items; do
  makima supervisor spawn-group "deepen-${item.id}" \
    --tasks "[
      {\"name\": \"best-practices\", \"plan\": \"Research best practices for: ${item.description}\"},
      {\"name\": \"edge-cases\", \"plan\": \"Identify edge cases for: ${item.description}\"},
      {\"name\": \"security\", \"plan\": \"Analyze security implications of: ${item.description}\"},
      {\"name\": \"performance\", \"plan\": \"Assess performance implications of: ${item.description}\"}
    ]" \
    --share-worktree \
    --read-only
done

# Step 3: Wait for all groups
makima supervisor wait-group "deepen-*" --timeout 300

# Step 4: Synthesize results into enhanced plan
makima supervisor synthesize-plan
```

### 4. Enhanced Plan Format

The deepened plan augments each step with structured annotations:

```markdown
## Step 3: Implement JWT Authentication

### Original Plan
Add JWT-based authentication middleware to the API gateway.
Generate tokens on login, validate on each request.

### Research Findings

#### Best Practices
- Use RS256 (asymmetric) for microservices, HS256 for monoliths
- Set short access token TTL (15 min) with refresh token rotation
- Include only essential claims (sub, exp, iat, roles)
- Never store sensitive data in JWT payload (it's base64, not encrypted)

#### Edge Cases
- Token expiry during long-running requests
- Clock skew between services (use ±30s leeway)
- Concurrent refresh token rotation (race condition)
- Token size exceeding header limits (>8KB with many claims)

#### Security Concerns
- **P2**: JWT in localStorage is XSS-vulnerable; prefer httpOnly cookies
- **P3**: Missing CSRF protection if using cookies
- **P2**: No token revocation mechanism for compromised tokens

#### Performance Notes
- JWT validation is CPU-bound (RS256 ~1ms per validation)
- Consider caching decoded tokens for repeated validation
- Refresh token DB lookup adds latency (~5ms)

#### Existing Learnings
- See: docs/solutions/security-practices/jwt-refresh-token-rotation.md
- Previous contract "Auth Service Refactor" used similar pattern

### Risks
- [ ] Clock skew handling not in original plan
- [ ] Token revocation strategy needed
- [ ] CSRF protection if using cookie storage
```

### 5. Integration with Knowledge Base

When the Knowledge Accumulation feature is available, `deepen-plan` automatically includes a **learning search agent** for each plan item:

```
Research Agent: "Search existing learnings relevant to JWT authentication"

Results:
- docs/solutions/security-practices/jwt-refresh-token-rotation.md (relevance: 0.92)
- docs/solutions/api-patterns/authentication-middleware-pattern.md (relevance: 0.78)
- docs/solutions/debugging-techniques/token-expiry-debugging.md (relevance: 0.65)
```

These results are included in the deepened plan with direct links.

---

## Integration with Existing Makima Features

### Contract Phases

Plan deepening occurs during the **Plan phase**, between initial plan creation and phase transition to Execute:

```
Plan Phase Timeline:
  1. Supervisor creates initial plan
  2. makima supervisor deepen-plan    ← NEW
  3. User reviews deepened plan
  4. makima supervisor advance-phase execute
```

### Supervisor/Worker Hierarchy

Research agents are spawned as **worker tasks** under the supervisor. Uses the existing `spawn-task` infrastructure with the proposed `spawn-group`/`wait-group` from the [Multi-Agent Review](feature-multi-agent-review.md) proposal.

### Contract Files

The deepened plan replaces or augments the plan document as a contract file:

```rust
File {
    contract_id: contract.id,
    contract_phase: "plan",
    name: "Implementation Plan (Deepened)",
    body: vec![
        // Enhanced plan content with annotations
    ],
}
```

### Directive System

For directive-based workflows, plan deepening can be added as a step:

```rust
DirectiveStep {
    name: "deepen-plan",
    description: "Enhance implementation plan with parallel research",
    depends_on: [initial_plan_step_id],
    task_plan: "Run deepen-plan on the initial plan...",
}
```

### Phase Guards

If `phase_guard` is enabled, the user reviews the deepened plan before approving transition to execute. This is the natural checkpoint for plan quality.

---

## Implementation Plan

### Phase 1: Core Command (2-3 days)

| Task | Effort | Description |
|------|--------|-------------|
| `deepen-plan` command | 1 day | Parse plan, spawn research groups |
| Research agent templates | 0.5 days | Default prompts for each category |
| Synthesis logic | 1 day | Merge research into annotated plan |
| Plan file update | 0.5 days | Write deepened plan as contract file |

### Phase 2: Knowledge Integration (1-2 days)

| Task | Effort | Description |
|------|--------|-------------|
| Learning search agent | 0.5 days | Search knowledge base per plan item |
| Result integration | 0.5 days | Include learning links in plan |
| Fallback when no KB | 0.5 days | Graceful degradation without KB |

### Phase 3: Configuration & Polish (2-3 days)

| Task | Effort | Description |
|------|--------|-------------|
| Config file support | 0.5 days | `.makima/deepen.yaml` |
| Focus area filtering | 0.5 days | `--focus` flag implementation |
| Concurrency control | 0.5 days | `--max-agents` limit |
| Documentation | 0.5 days | User guide |
| Tests | 1 day | Unit + integration |

---

## Configuration Examples

### Repository-Level Configuration

```yaml
# .makima/deepen.yaml
version: 1
deepen:
  # Auto-deepen when plan is created
  auto_trigger: false

  # Maximum agents per plan item
  max_agents_per_item: 5

  # Total maximum concurrent agents
  max_concurrent: 20

  # Timeout per research agent (seconds)
  agent_timeout: 120

  # Research dimensions to include
  dimensions:
    - best-practices
    - edge-cases
    - security
    - performance
    - dependencies
    - patterns
    - learnings          # Requires Knowledge Accumulation

  # Minimum plan items to trigger deepening
  min_plan_items: 3

  # Search learnings (requires Knowledge Accumulation)
  search_learnings: true
  search_min_relevance: 0.5
```

### Inline Usage

```bash
# Quick deepen with defaults
makima supervisor deepen-plan

# Focused deepen for security-sensitive work
makima supervisor deepen-plan --focus security,edge-cases

# Deepen with more agents for complex plans
makima supervisor deepen-plan --max-agents 30

# Deepen without knowledge base search
makima supervisor deepen-plan --no-learnings
```

---

## Open Questions

1. **Plan format parsing**: How should the system parse existing plans to identify discrete items? Markdown headers? Numbered lists? YAML structure?
2. **Research depth vs. cost**: 20-40 agents per deepening is expensive. Should there be a "lite" mode with fewer agents?
3. **Deepening multiple times**: Can a plan be deepened iteratively? Should subsequent deepenings build on previous research?
4. **User-provided context**: Should users be able to provide additional context (e.g., "this project uses PostgreSQL, not MySQL") to guide research?
5. **Codebase analysis**: Should research agents analyze the existing codebase to find relevant patterns, or only reason from general knowledge?
6. **Conflicting research**: When research agents disagree (e.g., one says "use Redis" and another says "avoid Redis"), how should the synthesis handle it?

---

## Alternatives Considered

| Alternative | Pros | Cons | Decision |
|-------------|------|------|----------|
| Sequential research (one agent) | Simple, cheaper | Slow; misses multi-perspective insights | Rejected — parallel is core value |
| Automatic deepening (always on) | No manual step | Adds latency to every plan; unnecessary for simple tasks | Optional auto-trigger |
| Web search integration | Real-time information | Inconsistent quality; potential hallucination from web results | Deferred — consider for v2 |
| User-provided research questions | Targeted research | Requires user to know what to ask | Complement — support alongside auto-research |
| LLM-only research (no task spawning) | Simpler, no infrastructure | Limited by single context window; no parallelism | Rejected — defeats the purpose |

---

## Priority & Complexity Assessment

- **Priority: MEDIUM** — Plan deepening significantly improves plan quality, but it's enhancement over an already-functional planning workflow. The compound engineering plugin's data shows ~40% plan revision reduction.
- **Complexity: LOW** — This feature is largely a composition of existing primitives (task spawning, group waiting, plan file updates). The main new work is research agent prompts and synthesis logic.
- **Risk: LOW** — Worst case is slightly better plans. No system changes required. Can be adopted incrementally.