# Feature Proposal: Knowledge Accumulation / Compound Learning System
> **Priority:** High
> **Complexity:** Medium
> **Estimated Effort:** 10-15 days
> **Status:** Proposal
> **Date:** 2026-02-09
> **Dependencies:** Contract Files system (existing)
> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Plan Deepening](feature-plan-deepening.md) · [Workflow Presets](feature-workflow-presets.md)
---
## Problem Statement
When a makima contract completes, the **knowledge generated during that contract is effectively lost**:
- **Solutions to tricky problems** exist only in task conversation history, which is not searchable or surfaceable
- **Patterns discovered** during one contract cannot inform future contracts
- **Mistakes made** in one contract are likely to be repeated in similar future contracts
- **Best practices** established during execution are not codified anywhere retrievable
- **Contract files** capture deliverables but not the *meta-knowledge* about how those deliverables were produced
This means every new contract starts from zero context, even when the team has solved similar problems before. Engineering effort does not compound.
---
## How Compound Engineering Solves This
The compound engineering plugin implements a `/compound` command that runs **5 parallel sub-agents** immediately after review:
```
┌─────────────────────────────────────────────────────────┐
│ /compound │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Context │ │ Solution │ │ Prevention │ │
│ │ Extractor │ │ Documenter │ │ Strategist │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │ │
│ ┌──────┴──────┐ ┌──────┴──────┐ │
│ │ Doc │ │ Category │ │
│ │ Linker │ │ Classifier │ │
│ └──────┬──────┘ └──────┬──────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────────────────────────┐ │
│ │ docs/solutions/[category]/file.md │ │
│ │ │ │
│ │ --- │ │
│ │ category: build-errors │ │
│ │ severity: medium │ │
│ │ tags: [webpack, esm, cjs] │ │
│ │ date: 2026-02-09 │ │
│ │ contract: abc-123 │ │
│ │ --- │ │
│ │ │ │
│ │ # Mixed ESM/CJS Import Resolution │ │
│ │ │ │
│ │ ## Problem │ │
│ │ ... │ │
│ │ ## Solution │ │
│ │ ... │ │
│ │ ## Prevention │ │
│ │ ... │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
```
### 9 Auto-Detected Categories
| Category | Description |
|----------|-------------|
| `build-errors` | Compilation, bundling, dependency resolution |
| `test-failures` | Test setup, assertion patterns, mocking |
| `api-patterns` | API design, endpoint structure, versioning |
| `architecture-decisions` | Structural choices, trade-offs, patterns |
| `performance-optimizations` | Speed, memory, caching strategies |
| `security-practices` | Auth, input validation, secrets management |
| `debugging-techniques` | Investigation methods, logging strategies |
| `tooling-configurations` | Tool setup, config patterns, CI/CD |
| `domain-knowledge` | Business logic, domain-specific patterns |
---
## Proposed Makima Implementation
### 1. New "Compound" Phase
Add an optional **compound** phase to the contract lifecycle, positioned after review:
```
Research → Specify → Plan → Execute → Review → Compound
▲
(new phase)
```
**Phase behavior:**
- **Auto-triggered** after review phase completes (configurable)
- **Short-lived** — typically completes in 1-3 minutes
- Extracts learnings from the contract's execution and review
- Stores them as searchable, categorized learning documents
- Can be skipped via configuration for trivial contracts
### 2. New Supervisor Command: `makima supervisor compound`
```bash
# Run compound learning for the current contract
makima supervisor compound
# Compound with specific focus areas
makima supervisor compound --focus "security,performance"
# Compound with explicit learnings
makima supervisor compound --learning "The retry logic needed exponential backoff, not fixed delay"
```
**Implementation:**
```bash
# Under the hood, this spawns learning sub-agents
makima supervisor spawn-group "compound" \
--tasks '[
{
"name": "context-extractor",
"plan": "Extract the problem context, constraints, and environment details from the contract execution history..."
},
{
"name": "solution-documenter",
"plan": "Document the solutions that were applied, including code patterns and configuration changes..."
},
{
"name": "prevention-strategist",
"plan": "Identify what could prevent this class of problem in the future..."
},
{
"name": "category-classifier",
"plan": "Classify these learnings into the appropriate category..."
},
{
"name": "doc-linker",
"plan": "Link these learnings to existing documentation and related learnings..."
}
]'
```
### 3. Learning Document Schema
Each learning is stored as a **contract file** with structured content and metadata:
```yaml
# Learning document metadata (stored in file description/metadata)
learning:
category: "build-errors" # One of 9 categories
severity: "medium" # low, medium, high, critical
tags: ["webpack", "esm", "cjs"] # Free-form tags
source_contract_id: "abc-123" # Contract that produced this learning
source_contract_name: "Fix webpack bundling"
repository: "github.com/org/repo"
date: "2026-02-09"
quality_score: 0.85 # 0-1, set by quality gate
access_count: 0 # Incremented on retrieval
last_accessed: null
relevance_decay: 0.95 # Per-month decay factor
```
**Document body structure:**
```markdown
# Mixed ESM/CJS Import Resolution
## Problem
When upgrading to webpack 5, mixed ESM and CommonJS imports caused
"Cannot use import statement outside a module" errors in production
but not development.
## Root Cause
The `type: "module"` field in package.json applied ESM resolution
globally, but several dependencies only provided CJS exports.
## Solution
1. Added `resolve.fullySpecified: false` to webpack config
2. Used `@babel/plugin-transform-modules-commonjs` for CJS deps
3. Created explicit `.cjs` extensions for config files
## Code Pattern
```javascript
// webpack.config.cjs (note: .cjs extension)
module.exports = {
resolve: {
fullySpecified: false,
extensions: ['.js', '.mjs', '.cjs', '.json']
}
};
```
## Prevention
- Add webpack build check to CI before merging
- Document module system choice in project README
- Use `resolve.fullySpecified: false` by default in webpack 5 projects
## Related
- docs/solutions/tooling-configurations/webpack-5-migration.md
- Contract: "Initial Webpack 5 Migration" (2026-01-15)
```
### 4. Storage Architecture
Learnings are stored in two complementary locations:
#### A. Contract Files (Structured, Persistent)
```rust
// Each learning becomes a contract file
File {
contract_id: Some(source_contract.id),
contract_phase: Some("compound"),
name: "Learning: Mixed ESM/CJS Import Resolution",
description: Some("category=build-errors; tags=webpack,esm,cjs; severity=medium"),
body: vec![
BodyElement::Heading { level: 1, text: "Mixed ESM/CJS Import Resolution" },
BodyElement::Heading { level: 2, text: "Problem" },
BodyElement::Paragraph { text: "..." },
// ... structured content
],
repo_file_path: Some("docs/solutions/build-errors/mixed-esm-cjs-resolution.md"),
repo_sync_status: Some("synced"),
}
```
#### B. Repository Files (Searchable, Portable)
```
docs/solutions/
├── build-errors/
│ ├── mixed-esm-cjs-resolution.md
│ └── docker-multi-stage-cache.md
├── test-failures/
│ ├── async-test-timeout-patterns.md
│ └── mock-service-worker-setup.md
├── api-patterns/
│ └── pagination-cursor-vs-offset.md
├── architecture-decisions/
│ └── event-sourcing-tradeoffs.md
├── performance-optimizations/
│ └── database-connection-pooling.md
├── security-practices/
│ └── jwt-refresh-token-rotation.md
├── debugging-techniques/
│ └── distributed-tracing-setup.md
├── tooling-configurations/
│ └── github-actions-cache-strategy.md
└── domain-knowledge/
└── payment-processing-idempotency.md
```
### 5. Auto-Surface Relevant Learnings
When a new contract is created, automatically search for relevant learnings:
```bash
# Supervisor plan template automatically includes:
# "Search existing learnings relevant to this task"
makima supervisor search-learnings --query "webpack bundling errors"
makima supervisor search-learnings --category "build-errors" --tags "webpack"
makima supervisor search-learnings --repository "github.com/org/repo"
```
**Search algorithm:**
```
Relevance Score =
keyword_match_score * 0.4
+ category_match_score * 0.2
+ tag_overlap_score * 0.2
+ recency_score * 0.1 # Decays over time
+ quality_score * 0.1 # Higher quality = more relevant
```
**Integration with plan phase:**
```
┌──────────────┐ ┌───────────────────┐
│ New Contract │──────▶│ Plan Phase │
│ Created │ │ │
└──────────────┘ │ 1. Create plan │
│ 2. Search for │◀── Learnings DB
│ relevant │
│ learnings │
│ 3. Inject context │
│ into plan │
└───────────────────┘
```
### 6. Quality Control
#### Relevance Decay
Learnings lose relevance over time unless accessed:
```
effective_relevance = quality_score * (decay_factor ^ months_since_creation)
+ access_bonus * recent_access_count
```
- Default decay factor: 0.95/month (learning at 60% relevance after 1 year)
- Access bonus: +0.05 per access (caps at +0.25)
- Learnings below 0.3 effective relevance are archived
#### Deduplication
When a new learning is created, check for existing similar learnings:
```
similarity = cosine_similarity(new_learning_embedding, existing_learning_embedding)
if similarity > 0.85:
merge_or_update(existing_learning, new_learning)
elif similarity > 0.70:
link_as_related(new_learning, existing_learning)
```
#### Quality Gate
Before storing a learning, validate:
| Check | Threshold | Action if Failed |
|-------|-----------|------------------|
| Has problem statement | Required | Reject |
| Has solution | Required | Reject |
| Has prevention strategy | Recommended | Warn, store with quality penalty |
| Code examples present | Recommended | Warn, store with quality penalty |
| Category valid | Required | Auto-classify |
| Not duplicate | >0.85 similarity | Merge with existing |
| Minimum length | >200 characters | Reject |
---
## Integration with Existing Makima Features
### Contract Phases
The compound phase integrates into the existing phase system:
```rust
// New phase variant
enum ContractPhase {
Research,
Specify,
Plan,
Execute,
Review,
Compound, // NEW
}
```
- Contracts with `contract_type: "specification"` get the full 6-phase cycle
- Contracts with `contract_type: "simple"` can opt-in via config
- Phase guard still applies: user must approve transition to compound
### Contract Files
Learnings are first-class contract files, leveraging existing:
- Versioning system
- Structured body format (`BodyElement` types)
- Repository file sync (`repo_file_path`, `repo_sync_status`)
- Phase association (`contract_phase: "compound"`)
### Directive System
For directive-based workflows, learnings can be captured per-step:
```rust
DirectiveStep {
name: "compound-step-3",
description: "Capture learnings from database migration step",
depends_on: [step_3_id, review_step_id],
task_plan: "Extract and document learnings from the completed migration...",
}
```
### Supervisor CLI
New commands integrate with existing CLI infrastructure:
```bash
# In supervisor context
makima supervisor compound # Run compound phase
makima supervisor search-learnings "query" # Search knowledge base
makima supervisor list-learnings # List all learnings
makima supervisor learning-stats # Knowledge base statistics
```
---
## Implementation Plan
### Phase 1: Core Infrastructure (4-5 days)
| Task | Effort | Description |
|------|--------|-------------|
| Add `compound` phase to contract lifecycle | 1 day | New phase enum, transition rules |
| Learning document schema | 1 day | Metadata structure, validation |
| `supervisor compound` command | 1-2 days | Spawn learning sub-agents |
| Repository file sync for learnings | 1 day | Write to `docs/solutions/` |
### Phase 2: Search & Retrieval (3-5 days)
| Task | Effort | Description |
|------|--------|-------------|
| `search-learnings` command | 1-2 days | Keyword + category search |
| Auto-surface in plan phase | 1-2 days | Inject relevant learnings into plans |
| Learning index | 1 day | Category/tag index for fast lookup |
### Phase 3: Quality & Maintenance (3-5 days)
| Task | Effort | Description |
|------|--------|-------------|
| Quality gate validation | 1 day | Pre-storage checks |
| Relevance decay system | 1 day | Scheduled decay + access tracking |
| Deduplication check | 1-2 days | Similarity detection and merging |
| Documentation & defaults | 1 day | User guide, default categories |
---
## Configuration Examples
### Enable Compound Phase (Contract-Level)
```yaml
# Contract configuration
compound:
enabled: true
auto_trigger: true # Auto-run after review completes
categories: # Override default categories
- build-errors
- test-failures
- api-patterns
- architecture-decisions
- performance-optimizations
- security-practices
- debugging-techniques
- tooling-configurations
- domain-knowledge
quality_gate:
min_length: 200
require_problem: true
require_solution: true
require_prevention: false
storage:
contract_files: true # Store as contract files
repo_files: true # Also write to docs/solutions/
repo_path: "docs/solutions"
```
### Repository-Level Configuration (`.makima/compound.yaml`)
```yaml
# .makima/compound.yaml
version: 1
compound:
# Default settings for all contracts in this repo
auto_trigger: true
# Custom categories for this project
categories:
- build-errors
- test-failures
- api-patterns
- payment-processing # Custom domain category
- compliance-requirements # Custom domain category
# Search settings
search:
max_results: 10
min_relevance: 0.3
include_archived: false
# Decay settings
decay:
factor: 0.95 # Per month
archive_threshold: 0.3
access_bonus: 0.05
max_access_bonus: 0.25
```
### Searching Learnings
```bash
# Full-text search
makima supervisor search-learnings "webpack ESM import error"
# Category filter
makima supervisor search-learnings --category build-errors
# Tag filter
makima supervisor search-learnings --tags webpack,esm
# Repository filter
makima supervisor search-learnings --repo github.com/org/repo
# Combined
makima supervisor search-learnings "import error" \
--category build-errors \
--tags webpack \
--min-relevance 0.5 \
--limit 5
```
---
## Open Questions
1. **Cross-repository knowledge**: Should learnings be scoped to a single repository or shared across all repositories for an owner?
2. **Learning ownership**: Who owns a learning — the contract creator, the repository, or the organization?
3. **Privacy**: Are learnings visible to all users, or scoped by access control?
4. **Embedding model**: For similarity-based deduplication and search, which embedding model should be used? Trade-off between quality and cost.
5. **Storage limits**: Should there be a cap on the number of learnings per repository/owner?
6. **Manual curation**: Should users be able to manually create, edit, or delete learnings outside the compound phase?
7. **Export/import**: Should learnings be exportable/importable across makima instances?
---
## Alternatives Considered
| Alternative | Pros | Cons | Decision |
|-------------|------|------|----------|
| Store learnings only in contract files | Simple, uses existing infrastructure | Not easily searchable across contracts | Rejected — search is critical |
| Store learnings only in repo files | Portable, version-controlled, greppable | Lost if repo deleted; no cross-repo search | Partial — use as secondary storage |
| Use external knowledge base (e.g., vector DB) | Best search quality | Added infrastructure dependency | Deferred — consider for v2 |
| Manual-only knowledge capture | No noise | Knowledge rarely captured | Rejected — must be automatic |
| Full contract history indexing | Most complete | Massive storage, noise, privacy concerns | Rejected — too much signal-to-noise |
---
## Priority & Complexity Assessment
- **Priority: HIGH** — This is the defining feature of compound engineering. Without knowledge accumulation, every contract starts from scratch. This is the feature that creates compounding returns.
- **Complexity: MEDIUM** — Core capture and storage is straightforward using existing contract files and repo sync. Search quality and relevance decay require iterative refinement.
- **Risk: MEDIUM** — Primary risk is low adoption (users skip compound phase) mitigated by auto-trigger. Secondary risk is knowledge base noise mitigated by quality gates.