# Feature Proposal: Structured Findings / Issues Tracking
> **Priority:** Medium
> **Complexity:** Low
> **Estimated Effort:** 7-10 days
> **Status:** Proposal
> **Date:** 2026-02-09
> **Dependencies:** None (standalone, but enhances [Multi-Agent Review](feature-multi-agent-review.md))
> **Related:** [Overview Analysis](compound-engineering-analysis.md) · [Multi-Agent Review](feature-multi-agent-review.md) · [Workflow Presets](feature-workflow-presets.md)
---
## Problem Statement
Currently, review outputs in makima are **unstructured text** in task conversation history:
- **No standard format** for reporting issues found during review
- **No severity classification** — all findings are treated equally
- **No lifecycle tracking** — findings are either "in the review output" or "hopefully fixed"
- **No verification** — there's no way to confirm a finding was actually resolved
- **No aggregation** — findings from multiple review tasks can't be collected and deduplicated
- **No blocking mechanism** — critical findings can't prevent phase transitions
- **No metrics** — no data on how many findings are produced, resolved, or escaped
This makes the review phase a documentation exercise rather than a quality gate.
---
## How Compound Engineering Solves This
The compound engineering plugin uses **structured TODO/finding files** with YAML frontmatter and a defined lifecycle:
### File Format
```markdown
---
id: SEC-001
status: open
priority: P1
category: security
title: SQL injection in user search endpoint
file: src/api/users.rs
line: 47
agent: security-sentinel
created: 2026-02-09T10:30:00Z
updated: 2026-02-09T10:30:00Z
tags: [injection, input-validation, database]
---
# SQL Injection in User Search Endpoint
## Finding
The `search_users` handler directly interpolates the `query` parameter into
a SQL string without parameterization.
## Evidence
```rust
// src/api/users.rs:47
let sql = format!("SELECT * FROM users WHERE name LIKE '%{}%'", query);
```
## Impact
An attacker can execute arbitrary SQL queries, potentially:
- Exfiltrating all user data
- Modifying or deleting records
- Escalating privileges
## Recommendation
Use parameterized queries:
```rust
let results = sqlx::query("SELECT * FROM users WHERE name LIKE $1")
.bind(format!("%{}%", query))
.fetch_all(&pool)
.await?;
```
## Resolution
_Not yet resolved_
```
### File Naming Convention
```
findings/{issue_id}-{status}-{priority}-{description}.md
```
Example: `findings/SEC-001-open-P1-sql-injection-user-search.md`
### Lifecycle
```
open ──▶ in-progress ──▶ resolved ──▶ verified
│ │
└── wont-fix ◀────────────┘
```
---
## Proposed Makima Implementation
### 1. Finding Record Format
Findings are stored as **contract files** with structured metadata and body:
```rust
// Finding metadata (stored in file description as structured JSON)
#[derive(Serialize, Deserialize)]
pub struct FindingMetadata {
pub id: String, // "SEC-001", auto-generated
pub status: FindingStatus, // open, in_progress, resolved, verified, wont_fix
pub severity: FindingSeverity, // P1 (critical), P2 (major), P3 (minor)
pub category: String, // security, performance, architecture, etc.
pub title: String, // Short description
pub file_path: Option<String>, // Affected file
pub line_number: Option<u32>, // Affected line
pub source_agent: Option<String>, // Which review agent found this
pub source_task_id: Option<Uuid>, // Task that produced this finding
pub assigned_to: Option<Uuid>, // Task assigned to resolve this
pub created_at: DateTime<Utc>,
pub updated_at: DateTime<Utc>,
pub resolved_at: Option<DateTime<Utc>>,
pub verified_at: Option<DateTime<Utc>>,
pub tags: Vec<String>,
}
pub enum FindingStatus {
Open,
InProgress,
Resolved,
Verified,
WontFix,
}
pub enum FindingSeverity {
P1, // Critical — must fix before merge
P2, // Major — should fix, can defer with justification
P3, // Minor — nice to fix, can defer
}
```
### 2. Supervisor Commands
#### Create a Finding
```bash
# Create a finding from review output
makima supervisor finding create \
--severity P1 \
--category security \
--title "SQL injection in user search endpoint" \
--file src/api/users.rs \
--line 47 \
--description "Direct string interpolation in SQL query"
# Output: Created finding SEC-001 (P1/security)
```
#### List Findings
```bash
# List all findings for the current contract
makima supervisor finding list
# Output:
# ID SEVERITY STATUS CATEGORY TITLE
# SEC-001 P1 open security SQL injection in user search
# PERF-001 P2 in-progress performance N+1 query in order listing
# ARCH-001 P3 resolved architecture Handler accessing DB directly
# Filter by severity
makima supervisor finding list --severity P1
# Filter by status
makima supervisor finding list --status open
# Summary only
makima supervisor finding summary
# Output:
# Total: 12 findings
# P1: 2 open, 1 resolved
# P2: 3 open, 2 in-progress
# P3: 4 resolved
```
#### Update Finding Status
```bash
# Mark as in-progress (assigned to a task)
makima supervisor finding update SEC-001 --status in-progress --assigned-to <task-id>
# Mark as resolved
makima supervisor finding update SEC-001 --status resolved \
--resolution "Replaced with parameterized query in commit abc123"
# Mark as verified (after re-review)
makima supervisor finding update SEC-001 --status verified
# Mark as won't fix
makima supervisor finding update SEC-001 --status wont-fix \
--justification "Endpoint is internal-only, behind auth"
```
#### Auto-Create from Review Output
```bash
# Parse review agent output and create findings automatically
makima supervisor finding parse-output --task-id <review-task-id>
```
This parses structured review output and creates individual finding records.
### 3. Finding Lifecycle
```
┌────────────────────────────────────────────────────────────┐
│ Finding Lifecycle │
│ │
│ ┌──────┐ ┌─────────────┐ ┌──────────┐ │
│ │ │ │ │ │ │ │
│ │ OPEN │───▶│ IN-PROGRESS │───▶│ RESOLVED │ │
│ │ │ │ │ │ │ │
│ └──┬───┘ └─────────────┘ └────┬─────┘ │
│ │ │ │
│ │ ┌─────────────┐ ┌────┴─────┐ │
│ │ │ │ │ │ │
│ └───────▶│ WONT-FIX │ │ VERIFIED │ │
│ │ │ │ │ │
│ └─────────────┘ └──────────┘ │
│ │
│ Triggers: │
│ open ─▶ in_progress : Task assigned to fix │
│ in_progress ─▶ resolved : Fix committed │
│ resolved ─▶ verified : Re-review confirms fix │
│ open ─▶ wont_fix : Explicit decision with justification │
│ resolved ─▶ wont_fix : Fix deemed unnecessary after review│
└────────────────────────────────────────────────────────────┘
```
### 4. P1/P2/P3 Severity System
| Severity | Name | Description | Merge Policy |
|----------|------|-------------|--------------|
| **P1** | Critical | Security vulnerabilities, data loss risks, crash bugs | **Blocks merge** — must be resolved before contract completion |
| **P2** | Major | Performance issues, architectural concerns, significant tech debt | **Should fix** — can defer with explicit justification |
| **P3** | Minor | Style issues, minor improvements, documentation gaps | **Nice to fix** — can defer freely |
### 5. Merge Blocking
When findings exist, phase transitions and merge operations check for blockers:
```rust
// In advance-phase handler
async fn check_findings_gate(contract_id: Uuid) -> Result<bool> {
let findings = get_findings(contract_id).await?;
let open_p1s = findings.iter()
.filter(|f| f.severity == P1 && f.status == Open)
.count();
if open_p1s > 0 {
warn!("{} open P1 findings block phase transition", open_p1s);
return Ok(false);
}
Ok(true)
}
```
### 6. Auto-Resolution Workflow
When the Multi-Agent Review feature is available, findings drive an automated resolution cycle:
```
┌──────────┐ ┌───────────┐ ┌──────────┐ ┌──────────┐
│ Review │────▶│ Findings │────▶│ Resolve │────▶│ Verify │
│ Phase │ │ Created │ │ Tasks │ │ Fixes │
│ │ │ (P1/P2/P3)│ │ Spawned │ │ Pass? │
└──────────┘ └───────────┘ └──────────┘ └────┬─────┘
│
Yes │ No
┌────┴────┐
▼ ▼
┌──────────┐ Loop back
│ Findings │ to resolve
│ Verified │
└──────────┘
```
```bash
# Auto-resolve: spawn tasks to fix each P1/P2 finding
makima supervisor finding auto-resolve --severity P1,P2
# This spawns one task per finding:
# - Task plan includes the finding details and recommendation
# - Task is assigned to the finding (finding.assigned_to = task.id)
# - When task completes, finding status → resolved
# - Verification task confirms the fix
```
---
## Integration with Existing Makima Features
### Contract Files
Each finding is stored as a **contract file**:
```rust
File {
contract_id: Some(contract.id),
contract_phase: Some("review"),
name: "Finding: SEC-001 — SQL injection in user search",
description: Some(serde_json::to_string(&finding_metadata)?),
body: vec![
BodyElement::Heading { level: 1, text: finding.title },
BodyElement::Heading { level: 2, text: "Finding" },
BodyElement::Paragraph { text: finding.description },
BodyElement::Heading { level: 2, text: "Evidence" },
BodyElement::Code { language: Some("rust"), content: finding.evidence },
BodyElement::Heading { level: 2, text: "Recommendation" },
BodyElement::Paragraph { text: finding.recommendation },
],
}
```
### Phase Guards
Findings integrate with existing phase guards:
- Phase guard checks finding gate before allowing transition
- User sees a summary of open findings when reviewing phase transition
- P1 findings produce a warning that requires explicit override
### Supervisor Questions
When P1 findings block a transition, the supervisor can ask:
```bash
makima supervisor ask \
"2 P1 findings are still open. How would you like to proceed?" \
--choices "Fix findings first,Override and continue,Mark as won't-fix" \
--context "SEC-001: SQL injection (P1), PERF-001: Memory leak (P1)"
```
### Task Assignment
Findings reference tasks:
- `source_task_id`: The review task that discovered the finding
- `assigned_to`: The task spawned to resolve the finding
```bash
# Spawn a fix task and assign the finding
makima supervisor spawn "fix-sec-001" \
--plan "Fix SQL injection vulnerability in src/api/users.rs:47. Use parameterized queries."
makima supervisor finding update SEC-001 \
--status in-progress \
--assigned-to <spawned-task-id>
```
### Autonomous Loop
The autonomous loop can use findings as a completion gate condition:
```xml
<COMPLETION_GATE>
ready: false
reason: "2 P1 findings still open"
progress: "Resolved 5/7 findings"
blockers: ["SEC-001: SQL injection", "PERF-001: Memory leak"]
</COMPLETION_GATE>
```
---
## Implementation Plan
### Phase 1: Core Finding System (3-4 days)
| Task | Effort | Description |
|------|--------|-------------|
| Finding metadata schema | 0.5 days | FindingMetadata struct, validation |
| `finding create` command | 1 day | Create finding as contract file |
| `finding list/summary` commands | 0.5 days | Query and display findings |
| `finding update` command | 0.5 days | Status transitions, validation |
| Auto-ID generation | 0.5 days | Category-based IDs (SEC-001, PERF-002) |
### Phase 2: Integration (2-3 days)
| Task | Effort | Description |
|------|--------|-------------|
| Phase guard integration | 0.5 days | Check P1 findings before transition |
| `finding parse-output` | 1 day | Parse review task output into findings |
| Merge blocking logic | 0.5 days | Block merge with open P1s |
| Finding assignment to tasks | 0.5 days | Track resolution via task ID |
### Phase 3: Automation & Polish (2-3 days)
| Task | Effort | Description |
|------|--------|-------------|
| `finding auto-resolve` | 1 day | Spawn fix tasks per finding |
| Verification workflow | 0.5 days | Re-review to verify fixes |
| Finding reports | 0.5 days | Summary contract file |
| Documentation | 0.5 days | User guide |
| Tests | 0.5 days | Unit + integration |
---
## Configuration Examples
### Finding Creation in Review Agent Output
Review agents produce structured findings in their output:
```markdown
## FINDING: SQL Injection in User Search
- **Severity**: P1
- **Category**: security
- **File**: src/api/users.rs
- **Line**: 47
- **Tags**: injection, input-validation, database
### Description
The `search_users` handler directly interpolates the `query` parameter...
### Evidence
```rust
let sql = format!("SELECT * FROM users WHERE name LIKE '%{}%'", query);
```
### Recommendation
Use parameterized queries with sqlx::query().bind()
```
The synthesis step parses these into formal Finding records.
### Merge Blocking Configuration
```yaml
# .makima/review-agents.yaml (or contract config)
review:
findings:
merge_blocking_severity: P1 # P1 blocks merge
require_justification: P2 # P2 needs justification to defer
auto_resolve: true # Spawn fix tasks for P1/P2
auto_resolve_severity: P1,P2 # Which severities to auto-resolve
verification:
enabled: true # Re-review after resolution
re_review_agents: # Which agents verify fixes
- security-sentinel # Security findings verified by security agent
```
### Finding Lifecycle Example
```bash
# 1. Review creates finding
makima supervisor finding create --severity P1 --category security \
--title "SQL injection in user search" --file src/api/users.rs --line 47
# 2. Auto-resolve spawns fix task
makima supervisor finding auto-resolve --severity P1
# → Spawns task "fix-SEC-001" with plan based on finding details
# 3. Fix task completes, finding auto-updated
# finding SEC-001: open → in-progress → resolved
# 4. Verification re-reviews the fix
makima supervisor finding verify SEC-001
# → Spawns verification task targeting the specific file/line
# 5. Verification passes
# finding SEC-001: resolved → verified
# 6. Phase transition allowed
makima supervisor advance-phase compound -y
```
---
## Open Questions
1. **Finding storage**: Contract files vs. dedicated findings table in the database? Contract files are simpler but querying is less efficient.
2. **Cross-contract findings**: Should findings persist across contracts? (e.g., a P2 deferred from one contract carries to the next)
3. **Finding templates**: Should common finding types have templates? (e.g., "SQL injection" pre-fills category, severity, recommendation)
4. **External integration**: Should findings be exportable to GitHub Issues, Jira, or other issue trackers?
5. **Metric tracking**: How granular should finding metrics be? Per-contract? Per-repository? Per-category?
6. **False positive handling**: How should agents indicate confidence level? Should low-confidence findings be automatically P3?
---
## Alternatives Considered
| Alternative | Pros | Cons | Decision |
|-------------|------|------|----------|
| GitHub Issues integration | Rich UI, collaboration | External dependency; not all projects use GitHub | Deferred — consider as export target |
| Plain text findings | Simple | Not queryable, no lifecycle | Rejected — defeats the purpose |
| Dedicated findings DB table | Fast queries, rich indexing | New infrastructure, migration | Recommended for v2 |
| Contract file-based | Uses existing infrastructure | Slower queries for large sets | Adopted for v1 |
| Inline code comments | Close to code | Lost on next commit; hard to track | Rejected — not persistent |
---
## Priority & Complexity Assessment
- **Priority: MEDIUM** — Structured findings transform the review phase from documentation to a quality gate. Essential for the Multi-Agent Review feature to produce actionable output.
- **Complexity: LOW** — Finding records are simple structured data. Lifecycle state machine is straightforward. Main integration point (phase guards) already exists.
- **Risk: LOW** — Purely additive feature. Worst case: findings exist but aren't used (same as today). Can be adopted incrementally.