diff options
| author | soryu <soryu@soryu.co> | 2026-02-02 02:34:50 +0000 |
|---|---|---|
| committer | soryu <soryu@soryu.co> | 2026-02-02 02:34:50 +0000 |
| commit | 151e9d87e117b7980e6aad522ac8f3633eeca87a (patch) | |
| tree | e80fb4301361b3b12e5abf8e442603db2d0622dc /.makima | |
| parent | a2c147ddd59f55a07b5be0c8970169726b55c876 (diff) | |
| download | soryu-151e9d87e117b7980e6aad522ac8f3633eeca87a.tar.gz soryu-151e9d87e117b7980e6aad522ac8f3633eeca87a.zip | |
Make makima more opinionated and structured
Diffstat (limited to '.makima')
| -rw-r--r-- | .makima/specs/red-team-system.md | 748 |
1 files changed, 0 insertions, 748 deletions
diff --git a/.makima/specs/red-team-system.md b/.makima/specs/red-team-system.md deleted file mode 100644 index 31f4b78..0000000 --- a/.makima/specs/red-team-system.md +++ /dev/null @@ -1,748 +0,0 @@ -# Red Team System Specification - -## Overview - -The Red Team system is an adversarial review feature for makima contracts that provides real-time quality assurance during task execution. When enabled, a parallel "red team" task instance monitors the output of work tasks, verifying that implementations adhere to the contract requirements, repository standards, and the execution plan. - -### Goals - -1. **Quality Assurance**: Catch deviations from the plan before they compound -2. **Standards Compliance**: Ensure code follows repository conventions (CONTRIBUTING.md, linting rules, etc.) -3. **Contract Adherence**: Verify implementations match the specification and requirements -4. **Proactive Issue Detection**: Flag potential problems early, not after task completion - -### Non-Goals - -1. The red team should NOT write code or make commits -2. The red team should NOT be overly pedantic or block progress for minor style issues -3. The red team is NOT a replacement for code review - it's an early warning system - ---- - -## 1. Feature Overview - -### 1.1 Concept - -The Red Team operates as a parallel observer task that: -- Monitors all work task outputs in real-time via the broadcast system -- Has read-only access to task diffs and outputs -- Can access contract specifications, plans, and repository standards -- Can notify the supervisor when it detects issues requiring attention - -### 1.2 Relationship to Existing Components - -``` -┌─────────────────────────────────────────────────────────────┐ -│ Contract │ -│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ -│ │ Supervisor │ │ Work Task 1 │ │ Work Task 2 │ │ -│ │ │<───│ │ │ │ │ -│ │ │<───│ │ │ │ │ -│ └──────────────┘ └──────────────┘ └──────────────┘ │ -│ ^ │ │ │ -│ │ outputs outputs │ -│ │ │ │ │ -│ [NOTIFY] v v │ -│ │ ┌─────────────────────────────┐ │ -│ └────────────│ Red Team Task │ │ -│ │ (Monitoring & Validation) │ │ -│ └─────────────────────────────┘ │ -└─────────────────────────────────────────────────────────────┘ -``` - -### 1.3 Task Type - -The Red Team task is a special task variant with the following characteristics: -- `is_red_team: true` flag on the Task model -- Has tool key for API access (like supervisor tasks) -- Does NOT have write permissions to the repository -- Subscribes to task output broadcasts -- Can use `makima red-team notify` command to alert supervisor - ---- - -## 2. Contract Configuration - -### 2.1 Contract Model Changes - -Add the following field to the `Contract` model in `makima/src/db/models.rs`: - -```rust -/// Contract record from the database -#[derive(Debug, Clone, FromRow, Serialize, ToSchema)] -#[serde(rename_all = "camelCase")] -pub struct Contract { - // ... existing fields ... - - /// Whether to spawn a red team task to monitor work tasks. - /// When enabled, a parallel task monitors outputs and can alert - /// the supervisor about potential issues. - #[serde(default)] - pub red_team_enabled: bool, - - /// Optional custom prompt/criteria for the red team to use - /// when evaluating task outputs. If not provided, uses default - /// quality criteria. - #[serde(skip_serializing_if = "Option::is_none")] - pub red_team_prompt: Option<String>, -} -``` - -### 2.2 CreateContractRequest Changes - -```rust -#[derive(Debug, Clone, Deserialize, ToSchema)] -#[serde(rename_all = "camelCase")] -pub struct CreateContractRequest { - // ... existing fields ... - - /// Enable red team monitoring for this contract. - /// When enabled, a parallel task monitors work task outputs - /// and can alert the supervisor about potential issues. - #[serde(default)] - pub red_team_enabled: Option<bool>, - - /// Optional custom criteria for the red team to evaluate. - /// Examples: "Focus on security vulnerabilities", - /// "Ensure all functions have tests", etc. - pub red_team_prompt: Option<String>, -} -``` - -### 2.3 CLI Flag for Contract Creation - -The daemon CLI should support red team enablement during contract creation: - -```bash -# Enable red team with default criteria -makima supervisor create --red-team "Contract Name" "Description" - -# Enable red team with custom review criteria -makima supervisor create --red-team --red-team-prompt "Focus on performance and memory usage" "Contract Name" "Description" -``` - ---- - -## 3. Red Team Task Lifecycle - -### 3.1 Spawning - -The red team task is spawned automatically when: -1. A contract has `red_team_enabled: true` -2. The first work task is spawned (not the supervisor itself) - -**Spawn Logic** (in `spawn_task` handler or supervisor spawn logic): - -```rust -// In spawn_task after creating a work task: -if contract.red_team_enabled && !is_supervisor_task { - // Check if red team task already exists - let existing_red_team = repository::get_red_team_task_for_contract(pool, contract_id).await?; - - if existing_red_team.is_none() { - // Spawn red team task - let red_team_task = spawn_red_team_task( - pool, - state, - contract_id, - owner_id, - contract.red_team_prompt.as_deref(), - ).await?; - - tracing::info!( - contract_id = %contract_id, - red_team_task_id = %red_team_task.id, - "Spawned red team task for contract" - ); - } -} -``` - -### 3.2 Task Properties - -When creating the red team task: - -```rust -CreateTaskRequest { - name: "Red Team Monitor".to_string(), - description: Some("Adversarial review task monitoring work task outputs".to_string()), - plan: generate_red_team_plan(contract, custom_prompt), - contract_id: Some(contract_id), - parent_task_id: None, // Not a child of supervisor - is_supervisor: false, - is_red_team: true, // NEW FIELD - // ... other fields ... -} -``` - -### 3.3 Lifespan - -The red team task: -- Lives for the duration of the **execute phase** -- Is automatically terminated when: - - The contract advances past the execute phase - - The contract is completed - - The contract is archived -- Can be paused/resumed along with other contract tasks -- Does NOT restart automatically after daemon failure (not critical path) - -### 3.4 Read-Only Enforcement - -The red team task: -- Has NO worktree of its own (or a read-only clone) -- Cannot use git operations (commit, branch, etc.) -- Can only READ files, not write them -- Has API access limited to read operations - ---- - -## 4. Red Team Notification CLI Command - -### 4.1 Command Specification - -New CLI command available only to red team tasks: - -```bash -makima red-team notify "<message>" -``` - -**Arguments:** -- `<message>`: A detailed description of the issue detected - -**Options:** -- `--severity <level>`: Issue severity: `low`, `medium`, `high`, `critical` (default: `medium`) -- `--task <task_id>`: The specific task this relates to (optional) -- `--file <path>`: The file path where the issue was detected (optional) -- `--context <text>`: Additional context about the issue (optional) - -**Example:** - -```bash -makima red-team notify "Task is adding console.log statements which violates the no-debug-logging rule in CONTRIBUTING.md" \ - --severity medium \ - --task 550e8400-e29b-41d4-a716-446655440000 \ - --file "src/api/handler.rs" -``` - -### 4.2 CLI Arguments Structure - -```rust -// In makima/src/daemon/cli/mod.rs - -/// Red Team subcommand - red team task commands. -#[derive(Subcommand, Debug)] -pub enum RedTeamCommand { - /// Send a notification to the supervisor about a detected issue. - /// Only available to red team tasks. - Notify(NotifyArgs), -} - -/// Arguments for red-team notify command. -#[derive(Args, Debug)] -pub struct NotifyArgs { - /// API URL - #[arg(long, env = "MAKIMA_API_URL", default_value = "https://api.makima.jp")] - pub api_url: String, - - /// API key for authentication - #[arg(long, env = "MAKIMA_API_KEY")] - pub api_key: String, - - /// Current task ID (must be a red team task) - #[arg(long, env = "MAKIMA_TASK_ID")] - pub task_id: Uuid, - - /// Contract ID - #[arg(long, env = "MAKIMA_CONTRACT_ID")] - pub contract_id: Uuid, - - /// The notification message - #[arg(index = 1)] - pub message: String, - - /// Severity level: low, medium, high, critical - #[arg(long, default_value = "medium")] - pub severity: String, - - /// Related task ID (optional) - #[arg(long)] - pub task: Option<Uuid>, - - /// Related file path (optional) - #[arg(long)] - pub file: Option<String>, - - /// Additional context (optional) - #[arg(long)] - pub context: Option<String>, -} -``` - -### 4.3 API Endpoint - -**POST** `/api/v1/mesh/red-team/notify` - -**Request Body:** -```json -{ - "message": "Issue description", - "severity": "medium", - "relatedTaskId": "uuid-optional", - "filePath": "src/path/optional.rs", - "context": "Additional context optional" -} -``` - -**Response:** -```json -{ - "notificationId": "uuid", - "delivered": true, - "supervisorTaskId": "uuid" -} -``` - -### 4.4 Notification Delivery - -When a red team notification is received: - -1. **Validate Caller**: Ensure the request comes from a valid red team task -2. **Find Supervisor**: Get the supervisor task for the contract -3. **Format Message**: Create an `[ACTION REQUIRED]` formatted message -4. **Send to Supervisor**: Inject the message into the supervisor's stdin via `SendMessage` command - -**Message Format:** - -``` -════════════════════════════════════════════════════════════════ -[RED TEAM ALERT] Severity: MEDIUM -════════════════════════════════════════════════════════════════ - -Issue: Task is adding console.log statements which violates the -no-debug-logging rule in CONTRIBUTING.md - -Related Task: 550e8400-e29b-41d4-a716-446655440000 -File: src/api/handler.rs - -Context: The CONTRIBUTING.md file explicitly states that debug -logging should use the tracing crate, not console.log or println! - -════════════════════════════════════════════════════════════════ -You can: -- Pause the related task to investigate -- Send feedback to the task to correct the issue -- Acknowledge this alert and continue monitoring -════════════════════════════════════════════════════════════════ -``` - -### 4.5 Supervisor Response Handling - -The supervisor can respond to red team notifications by: -1. **Pausing the task**: `makima supervisor pause <task_id>` -2. **Sending feedback**: `makima supervisor message <task_id> "Please use tracing instead of console.log"` -3. **Acknowledging**: Simply continue (the red team will keep monitoring) -4. **Dismissing**: Mark the alert as false positive (future consideration) - ---- - -## 5. Red Team Access Patterns - -### 5.1 Task Output Subscription - -The red team task subscribes to the `task_outputs` broadcast channel: - -```rust -// In red team task initialization -let mut task_output_rx = state.task_outputs.subscribe(); - -loop { - match task_output_rx.recv().await { - Ok(notification) => { - // Only process outputs from work tasks in our contract - if notification.contract_id == Some(self.contract_id) - && !notification.is_supervisor - && !notification.is_red_team { - self.analyze_output(notification).await; - } - } - Err(e) => { - tracing::warn!("Red team task output subscription error: {}", e); - } - } -} -``` - -### 5.2 Task Diff Access - -The red team can request diffs via the supervisor API: - -**GET** `/api/v1/mesh/supervisor/tasks/{task_id}/diff` - -This endpoint already exists and can be used by the red team (with tool key auth). - -### 5.3 Contract Information Access - -The red team can read: -- Contract plan and specifications (via contract files) -- Repository standards (CONTRIBUTING.md, .editorconfig, etc.) -- Task descriptions and plans - -**Existing endpoints used:** -- `GET /api/v1/contracts/{id}` - Contract details -- `GET /api/v1/contracts/{id}/files` - Contract files -- `GET /api/v1/files/{id}` - File content - -### 5.4 Repository File Access - -For repository standards, the red team uses the existing daemon file read capability: - -```bash -# Via makima CLI (from within the red team task) -makima supervisor read-file <task_id> "CONTRIBUTING.md" -makima supervisor read-file <task_id> ".editorconfig" -makima supervisor read-file <task_id> "rustfmt.toml" -``` - -Or direct filesystem access if the red team has a read-only worktree clone. - ---- - -## 6. System Prompt for Red Team Task - -The red team task receives a specialized system prompt that guides its behavior: - -```markdown -# Red Team Monitor - -You are an adversarial quality reviewer for a software development contract. Your role is to monitor work task outputs in real-time and flag potential issues BEFORE they compound into larger problems. - -## Your Mission - -Monitor all task outputs and verify: -1. **Plan Adherence**: Are tasks following the implementation plan? -2. **Code Quality**: Does the code meet repository standards? -3. **Contract Requirements**: Does the implementation match the specification? -4. **Best Practices**: Are there obvious anti-patterns or issues? - -## Access Available - -You have read-only access to: -- Task outputs (streamed in real-time) -- Task diffs (code changes) -- Contract specifications and plan documents -- Repository configuration files (CONTRIBUTING.md, linting configs, etc.) - -## How to Monitor - -1. **Subscribe to task outputs**: You'll receive outputs from all work tasks -2. **Analyze code changes**: Request diffs for completed tasks -3. **Cross-reference**: Compare outputs against the plan and specifications -4. **Report issues**: Use `makima red-team notify` when you detect problems - -## When to Notify - -NOTIFY the supervisor when you observe: -- **Critical**: Security vulnerabilities, data loss risks, breaking changes -- **High**: Significant deviations from the plan, major code quality issues -- **Medium**: Missing tests, suboptimal implementations, minor standard violations -- **Low**: Style inconsistencies, documentation gaps (use sparingly) - -## What NOT to Do - -- Do NOT nitpick minor style issues (that's what linters are for) -- Do NOT block progress for trivial concerns -- Do NOT write code or make changes yourself -- Do NOT notify for things that are already in progress and being addressed -- Do NOT create duplicate notifications for the same issue - -## Notification Format - -When notifying, always include: -1. A clear, concise description of the issue -2. The severity level (critical/high/medium/low) -3. The related task ID if applicable -4. The specific file or code location if known -5. Why this matters (reference to plan, spec, or standards) - -## Example Notification - -``` -makima red-team notify "Task is implementing authentication with plaintext password storage, which contradicts the security requirements in the specification document" \ - --severity critical \ - --task <task_id> \ - --file "src/auth/user.rs" \ - --context "Specification section 3.2 requires bcrypt hashing for all passwords" -``` - -## Custom Review Criteria - -{{#if red_team_prompt}} -Additional review criteria for this contract: -{{red_team_prompt}} -{{/if}} - -## Contract Context - -Contract: {{contract_name}} -Phase: {{contract_phase}} -Repository: {{repository_url}} - -Focus your monitoring on outputs that relate to the active work tasks. Prioritize issues that could affect the success of the contract or introduce technical debt. -``` - ---- - -## 7. API Changes Summary - -### 7.1 New Endpoints - -| Method | Path | Description | -|--------|------|-------------| -| POST | `/api/v1/mesh/red-team/notify` | Send notification from red team to supervisor | -| GET | `/api/v1/mesh/red-team/status` | Get red team task status for a contract | - -### 7.2 Modified Endpoints - -| Method | Path | Change | -|--------|------|--------| -| POST | `/api/v1/contracts` | Add `red_team_enabled` and `red_team_prompt` fields | -| GET | `/api/v1/contracts/{id}` | Include red team task info in response | - -### 7.3 New Request/Response Types - -**RedTeamNotifyRequest:** -```rust -#[derive(Debug, Deserialize, ToSchema)] -#[serde(rename_all = "camelCase")] -pub struct RedTeamNotifyRequest { - pub message: String, - #[serde(default = "default_severity")] - pub severity: String, - pub related_task_id: Option<Uuid>, - pub file_path: Option<String>, - pub context: Option<String>, -} -``` - -**RedTeamNotifyResponse:** -```rust -#[derive(Debug, Serialize, ToSchema)] -#[serde(rename_all = "camelCase")] -pub struct RedTeamNotifyResponse { - pub notification_id: Uuid, - pub delivered: bool, - pub supervisor_task_id: Uuid, -} -``` - -**RedTeamStatusResponse:** -```rust -#[derive(Debug, Serialize, ToSchema)] -#[serde(rename_all = "camelCase")] -pub struct RedTeamStatusResponse { - pub contract_id: Uuid, - pub red_team_task_id: Option<Uuid>, - pub status: Option<String>, - pub notifications_sent: i32, - pub last_activity: Option<DateTime<Utc>>, -} -``` - ---- - -## 8. Database Schema Changes - -### 8.1 Contracts Table - -```sql -ALTER TABLE contracts -ADD COLUMN red_team_enabled BOOLEAN NOT NULL DEFAULT FALSE, -ADD COLUMN red_team_prompt TEXT; -``` - -### 8.2 Tasks Table - -```sql -ALTER TABLE tasks -ADD COLUMN is_red_team BOOLEAN NOT NULL DEFAULT FALSE; -``` - -### 8.3 Red Team Notifications Table (New) - -```sql -CREATE TABLE red_team_notifications ( - id UUID PRIMARY KEY DEFAULT gen_random_uuid(), - contract_id UUID NOT NULL REFERENCES contracts(id) ON DELETE CASCADE, - red_team_task_id UUID NOT NULL REFERENCES tasks(id) ON DELETE CASCADE, - related_task_id UUID REFERENCES tasks(id) ON DELETE SET NULL, - - message TEXT NOT NULL, - severity VARCHAR(20) NOT NULL DEFAULT 'medium', - file_path TEXT, - context TEXT, - - -- Delivery status - delivered BOOLEAN NOT NULL DEFAULT FALSE, - delivered_at TIMESTAMP WITH TIME ZONE, - acknowledged BOOLEAN NOT NULL DEFAULT FALSE, - acknowledged_at TIMESTAMP WITH TIME ZONE, - - created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW() -); - --- Indexes -CREATE INDEX idx_red_team_notifications_contract_id ON red_team_notifications(contract_id); -CREATE INDEX idx_red_team_notifications_red_team_task_id ON red_team_notifications(red_team_task_id); -CREATE INDEX idx_red_team_notifications_created_at ON red_team_notifications(created_at DESC); -``` - -### 8.4 Index for Red Team Task Lookup - -```sql -CREATE INDEX idx_tasks_contract_red_team ON tasks(contract_id, is_red_team) -WHERE is_red_team = TRUE; -``` - ---- - -## 9. Implementation Phases - -### Phase 1: Foundation (MVP) -- [ ] Add `red_team_enabled` and `red_team_prompt` to Contract model -- [ ] Add `is_red_team` to Task model -- [ ] Database migrations -- [ ] Basic red team task spawning logic -- [ ] `makima red-team notify` CLI command -- [ ] Red team notification API endpoint - -### Phase 2: Monitoring Infrastructure -- [ ] Task output subscription for red team -- [ ] Diff access for red team tasks -- [ ] Red team system prompt generation -- [ ] Notification delivery to supervisor - -### Phase 3: Polish & UX -- [ ] Red team status in contract view -- [ ] Notification history and acknowledgment -- [ ] TUI integration for red team alerts -- [ ] Frontend display of red team notifications - -### Phase 4: Future Enhancements -- [ ] Configurable notification thresholds -- [ ] Automatic pause on critical issues -- [ ] Red team notification digest/summary -- [ ] Integration with external code review tools - ---- - -## 10. Security Considerations - -### 10.1 Access Control - -- Red team tasks MUST only have read access -- Verify `is_red_team` flag before allowing notification API calls -- Red team cannot spawn tasks or modify contract state -- Tool key scope should be limited for red team tasks - -### 10.2 Abuse Prevention - -- Rate limit red team notifications (max 10 per minute per task) -- Prevent notification spam with deduplication -- Log all red team activities for audit - -### 10.3 Isolation - -- Red team task runs in separate worktree (or no worktree) -- Cannot affect work task execution directly -- Supervisor controls whether to act on notifications - ---- - -## 11. Testing Strategy - -### 11.1 Unit Tests - -- Contract model serialization with red team fields -- Red team task spawning conditions -- Notification message formatting - -### 11.2 Integration Tests - -- Full contract lifecycle with red team enabled -- Notification delivery to supervisor -- Red team output subscription - -### 11.3 E2E Tests - -- Create contract with `--red-team` flag -- Red team detects intentional violation -- Supervisor receives and responds to notification - ---- - -## 12. Success Metrics - -1. **Detection Rate**: Percentage of issues caught by red team before task completion -2. **False Positive Rate**: Percentage of notifications that are dismissed as not actionable -3. **Response Time**: Time between red team detection and supervisor acknowledgment -4. **Contract Success Rate**: Compare success rates for contracts with/without red team - ---- - -## Appendix A: Message Protocol - -### Task Output Notification Structure - -The red team subscribes to `TaskOutputNotification`: - -```rust -pub struct TaskOutputNotification { - pub task_id: Uuid, - pub owner_id: Option<Uuid>, - pub message_type: String, // "assistant", "tool_use", "tool_result", etc. - pub content: String, - pub tool_name: Option<String>, - pub tool_input: Option<serde_json::Value>, - pub is_error: Option<bool>, - pub cost_usd: Option<f64>, - pub duration_ms: Option<u64>, - pub is_partial: bool, -} -``` - -### Daemon Command for Supervisor Message - -```rust -DaemonCommand::SendMessage { - task_id: supervisor_id, - message: formatted_red_team_alert, -} -``` - ---- - -## Appendix B: Configuration Examples - -### Contract Creation with Red Team (API) - -```json -POST /api/v1/contracts -{ - "name": "Implement User Authentication", - "description": "Add OAuth2 authentication flow", - "contract_type": "specification", - "red_team_enabled": true, - "red_team_prompt": "Pay special attention to security best practices and OWASP guidelines. Flag any hardcoded secrets or insecure token handling." -} -``` - -### Contract Creation with Red Team (CLI) - -```bash -makima contract create \ - --type specification \ - --red-team \ - --red-team-prompt "Focus on API backwards compatibility and deprecation handling" \ - "API v2 Migration" \ - "Migrate public API from v1 to v2" -``` |
