# Red Team System Specification ## Overview The Red Team system is an adversarial review feature for makima contracts that provides real-time quality assurance during task execution. When enabled, a parallel "red team" task instance monitors the output of work tasks, verifying that implementations adhere to the contract requirements, repository standards, and the execution plan. ### Goals 1. **Quality Assurance**: Catch deviations from the plan before they compound 2. **Standards Compliance**: Ensure code follows repository conventions (CONTRIBUTING.md, linting rules, etc.) 3. **Contract Adherence**: Verify implementations match the specification and requirements 4. **Proactive Issue Detection**: Flag potential problems early, not after task completion ### Non-Goals 1. The red team should NOT write code or make commits 2. The red team should NOT be overly pedantic or block progress for minor style issues 3. The red team is NOT a replacement for code review - it's an early warning system --- ## 1. Feature Overview ### 1.1 Concept The Red Team operates as a parallel observer task that: - Monitors all work task outputs in real-time via the broadcast system - Has read-only access to task diffs and outputs - Can access contract specifications, plans, and repository standards - Can notify the supervisor when it detects issues requiring attention ### 1.2 Relationship to Existing Components ``` ┌─────────────────────────────────────────────────────────────┐ │ Contract │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Supervisor │ │ Work Task 1 │ │ Work Task 2 │ │ │ │ │<───│ │ │ │ │ │ │ │<───│ │ │ │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ ^ │ │ │ │ │ outputs outputs │ │ │ │ │ │ │ [NOTIFY] v v │ │ │ ┌─────────────────────────────┐ │ │ └────────────│ Red Team Task │ │ │ │ (Monitoring & Validation) │ │ │ └─────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ ``` ### 1.3 Task Type The Red Team task is a special task variant with the following characteristics: - `is_red_team: true` flag on the Task model - Has tool key for API access (like supervisor tasks) - Does NOT have write permissions to the repository - Subscribes to task output broadcasts - Can use `makima red-team notify` command to alert supervisor --- ## 2. Contract Configuration ### 2.1 Contract Model Changes Add the following field to the `Contract` model in `makima/src/db/models.rs`: ```rust /// Contract record from the database #[derive(Debug, Clone, FromRow, Serialize, ToSchema)] #[serde(rename_all = "camelCase")] pub struct Contract { // ... existing fields ... /// Whether to spawn a red team task to monitor work tasks. /// When enabled, a parallel task monitors outputs and can alert /// the supervisor about potential issues. #[serde(default)] pub red_team_enabled: bool, /// Optional custom prompt/criteria for the red team to use /// when evaluating task outputs. If not provided, uses default /// quality criteria. #[serde(skip_serializing_if = "Option::is_none")] pub red_team_prompt: Option, } ``` ### 2.2 CreateContractRequest Changes ```rust #[derive(Debug, Clone, Deserialize, ToSchema)] #[serde(rename_all = "camelCase")] pub struct CreateContractRequest { // ... existing fields ... /// Enable red team monitoring for this contract. /// When enabled, a parallel task monitors work task outputs /// and can alert the supervisor about potential issues. #[serde(default)] pub red_team_enabled: Option, /// Optional custom criteria for the red team to evaluate. /// Examples: "Focus on security vulnerabilities", /// "Ensure all functions have tests", etc. pub red_team_prompt: Option, } ``` ### 2.3 CLI Flag for Contract Creation The daemon CLI should support red team enablement during contract creation: ```bash # Enable red team with default criteria makima supervisor create --red-team "Contract Name" "Description" # Enable red team with custom review criteria makima supervisor create --red-team --red-team-prompt "Focus on performance and memory usage" "Contract Name" "Description" ``` --- ## 3. Red Team Task Lifecycle ### 3.1 Spawning The red team task is spawned automatically when: 1. A contract has `red_team_enabled: true` 2. The first work task is spawned (not the supervisor itself) **Spawn Logic** (in `spawn_task` handler or supervisor spawn logic): ```rust // In spawn_task after creating a work task: if contract.red_team_enabled && !is_supervisor_task { // Check if red team task already exists let existing_red_team = repository::get_red_team_task_for_contract(pool, contract_id).await?; if existing_red_team.is_none() { // Spawn red team task let red_team_task = spawn_red_team_task( pool, state, contract_id, owner_id, contract.red_team_prompt.as_deref(), ).await?; tracing::info!( contract_id = %contract_id, red_team_task_id = %red_team_task.id, "Spawned red team task for contract" ); } } ``` ### 3.2 Task Properties When creating the red team task: ```rust CreateTaskRequest { name: "Red Team Monitor".to_string(), description: Some("Adversarial review task monitoring work task outputs".to_string()), plan: generate_red_team_plan(contract, custom_prompt), contract_id: Some(contract_id), parent_task_id: None, // Not a child of supervisor is_supervisor: false, is_red_team: true, // NEW FIELD // ... other fields ... } ``` ### 3.3 Lifespan The red team task: - Lives for the duration of the **execute phase** - Is automatically terminated when: - The contract advances past the execute phase - The contract is completed - The contract is archived - Can be paused/resumed along with other contract tasks - Does NOT restart automatically after daemon failure (not critical path) ### 3.4 Read-Only Enforcement The red team task: - Has NO worktree of its own (or a read-only clone) - Cannot use git operations (commit, branch, etc.) - Can only READ files, not write them - Has API access limited to read operations --- ## 4. Red Team Notification CLI Command ### 4.1 Command Specification New CLI command available only to red team tasks: ```bash makima red-team notify "" ``` **Arguments:** - ``: A detailed description of the issue detected **Options:** - `--severity `: Issue severity: `low`, `medium`, `high`, `critical` (default: `medium`) - `--task `: The specific task this relates to (optional) - `--file `: The file path where the issue was detected (optional) - `--context `: Additional context about the issue (optional) **Example:** ```bash makima red-team notify "Task is adding console.log statements which violates the no-debug-logging rule in CONTRIBUTING.md" \ --severity medium \ --task 550e8400-e29b-41d4-a716-446655440000 \ --file "src/api/handler.rs" ``` ### 4.2 CLI Arguments Structure ```rust // In makima/src/daemon/cli/mod.rs /// Red Team subcommand - red team task commands. #[derive(Subcommand, Debug)] pub enum RedTeamCommand { /// Send a notification to the supervisor about a detected issue. /// Only available to red team tasks. Notify(NotifyArgs), } /// Arguments for red-team notify command. #[derive(Args, Debug)] pub struct NotifyArgs { /// API URL #[arg(long, env = "MAKIMA_API_URL", default_value = "https://api.makima.jp")] pub api_url: String, /// API key for authentication #[arg(long, env = "MAKIMA_API_KEY")] pub api_key: String, /// Current task ID (must be a red team task) #[arg(long, env = "MAKIMA_TASK_ID")] pub task_id: Uuid, /// Contract ID #[arg(long, env = "MAKIMA_CONTRACT_ID")] pub contract_id: Uuid, /// The notification message #[arg(index = 1)] pub message: String, /// Severity level: low, medium, high, critical #[arg(long, default_value = "medium")] pub severity: String, /// Related task ID (optional) #[arg(long)] pub task: Option, /// Related file path (optional) #[arg(long)] pub file: Option, /// Additional context (optional) #[arg(long)] pub context: Option, } ``` ### 4.3 API Endpoint **POST** `/api/v1/mesh/red-team/notify` **Request Body:** ```json { "message": "Issue description", "severity": "medium", "relatedTaskId": "uuid-optional", "filePath": "src/path/optional.rs", "context": "Additional context optional" } ``` **Response:** ```json { "notificationId": "uuid", "delivered": true, "supervisorTaskId": "uuid" } ``` ### 4.4 Notification Delivery When a red team notification is received: 1. **Validate Caller**: Ensure the request comes from a valid red team task 2. **Find Supervisor**: Get the supervisor task for the contract 3. **Format Message**: Create an `[ACTION REQUIRED]` formatted message 4. **Send to Supervisor**: Inject the message into the supervisor's stdin via `SendMessage` command **Message Format:** ``` ════════════════════════════════════════════════════════════════ [RED TEAM ALERT] Severity: MEDIUM ════════════════════════════════════════════════════════════════ Issue: Task is adding console.log statements which violates the no-debug-logging rule in CONTRIBUTING.md Related Task: 550e8400-e29b-41d4-a716-446655440000 File: src/api/handler.rs Context: The CONTRIBUTING.md file explicitly states that debug logging should use the tracing crate, not console.log or println! ════════════════════════════════════════════════════════════════ You can: - Pause the related task to investigate - Send feedback to the task to correct the issue - Acknowledge this alert and continue monitoring ════════════════════════════════════════════════════════════════ ``` ### 4.5 Supervisor Response Handling The supervisor can respond to red team notifications by: 1. **Pausing the task**: `makima supervisor pause ` 2. **Sending feedback**: `makima supervisor message "Please use tracing instead of console.log"` 3. **Acknowledging**: Simply continue (the red team will keep monitoring) 4. **Dismissing**: Mark the alert as false positive (future consideration) --- ## 5. Red Team Access Patterns ### 5.1 Task Output Subscription The red team task subscribes to the `task_outputs` broadcast channel: ```rust // In red team task initialization let mut task_output_rx = state.task_outputs.subscribe(); loop { match task_output_rx.recv().await { Ok(notification) => { // Only process outputs from work tasks in our contract if notification.contract_id == Some(self.contract_id) && !notification.is_supervisor && !notification.is_red_team { self.analyze_output(notification).await; } } Err(e) => { tracing::warn!("Red team task output subscription error: {}", e); } } } ``` ### 5.2 Task Diff Access The red team can request diffs via the supervisor API: **GET** `/api/v1/mesh/supervisor/tasks/{task_id}/diff` This endpoint already exists and can be used by the red team (with tool key auth). ### 5.3 Contract Information Access The red team can read: - Contract plan and specifications (via contract files) - Repository standards (CONTRIBUTING.md, .editorconfig, etc.) - Task descriptions and plans **Existing endpoints used:** - `GET /api/v1/contracts/{id}` - Contract details - `GET /api/v1/contracts/{id}/files` - Contract files - `GET /api/v1/files/{id}` - File content ### 5.4 Repository File Access For repository standards, the red team uses the existing daemon file read capability: ```bash # Via makima CLI (from within the red team task) makima supervisor read-file "CONTRIBUTING.md" makima supervisor read-file ".editorconfig" makima supervisor read-file "rustfmt.toml" ``` Or direct filesystem access if the red team has a read-only worktree clone. --- ## 6. System Prompt for Red Team Task The red team task receives a specialized system prompt that guides its behavior: ```markdown # Red Team Monitor You are an adversarial quality reviewer for a software development contract. Your role is to monitor work task outputs in real-time and flag potential issues BEFORE they compound into larger problems. ## Your Mission Monitor all task outputs and verify: 1. **Plan Adherence**: Are tasks following the implementation plan? 2. **Code Quality**: Does the code meet repository standards? 3. **Contract Requirements**: Does the implementation match the specification? 4. **Best Practices**: Are there obvious anti-patterns or issues? ## Access Available You have read-only access to: - Task outputs (streamed in real-time) - Task diffs (code changes) - Contract specifications and plan documents - Repository configuration files (CONTRIBUTING.md, linting configs, etc.) ## How to Monitor 1. **Subscribe to task outputs**: You'll receive outputs from all work tasks 2. **Analyze code changes**: Request diffs for completed tasks 3. **Cross-reference**: Compare outputs against the plan and specifications 4. **Report issues**: Use `makima red-team notify` when you detect problems ## When to Notify NOTIFY the supervisor when you observe: - **Critical**: Security vulnerabilities, data loss risks, breaking changes - **High**: Significant deviations from the plan, major code quality issues - **Medium**: Missing tests, suboptimal implementations, minor standard violations - **Low**: Style inconsistencies, documentation gaps (use sparingly) ## What NOT to Do - Do NOT nitpick minor style issues (that's what linters are for) - Do NOT block progress for trivial concerns - Do NOT write code or make changes yourself - Do NOT notify for things that are already in progress and being addressed - Do NOT create duplicate notifications for the same issue ## Notification Format When notifying, always include: 1. A clear, concise description of the issue 2. The severity level (critical/high/medium/low) 3. The related task ID if applicable 4. The specific file or code location if known 5. Why this matters (reference to plan, spec, or standards) ## Example Notification ``` makima red-team notify "Task is implementing authentication with plaintext password storage, which contradicts the security requirements in the specification document" \ --severity critical \ --task \ --file "src/auth/user.rs" \ --context "Specification section 3.2 requires bcrypt hashing for all passwords" ``` ## Custom Review Criteria {{#if red_team_prompt}} Additional review criteria for this contract: {{red_team_prompt}} {{/if}} ## Contract Context Contract: {{contract_name}} Phase: {{contract_phase}} Repository: {{repository_url}} Focus your monitoring on outputs that relate to the active work tasks. Prioritize issues that could affect the success of the contract or introduce technical debt. ``` --- ## 7. API Changes Summary ### 7.1 New Endpoints | Method | Path | Description | |--------|------|-------------| | POST | `/api/v1/mesh/red-team/notify` | Send notification from red team to supervisor | | GET | `/api/v1/mesh/red-team/status` | Get red team task status for a contract | ### 7.2 Modified Endpoints | Method | Path | Change | |--------|------|--------| | POST | `/api/v1/contracts` | Add `red_team_enabled` and `red_team_prompt` fields | | GET | `/api/v1/contracts/{id}` | Include red team task info in response | ### 7.3 New Request/Response Types **RedTeamNotifyRequest:** ```rust #[derive(Debug, Deserialize, ToSchema)] #[serde(rename_all = "camelCase")] pub struct RedTeamNotifyRequest { pub message: String, #[serde(default = "default_severity")] pub severity: String, pub related_task_id: Option, pub file_path: Option, pub context: Option, } ``` **RedTeamNotifyResponse:** ```rust #[derive(Debug, Serialize, ToSchema)] #[serde(rename_all = "camelCase")] pub struct RedTeamNotifyResponse { pub notification_id: Uuid, pub delivered: bool, pub supervisor_task_id: Uuid, } ``` **RedTeamStatusResponse:** ```rust #[derive(Debug, Serialize, ToSchema)] #[serde(rename_all = "camelCase")] pub struct RedTeamStatusResponse { pub contract_id: Uuid, pub red_team_task_id: Option, pub status: Option, pub notifications_sent: i32, pub last_activity: Option>, } ``` --- ## 8. Database Schema Changes ### 8.1 Contracts Table ```sql ALTER TABLE contracts ADD COLUMN red_team_enabled BOOLEAN NOT NULL DEFAULT FALSE, ADD COLUMN red_team_prompt TEXT; ``` ### 8.2 Tasks Table ```sql ALTER TABLE tasks ADD COLUMN is_red_team BOOLEAN NOT NULL DEFAULT FALSE; ``` ### 8.3 Red Team Notifications Table (New) ```sql CREATE TABLE red_team_notifications ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), contract_id UUID NOT NULL REFERENCES contracts(id) ON DELETE CASCADE, red_team_task_id UUID NOT NULL REFERENCES tasks(id) ON DELETE CASCADE, related_task_id UUID REFERENCES tasks(id) ON DELETE SET NULL, message TEXT NOT NULL, severity VARCHAR(20) NOT NULL DEFAULT 'medium', file_path TEXT, context TEXT, -- Delivery status delivered BOOLEAN NOT NULL DEFAULT FALSE, delivered_at TIMESTAMP WITH TIME ZONE, acknowledged BOOLEAN NOT NULL DEFAULT FALSE, acknowledged_at TIMESTAMP WITH TIME ZONE, created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW() ); -- Indexes CREATE INDEX idx_red_team_notifications_contract_id ON red_team_notifications(contract_id); CREATE INDEX idx_red_team_notifications_red_team_task_id ON red_team_notifications(red_team_task_id); CREATE INDEX idx_red_team_notifications_created_at ON red_team_notifications(created_at DESC); ``` ### 8.4 Index for Red Team Task Lookup ```sql CREATE INDEX idx_tasks_contract_red_team ON tasks(contract_id, is_red_team) WHERE is_red_team = TRUE; ``` --- ## 9. Implementation Phases ### Phase 1: Foundation (MVP) - [ ] Add `red_team_enabled` and `red_team_prompt` to Contract model - [ ] Add `is_red_team` to Task model - [ ] Database migrations - [ ] Basic red team task spawning logic - [ ] `makima red-team notify` CLI command - [ ] Red team notification API endpoint ### Phase 2: Monitoring Infrastructure - [ ] Task output subscription for red team - [ ] Diff access for red team tasks - [ ] Red team system prompt generation - [ ] Notification delivery to supervisor ### Phase 3: Polish & UX - [ ] Red team status in contract view - [ ] Notification history and acknowledgment - [ ] TUI integration for red team alerts - [ ] Frontend display of red team notifications ### Phase 4: Future Enhancements - [ ] Configurable notification thresholds - [ ] Automatic pause on critical issues - [ ] Red team notification digest/summary - [ ] Integration with external code review tools --- ## 10. Security Considerations ### 10.1 Access Control - Red team tasks MUST only have read access - Verify `is_red_team` flag before allowing notification API calls - Red team cannot spawn tasks or modify contract state - Tool key scope should be limited for red team tasks ### 10.2 Abuse Prevention - Rate limit red team notifications (max 10 per minute per task) - Prevent notification spam with deduplication - Log all red team activities for audit ### 10.3 Isolation - Red team task runs in separate worktree (or no worktree) - Cannot affect work task execution directly - Supervisor controls whether to act on notifications --- ## 11. Testing Strategy ### 11.1 Unit Tests - Contract model serialization with red team fields - Red team task spawning conditions - Notification message formatting ### 11.2 Integration Tests - Full contract lifecycle with red team enabled - Notification delivery to supervisor - Red team output subscription ### 11.3 E2E Tests - Create contract with `--red-team` flag - Red team detects intentional violation - Supervisor receives and responds to notification --- ## 12. Success Metrics 1. **Detection Rate**: Percentage of issues caught by red team before task completion 2. **False Positive Rate**: Percentage of notifications that are dismissed as not actionable 3. **Response Time**: Time between red team detection and supervisor acknowledgment 4. **Contract Success Rate**: Compare success rates for contracts with/without red team --- ## Appendix A: Message Protocol ### Task Output Notification Structure The red team subscribes to `TaskOutputNotification`: ```rust pub struct TaskOutputNotification { pub task_id: Uuid, pub owner_id: Option, pub message_type: String, // "assistant", "tool_use", "tool_result", etc. pub content: String, pub tool_name: Option, pub tool_input: Option, pub is_error: Option, pub cost_usd: Option, pub duration_ms: Option, pub is_partial: bool, } ``` ### Daemon Command for Supervisor Message ```rust DaemonCommand::SendMessage { task_id: supervisor_id, message: formatted_red_team_alert, } ``` --- ## Appendix B: Configuration Examples ### Contract Creation with Red Team (API) ```json POST /api/v1/contracts { "name": "Implement User Authentication", "description": "Add OAuth2 authentication flow", "contract_type": "specification", "red_team_enabled": true, "red_team_prompt": "Pay special attention to security best practices and OWASP guidelines. Flag any hardcoded secrets or insecure token handling." } ``` ### Contract Creation with Red Team (CLI) ```bash makima contract create \ --type specification \ --red-team \ --red-team-prompt "Focus on API backwards compatibility and deprecation handling" \ "API v2 Migration" \ "Migrate public API from v1 to v2" ```