summaryrefslogtreecommitdiff
path: root/.makima
diff options
context:
space:
mode:
authorsoryu <soryu@soryu.co>2026-02-02 02:34:50 +0000
committersoryu <soryu@soryu.co>2026-02-02 02:34:50 +0000
commit151e9d87e117b7980e6aad522ac8f3633eeca87a (patch)
treee80fb4301361b3b12e5abf8e442603db2d0622dc /.makima
parenta2c147ddd59f55a07b5be0c8970169726b55c876 (diff)
downloadsoryu-151e9d87e117b7980e6aad522ac8f3633eeca87a.tar.gz
soryu-151e9d87e117b7980e6aad522ac8f3633eeca87a.zip
Make makima more opinionated and structured
Diffstat (limited to '.makima')
-rw-r--r--.makima/specs/red-team-system.md748
1 files changed, 0 insertions, 748 deletions
diff --git a/.makima/specs/red-team-system.md b/.makima/specs/red-team-system.md
deleted file mode 100644
index 31f4b78..0000000
--- a/.makima/specs/red-team-system.md
+++ /dev/null
@@ -1,748 +0,0 @@
-# Red Team System Specification
-
-## Overview
-
-The Red Team system is an adversarial review feature for makima contracts that provides real-time quality assurance during task execution. When enabled, a parallel "red team" task instance monitors the output of work tasks, verifying that implementations adhere to the contract requirements, repository standards, and the execution plan.
-
-### Goals
-
-1. **Quality Assurance**: Catch deviations from the plan before they compound
-2. **Standards Compliance**: Ensure code follows repository conventions (CONTRIBUTING.md, linting rules, etc.)
-3. **Contract Adherence**: Verify implementations match the specification and requirements
-4. **Proactive Issue Detection**: Flag potential problems early, not after task completion
-
-### Non-Goals
-
-1. The red team should NOT write code or make commits
-2. The red team should NOT be overly pedantic or block progress for minor style issues
-3. The red team is NOT a replacement for code review - it's an early warning system
-
----
-
-## 1. Feature Overview
-
-### 1.1 Concept
-
-The Red Team operates as a parallel observer task that:
-- Monitors all work task outputs in real-time via the broadcast system
-- Has read-only access to task diffs and outputs
-- Can access contract specifications, plans, and repository standards
-- Can notify the supervisor when it detects issues requiring attention
-
-### 1.2 Relationship to Existing Components
-
-```
-┌─────────────────────────────────────────────────────────────┐
-│ Contract │
-│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
-│ │ Supervisor │ │ Work Task 1 │ │ Work Task 2 │ │
-│ │ │<───│ │ │ │ │
-│ │ │<───│ │ │ │ │
-│ └──────────────┘ └──────────────┘ └──────────────┘ │
-│ ^ │ │ │
-│ │ outputs outputs │
-│ │ │ │ │
-│ [NOTIFY] v v │
-│ │ ┌─────────────────────────────┐ │
-│ └────────────│ Red Team Task │ │
-│ │ (Monitoring & Validation) │ │
-│ └─────────────────────────────┘ │
-└─────────────────────────────────────────────────────────────┘
-```
-
-### 1.3 Task Type
-
-The Red Team task is a special task variant with the following characteristics:
-- `is_red_team: true` flag on the Task model
-- Has tool key for API access (like supervisor tasks)
-- Does NOT have write permissions to the repository
-- Subscribes to task output broadcasts
-- Can use `makima red-team notify` command to alert supervisor
-
----
-
-## 2. Contract Configuration
-
-### 2.1 Contract Model Changes
-
-Add the following field to the `Contract` model in `makima/src/db/models.rs`:
-
-```rust
-/// Contract record from the database
-#[derive(Debug, Clone, FromRow, Serialize, ToSchema)]
-#[serde(rename_all = "camelCase")]
-pub struct Contract {
- // ... existing fields ...
-
- /// Whether to spawn a red team task to monitor work tasks.
- /// When enabled, a parallel task monitors outputs and can alert
- /// the supervisor about potential issues.
- #[serde(default)]
- pub red_team_enabled: bool,
-
- /// Optional custom prompt/criteria for the red team to use
- /// when evaluating task outputs. If not provided, uses default
- /// quality criteria.
- #[serde(skip_serializing_if = "Option::is_none")]
- pub red_team_prompt: Option<String>,
-}
-```
-
-### 2.2 CreateContractRequest Changes
-
-```rust
-#[derive(Debug, Clone, Deserialize, ToSchema)]
-#[serde(rename_all = "camelCase")]
-pub struct CreateContractRequest {
- // ... existing fields ...
-
- /// Enable red team monitoring for this contract.
- /// When enabled, a parallel task monitors work task outputs
- /// and can alert the supervisor about potential issues.
- #[serde(default)]
- pub red_team_enabled: Option<bool>,
-
- /// Optional custom criteria for the red team to evaluate.
- /// Examples: "Focus on security vulnerabilities",
- /// "Ensure all functions have tests", etc.
- pub red_team_prompt: Option<String>,
-}
-```
-
-### 2.3 CLI Flag for Contract Creation
-
-The daemon CLI should support red team enablement during contract creation:
-
-```bash
-# Enable red team with default criteria
-makima supervisor create --red-team "Contract Name" "Description"
-
-# Enable red team with custom review criteria
-makima supervisor create --red-team --red-team-prompt "Focus on performance and memory usage" "Contract Name" "Description"
-```
-
----
-
-## 3. Red Team Task Lifecycle
-
-### 3.1 Spawning
-
-The red team task is spawned automatically when:
-1. A contract has `red_team_enabled: true`
-2. The first work task is spawned (not the supervisor itself)
-
-**Spawn Logic** (in `spawn_task` handler or supervisor spawn logic):
-
-```rust
-// In spawn_task after creating a work task:
-if contract.red_team_enabled && !is_supervisor_task {
- // Check if red team task already exists
- let existing_red_team = repository::get_red_team_task_for_contract(pool, contract_id).await?;
-
- if existing_red_team.is_none() {
- // Spawn red team task
- let red_team_task = spawn_red_team_task(
- pool,
- state,
- contract_id,
- owner_id,
- contract.red_team_prompt.as_deref(),
- ).await?;
-
- tracing::info!(
- contract_id = %contract_id,
- red_team_task_id = %red_team_task.id,
- "Spawned red team task for contract"
- );
- }
-}
-```
-
-### 3.2 Task Properties
-
-When creating the red team task:
-
-```rust
-CreateTaskRequest {
- name: "Red Team Monitor".to_string(),
- description: Some("Adversarial review task monitoring work task outputs".to_string()),
- plan: generate_red_team_plan(contract, custom_prompt),
- contract_id: Some(contract_id),
- parent_task_id: None, // Not a child of supervisor
- is_supervisor: false,
- is_red_team: true, // NEW FIELD
- // ... other fields ...
-}
-```
-
-### 3.3 Lifespan
-
-The red team task:
-- Lives for the duration of the **execute phase**
-- Is automatically terminated when:
- - The contract advances past the execute phase
- - The contract is completed
- - The contract is archived
-- Can be paused/resumed along with other contract tasks
-- Does NOT restart automatically after daemon failure (not critical path)
-
-### 3.4 Read-Only Enforcement
-
-The red team task:
-- Has NO worktree of its own (or a read-only clone)
-- Cannot use git operations (commit, branch, etc.)
-- Can only READ files, not write them
-- Has API access limited to read operations
-
----
-
-## 4. Red Team Notification CLI Command
-
-### 4.1 Command Specification
-
-New CLI command available only to red team tasks:
-
-```bash
-makima red-team notify "<message>"
-```
-
-**Arguments:**
-- `<message>`: A detailed description of the issue detected
-
-**Options:**
-- `--severity <level>`: Issue severity: `low`, `medium`, `high`, `critical` (default: `medium`)
-- `--task <task_id>`: The specific task this relates to (optional)
-- `--file <path>`: The file path where the issue was detected (optional)
-- `--context <text>`: Additional context about the issue (optional)
-
-**Example:**
-
-```bash
-makima red-team notify "Task is adding console.log statements which violates the no-debug-logging rule in CONTRIBUTING.md" \
- --severity medium \
- --task 550e8400-e29b-41d4-a716-446655440000 \
- --file "src/api/handler.rs"
-```
-
-### 4.2 CLI Arguments Structure
-
-```rust
-// In makima/src/daemon/cli/mod.rs
-
-/// Red Team subcommand - red team task commands.
-#[derive(Subcommand, Debug)]
-pub enum RedTeamCommand {
- /// Send a notification to the supervisor about a detected issue.
- /// Only available to red team tasks.
- Notify(NotifyArgs),
-}
-
-/// Arguments for red-team notify command.
-#[derive(Args, Debug)]
-pub struct NotifyArgs {
- /// API URL
- #[arg(long, env = "MAKIMA_API_URL", default_value = "https://api.makima.jp")]
- pub api_url: String,
-
- /// API key for authentication
- #[arg(long, env = "MAKIMA_API_KEY")]
- pub api_key: String,
-
- /// Current task ID (must be a red team task)
- #[arg(long, env = "MAKIMA_TASK_ID")]
- pub task_id: Uuid,
-
- /// Contract ID
- #[arg(long, env = "MAKIMA_CONTRACT_ID")]
- pub contract_id: Uuid,
-
- /// The notification message
- #[arg(index = 1)]
- pub message: String,
-
- /// Severity level: low, medium, high, critical
- #[arg(long, default_value = "medium")]
- pub severity: String,
-
- /// Related task ID (optional)
- #[arg(long)]
- pub task: Option<Uuid>,
-
- /// Related file path (optional)
- #[arg(long)]
- pub file: Option<String>,
-
- /// Additional context (optional)
- #[arg(long)]
- pub context: Option<String>,
-}
-```
-
-### 4.3 API Endpoint
-
-**POST** `/api/v1/mesh/red-team/notify`
-
-**Request Body:**
-```json
-{
- "message": "Issue description",
- "severity": "medium",
- "relatedTaskId": "uuid-optional",
- "filePath": "src/path/optional.rs",
- "context": "Additional context optional"
-}
-```
-
-**Response:**
-```json
-{
- "notificationId": "uuid",
- "delivered": true,
- "supervisorTaskId": "uuid"
-}
-```
-
-### 4.4 Notification Delivery
-
-When a red team notification is received:
-
-1. **Validate Caller**: Ensure the request comes from a valid red team task
-2. **Find Supervisor**: Get the supervisor task for the contract
-3. **Format Message**: Create an `[ACTION REQUIRED]` formatted message
-4. **Send to Supervisor**: Inject the message into the supervisor's stdin via `SendMessage` command
-
-**Message Format:**
-
-```
-════════════════════════════════════════════════════════════════
-[RED TEAM ALERT] Severity: MEDIUM
-════════════════════════════════════════════════════════════════
-
-Issue: Task is adding console.log statements which violates the
-no-debug-logging rule in CONTRIBUTING.md
-
-Related Task: 550e8400-e29b-41d4-a716-446655440000
-File: src/api/handler.rs
-
-Context: The CONTRIBUTING.md file explicitly states that debug
-logging should use the tracing crate, not console.log or println!
-
-════════════════════════════════════════════════════════════════
-You can:
-- Pause the related task to investigate
-- Send feedback to the task to correct the issue
-- Acknowledge this alert and continue monitoring
-════════════════════════════════════════════════════════════════
-```
-
-### 4.5 Supervisor Response Handling
-
-The supervisor can respond to red team notifications by:
-1. **Pausing the task**: `makima supervisor pause <task_id>`
-2. **Sending feedback**: `makima supervisor message <task_id> "Please use tracing instead of console.log"`
-3. **Acknowledging**: Simply continue (the red team will keep monitoring)
-4. **Dismissing**: Mark the alert as false positive (future consideration)
-
----
-
-## 5. Red Team Access Patterns
-
-### 5.1 Task Output Subscription
-
-The red team task subscribes to the `task_outputs` broadcast channel:
-
-```rust
-// In red team task initialization
-let mut task_output_rx = state.task_outputs.subscribe();
-
-loop {
- match task_output_rx.recv().await {
- Ok(notification) => {
- // Only process outputs from work tasks in our contract
- if notification.contract_id == Some(self.contract_id)
- && !notification.is_supervisor
- && !notification.is_red_team {
- self.analyze_output(notification).await;
- }
- }
- Err(e) => {
- tracing::warn!("Red team task output subscription error: {}", e);
- }
- }
-}
-```
-
-### 5.2 Task Diff Access
-
-The red team can request diffs via the supervisor API:
-
-**GET** `/api/v1/mesh/supervisor/tasks/{task_id}/diff`
-
-This endpoint already exists and can be used by the red team (with tool key auth).
-
-### 5.3 Contract Information Access
-
-The red team can read:
-- Contract plan and specifications (via contract files)
-- Repository standards (CONTRIBUTING.md, .editorconfig, etc.)
-- Task descriptions and plans
-
-**Existing endpoints used:**
-- `GET /api/v1/contracts/{id}` - Contract details
-- `GET /api/v1/contracts/{id}/files` - Contract files
-- `GET /api/v1/files/{id}` - File content
-
-### 5.4 Repository File Access
-
-For repository standards, the red team uses the existing daemon file read capability:
-
-```bash
-# Via makima CLI (from within the red team task)
-makima supervisor read-file <task_id> "CONTRIBUTING.md"
-makima supervisor read-file <task_id> ".editorconfig"
-makima supervisor read-file <task_id> "rustfmt.toml"
-```
-
-Or direct filesystem access if the red team has a read-only worktree clone.
-
----
-
-## 6. System Prompt for Red Team Task
-
-The red team task receives a specialized system prompt that guides its behavior:
-
-```markdown
-# Red Team Monitor
-
-You are an adversarial quality reviewer for a software development contract. Your role is to monitor work task outputs in real-time and flag potential issues BEFORE they compound into larger problems.
-
-## Your Mission
-
-Monitor all task outputs and verify:
-1. **Plan Adherence**: Are tasks following the implementation plan?
-2. **Code Quality**: Does the code meet repository standards?
-3. **Contract Requirements**: Does the implementation match the specification?
-4. **Best Practices**: Are there obvious anti-patterns or issues?
-
-## Access Available
-
-You have read-only access to:
-- Task outputs (streamed in real-time)
-- Task diffs (code changes)
-- Contract specifications and plan documents
-- Repository configuration files (CONTRIBUTING.md, linting configs, etc.)
-
-## How to Monitor
-
-1. **Subscribe to task outputs**: You'll receive outputs from all work tasks
-2. **Analyze code changes**: Request diffs for completed tasks
-3. **Cross-reference**: Compare outputs against the plan and specifications
-4. **Report issues**: Use `makima red-team notify` when you detect problems
-
-## When to Notify
-
-NOTIFY the supervisor when you observe:
-- **Critical**: Security vulnerabilities, data loss risks, breaking changes
-- **High**: Significant deviations from the plan, major code quality issues
-- **Medium**: Missing tests, suboptimal implementations, minor standard violations
-- **Low**: Style inconsistencies, documentation gaps (use sparingly)
-
-## What NOT to Do
-
-- Do NOT nitpick minor style issues (that's what linters are for)
-- Do NOT block progress for trivial concerns
-- Do NOT write code or make changes yourself
-- Do NOT notify for things that are already in progress and being addressed
-- Do NOT create duplicate notifications for the same issue
-
-## Notification Format
-
-When notifying, always include:
-1. A clear, concise description of the issue
-2. The severity level (critical/high/medium/low)
-3. The related task ID if applicable
-4. The specific file or code location if known
-5. Why this matters (reference to plan, spec, or standards)
-
-## Example Notification
-
-```
-makima red-team notify "Task is implementing authentication with plaintext password storage, which contradicts the security requirements in the specification document" \
- --severity critical \
- --task <task_id> \
- --file "src/auth/user.rs" \
- --context "Specification section 3.2 requires bcrypt hashing for all passwords"
-```
-
-## Custom Review Criteria
-
-{{#if red_team_prompt}}
-Additional review criteria for this contract:
-{{red_team_prompt}}
-{{/if}}
-
-## Contract Context
-
-Contract: {{contract_name}}
-Phase: {{contract_phase}}
-Repository: {{repository_url}}
-
-Focus your monitoring on outputs that relate to the active work tasks. Prioritize issues that could affect the success of the contract or introduce technical debt.
-```
-
----
-
-## 7. API Changes Summary
-
-### 7.1 New Endpoints
-
-| Method | Path | Description |
-|--------|------|-------------|
-| POST | `/api/v1/mesh/red-team/notify` | Send notification from red team to supervisor |
-| GET | `/api/v1/mesh/red-team/status` | Get red team task status for a contract |
-
-### 7.2 Modified Endpoints
-
-| Method | Path | Change |
-|--------|------|--------|
-| POST | `/api/v1/contracts` | Add `red_team_enabled` and `red_team_prompt` fields |
-| GET | `/api/v1/contracts/{id}` | Include red team task info in response |
-
-### 7.3 New Request/Response Types
-
-**RedTeamNotifyRequest:**
-```rust
-#[derive(Debug, Deserialize, ToSchema)]
-#[serde(rename_all = "camelCase")]
-pub struct RedTeamNotifyRequest {
- pub message: String,
- #[serde(default = "default_severity")]
- pub severity: String,
- pub related_task_id: Option<Uuid>,
- pub file_path: Option<String>,
- pub context: Option<String>,
-}
-```
-
-**RedTeamNotifyResponse:**
-```rust
-#[derive(Debug, Serialize, ToSchema)]
-#[serde(rename_all = "camelCase")]
-pub struct RedTeamNotifyResponse {
- pub notification_id: Uuid,
- pub delivered: bool,
- pub supervisor_task_id: Uuid,
-}
-```
-
-**RedTeamStatusResponse:**
-```rust
-#[derive(Debug, Serialize, ToSchema)]
-#[serde(rename_all = "camelCase")]
-pub struct RedTeamStatusResponse {
- pub contract_id: Uuid,
- pub red_team_task_id: Option<Uuid>,
- pub status: Option<String>,
- pub notifications_sent: i32,
- pub last_activity: Option<DateTime<Utc>>,
-}
-```
-
----
-
-## 8. Database Schema Changes
-
-### 8.1 Contracts Table
-
-```sql
-ALTER TABLE contracts
-ADD COLUMN red_team_enabled BOOLEAN NOT NULL DEFAULT FALSE,
-ADD COLUMN red_team_prompt TEXT;
-```
-
-### 8.2 Tasks Table
-
-```sql
-ALTER TABLE tasks
-ADD COLUMN is_red_team BOOLEAN NOT NULL DEFAULT FALSE;
-```
-
-### 8.3 Red Team Notifications Table (New)
-
-```sql
-CREATE TABLE red_team_notifications (
- id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
- contract_id UUID NOT NULL REFERENCES contracts(id) ON DELETE CASCADE,
- red_team_task_id UUID NOT NULL REFERENCES tasks(id) ON DELETE CASCADE,
- related_task_id UUID REFERENCES tasks(id) ON DELETE SET NULL,
-
- message TEXT NOT NULL,
- severity VARCHAR(20) NOT NULL DEFAULT 'medium',
- file_path TEXT,
- context TEXT,
-
- -- Delivery status
- delivered BOOLEAN NOT NULL DEFAULT FALSE,
- delivered_at TIMESTAMP WITH TIME ZONE,
- acknowledged BOOLEAN NOT NULL DEFAULT FALSE,
- acknowledged_at TIMESTAMP WITH TIME ZONE,
-
- created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW()
-);
-
--- Indexes
-CREATE INDEX idx_red_team_notifications_contract_id ON red_team_notifications(contract_id);
-CREATE INDEX idx_red_team_notifications_red_team_task_id ON red_team_notifications(red_team_task_id);
-CREATE INDEX idx_red_team_notifications_created_at ON red_team_notifications(created_at DESC);
-```
-
-### 8.4 Index for Red Team Task Lookup
-
-```sql
-CREATE INDEX idx_tasks_contract_red_team ON tasks(contract_id, is_red_team)
-WHERE is_red_team = TRUE;
-```
-
----
-
-## 9. Implementation Phases
-
-### Phase 1: Foundation (MVP)
-- [ ] Add `red_team_enabled` and `red_team_prompt` to Contract model
-- [ ] Add `is_red_team` to Task model
-- [ ] Database migrations
-- [ ] Basic red team task spawning logic
-- [ ] `makima red-team notify` CLI command
-- [ ] Red team notification API endpoint
-
-### Phase 2: Monitoring Infrastructure
-- [ ] Task output subscription for red team
-- [ ] Diff access for red team tasks
-- [ ] Red team system prompt generation
-- [ ] Notification delivery to supervisor
-
-### Phase 3: Polish & UX
-- [ ] Red team status in contract view
-- [ ] Notification history and acknowledgment
-- [ ] TUI integration for red team alerts
-- [ ] Frontend display of red team notifications
-
-### Phase 4: Future Enhancements
-- [ ] Configurable notification thresholds
-- [ ] Automatic pause on critical issues
-- [ ] Red team notification digest/summary
-- [ ] Integration with external code review tools
-
----
-
-## 10. Security Considerations
-
-### 10.1 Access Control
-
-- Red team tasks MUST only have read access
-- Verify `is_red_team` flag before allowing notification API calls
-- Red team cannot spawn tasks or modify contract state
-- Tool key scope should be limited for red team tasks
-
-### 10.2 Abuse Prevention
-
-- Rate limit red team notifications (max 10 per minute per task)
-- Prevent notification spam with deduplication
-- Log all red team activities for audit
-
-### 10.3 Isolation
-
-- Red team task runs in separate worktree (or no worktree)
-- Cannot affect work task execution directly
-- Supervisor controls whether to act on notifications
-
----
-
-## 11. Testing Strategy
-
-### 11.1 Unit Tests
-
-- Contract model serialization with red team fields
-- Red team task spawning conditions
-- Notification message formatting
-
-### 11.2 Integration Tests
-
-- Full contract lifecycle with red team enabled
-- Notification delivery to supervisor
-- Red team output subscription
-
-### 11.3 E2E Tests
-
-- Create contract with `--red-team` flag
-- Red team detects intentional violation
-- Supervisor receives and responds to notification
-
----
-
-## 12. Success Metrics
-
-1. **Detection Rate**: Percentage of issues caught by red team before task completion
-2. **False Positive Rate**: Percentage of notifications that are dismissed as not actionable
-3. **Response Time**: Time between red team detection and supervisor acknowledgment
-4. **Contract Success Rate**: Compare success rates for contracts with/without red team
-
----
-
-## Appendix A: Message Protocol
-
-### Task Output Notification Structure
-
-The red team subscribes to `TaskOutputNotification`:
-
-```rust
-pub struct TaskOutputNotification {
- pub task_id: Uuid,
- pub owner_id: Option<Uuid>,
- pub message_type: String, // "assistant", "tool_use", "tool_result", etc.
- pub content: String,
- pub tool_name: Option<String>,
- pub tool_input: Option<serde_json::Value>,
- pub is_error: Option<bool>,
- pub cost_usd: Option<f64>,
- pub duration_ms: Option<u64>,
- pub is_partial: bool,
-}
-```
-
-### Daemon Command for Supervisor Message
-
-```rust
-DaemonCommand::SendMessage {
- task_id: supervisor_id,
- message: formatted_red_team_alert,
-}
-```
-
----
-
-## Appendix B: Configuration Examples
-
-### Contract Creation with Red Team (API)
-
-```json
-POST /api/v1/contracts
-{
- "name": "Implement User Authentication",
- "description": "Add OAuth2 authentication flow",
- "contract_type": "specification",
- "red_team_enabled": true,
- "red_team_prompt": "Pay special attention to security best practices and OWASP guidelines. Flag any hardcoded secrets or insecure token handling."
-}
-```
-
-### Contract Creation with Red Team (CLI)
-
-```bash
-makima contract create \
- --type specification \
- --red-team \
- --red-team-prompt "Focus on API backwards compatibility and deprecation handling" \
- "API v2 Migration" \
- "Migrate public API from v1 to v2"
-```