diff options
| author | soryu <soryu@soryu.co> | 2026-02-01 00:20:55 +0000 |
|---|---|---|
| committer | soryu <soryu@soryu.co> | 2026-02-01 00:25:43 +0000 |
| commit | bb14010db99b40792372bfcb4348cf4984f30b3f (patch) | |
| tree | d5c12af5ce8e87430daad3f80a979157233e8644 /makima/migrations/20260201000000_supervisor_heartbeats.sql | |
| parent | 7567153e6281b94e39e52be5d060b381ed69597d (diff) | |
| download | soryu-bb14010db99b40792372bfcb4348cf4984f30b3f.tar.gz soryu-bb14010db99b40792372bfcb4348cf4984f30b3f.zip | |
feat: Implement Phase 3 Tasks 3.1 and 3.2 - SupervisorState enum and Heartbeat Infrastructure
Task 3.1: Enhanced Supervisor State Enum
- Add SupervisorStateEnum with states: Initializing, Idle, Working, WaitingForUser,
WaitingForTasks, Blocked, Completed, Failed, Interrupted
- Implement Display, FromStr, Default, and serde serialization
- Add SupervisorHeartbeatRecord and SupervisorHeartbeatRequest structs
Task 3.2: Heartbeat Infrastructure
- Create supervisor_heartbeats migration with proper indexes and constraints
- Add heartbeat storage functions to repository.rs:
- create_supervisor_heartbeat
- get_latest_supervisor_heartbeat
- get_supervisor_heartbeats
- get_contract_supervisor_heartbeats
- cleanup_old_heartbeats (24 hour TTL support)
- find_stale_supervisors (for dead supervisor detection)
- Add SupervisorHeartbeat message to protocol.rs with enhanced fields
- Update mesh_daemon.rs to process and store supervisor heartbeats
- Add unit tests for SupervisorStateEnum and heartbeat serialization
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Diffstat (limited to 'makima/migrations/20260201000000_supervisor_heartbeats.sql')
| -rw-r--r-- | makima/migrations/20260201000000_supervisor_heartbeats.sql | 36 |
1 files changed, 36 insertions, 0 deletions
diff --git a/makima/migrations/20260201000000_supervisor_heartbeats.sql b/makima/migrations/20260201000000_supervisor_heartbeats.sql new file mode 100644 index 0000000..8595f71 --- /dev/null +++ b/makima/migrations/20260201000000_supervisor_heartbeats.sql @@ -0,0 +1,36 @@ +-- Create supervisor_heartbeats table for tracking supervisor state over time. +-- This enables detection of dead/stale supervisors and provides audit trail. + +CREATE TABLE IF NOT EXISTS supervisor_heartbeats ( + id UUID PRIMARY KEY DEFAULT gen_random_uuid(), + supervisor_task_id UUID NOT NULL REFERENCES tasks(id) ON DELETE CASCADE, + contract_id UUID NOT NULL REFERENCES contracts(id) ON DELETE CASCADE, + state VARCHAR(50) NOT NULL, + phase VARCHAR(50) NOT NULL, + current_activity TEXT, + progress INTEGER DEFAULT 0 CHECK (progress >= 0 AND progress <= 100), + pending_task_ids UUID[] DEFAULT ARRAY[]::UUID[], + timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW() +); + +-- Index for finding heartbeats by supervisor task +CREATE INDEX idx_heartbeats_supervisor ON supervisor_heartbeats(supervisor_task_id); + +-- Index for finding heartbeats by timestamp (for cleanup and monitoring) +CREATE INDEX idx_heartbeats_timestamp ON supervisor_heartbeats(timestamp); + +-- Index for finding heartbeats by contract +CREATE INDEX idx_heartbeats_contract ON supervisor_heartbeats(contract_id); + +-- Composite index for finding latest heartbeat per supervisor +CREATE INDEX idx_heartbeats_supervisor_timestamp ON supervisor_heartbeats(supervisor_task_id, timestamp DESC); + +COMMENT ON TABLE supervisor_heartbeats IS 'Historical record of supervisor heartbeats for monitoring and dead supervisor detection'; +COMMENT ON COLUMN supervisor_heartbeats.state IS 'Supervisor state: initializing, idle, working, waiting_for_user, waiting_for_tasks, blocked, completed, failed, interrupted'; +COMMENT ON COLUMN supervisor_heartbeats.phase IS 'Current contract phase when heartbeat was sent'; +COMMENT ON COLUMN supervisor_heartbeats.current_activity IS 'Human-readable description of what the supervisor is doing'; +COMMENT ON COLUMN supervisor_heartbeats.progress IS 'Progress percentage (0-100)'; +COMMENT ON COLUMN supervisor_heartbeats.pending_task_ids IS 'Array of task IDs the supervisor is waiting on'; + +-- Note: Cleanup of old heartbeats (24 hour TTL) should be done by a scheduled job +-- or application-level cleanup, not a CHECK constraint (which can't reference NOW()) |
