soryu - soryu-co/soryu mirror

	Commit message (Collapse)	Author	Age	Files	Lines
*	Release in makima repo	soryu	2026-02-02	1	-6/+4
\| \| \| \|	Also remove all other TTS models
*	Use chatterbox TTS	soryu	2026-02-01	1	-8/+8
\|
*	Add auto_merge_local option for local-only contracts (#50)	soryu	2026-01-31	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When local_only=true on a contract, all completion actions are skipped. This adds a new option auto_merge_local that, when enabled along with local_only, will automatically merge completed task changes to the master/main branch locally (without pushing or creating PRs). Changes: - Add auto_merge_local column to contracts table (migration) - Add auto_merge_local field to Contract model and summary - Update CreateContractRequest and UpdateContractRequest structs - Update contract repository create/update functions - Add auto_merge_local to WebSocket protocol StartTask command - Pass auto_merge_local through spawn_task and run_task functions - Modify task manager completion logic: if local_only=true AND auto_merge_local=true, execute 'merge' completion action locally - Update all server handlers to retrieve and pass auto_merge_local - Add TypeScript types to frontend components Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
*	Add autodetection of master for PR creation	soryu	2026-01-29	1	-2/+2
\|
*	Fix model loading for TTS / speak	soryu	2026-01-29	1	-2/+10
\|
*	Fix makima supervisor pr CLI command	soryu	2026-01-29	1	-1/+3
\|
*	Add Qwen3-TTS streaming endpoint for voice synthesis (#40)	soryu	2026-01-28	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Task completion checkpoint * Task completion checkpoint * Task completion checkpoint * Add Qwen3-TTS research document for live TTS replacement Research findings for replacing Chatterbox TTS with Qwen3-TTS-12Hz-0.6B-Base: - Current TTS: Chatterbox-Turbo-ONNX with batch-only generation, no streaming - Qwen3-TTS: 97ms end-to-end latency, streaming support, 3-second voice cloning - Voice cloning: Requires 3s reference audio + transcript (Makima voice planned) - Integration: Python service with WebSocket bridge (no ONNX export available) - Languages: 10 supported including English and Japanese Document includes: - Current architecture analysis (makima/src/tts.rs) - Qwen3-TTS capabilities and requirements - Feasibility assessment for live/streaming TTS - Audio clip requirements for voice cloning - Preliminary technical approach with architecture diagrams Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * [WIP] Heartbeat checkpoint - 2026-01-27 03:11:15 UTC * Add Qwen3-TTS research documentation Comprehensive research on replacing Chatterbox TTS with Qwen3-TTS-12Hz-0.6B-Base: - Current TTS implementation analysis (Chatterbox-Turbo-ONNX in makima/src/tts.rs) - Qwen3-TTS capabilities: 97ms streaming latency, voice cloning with 3s reference - Cross-lingual support: Japanese voice (Makima/Tomori Kusunoki) speaking English - Python microservice architecture recommendation (FastAPI + WebSocket) - Implementation phases and technical approach - Hardware requirements and dependencies Key findings: - Live/streaming TTS is highly feasible with 97ms latency - Voice cloning fully supported with 0.95 speaker similarity - Recommended: Python microservice with WebSocket streaming Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add comprehensive Qwen3-TTS integration specification This specification document defines the complete integration of Qwen3-TTS-12Hz-0.6B-Base as a replacement for the existing Chatterbox-Turbo TTS implementation. The document covers: ## Functional Requirements - WebSocket endpoint /api/v1/speak for streaming TTS - Voice cloning with default Makima voice (Japanese VA speaking English) - Support for custom voice references - Detailed client-to-server and server-to-client message protocols - Integration with Listen page for bidirectional speech ## Non-Functional Requirements - Latency targets: < 200ms first audio byte - Audio quality: 24kHz, mono, PCM16/PCM32f - Hardware requirements: CUDA GPU with 4-8GB VRAM - Scalability: 10 concurrent sessions per GPU ## Architecture Specification - Python TTS microservice with FastAPI/WebSocket - Rust proxy endpoint in makima server - Voice prompt caching mechanism (LRU cache) - Error handling and recovery strategies ## API Contract - Complete WebSocket message format definitions (TypeScript) - Error codes and responses (TTS_UNAVAILABLE, SYNTHESIS_ERROR, etc.) - Session state machine and lifecycle management ## Voice Asset Requirements - Makima voice clip specifications (5-10s WAV, transcript required) - Storage location: models/voices/makima/ - Metadata format for voice management ## Testing Strategy - Unit tests for Python TTS service and Rust proxy - Integration tests for WebSocket flow - Latency benchmarks with performance targets - Test data fixtures for various text lengths Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add Qwen3-TTS implementation plan Comprehensive implementation plan for replacing Chatterbox-TTS with Qwen3-TTS streaming TTS service, including: - Task breakdown with estimated hours for each phase - Phase 1: Python TTS microservice (FastAPI, WebSocket) - Phase 2: Rust proxy integration (speak.rs, tts_client.rs) - Detailed file changes and new module structure - Testing plan with unit, integration, and latency benchmarks - Risk assessment with mitigation strategies - Success criteria for each phase Based on specification in docs/specs/qwen3-tts-spec.md Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add author and research references to TTS implementation plan Add links to research documentation and author attribution. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * [WIP] Heartbeat checkpoint - 2026-01-27 03:25:06 UTC * Add Python TTS service project structure (Phase 1.1-1.3) Create the initial makima-tts Python service directory structure with: - pyproject.toml with FastAPI, Qwen-TTS, and torch dependencies - config.py with pydantic-settings TTSConfig class - models.py with Pydantic message models (Start, Speak, Stop, Ready, etc.) This implements tasks P1.1, P1.2, and P1.3 from the Qwen3-TTS implementation plan. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add TTS engine and voice manager for Qwen3-TTS (Phase 1.4-1.5) Implement core TTS functionality: - tts_engine.py: Qwen3-TTS wrapper with streaming audio chunk generation - voice_manager.py: Voice prompt caching with LRU eviction and TTL support Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * [WIP] Heartbeat checkpoint - 2026-01-27 03:30:06 UTC * Add TTS proxy client and message types (Phase 2.1, 2.2, 2.4) - Add tts_client.rs with TtsConfig, TtsCircuitBreaker, TtsError, TtsProxyClient, and TtsConnection structs for WebSocket proxying - Add TTS message types to messages.rs (TtsAudioEncoding, TtsPriority, TtsStartMessage, TtsSpeakMessage, TtsStopMessage, TtsClientMessage, TtsReadyMessage, TtsAudioChunkMessage, TtsCompleteMessage, TtsErrorMessage, TtsStoppedMessage, TtsServerMessage) - Export tts_client module from server mod.rs - tokio-tungstenite already present in Cargo.toml Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add TTS WebSocket handler and route (Phase 2.3, 2.5, 2.6) - Create speak.rs WebSocket handler that proxies to Python TTS service - Add TtsState fields (tts_client, tts_config) to AppState - Add with_tts() builder and is_tts_healthy() methods to AppState - Register /api/v1/speak route in the router - Add speak module export in handlers/mod.rs The handler forwards WebSocket messages bidirectionally between the client and the Python TTS microservice with proper error handling. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add Makima voice profile assets for TTS voice cloning Creates the voice assets directory structure with: - manifest.json containing voice configuration (voice_id, speaker, language, reference audio path, and Japanese transcript placeholder) - README.md with instructions for obtaining voice reference audio Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add Rust-native Qwen3-TTS integration research document Research findings for integrating Qwen3-TTS-12Hz-0.6B-Base directly into the makima Rust codebase without Python. Key conclusions: - ONNX export is not viable (unsupported architecture) - Candle (HF Rust ML framework) is the recommended approach - Model weights available in safetensors format (2.52GB total) - Three components needed: LM backbone, code predictor, speech tokenizer - Crane project has Qwen3-TTS as highest priority (potential upstream) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * [WIP] Heartbeat checkpoint - 2026-01-27 11:21:43 UTC * [WIP] Heartbeat checkpoint - 2026-01-27 11:24:19 UTC * [WIP] Heartbeat checkpoint - 2026-01-27 11:26:43 UTC * feat: implement Rust-native Qwen3-TTS using candle framework Replace monolithic tts.rs with modular tts/ directory structure: - tts/mod.rs: TtsEngine trait, TtsEngineFactory, shared types (AudioChunk, TtsError), and utility functions (save_wav, resample, argmax) - tts/chatterbox.rs: existing ONNX-based ChatterboxTTS adapted to implement TtsEngine trait with Mutex-wrapped sessions for Send+Sync - tts/qwen3/mod.rs: Qwen3Tts entry point with HuggingFace model loading - tts/qwen3/config.rs: Qwen3TtsConfig parsing from HF config.json - tts/qwen3/model.rs: 28-layer Qwen3 transformer with RoPE, GQA (16 heads, 8 KV heads), SiLU MLP, RMS norm, and KV cache - tts/qwen3/code_predictor.rs: 5-layer MTP module predicting 16 codebooks - tts/qwen3/speech_tokenizer.rs: ConvNet encoder/decoder with 16-layer RVQ - tts/qwen3/generate.rs: autoregressive generation loop with streaming support Add candle-core, candle-nn, candle-transformers, safetensors to Cargo.toml. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat: integrate TTS engine into speak WebSocket handler - Update speak.rs handler to use TTS engine directly from SharedState instead of returning a stub "not implemented" error - Add TtsEngine (OnceCell lazy-loaded) to AppState in state.rs with get_tts_engine() method for lazy initialization on first connection - Implement full WebSocket protocol: client sends JSON speak/cancel/stop messages, server streams binary PCM audio chunks and audio_end signals - Create voices/makima/manifest.json for Makima voice profile configuration - All files compile successfully with zero errors Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * feat: add /speak TTS page with WebSocket audio playback Add a new /speak frontend page for text-to-speech via WebSocket. The page accepts text input and streams synthesized PCM audio through the Web Audio API. Includes model loading indicator, cancel support, and connection status. Also adds a loading bar to the listen page ControlPanel during WebSocket connection. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
*	Add patch merging and fix task healthcheck failing due to worktrees	soryu	2026-01-27	1	-0/+17
\|
*	Fix worktree info and patches endpoint	soryu	2026-01-27	1	-1/+21
\|
*	Default to shared worktree and add worktree endpoint	soryu	2026-01-27	1	-0/+9
\|
*	Fix supervisor merge and PR commands (#36)	soryu	2026-01-27	1	-0/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Task completion checkpoint * Task completion checkpoint * Task completion checkpoint * Fix supervisor merge for completed tasks and make PR command synchronous ## Issue 1: makima supervisor merge doesn't work for completed tasks When a task completes, the daemon removes it from in-memory task tracking. This caused merge operations to fail with "Task not found". Fixed by updating handle_merge_task_to_target() to use get_task_worktree_path() which scans the worktrees directory as a fallback when the task is not in memory. Also updated handle_create_pr() with the same pattern for consistency. ## Issue 2: makima supervisor pr returns immediately without result The create_pr handler was asynchronous - it sent the CreatePR command to the daemon and immediately returned without waiting for the result. Fixed by: 1. Adding PrResultNotification struct and pr_results broadcast channel to AppState 2. Updating mesh_daemon.rs to broadcast PRCreated results to the channel 3. Updating create_pr() handler to subscribe to pr_results and wait for the result with a 60-second timeout (matching the merge command pattern) Now the PR command returns the actual pr_url and pr_number from the daemon. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
*	Add local-only mode for contracts with patch export support (#34)	soryu	2026-01-26	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add local_only flag to contracts database and models Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Task completion checkpoint * Skip automatic completion actions in local_only mode Add `local_only` flag to contracts that prevents automatic completion actions (branch, merge, pr) from executing when tasks complete. This allows users to manually handle code changes via patch files or other means when operating in local-only mode. Changes: - Add `local_only` field to Contract model and request types - Add database migration for the new column - Add `local_only` parameter to SpawnTask command in both state.rs and daemon protocol.rs - Modify task manager to skip completion action execution when `local_only` is true, with appropriate logging - Pass `local_only` flag through all task spawning paths: - mesh_supervisor.rs (task spawn, retry, resume) - mesh.rs (task start, reassign, continue) - mesh_chat.rs (run task) - contract_chat.rs (run task) - Update repository create/update functions to handle `local_only` Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Task completion checkpoint * Implement core patch export system Add functionality to create uncompressed, human-readable git patches for export. This enables users to generate patches that can be manually applied or shared, without the compression used for internal checkpoints. Changes: - Add ExportPatchResult struct with patch content, file count, and line stats - Add create_export_patch() function that generates diffs against a base SHA - Add get_head_sha() utility function - Add parse_diff_stat() helper to extract line counts from git output - Add CreateExportPatch command to daemon protocol - Add ExportPatchCreated response message to protocol - Add handler in task manager to process export patch requests - Add server-side handling to broadcast patch results to UI The export patch system automatically finds the merge-base when no base SHA is provided, trying upstream tracking branch first, then common default branches (origin/main, origin/master, main, master). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Task completion checkpoint * Add GitActionsPanel frontend component * Add WorktreeFilesPanel and PatchesListPanel components * Add local-only mode toggle to contract creation --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
*	Make merges synchronous	soryu	2026-01-26	1	-0/+26
\|
*	Add automatic phase transitions and fix PR creation	soryu	2026-01-25	1	-0/+10
\|
*	Clean up questions after removing tasks or contracts	soryu	2026-01-24	1	-0/+64
\|
*	Add patch checkpointing	soryu	2026-01-23	1	-0/+9
\|
*	Add daemon restart feature from settings (#18)	soryu	2026-01-22	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add daemon restart feature from settings This adds the ability to restart a connected daemon from the settings page. The feature includes: - Backend: RestartDaemon command added to DaemonCommand enum - Backend: New POST /api/v1/mesh/daemons/{id}/restart endpoint - Backend: Daemon gracefully shuts down tasks and exits with code 42 (can be used by process managers like systemd to detect restart requests) - Frontend: restartDaemon() API function - Frontend: Restart button in Connected Daemons section of settings - Frontend: Confirmation dialog before restart to prevent accidental restarts When a daemon receives the restart command, it: 1. Gracefully shuts down all running Claude processes (5s timeout) 2. Exits with code 42 to signal restart requested 3. The daemon can be restarted by a process manager or manually Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
*	Add non-blocking persistent contract completion questions (#14)	soryu	2026-01-20	1	-0/+5
\| \| \| \| \|	* [WIP] Heartbeat checkpoint - 2026-01-20 22:40:37 UTC * Task completion checkpoint
*	Make sure tasks can continue	soryu	2026-01-19	1	-0/+6
\|
*	Add pushed heartbeats and multi-question select	soryu	2026-01-18	1	-0/+9
\|
*	Fix: Find alternate daemon	soryu	2026-01-18	1	-0/+23
\|
*	Implement git config inherit system	soryu	2026-01-15	1	-0/+7
\|
*	Implement simple git checkpoint command for supervisor	soryu	2026-01-15	1	-0/+8
\|
*	Fixup: Add cleanup and isolation features to makima	soryu	2026-01-15	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add comprehensive CLI documentation - Create makima/docs/CLI.md with complete command reference for: - makima server: HTTP/WebSocket server options - makima daemon: Worker daemon configuration - makima supervisor: Contract orchestration commands - makima contract: Task-contract interaction commands - Include configuration file examples and environment variables - Add usage workflows for common scenarios - Update makima/README.md with CLI overview and link to docs Add GitHub Actions release workflow for v0.1.0 Creates automated release workflow that: - Triggers on v* tag pushes - Builds binaries for Linux x86_64, macOS x86_64, and macOS ARM64 - Uses Rust nightly toolchain (required for edition 2024) - Packages binaries as .tar.gz archives - Creates GitHub release with installation instructions fix(ci): update macOS runner for x86_64 builds Replace deprecated macos-13 runner with macos-15-intel for x86_64-apple-darwin target. The macos-13 runner has been retired by GitHub Actions. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Add dismissing notifications and fix CLI task ID arg Add worktree cleanup when contracts complete or are deleted (#21) - Add CleanupWorktree daemon command variant - Handle CleanupWorktree in daemon task manager - Add cleanup_contract_worktrees helper function - Trigger cleanup when contract status becomes 'completed' - Trigger cleanup before contract deletion Add Autonomous Loop Mode for persistent task completion (#20) Implements the "Autonomous Loop Mode" feature inspired by Ralph for Claude Code. This enables tasks to automatically restart and continue working until they explicitly signal completion via a COMPLETION_GATE block. Key features: - Exit confirmation via COMPLETION_GATE: Tasks must output a <COMPLETION_GATE> block with `ready: true` to signal completion. Without this, the task auto-restarts using `claude --continue` to resume the conversation. - Circuit breaker: Prevents infinite loops by detecting: * Maximum iteration limit (default: 10) * No progress for N consecutive iterations (default: 3) * Same error repeated N times (default: 5) - spawn_continue: New ProcessManager method to spawn Claude with the `--continue` flag, resuming from the previous session state. Toggle: Enable via `autonomous_loop` flag on contracts. When set, all tasks spawned for that contract will run in autonomous loop mode. Files changed: - completion_gate.rs: COMPLETION_GATE parser and CircuitBreaker logic - claude.rs: spawn_continue() for --continue mode spawning - manager.rs: Autonomous loop iteration logic in run_task() - protocol.rs: autonomousLoop field in DaemonCommand::SpawnTask - models.rs/repository.rs: autonomous_loop column on contracts/tasks - Migration: Adds autonomous_loop columns to contracts and tasks tables Add get-task and output commands to supervisor CLI (#24) Add two new supervisor subcommands: - `makima supervisor task <task_id>` - Get individual task details - `makima supervisor output <task_id>` - Get task output/claude log This allows supervisors to fetch task details and claude output directly from the CLI instead of using curl to call the task API. Add optional bubblewrap sandboxing for Claude processes (#23) Add --bubblewrap flag and process.bubblewrap config section to enable running Claude Code in a bubblewrap sandbox for process isolation. When enabled, claude processes run with filesystem restrictions: - Root filesystem mounted read-only - Working directory (worktree) mounted read-write - Fresh /dev, /proc, /tmp - Network access preserved for API calls Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
*	Automatically derive repo URL and add notifications for input	soryu	2026-01-15	1	-0/+185
\|
*	Contract system	soryu	2026-01-15	1	-23/+341
\|
*	Initial Control system	soryu	2026-01-11	1	-2/+465
\|
*	Add conflict notification and file update WS endpoint	soryu	2025-12-23	1	-1/+29
\|
*	Add Postgres for persistence and File cabinet	soryu	2025-12-23	1	-2/+12
\| \| \| \|	Migrations are local only currently, and must be run manually by setting POSTGRES_CONNECTION_URI
*	Add EOU detection and streaming diarization	soryu	2025-12-23	1	-3/+9
\|
*	Remove TTS endpoint using chatterbox	soryu	2025-12-23	1	-7/+0
\| \| \| \|	The library still remains, but this complicates deployment due to the large size of the model, so it is removed for now
*	Implement makima listen websockets server	soryu	2025-12-23	1	-0/+50