Add Qwen3-TTS research document for live TTS replacement - soryu

diff options

author	soryu <soryu@soryu.co>	2026-01-27 03:11:08 +0000
committer	soryu <soryu@soryu.co>	2026-01-27 03:11:08 +0000
commit	ebe029483184d51e702adb9ed79ea70d681a35f8 (patch)
tree	e69c54d7016fef0b2e6c5c3635f4a8501f66ae64 /apps/mobile/components/TaskStatusBadge.tsx
parent	f6b4d06a0158fb7803a2d7a861cf891cb3b202b4 (diff)
download	soryu-ebe029483184d51e702adb9ed79ea70d681a35f8.tar.gz soryu-ebe029483184d51e702adb9ed79ea70d681a35f8.zip

Add Qwen3-TTS research document for live TTS replacement

Research findings for replacing Chatterbox TTS with Qwen3-TTS-12Hz-0.6B-Base: - Current TTS: Chatterbox-Turbo-ONNX with batch-only generation, no streaming - Qwen3-TTS: 97ms end-to-end latency, streaming support, 3-second voice cloning - Voice cloning: Requires 3s reference audio + transcript (Makima voice planned) - Integration: Python service with WebSocket bridge (no ONNX export available) - Languages: 10 supported including English and Japanese Document includes: - Current architecture analysis (makima/src/tts.rs) - Qwen3-TTS capabilities and requirements - Feasibility assessment for live/streaming TTS - Audio clip requirements for voice cloning - Preliminary technical approach with architecture diagrams Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Diffstat (limited to 'apps/mobile/components/TaskStatusBadge.tsx')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: