<feed xmlns='http://www.w3.org/2005/Atom'>
<title>soryu/makima/sh, branch master</title>
<subtitle>soryu-co/soryu mirror</subtitle>
<id>http://src.eirin.xyz/soryu/atom?h=master</id>
<link rel='self' href='http://src.eirin.xyz/soryu/atom?h=master'/>
<link rel='alternate' type='text/html' href='http://src.eirin.xyz/soryu/'/>
<updated>2026-02-02T23:16:00+00:00</updated>
<entry>
<title>Fix downloading too many models</title>
<updated>2026-02-02T23:16:00+00:00</updated>
<author>
<name>soryu</name>
<email>soryu@soryu.co</email>
</author>
<published>2026-02-02T23:16:00+00:00</published>
<link rel='alternate' type='text/html' href='http://src.eirin.xyz/soryu/commit/?id=8361916ce67f3d2ba191ebf27cb50e79cb42e39c'/>
<id>urn:sha1:8361916ce67f3d2ba191ebf27cb50e79cb42e39c</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Use chatterbox TTS</title>
<updated>2026-02-01T03:04:36+00:00</updated>
<author>
<name>soryu</name>
<email>soryu@soryu.co</email>
</author>
<published>2026-02-01T03:04:36+00:00</published>
<link rel='alternate' type='text/html' href='http://src.eirin.xyz/soryu/commit/?id=a2c147ddd59f55a07b5be0c8970169726b55c876'/>
<id>urn:sha1:a2c147ddd59f55a07b5be0c8970169726b55c876</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Download vocab.json and merges.txt in container image</title>
<updated>2026-01-30T02:59:45+00:00</updated>
<author>
<name>soryu</name>
<email>soryu@soryu.co</email>
</author>
<published>2026-01-30T02:59:45+00:00</published>
<link rel='alternate' type='text/html' href='http://src.eirin.xyz/soryu/commit/?id=a9655dccdad116db2b92c13794ddd559f160148d'/>
<id>urn:sha1:a9655dccdad116db2b92c13794ddd559f160148d</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Ensure tokenizor exists for TTS model</title>
<updated>2026-01-29T13:06:33+00:00</updated>
<author>
<name>soryu</name>
<email>soryu@soryu.co</email>
</author>
<published>2026-01-29T13:06:33+00:00</published>
<link rel='alternate' type='text/html' href='http://src.eirin.xyz/soryu/commit/?id=74e77be0ce3adae5889894e9a791fce96d7c82df'/>
<id>urn:sha1:74e77be0ce3adae5889894e9a791fce96d7c82df</id>
<content type='text'>
</content>
</entry>
<entry>
<title>fix: Use correct hf command for Qwen3-TTS download (#46)</title>
<updated>2026-01-29T02:24:04+00:00</updated>
<author>
<name>soryu</name>
<email>soryu@soryu.co</email>
</author>
<published>2026-01-29T02:24:04+00:00</published>
<link rel='alternate' type='text/html' href='http://src.eirin.xyz/soryu/commit/?id=764ace78046e78cce36b64cb3682cc5489bcf9d7'/>
<id>urn:sha1:764ace78046e78cce36b64cb3682cc5489bcf9d7</id>
<content type='text'>
* chore: fix unused import warnings in qwen3-tts module

- Remove unused import 'IndexOp' in model.rs
- Remove unused import 'DType' in speech_tokenizer.rs
- Add #[allow(dead_code)] to codebook_dim field in RvqCodebook

Co-Authored-By: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;

* feat: add voice loading and selection for TTS cloning

Add voice reference audio loading so the TTS speak handler can perform
voice cloning using reference WAV files from the voices/ directory.

- Add voice.rs module: loads manifest.json and reference.wav for a given
  voice_id, decodes via symphonia, resamples to 24kHz for the TTS engine
- Update speak.rs: resolve voice_id from the speak request (default
  "makima"), load reference audio, pass it to engine.generate()
- Add voices/makima/README.md with instructions for obtaining reference
  audio (extraction from YouTube, recording, ffmpeg conversion)
- Graceful fallback: if reference audio is missing, TTS proceeds without
  voice cloning using the model's default voice

Co-Authored-By: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;

* feat: add inference cancellation support for TTS generation

Add cooperative cancellation via Arc&lt;AtomicBool&gt; cancel flag that
threads through TtsEngine::generate -&gt; Qwen3Tts -&gt; GenerationContext.
The autoregressive loop and streaming decoder check the flag each
iteration and break early when set. The speak WebSocket handler
creates a per-session flag, passes it to generate, and sets it on
Cancel/Stop/Close messages.

Co-Authored-By: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;

* Add Qwen3-TTS model download to build process

Fix TTS engine failure due to missing tokenizer by downloading
Qwen3-TTS models during Docker build:
- Download model.safetensors, config.json, tokenizer.json, and
  tokenizer_config.json from Qwen/Qwen3-TTS-12Hz-0.6B-Base
- Download speech tokenizer from Qwen/Qwen3-TTS-Tokenizer-12Hz
- Add QWEN3_TTS_DIR environment variable to Dockerfile
- Script supports both env var override and default path

Co-Authored-By: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;

* fix: use correct hf command for Qwen3-TTS download

Co-Authored-By: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;

---------

Co-authored-by: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;</content>
</entry>
<entry>
<title>fix: Add Qwen3-TTS model download to Docker build (#44)</title>
<updated>2026-01-29T01:04:42+00:00</updated>
<author>
<name>soryu</name>
<email>soryu@soryu.co</email>
</author>
<published>2026-01-29T01:04:42+00:00</published>
<link rel='alternate' type='text/html' href='http://src.eirin.xyz/soryu/commit/?id=d7b0b576fb43902535f0ae8d4f257b50387ec01a'/>
<id>urn:sha1:d7b0b576fb43902535f0ae8d4f257b50387ec01a</id>
<content type='text'>
* chore: fix unused import warnings in qwen3-tts module

- Remove unused import 'IndexOp' in model.rs
- Remove unused import 'DType' in speech_tokenizer.rs
- Add #[allow(dead_code)] to codebook_dim field in RvqCodebook

Co-Authored-By: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;

* feat: add voice loading and selection for TTS cloning

Add voice reference audio loading so the TTS speak handler can perform
voice cloning using reference WAV files from the voices/ directory.

- Add voice.rs module: loads manifest.json and reference.wav for a given
  voice_id, decodes via symphonia, resamples to 24kHz for the TTS engine
- Update speak.rs: resolve voice_id from the speak request (default
  "makima"), load reference audio, pass it to engine.generate()
- Add voices/makima/README.md with instructions for obtaining reference
  audio (extraction from YouTube, recording, ffmpeg conversion)
- Graceful fallback: if reference audio is missing, TTS proceeds without
  voice cloning using the model's default voice

Co-Authored-By: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;

* feat: add inference cancellation support for TTS generation

Add cooperative cancellation via Arc&lt;AtomicBool&gt; cancel flag that
threads through TtsEngine::generate -&gt; Qwen3Tts -&gt; GenerationContext.
The autoregressive loop and streaming decoder check the flag each
iteration and break early when set. The speak WebSocket handler
creates a per-session flag, passes it to generate, and sets it on
Cancel/Stop/Close messages.

Co-Authored-By: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;

* Add Qwen3-TTS model download to build process

Fix TTS engine failure due to missing tokenizer by downloading
Qwen3-TTS models during Docker build:
- Download model.safetensors, config.json, tokenizer.json, and
  tokenizer_config.json from Qwen/Qwen3-TTS-12Hz-0.6B-Base
- Download speech tokenizer from Qwen/Qwen3-TTS-Tokenizer-12Hz
- Add QWEN3_TTS_DIR environment variable to Dockerfile
- Script supports both env var override and default path

Co-Authored-By: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;

---------

Co-authored-by: Claude Opus 4.5 &lt;noreply@anthropic.com&gt;</content>
</entry>
<entry>
<title>Add Postgres for persistence and File cabinet</title>
<updated>2025-12-23T14:47:18+00:00</updated>
<author>
<name>soryu</name>
<email>soryu@soryu.co</email>
</author>
<published>2025-12-23T02:14:58+00:00</published>
<link rel='alternate' type='text/html' href='http://src.eirin.xyz/soryu/commit/?id=a32dc56d2e5447ef8988cb98b8686476cc94e70c'/>
<id>urn:sha1:a32dc56d2e5447ef8988cb98b8686476cc94e70c</id>
<content type='text'>
Migrations are local only currently, and must be run manually by setting POSTGRES_CONNECTION_URI
</content>
</entry>
<entry>
<title>Bump diarization version to 2.1 and fix downloading the tokenizer</title>
<updated>2025-12-23T14:47:18+00:00</updated>
<author>
<name>soryu</name>
<email>soryu@soryu.co</email>
</author>
<published>2025-12-21T19:14:29+00:00</published>
<link rel='alternate' type='text/html' href='http://src.eirin.xyz/soryu/commit/?id=75f2a72a06af6f722fce1bba1d1fc2f4c5e844df'/>
<id>urn:sha1:75f2a72a06af6f722fce1bba1d1fc2f4c5e844df</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Use hf cli to download models</title>
<updated>2025-12-23T14:47:18+00:00</updated>
<author>
<name>soryu</name>
<email>soryu@soryu.co</email>
</author>
<published>2025-12-21T18:12:56+00:00</published>
<link rel='alternate' type='text/html' href='http://src.eirin.xyz/soryu/commit/?id=87e6c9c49fca144e3de3ea4a3618a84b1c418536'/>
<id>urn:sha1:87e6c9c49fca144e3de3ea4a3618a84b1c418536</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Use HF to download models</title>
<updated>2025-12-23T14:47:18+00:00</updated>
<author>
<name>soryu</name>
<email>soryu@soryu.co</email>
</author>
<published>2025-12-21T04:09:18+00:00</published>
<link rel='alternate' type='text/html' href='http://src.eirin.xyz/soryu/commit/?id=dbec21683cad0c61736ef5d376c44a30451b46c8'/>
<id>urn:sha1:dbec21683cad0c61736ef5d376c44a30451b46c8</id>
<content type='text'>
</content>
</entry>
</feed>
