|
|
Add voice reference audio loading so the TTS speak handler can perform
voice cloning using reference WAV files from the voices/ directory.
- Add voice.rs module: loads manifest.json and reference.wav for a given
voice_id, decodes via symphonia, resamples to 24kHz for the TTS engine
- Update speak.rs: resolve voice_id from the speak request (default
"makima"), load reference audio, pass it to engine.generate()
- Add voices/makima/README.md with instructions for obtaining reference
audio (extraction from YouTube, recording, ffmpeg conversion)
- Graceful fallback: if reference audio is missing, TTS proceeds without
voice cloning using the model's default voice
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|