SOPRO
// A lightweight text-to-speech model with zero-shot voice cloning
sopro
A lightweight text-to-speech model with zero-shot voice cloning
13EmergingUnknown
What it does
https://github.com/user-attachments/assets/8b70f36e-2623-452d-b65a-be473ec36f26 2026.02.04 - SoproTTS v1.5 is out: more stable, faster, and smaller (135M parameters). Trained for just $100 on a single GPU, it reaches 250 ms TTFA streaming and 0.05 RTF (~20× realtime) on CPU. Sopro (from the Portuguese word for “breath/blow”) is a lightweight English text-to-speech model I trained as a side
Getting Started
git
git clone https://github.com/samuel-vitorino/sopro