Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Index-TTS is an industrial-level, controllable zero-shot text-to-speech system that introduces precise duration control into an autoregressive TTS architecture while preserving natural prosody from voice prompts. It achieves disentanglement of speaker identity and emotional expression, enabling independent control over timbre and emotion through audio reference, emotional vector specification, or free-text description, with support for Chinese and English and optional FP16 and DeepSpeed acceleration.
RVC-Boss
Open-source WebUI for few-shot and zero-shot voice cloning and text-to-speech, producing a usable voice from as little as a 5-second sample.
Microsoft
Microsoft's MIT-licensed open frontier voice AI: 1.5B long-form TTS up to 90 minutes with 4 speakers, 0.5B streaming TTS at 300 ms latency, and 7B ASR for 60-minute single-pass transcription. 47k+ stars.
OpenBMB
OpenBMB's tokenizer-free TTS system, now a 2B model trained on 2M+ hours across 30 languages with Voice Design, controllable cloning, and 48kHz audio.
resemble-ai
Resemble AI's MIT-licensed family of state-of-the-art open TTS models, spanning a 350M low-latency Turbo variant and a 23+ language Multilingual V3. Offers zero-shot voice cloning, exaggeration control, and built-in Perth neural watermarking on every clip.