Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Index-TTS is an industrial-level, controllable zero-shot text-to-speech system that introduces precise duration control into an autoregressive TTS architecture while preserving natural prosody from voice prompts. It achieves disentanglement of speaker identity and emotional expression, enabling independent control over timbre and emotion through audio reference, emotional vector specification, or free-text description, with support for Chinese and English and optional FP16 and DeepSpeed acceleration.
Microsoft
Microsoft's MIT-licensed open frontier voice AI: 1.5B long-form TTS up to 90 minutes with 4 speakers, 0.5B streaming TTS at 300 ms latency, and 7B ASR for 60-minute single-pass transcription. 47k+ stars.
microsoft
Open-source frontier voice AI for TTS and ASR
resemble-ai
Family of SoTA open-source TTS models by Resemble AI with zero-shot voice cloning, 23+ language support, and paralinguistic controls across 350M-500M parameter variants.
OpenBMB
OpenBMB's tokenizer-free 2B-parameter TTS model emitting native 48kHz audio across 30 languages with voice design, controllable cloning, and an OpenAI-compatible endpoint.