Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Index-TTS is an industrial-level, controllable zero-shot text-to-speech system that introduces precise duration control into an autoregressive TTS architecture while preserving natural prosody from voice prompts. It achieves disentanglement of speaker identity and emotional expression, enabling independent control over timbre and emotion through audio reference, emotional vector specification, or free-text description, with support for Chinese and English and optional FP16 and DeepSpeed acceleration.
microsoft
Open-source frontier voice AI for TTS and ASR
resemble-ai
Family of SoTA open-source TTS models by Resemble AI with zero-shot voice cloning, 23+ language support, and paralinguistic controls across 350M-500M parameter variants.
nari-labs
A 1.6B parameter TTS model by Nari Labs that generates ultra-realistic multi-speaker dialogue in a single pass, supporting voice cloning and non-verbal expressions.
Sesame AI Labs
Sesame AI Labs' open-source 1B-parameter conversational speech model using Llama architecture — natural human-like intonation, multi-speaker support, HuggingFace Transformers native.