Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Microsoft's open-source frontier voice AI combining text-to-speech and automatic speech recognition. Features 60-minute long-form ASR, 90-minute continuous TTS, and real-time streaming with 300ms latency using continuous speech tokenizers at 7.5 Hz frame rate.
resemble-ai
Family of SoTA open-source TTS models by Resemble AI with zero-shot voice cloning, 23+ language support, and paralinguistic controls across 350M-500M parameter variants.
index-tts
Industrial-grade zero-shot TTS with precise duration control and emotion disentanglement
nari-labs
A 1.6B parameter TTS model by Nari Labs that generates ultra-realistic multi-speaker dialogue in a single pass, supporting voice cloning and non-verbal expressions.
Sesame AI Labs
Sesame AI Labs' open-source 1B-parameter conversational speech model using Llama architecture — natural human-like intonation, multi-speaker support, HuggingFace Transformers native.