Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
NeuTTS is the world's first super-realistic on-device text-to-speech speech language model with instant voice cloning, developed by Neuphonic. Built on a compact 0.5B LLM backbone (Qwen 0.5B), it brings natural-sounding speech, real-time performance, and speaker cloning to local devices, unlocking a new category of embedded voice agents, assistants, and compliance-safe applications. The model uses the NeuCodec audio codec that achieves exceptional audio quality at low bitrates using a single codebook.
RVC-Boss
Open-source WebUI for few-shot and zero-shot voice cloning and text-to-speech, producing a usable voice from as little as a 5-second sample.
Microsoft
Microsoft's MIT-licensed open frontier voice AI: 1.5B long-form TTS up to 90 minutes with 4 speakers, 0.5B streaming TTS at 300 ms latency, and 7B ASR for 60-minute single-pass transcription. 47k+ stars.
2noise
A dialogue-optimized open TTS model trained on 100,000+ hours that adds fine-grained prosody — laughter, pauses, interjections — with multi-speaker, English/Chinese support.
suno-ai
Suno's fully generative text-to-audio model — speech, music, and sound effects from one transformer, with nonverbal cues like [laughs] and 100+ voice presets (MIT).