Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Amphion is an open-source toolkit by OpenMMLab for audio, music, and speech generation research and production. It covers a broad range of generation tasks including text-to-speech, voice conversion, singing voice synthesis, text-to-audio, and vocoder training, with support for architectures like VALL-E, NaturalSpeech2, MaskGCT, and Vevo for zero-shot capabilities. Pre-trained models are available on HuggingFace and ModelScope, making it accessible for both researchers and engineers.
FunAudioLLM
Multilingual LLM-based TTS with zero-shot voice cloning, 9 languages, and 150ms streaming latency.
pipecat-ai
Open-source Python framework for real-time voice and multimodal conversational AI with 20+ STT/TTS/LLM providers and 10.6K+ GitHub stars.
Blaizzy
Comprehensive TTS, STT, and STS library optimized for Apple Silicon with 9+ TTS models, 8+ STT models, voice cloning, quantization, and OpenAI-compatible API
MoonshotAI
An open-source 7B audio foundation model by MoonshotAI that excels in audio understanding, generation, and conversation, pre-trained on over 13 million hours of diverse audio data.