Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Amphion is an open-source toolkit by OpenMMLab for audio, music, and speech generation research and production. It covers a broad range of generation tasks including text-to-speech, voice conversion, singing voice synthesis, text-to-audio, and vocoder training, with support for architectures like VALL-E, NaturalSpeech2, MaskGCT, and Vevo for zero-shot capabilities. Pre-trained models are available on HuggingFace and ModelScope, making it accessible for both researchers and engineers.
myshell-ai
Instant voice cloning framework by MIT and MyShell with 36k+ GitHub stars, enabling zero-shot cross-lingual voice replication from just seconds of reference audio.
FunAudioLLM
Multilingual LLM-based TTS with zero-shot voice cloning, 9 languages, and 150ms streaming latency.
speechbrain
Comprehensive PyTorch speech toolkit supporting 16+ tasks from ASR to TTS with 200+ training recipes
Vaibhavs10
Blazing-fast Whisper transcription CLI processing 150 minutes of audio in 98 seconds with Flash Attention 2