Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
SenseVoice is a speech foundation model from Alibaba's FunAudioLLM team that delivers multiple speech understanding capabilities including automatic speech recognition (ASR), spoken language identification (LID), speech emotion recognition (SER), and audio event detection (AED). Trained with over 400,000 hours of data and supporting more than 50 languages, it surpasses Whisper in recognition performance while running 7x faster than Whisper-small and 17x faster than Whisper-large. The model family includes SenseVoice-Small for low-latency 5-language ASR and SenseVoice-Large for high-precision 50+ language support.
ggml-org
Pure C/C++ port of OpenAI Whisper for edge deployment
SYSTRAN
High-performance Whisper reimplementation using CTranslate2, delivering 4x faster speech recognition with INT8 quantization support.
m-bain
Fast Whisper-based ASR with word-level timestamps and multi-speaker diarization, 70x real-time speed
modelscope
Industrial-grade end-to-end speech recognition toolkit with 20+ pretrained models and 31-language support