Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
SenseVoice is a speech foundation model from Alibaba's FunAudioLLM team that delivers multiple speech understanding capabilities including automatic speech recognition (ASR), spoken language identification (LID), speech emotion recognition (SER), and audio event detection (AED). Trained with over 400,000 hours of data and supporting more than 50 languages, it surpasses Whisper in recognition performance while running 7x faster than Whisper-small and 17x faster than Whisper-large. The model family includes SenseVoice-Small for low-latency 5-language ASR and SenseVoice-Large for high-precision 50+ language support.
ggml-org
Pure C/C++ port of OpenAI Whisper for edge deployment
CJ Pais
A free, open-source, cross-platform speech-to-text app that transcribes your voice entirely offline — press a shortcut, speak, and have the text pasted into any app.
SYSTRAN
A CTranslate2-based reimplementation of OpenAI's Whisper that runs up to 4x faster at the same accuracy with lower memory, adding 8-bit quantization, batched inference, and word-level timestamps. MIT-licensed and FFmpeg-free.
m-bain
A fast open ASR system that wraps Whisper to add accurate word-level timestamps, 70x real-time batched inference, and speaker diarization for subtitles and meetings.