Reviews AI Tools Open Source Live News AI Official

Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

Omnilingual ASR - Open Source | Evermx | Evermx

Back to Open Source

Trending

Omnilingual ASR

facebookresearchApache-2.0

View on GitHub

STT2.7K Stars243 Forks130 views

Omnilingual ASR is an open-source multilingual automatic speech recognition system from Meta's FAIR (Fundamental AI Research) lab, supporting over 1,600 languages — including hundreds never previously covered by any existing ASR technology. The project represents a landmark breakthrough in speech recognition coverage, enabling zero-shot language addition with minimal paired data examples and unlocking voice interfaces for communities previously excluded from digital speech tools. The system provides multiple model families to suit different accuracy and latency requirements: W2V (self-supervised learning) models for data-scarce languages, CTC-based architectures for fast batch inference, and LLM-based models for highest-accuracy transcription. Model sizes range from 300M to 7B parameters, and the v2 release introduced unlimited audio length support, removing previous constraints on input duration. As of the December 2025 v2 update, the LLM-based variant achieves character error rates below 10% for 78% of its 1,600+ supported languages, an unprecedented breadth of coverage. The real-time factor approaches 1x for LLM variants, making the system practically deployable for real-time or near-real-time transcription across its entire language set. The project integrates with HuggingFace datasets (CC-BY-4.0 licensed data) and provides a comprehensive inference pipeline supporting batch processing, language conditioning, and zero-shot recognition for unseen languages. Released under Apache 2.0, Omnilingual ASR is freely available for both research and commercial use. With 2,700 GitHub stars and 243 forks, the project has attracted attention from linguists, AI researchers, accessibility advocates, and developers building voice applications for underserved language communities. It is particularly valuable for NGOs, governments, and localization teams working with indigenous, endangered, or low-resource languages that commercial speech APIs completely ignore.

Key Features

Speech recognition for 1,600+ languages including hundreds of never-covered languages
Zero-shot language addition with minimal paired examples
Multiple model families: W2V (SSL), CTC, and LLM-based (300M–7B parameters)
Unlimited audio length support in v2 models
Character error rate below 10% for 78% of supported languages
Real-time factor approaching 1x for LLM variants
HuggingFace dataset integration and batch processing pipeline
Apache 2.0 license for research and commercial use