Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

Evermx

Latest AI/LLM news and in-depth reviews.
We analyze usability, potential, and trade-offs.

info@evermx.com

LLM

Claude
Gemini
GPT
Llama
Other LLM

Official Sites

Anthropic (Claude)
Google AI (Gemini)
OpenAI (GPT)
Meta AI (Llama)
Hugging Face

About Editorial Policy Contact Privacy Policy Terms of Service

Reviews Tools Open Source Live Official Profile

Qwen3-Omni - Open Source | Evermx | Evermx

Back to Open Source

Trending

Qwen3-Omni

QwenLMApache-2.0

View on GitHub

Multimodal3.6K Stars240 Forks127 views

Qwen3-Omni is a natively end-to-end omni-modal LLM from Alibaba Cloud's Qwen team that understands text, audio, images, and video while generating speech in real time. Supporting 119 text languages, 19 speech input languages, and 10 speech output languages, it delivers state-of-the-art performance across multimodal benchmarks. The model processes mixed multimodal inputs simultaneously and produces streaming responses, making it suitable for conversational AI, captioning, translation, and real-time applications.

Key Features

Natively end-to-end omni-modal architecture without adapter modules
Real-time speech generation with 10 language output support
Mixed multimodal inputs: text, audio, images, and video simultaneously
119 text languages and 19 speech input languages for global coverage

Related Projects

TrendingMultimodal

GitHub

80.3K11.7K

Deep-Live-Cam

hacksider

Real-time AI face swap and one-click video deepfake with only a single image

MoneyPrinterTurbo

harry0703

AI-powered short video generator that automates scripting, footage sourcing, subtitles, and composition — supporting 10+ LLM providers and batch production.

BitNet

microsoft

Microsoft's official 1-bit LLM inference framework achieving 1.37x-6.17x speedup and up to 82% energy reduction, enabling 100B parameter models to run on consumer CPUs.

MIT245