Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Qwen3-Omni is a natively end-to-end omni-modal LLM from Alibaba Cloud's Qwen team that understands text, audio, images, and video while generating speech in real time. Supporting 119 text languages, 19 speech input languages, and 10 speech output languages, it delivers state-of-the-art performance across multimodal benchmarks. The model processes mixed multimodal inputs simultaneously and produces streaming responses, making it suitable for conversational AI, captioning, translation, and real-time applications.
hacksider
Real-time AI face swap and one-click video deepfake with only a single image
harry0703
AI-powered short video generator that automates scripting, footage sourcing, subtitles, and composition — supporting 10+ LLM providers and batch production.
microsoft
Microsoft's official 1-bit LLM inference framework achieving 1.37x-6.17x speedup and up to 82% energy reduction, enabling 100B parameter models to run on consumer CPUs.
bytedance
ByteDance's open-source multimodal AI agent stack for GUI automation with vision-language model integration