Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace. Discover projects across categories like LLM, Vision, Audio, and more.
489 projects
HKUDS
HKUDS's MIT-licensed multi-agent video generation framework with director, screenwriter, producer, and generator roles that assembles minute-scale narrative video from ideas, novels, or scripts.
p-e-w
Automated directional ablation tool that removes refusal behavior from open LLMs while preserving capability through Optuna-tuned per-layer ablation kernels.
OpenBMB
OpenBMB's tokenizer-free 2B-parameter TTS model emitting native 48kHz audio across 30 languages with voice design, controllable cloning, and an OpenAI-compatible endpoint.
galilai-group
Open-source platform from galilai-group, with authors including Yann LeCun, that unifies data collection, training, and model-predictive control evaluation for world model research across 25+ environments.
LlamaIndex
Fast, local-first open-source document parser from run-llama with Rust core, spatial text plus bounding boxes, bundled OCR, and bindings for Python, Node.js, Rust, and WASM.
NVlabs
NVIDIA's open vision-language model family using data-centric strategies, spanning Eagle, Eagle 2, Eagle 2.5 with 128K context, and the new LocateAnything generalist grounding model.
OpenMOSS
Open-source speech and sound generation model family from OpenMOSS with multilingual TTS, dialogue, real-time voice agents, voice design, and sound effects under a unified audio tokenizer.
GPUStack
Open-source GPU cluster manager that turns heterogeneous accelerators into a self-hosted, OpenAI-compatible model-as-a-service platform powered by vLLM, SGLang, and llama.cpp.
kvcache-ai (Moonshot AI)
KV-cache-centric LLM serving platform open-sourced by Moonshot AI, featuring disaggregated prefill and decode, RDMA-based KV transfer, and adapters for vLLM and SGLang.
LMCache
Open-source KV cache layer that lets LLM serving systems reuse previously computed tokens across replicas, dramatically reducing time-to-first-token for RAG, long-context, and multi-turn workloads.
ModelTC
Pure-Python LLM inference and serving framework from ModelTC with tri-process asynchronous architecture, token-level KV cache management, and Nopad attention for high GPU utilization.
vLLM Project
Kubernetes-native control plane from the vLLM team for GenAI inference, providing autoscaling, cache-aware routing, distributed KV cache, and high-density LoRA adapter serving.