Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Quantized Attention achieving 2-5x speedup over FlashAttention without losing end-to-end metrics across language, image, and video models — accepted at ICLR2025, ICML2025, and NeurIPS2025 Spotlight.
ollama
The simplest way to run LLMs locally with 165K+ GitHub stars. One-command deployment, 100+ models, REST API, and multi-platform support.
ggml-org
Pure C/C++ LLM inference engine supporting CPUs, Apple Silicon, CUDA, and Vulkan
vLLM Project
A high-throughput, memory-efficient LLM inference and serving engine built around PagedAttention, with an OpenAI-compatible API and 200+ model support.
unslothai
2x faster LLM fine-tuning with 70% less VRAM via custom Triton kernels. Supports Llama, Qwen, DeepSeek, Gemma, and 500+ models.