Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Lightweight block diffusion model for speculative decoding that enables parallel token drafting for faster LLM generation. Compatible with vLLM, SGLang, Transformers, and MLX backends.
ollama
The simplest way to run LLMs locally with 165K+ GitHub stars. One-command deployment, 100+ models, REST API, and multi-platform support.
ggml-org
Pure C/C++ LLM inference engine supporting CPUs, Apple Silicon, CUDA, and Vulkan
sgl-project
SGLang is a high-performance open-source serving framework for LLMs with RadixAttention prefix caching, zero-overhead CPU scheduling, and prefill-decode disaggregation — deployed across 400,000+ GPUs at xAI, LinkedIn, and Cursor.
sgl-project
High-performance LLM and multimodal model serving framework with RadixAttention and structured generation.