Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
NVIDIA Model Optimizer (formerly TensorRT Model Optimizer) is a unified library of state-of-the-art model optimization techniques including quantization, pruning, distillation, speculative decoding, and sparsity. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM, TensorRT, and vLLM to dramatically optimize inference speed. The library supports highly performant quantization formats including NVFP4, FP8, INT8, and INT4 with advanced algorithms such as SmoothQuant, AWQ, and SVDQuant.
ollama
The simplest way to run LLMs locally with 165K+ GitHub stars. One-command deployment, 100+ models, REST API, and multi-platform support.
ggml-org
Pure C/C++ LLM inference engine supporting CPUs, Apple Silicon, CUDA, and Vulkan
sgl-project
High-performance LLM and multimodal model serving framework with RadixAttention and structured generation.
mlc-ai
Universal LLM deployment engine using ML compilation for cloud, mobile, and web