Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace. Discover projects across categories like LLM, Vision, Audio, and more.
489 projects
turboderp
MIT-licensed quantization and inference library for running large LLMs on single consumer-class NVIDIA GPUs, with the new EXL3 format and Marlin-inspired memory-bound GEMM kernels.
fastgs
CVPR 2026 Highlight: official implementation of 'FastGS' — a general framework that trains 3D Gaussian Splatting scenes in ~100 seconds at PSNR parity with the Inria reference. MIT licensed, 1,100+ stars.
zai-org
Z.ai's open-source GLM-4.1V/4.5V/4.6V vision-language family with explicit 'thinking' reasoning mode and RLCS training, released under Apache 2.0 with 2,300+ GitHub stars.
speaches-ai
MIT-licensed self-hosted server that speaks the OpenAI Audio API, with faster-whisper for streaming transcription and translation plus piper and Kokoro for TTS. 3,300+ stars and 398 forks.
Rohit Ghumare
Apache 2.0 persistent memory layer for AI coding agents. Runs as a single local server on port 3111, talks to Claude Code, Cursor, Codex, Gemini CLI, OpenClaw, Hermes, pi, and OpenCode via 53 MCP tools and 12 auto-hooks. Benchmarked at 95.2 percent recall and 92 percent fewer tokens.
Google Chrome DevTools
Google's official Apache 2.0 MCP server that gives coding agents the full Chrome DevTools surface — performance traces with CrUX field data, source-mapped console and network inspection, and Puppeteer-driven automation. Works with Claude Code, Antigravity, Cursor, Codex, Copilot, and any MCP client.
Jesse Vincent / Prime Radiant
MIT-licensed agentic skills framework that gives Claude Code, Codex, Cursor, Gemini CLI, OpenCode, and Copilot CLI a TDD-enforced, spec-driven, subagent-orchestrated software development methodology. Distributed via the official Anthropic plugin marketplace.
LightSeek Foundation
MIT-licensed speed-of-light LLM inference engine from the LightSeek Foundation, targeting TensorRT-LLM performance with vLLM usability. 9 to 11 percent faster than TensorRT-LLM on Kimi K2.5 on Nvidia B200. MLA kernel already adopted by vLLM.
Shanghai AI Laboratory (InternLM)
Shanghai AI Lab's Apache 2.0 LLM toolkit centered on the C++ TurboMind engine. v0.13.0 adds TurboQuant KV cache, Qwen3.5 MoE on Blackwell, Anthropic-compatible endpoints, and prefill-starvation fixes. Supports NVIDIA, Ascend, ROCm, Cambricon, and Apple Maca.
JD.com
JD.com's Apache 2.0 inference engine for LLMs, VLMs, DiT, and recommendation models, optimized for Huawei Ascend, Cambricon MLU, MThreads MUSA, and Nvidia CUDA. Production-validated across JD Retail's customer service and recommendation workloads.
antirez
Salvatore Sanfilippo's MIT-licensed local inference engine for DeepSeek V4 Flash, in pure C with Metal, CUDA, and ROCm backends. Asymmetric 2-bit quantization fits the model in 96 GB. 11.7k stars in under three weeks.
NVIDIA Labs
NVIDIA Labs open-source infrastructure for long autoregressive video generation. Apache 2.0, NVFP4 W4A4 quantization, balanced sequence parallelism, multi-shot training, 1.3B-5B models reaching 45.7 FPS quantized.