Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.

LEANN is a lightweight personal vector database that achieves 97% storage savings compared to traditional solutions while maintaining search accuracy. Published at MLsys 2026, it uses graph-based selective recomputation with high-degree preserving pruning to compute embeddings on-demand instead of storing them all. This allows indexing 60 million text chunks in just 6GB instead of 201GB. LEANN can semantic search across local files, emails, browser history, chat logs (WeChat, iMessage), AI conversations (ChatGPT, Claude), live data (Slack, Twitter), and codebases with intelligent language-aware chunking. Native MCP integration makes it a drop-in semantic search service for Claude Code. All data stays local with zero cloud costs, ensuring complete privacy.
langflow-ai
Open-source visual platform for building AI agents and workflows with 145k+ stars, drag-and-drop interface, and full Python customization
microsoft
Lightweight Python utility from Microsoft that converts virtually any file format—PDFs, Word docs, PowerPoints, images, audio, and web pages—into clean, token-efficient Markdown for LLM integration.
firecrawl
Open-source Web Data API for AI that converts websites into LLM-ready markdown, structured JSON, and screenshots with 96% web coverage.
microsoft
Microsoft's lightweight Python utility converting PDFs, Office docs, images, audio, and more into clean Markdown optimized for LLM pipelines, with MCP server integration