Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
Spark NLP is a state-of-the-art, production-grade Natural Language Processing library built on Apache Spark that scales seamlessly across distributed environments. It ships with over 100,000 pretrained pipelines and models covering 200+ languages, supporting tasks such as named entity recognition, sentiment analysis, machine translation, question answering, and LLM-based text generation. Built by John Snow Labs, it bridges enterprise-grade NLP reliability with modern transformer and LLM capabilities.
langflow-ai
Open-source visual platform for building AI agents and workflows with 145k+ stars, drag-and-drop interface, and full Python customization
microsoft
Lightweight Python utility from Microsoft that converts virtually any file format—PDFs, Word docs, PowerPoints, images, audio, and web pages—into clean, token-efficient Markdown for LLM integration.
firecrawl
Open-source Web Data API for AI that converts websites into LLM-ready markdown, structured JSON, and screenshots with 96% web coverage.
microsoft
Microsoft's lightweight Python utility converting PDFs, Office docs, images, audio, and more into clean Markdown optimized for LLM pipelines, with MCP server integration