Reviews AI Tools Open Source Live News AI Official

Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

RAG-Anything - Open Source | Evermx | Evermx

Back to Open Source

RAG-Anything

MIT

View on GitHub

Other16.4K Stars2.0K Forks200 views

## Overview RAG-Anything is a comprehensive retrieval-augmented generation (RAG) framework developed by HKUDS Lab that eliminates the fragmentation of modern RAG pipelines. Where most RAG systems only handle plain text, RAG-Anything natively processes diverse content modalities — PDFs, Office documents, images, tables, mathematical equations, and charts — all within a unified architecture. With over 16,000 GitHub stars, it has quickly become one of the most-watched RAG projects in the AI open-source community. ## Key Features - **Multimodal Processing Pipeline**: Dedicated processors for text, visual content, structured tables, mathematical expressions (LaTeX), and charts operate in a coordinated pipeline without requiring separate tooling - **Universal Document Support**: Handles PDFs, Word/Excel/PowerPoint files, images (JPG, PNG, WebP, TIFF), plain text, and Markdown files out of the box - **Knowledge Graph Integration**: Automatically extracts entities and relationships across modalities, enabling graph-based reasoning on top of vector retrieval - **Flexible Parser Backend**: Supports MinerU, Docling, and PaddleOCR parsers so users can swap components based on performance or licensing needs - **Hybrid Retrieval**: Combines dense vector similarity search with graph traversal algorithms for more comprehensive and contextually accurate results ## Use Cases RAG-Anything excels in enterprise and research scenarios where documents are inherently heterogeneous: financial reports mixing tables and narrative text, academic papers with equations and figures, technical documentation with diagrams, and legal filings with mixed layouts. It is particularly valuable for organizations building internal knowledge bases where document uniformity cannot be assumed. ## Technical Details The framework is built in Python and integrates with popular vector stores and LLM backends. The knowledge graph layer adds a semantic layer above raw embeddings, enabling multi-hop reasoning. Processing pipelines are modular, allowing organizations to customize extraction for specific document types. The codebase has 1,970+ forks, indicating broad adoption and active derivative work. ## Getting Started ```bash # Install with all optional dependencies pip install raganything[all] # LibreOffice required for Office document processing # sudo apt install libreoffice # Basic usage from raganything import RAGAnything rag = RAGAnything() rag.ingest("document.pdf") result = rag.query("What are the key findings?") ```

Related Projects

TrendingOther

GitHub

206.5K18.4K

Superpowers

Jesse Vincent / Prime Radiant

MIT217

Open Source

RAG-Anything

Tags

Related Projects

Superpowers

Langflow

Open WebUI

MarkItDown