Reviews AI Tools Open Source Live News AI Official

Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

PaddleOCR - Open Source | Evermx | Evermx

Back to Open Source

Trending

PaddleOCR

PaddlePaddleApache-2.0

View on GitHub

Vision83.0K Stars10.8K Forks137 views

PaddleOCR is a global-leading open-source OCR toolkit and Document AI engine. It converts PDF documents and images into structured, LLM-ready data in JSON and Markdown with industry-leading accuracy, serving as the bedrock for intelligent RAG and agentic applications. ## Why PaddleOCR Matters Real-world documents are messy: scanned PDFs, tables, formulas, seals, charts, and dozens of languages mixed together. PaddleOCR turns that visual chaos into clean, structured data an LLM can actually use. With 80,000+ GitHub stars and adoption by top-tier projects such as Dify, RAGFlow, and Cherry Studio, it has become a default building block for document-centric AI pipelines. ## SOTA Document Vision-Language Model At the core is PaddleOCR-VL-1.6, a lightweight 0.9B vision-language model purpose-built for document parsing. It reaches 96.3% accuracy on OmniDocBench v1.6 and leads in text, formula, and table recognition, with markedly stronger handling of ancient documents, rare characters, seals, and charts — all emitted as structured Markdown and JSON. ## Structure-Aware Document Conversion PP-StructureV3 provides structure-aware conversion that turns complex PDFs and images into Markdown or JSON while preserving layout. It offers finer-grained control over reading order, tables, and nested elements, making the output reliable for retrieval-augmented generation rather than a flat dump of text. ## Broad Language and Hardware Support PaddleOCR supports 100+ languages and runs across CPU, GPU, XPU, and NPU hardware on Linux, Windows, and macOS. The toolkit spans the full pipeline from text detection and recognition to key information extraction and document translation, so teams can deploy on-premise without locking into a proprietary cloud API. ## A Mature, Trusted Ecosystem Beyond raw models, PaddleOCR ships training tools, pretrained pipelines, and integrations used by thousands of downstream repositories. Its Apache-2.0 license and active maintenance make it a dependable foundation for production document intelligence.

Key Features

Converts PDFs and images into structured, LLM-ready JSON and Markdown
PaddleOCR-VL-1.6 (0.9B) document VLM at 96.3% on OmniDocBench v1.6
PP-StructureV3 structure-aware conversion preserving layout and tables
Strong recognition of text, formulas, tables, seals, charts, and rare characters
100+ language support for global document workflows
Runs on CPU, GPU, XPU, and NPU across Linux, Windows, and macOS
Used by Dify, RAGFlow, Cherry Studio, and 6k+ downstream repositories
Apache-2.0 licensed for on-premise, API-free deployment

Related Projects

TrendingVision

GitHub

108.4K12.6K

ComfyUI

Comfy-Org

GPL-3.0275

Open Source

PaddleOCR

Key Features

Tags

Related Projects

ComfyUI

RuView

Ultralytics YOLO

Roboflow Supervision