Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
The Production Agentic RAG Course is an open-source, hands-on learning resource that teaches developers how to build production-grade Retrieval-Augmented Generation systems from scratch. With 3,600 GitHub stars and nearly 1,000 forks, it has gained significant traction among developers who want to move beyond tutorial-level RAG implementations to systems that can handle real-world production demands. ## The Production-First Approach What sets this course apart from typical RAG tutorials is its emphasis on the professional development path. Rather than starting with vector databases and embeddings (the approach most RAG tutorials take), this course begins with solid search foundations using keyword-based retrieval (BM25) before layering on semantic capabilities. This mirrors how production search systems are actually built in industry, where keyword search provides a reliable baseline that semantic search enhances rather than replaces. The course uses an arXiv Paper Curator as its running project, a practical application that fetches, indexes, and enables intelligent querying of academic papers. This is not a contrived example but a genuinely useful tool that participants build incrementally over seven weeks. ## Seven-Week Curriculum | Week | Focus Area | |------|------------| | Week 1 | Infrastructure setup with Docker, FastAPI, PostgreSQL, and OpenSearch | | Week 2 | Data ingestion pipeline for automated arXiv paper fetching and parsing | | Week 3 | Keyword search implementation with OpenSearch and BM25 scoring | | Week 4 | Document chunking strategies and hybrid search combining keyword and semantic retrieval | | Week 5 | Complete RAG pipeline with local LLM integration via Ollama and streaming responses | | Week 6 | Production monitoring with Langfuse tracing and Redis caching for performance optimization | | Week 7 | Agentic RAG using LangGraph with guardrails, query rewriting, and Telegram bot deployment | ## Production-Grade Infrastructure The course infrastructure stack reflects real production architectures. Docker Compose orchestrates all services, Apache Airflow 3.0 manages data pipeline scheduling, OpenSearch 2.19 handles both keyword and vector search, and FastAPI provides the API layer. Every component is containerized and reproducible, meaning participants can tear down and rebuild the entire system reliably. ## Local-First LLM Integration The RAG system integrates with Ollama for local LLM inference, eliminating API costs during development and learning. This is a deliberate choice that makes the course accessible to developers who cannot afford commercial API usage during extended learning periods. The architecture is modular enough that swapping in a commercial API for production use requires minimal code changes. ## Agentic RAG in Week 7 The final week introduces agentic patterns using LangGraph for state-based workflow orchestration. The agent can grade document relevance, rewrite queries when initial retrieval fails, implement guardrails to prevent hallucination, and adaptively choose retrieval strategies based on query characteristics. A Telegram bot integration provides a practical deployment target that demonstrates the system working in a real messaging context. ## Monitoring and Observability Week 6 covers production monitoring with Langfuse, providing distributed tracing for every RAG pipeline step. This visibility is essential for debugging production RAG systems where failures can occur at any stage: retrieval, reranking, context assembly, or generation. Redis caching is added to optimize repeated queries and reduce latency. ## Technical Requirements The course requires Python 3.12+, Docker Desktop with Compose, 8GB+ RAM, and 20GB+ free disk space. The UV package manager is used for dependency management, reflecting modern Python development practices.