Open Source
Explore the latest AI open-source projects from GitHub and HuggingFace.
Explore the latest AI open-source projects from GitHub and HuggingFace.
AReaL is an open-source fully asynchronous reinforcement learning training system for large reasoning and agentic models, developed by researchers from Tsinghua IIIS and the AReaL Team at Ant Group. With 4,100 GitHub stars and growing, it has quickly become one of the most notable RL training frameworks specifically designed for improving LLM reasoning capabilities. ## What AReaL Does AReaL addresses a critical gap in the LLM training pipeline: how to efficiently apply reinforcement learning to improve reasoning and agentic capabilities in large language models. Traditional RL training systems for LLMs suffer from synchronization bottlenecks, where generation workers sit idle while training workers update the model, and vice versa. AReaL eliminates this through a fully asynchronous architecture where LLM generation runs in a streaming manner, with rollout workers continuously producing new outputs while trainer workers run parallel model updates whenever a training batch becomes available. ## Key Architecture The system uses an algorithm-system co-design approach that separates the generation (rollout) and training phases into independent, asynchronous pipelines. This means the GPU utilization stays consistently high throughout training, rather than oscillating between computation and idle states as in synchronous systems. The v0.3 release (codenamed boba-squared) achieves a 2.77x speedup compared to synchronous RL training systems while delivering comparable or superior model performance. This speedup is not just a micro-benchmark result but translates directly to reduced training time and cloud computing costs for organizations training reasoning models. ## Supported Training Workflows AReaL supports multiple RL algorithms and training paradigms: | Algorithm | Use Case | |-----------|----------| | PPO | Standard policy optimization for reasoning | | GRPO | Group-relative policy optimization | | GSPO | Gradient-scaled policy optimization | | DPO | Direct preference optimization | | Multi-turn Agentic RL | Training agents that use tools and interact with environments | ## State-of-the-Art Results The AReaL team has demonstrated strong results across multiple domains. Their trained 7B and 32B parameter models achieve state-of-the-art performance in mathematical reasoning benchmarks. The system has also been validated on coding tasks, search-based reasoning, and customer service agent scenarios, demonstrating versatility beyond pure mathematical reasoning. ## AReaL-lite for Accessibility Recognizing that the full AReaL system can be complex for newcomers, the team released AReaL-lite, a simplified version with an algorithm-first API design. AReaL-lite uses 80% fewer lines of code while maintaining 90% of the full system's performance and core functionality. It natively supports fully asynchronous agentic RL, making it accessible to individual researchers and smaller teams who want to experiment with RL-based LLM training. ## Practical Implications For organizations and researchers working on reasoning-capable LLMs, AReaL provides a production-ready training infrastructure that can significantly reduce the time and cost of RL training. The asynchronous architecture is particularly valuable for large-scale training runs where GPU utilization efficiency directly impacts cloud computing bills. The multi-turn agentic RL support also positions AReaL as a framework for training the next generation of AI agents that can reason, plan, and use tools effectively.

Shubhamsaboo
Collection of 100+ production-ready LLM apps with AI agents, RAG, voice agents, and MCP using OpenAI, Anthropic, Gemini, and open-source models
infiniflow
Leading open-source RAG engine with deep document understanding, grounded citations, and agent capabilities, with 73K+ GitHub stars.