Reviews AI Tools Open Source Live News AI Official

Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

AReaL - Open Source | Evermx | Evermx

Back to Open Source

TrendingFeatured

AReaL

inclusionAI (Ant Group / Tsinghua IIIS)Apache-2.0

View on GitHub

LLM4.1K Stars342 Forks285 views

AReaL is an open-source fully asynchronous reinforcement learning training system for large reasoning and agentic models, developed by researchers from Tsinghua IIIS and the AReaL Team at Ant Group. With 4,100 GitHub stars and growing, it has quickly become one of the most notable RL training frameworks specifically designed for improving LLM reasoning capabilities. ## What AReaL Does AReaL addresses a critical gap in the LLM training pipeline: how to efficiently apply reinforcement learning to improve reasoning and agentic capabilities in large language models. Traditional RL training systems for LLMs suffer from synchronization bottlenecks, where generation workers sit idle while training workers update the model, and vice versa. AReaL eliminates this through a fully asynchronous architecture where LLM generation runs in a streaming manner, with rollout workers continuously producing new outputs while trainer workers run parallel model updates whenever a training batch becomes available. ## Key Architecture The system uses an algorithm-system co-design approach that separates the generation (rollout) and training phases into independent, asynchronous pipelines. This means the GPU utilization stays consistently high throughout training, rather than oscillating between computation and idle states as in synchronous systems. The v0.3 release (codenamed boba-squared) achieves a 2.77x speedup compared to synchronous RL training systems while delivering comparable or superior model performance. This speedup is not just a micro-benchmark result but translates directly to reduced training time and cloud computing costs for organizations training reasoning models. ## Supported Training Workflows AReaL supports multiple RL algorithms and training paradigms: | Algorithm | Use Case | |-----------|----------| | PPO | Standard policy optimization for reasoning | | GRPO | Group-relative policy optimization | | GSPO | Gradient-scaled policy optimization | | DPO | Direct preference optimization | | Multi-turn Agentic RL | Training agents that use tools and interact with environments | ## State-of-the-Art Results The AReaL team has demonstrated strong results across multiple domains. Their trained 7B and 32B parameter models achieve state-of-the-art performance in mathematical reasoning benchmarks. The system has also been validated on coding tasks, search-based reasoning, and customer service agent scenarios, demonstrating versatility beyond pure mathematical reasoning. ## AReaL-lite for Accessibility Recognizing that the full AReaL system can be complex for newcomers, the team released AReaL-lite, a simplified version with an algorithm-first API design. AReaL-lite uses 80% fewer lines of code while maintaining 90% of the full system's performance and core functionality. It natively supports fully asynchronous agentic RL, making it accessible to individual researchers and smaller teams who want to experiment with RL-based LLM training. ## Practical Implications For organizations and researchers working on reasoning-capable LLMs, AReaL provides a production-ready training infrastructure that can significantly reduce the time and cost of RL training. The asynchronous architecture is particularly valuable for large-scale training runs where GPU utilization efficiency directly impacts cloud computing bills. The multi-turn agentic RL support also positions AReaL as a framework for training the next generation of AI agents that can reason, plan, and use tools effectively.

Key Features

Fully asynchronous RL training architecture eliminating synchronization bottlenecks between generation and training
2.77x speedup over synchronous RL systems while maintaining comparable or superior model performance (v0.3)
Multi-algorithm support including PPO, GRPO, GSPO, DPO, and multi-turn agentic RL
State-of-the-art 7B and 32B reasoning models for math, coding, search, and customer service agents
AReaL-lite simplified version with 80% less code retaining 90% of full system performance
SGLang integration for faster rollout generation
Seamless customization for multi-turn agentic rollout workflows with external tool integration
Algorithm-system co-design approach for maximum GPU utilization during RL training

Related Projects

TrendingLLM

GitHub

159.1K32.8K

Hugging Face Transformers

huggingface

Apache-2.0374

Open Source

AReaL

Key Features

Tags

Related Projects

Hugging Face Transformers

Hermes Agent

LangChain

Open WebUI