Reviews AI Tools Open Source Live News AI Official

Open Source

Explore the latest AI open-source projects from GitHub and HuggingFace.

slime - Open Source | Evermx | Evermx

Back to Open Source

Trending

slime

THUDMApache-2.0

View on GitHub

LLM7.1K Stars1.0K Forks5 views

slime is an open-source LLM post-training framework built for reinforcement learning at scale. Developed by THUDM (the team behind the GLM models), it connects Megatron for high-performance training with SGLang for fast rollout, giving researchers a single, coherent path for the RL loop instead of a loose stack of disconnected trainers, serving engines, and agent frameworks. With roughly 7,000 GitHub stars, slime has quickly become one of the most battle-tested open RL post-training stacks available. ## Megatron + SGLang in One Loop slime's core idea is that training and data generation should reinforce each other. Megatron handles the main training process while SGLang serves rollouts, and both flow through the same training / rollout / Data Buffer path. By committing to a single rollout backend, slime can use SGLang-specific capabilities — routing, caching, disaggregation, and weight synchronization — directly, rather than flattening multiple inference engines into a lowest-common-denominator abstraction. ## Native Argument Pass-Through A defining design choice is native pass-through. slime reads Megatron arguments directly and exposes every installed SGLang argument with a `--sglang-` prefix, so upstream improvements to either engine remain available without wrapper code. This keeps the framework lightweight and close to the engines it builds on, letting teams adopt new training and serving optimizations as the underlying projects evolve. ## Flexible Data Generation slime treats math, code, search, tools, sandboxes, verifiers, environments, and multi-agent or long-horizon agentic workflows as data-generation or reward workflows that plug in without forking the training kernel. Custom data-generation interfaces and server-based engines make it possible to build arbitrary RL data pipelines, supporting everything from simple verifiable rewards to complex agentic rollouts. ## Battle-Tested at Frontier Scale slime is the RL framework behind the GLM family — GLM-4.5 through GLM-5.2 — and also supports Qwen3 series, DeepSeek V3/V3.1/R1, and Llama 3. Because RL bugs are often silent, the project treats correctness as a first-class concern, maintaining CPU unit tests, contract tests for customization hooks, and GPU end-to-end tests covering dense and MoE models, checkpointing, numerical precision, async rollout, and PPO-style workflows. ## Considerations slime is deliberately opinionated: it optimizes deeply for the Megatron + SGLang path rather than supporting many backends, so teams committed to a different stack may find it less flexible. Large-scale RL post-training also remains resource-intensive and operationally complex. For groups building release-grade models who want explicit dataflow and reproducible RL infrastructure, though, slime is among the most credible open frameworks in the space.

Key Features

RL post-training framework connecting Megatron training with SGLang rollout
Single training / rollout / Data Buffer path instead of disconnected trainers and services
Native pass-through of Megatron and SGLang arguments (--sglang- prefix)
Flexible data generation for math, code, tools, sandboxes, verifiers, and agentic workflows
Battle-tested behind GLM-4.5 to GLM-5.2; supports Qwen3, DeepSeek V3/R1, and Llama 3
Correctness-first engineering with CPU unit tests and GPU end-to-end CI

Related Projects

TrendingLLM

GitHub

159.1K32.8K

Hugging Face Transformers

huggingface

Apache-2.0313

Open Source

slime

Key Features

Tags

Related Projects

Hugging Face Transformers

Hermes Agent

LangChain

Open WebUI