Agent-R1¶
Training Powerful LLM Agents with End-to-End Reinforcement Learning¶
Agent-R1 is an open-source framework for training powerful language agents with end-to-end reinforcement learning. With Agent-R1, you can build custom agent workflows, define interactive environments and tools, and train multi-step agents in a unified RL pipeline.
-
Step-level MDP
A principled MDP formulation that enables flexible context management and per-step reward signals.
-
Layered Abstractions
From maximum flexibility to out-of-the-box, choose the right level of abstraction for your use case.
Reading Guide¶
- Start with
Getting Startedif you want the minimal path: use the same environment asverl, download the processed data from ModelScope, run a sanity check, and confirm the repository is ready. - Read
Step-level MDPandLayered Abstractionsif you want to understand the framework design before touching code. - Follow
Agent Task Tutorialif you want to see the minimal GSM8K + Tool example throughToolEnv + BaseTool. - Read
Recipes and Algorithmsfor the current GSM8K, HotpotQA, ALFWorld, WebShop, paper-search, and algorithm script layout.
Scope of This Documentation¶
This version of the documentation is intentionally compact. It focuses on the parts that are already central to Agent-R1 today: the core agent abstractions, runnable examples, and recipe-level integrations.
Supported by the State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China (USTC).