Agent-R1¶

Training Powerful LLM Agents with End-to-End Reinforcement Learning¶

Agent-R1 is an open-source framework for training powerful language agents with end-to-end reinforcement learning. With Agent-R1, you can build custom agent workflows, define interactive environments and tools, and train multi-step agents in a unified RL pipeline.

Step-level MDP

A principled MDP formulation that enables flexible context management and per-step reward signals.

Learn more
Layered Abstractions

From maximum flexibility to out-of-the-box, choose the right level of abstraction for your use case.

Learn more

Reading Guide¶

Start with Getting Started if you want the minimal path: use the same environment as verl, run a sanity check, and confirm the repository is ready.
Read Step-level MDP and Layered Abstractions if you want to understand the framework design before touching code.
Follow Agent Task Tutorial if you want to see the main Agent-R1 workflow: multi-step interaction through AgentEnvLoop and ToolEnv.

Scope of This Documentation¶

This version of the documentation is intentionally compact. It focuses on the parts that are already central to Agent-R1 today and leaves room for future tutorials as more environments and tools are added.

Supported by the State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China (USTC).