Core Concepts¶
This section introduces the ideas that shape Agent-R1 as a framework for agent tasks.
In This Section¶
Step-level MDP: why Agent-R1 models agent training as multi-step interaction instead of a single growing token stream.Layered Abstractions: howAgentFlowBase,AgentEnvLoop,AgentEnv,ToolEnv, andBaseToolfit together.
Why These Concepts Matter¶
Agent-R1 is designed for agent tasks where an LLM interacts with an environment, receives new observations, and improves through reinforcement learning over trajectories. These two pages explain the core formulation and the programming model that support that workflow.