Quick Start¶
This quick start is a sanity check, not the main Agent-R1 workflow. Its purpose is to verify that your environment, dataset path, model path, and training stack are wired correctly.
1. Prepare a Minimal Dataset¶
Use the GSM8K preprocessing script:
This produces:
~/data/gsm8k/train.parquet~/data/gsm8k/test.parquet
2. Run the Sanity Check Script¶
Use the provided single-step script:
If needed, adjust the following values before running:
CUDA_VISIBLE_DEVICESactor_rollout_ref.model.path- dataset paths under
~/data/gsm8k
The script entrypoint is examples/run_qwen2.5-3b.sh, which launches python3 -m agent_r1.main_agent_ppo.
3. What to Do Next¶
- Read
Step-level MDPto understand the main training abstraction. - Read
Layered Abstractionsto see howAgentFlowBase,AgentEnvLoop, andToolEnvfit together. - Continue to the
Agent Task Tutorialfor the main Agent-R1 workflow based on multi-step interaction.