Skip to content

Getting Started

Welcome to the Claw-R1 documentation. This section guides you from installation to your first training run.

  • Installation


    Set up your conda environment, install veRL, and get Claw-R1 running in minutes.

    Installation

  • Quick Start


    Run your first white-box or black-box agent training with a minimal working example.

    Quick Start

Prerequisites

Before you begin, make sure you have:

  • A machine with one or more NVIDIA GPUs (CUDA required for training)
  • Conda or Mamba for environment management
  • Python 3.10 or higher
  • Git

Architecture at a Glance

Claw-R1 separates concerns into three independent processes that communicate over the network:

Agent (any HTTP client)
    │  POST /v1/chat/completions  (OpenAI-compatible)
Gateway Server  ──── Ray RPC ────►  DataPool (Ray Actor)
                                    fetch_batch()
                                   Async Trainer  ──► vLLM (weight sync)

This design lets you run the agent, gateway, and trainer on completely separate machines, with no coupling between service latency and training throughput.