ServerlessBackend
Run training and inference on autoscaling GPUs.
SkyPilotBackend
Run training and inference on a separate ephemeral machine.
LocalBackend
Run training and inference on your local machine.
Initializing the client
The client that you’ll use to generate tokens and train your model is initialized through theart.TrainableModel class.
Initializing from an existing SFT LoRA
If you’ve already fine-tuned a model with SFT using a LoRA adapter (e.g., Unsloth/PEFT) and have a standard Hugging Face–style adapter directory, you can start RL training from those weights by passing the adapter directory path asbase_model when creating your TrainableModel.
Why this?
- Warm-start from task-aligned weights to reduce steps/GPU cost.
- Stabilize early training, especially for small models (1B–8B) that may get near-zero rewards at RL start.
Running inference
Your model will generate inference tokens by making requests to a vLLM server running on whichever backend you previously registered. To route inference requests to this backend, follow the code sample below.Training the model
Before training your model, you need to provide a few scenarios that your agent should learn from. While completing these scenarios, its weights will update to avoid past mistakes and reproduce successes. It’s best to provide at least 10 scenarios that adequately represent the real scenarios your agent will handle after it’s deployed.Summarizer Tutorial
Teach a summarizer agent to outperform Sonnet 4.
Notebooks
Put the ART client and server in action in one of our notebooks!