Policy Inference¶

Run a fine-tuned VLA policy on the robot in closed-loop at a fixed frequency.

1. Install policy dependencies¶

Policy inference needs openpi and lerobot. They're opt-in (the install is large and conflicts with the docs extra):

uv sync --group pi

2. Pick a station and policy¶

Env var	Purpose	Where defined
`STATION`	Robot hardware config (arm, gripper, cameras)	`examples/cfg/`
`POLICY`	Policy config (checkpoint, transforms, action space)	`examples/policy_cfgs/`

Available policies:

`POLICY`	Model	Notes
`Pi05Cfg`	openpi pi0.5 (DROID-style)	3 cameras, 32-dim actions, `joint_vel`
`SmolVLACfg`	lerobot SmolVLA	2 cameras, `joint_pos`

List available stations and policies:

uv run -m examples.cfg
uv run -m examples.policy_cfgs

3. Point the policy at your checkpoint¶

The three fields you'll typically set live at the top of each policy config:

Field	Pi05Cfg	SmolVLACfg
`policy_path`	Dir with `params/` + `assets/`	lerobot checkpoint dir
`asset_id`	Subfolder under `<policy_path>/assets/` containing `norm_stats.json`	—
`instruction`	Language prompt for the task	Language prompt (optional)

You can either edit the defaults in examples/policy_cfgs/pi05.py / smolvla.py, or override them on the CLI (see below).

The action_space in the policy config automatically sets the arm controller — no need to edit station files.

4. Run¶

# pi0.5
STATION=Xarm7EEFStation POLICY=Pi05Cfg uv run -m examples.policy_inference \
  --policy-path /data/ckpt/pi05_multi_fold \
  --asset-id robotio/multirobot_fold \
  --instruction "Fold the shirt in half."

# SmolVLA
STATION=Xarm7EEFStation POLICY=SmolVLACfg uv run -m examples.policy_inference \
  --policy-path ckpts/smolvla \
  --instruction "Place the cup on the shelf."

Press Enter when prompted to start the loop.

Common flags¶

Flag	Default	Description
`--policy-path`	from config	Checkpoint directory
`--asset-id` (pi0.5)	from config	Norm-stats asset id
`--instruction`	from config	Language instruction
`--freq`	`50`	Control frequency (Hz)
`--visualizer`	`None`	Set to `Rerun` for live visualization
`--policy-node-cfg.chunk-size`	`16`	Actions per chunk
`--policy-node-cfg.chunk-request-threshold`	`0.1`	Fraction consumed before next chunk request

Inference loop¶

Collect an observation (camera images + proprioception)
Send it to the policy server
Receive an action chunk and step through it
Request a new chunk before the current one is exhausted