Get started
Install AVP
AVP is an open standard for running agents and recording what they do. The avp CLI installs, runs, and scores agents for you. You need one thing on your machine — Docker — and the CLI manages the rest. macOS and Linux.
Install and run Docker
Every agent run executes in a sandbox backed by a Docker daemon. Any of Docker Desktop, OrbStack, or colima works. Skip this if Docker is already running.
brew install --cask docker # or: brew install colima docker && colima start
Install the avp CLI
curl -LsSf https://astral.sh/uv/install.sh | sh # install uv, if you don't have it uv tool install avp-cli # installs the `avp` command
That's it — avp is on your PATH. Run avp with no arguments any time to see the full command map.
Install an agent
Agents are prebuilt; the CLI fetches and installs them.
avp agent install goose avp agent install claude-code # optional; needs the claude CLI on PATH avp agent list
Run your first eval
An eval is a JSON file: a dataset, a scorer, and the agent configs (“commissions”) to compare. avp init scaffolds one; avp eval run runs it and prints a ranked board.
export ANTHROPIC_API_KEY=sk-ant-... avp init capitals --agent goose avp eval run capitals.eval.json
Every avp eval / avp run executes the agent inside an isolated sandbox with a default-deny network allowlist — its writes stay in its workspace. The first run sets up the sandbox stack; later runs start in a couple of seconds.
Full reference, the four specs, and the conformance suite live in the repository. Want to see a run first? Set sail →