Skip to main content
BAP (Browser Agent Protocol) is an open standard for AI agents to interact with web browsers. It defines 55 methods across 13 categories, all transported as JSON-RPC 2.0 messages over WebSocket.

Design Principles

JSON-RPC 2.0

Every request and response follows the JSON-RPC 2.0 specification. Clients send { jsonrpc: "2.0", method, params, id } and receive { jsonrpc: "2.0", result, id } or { jsonrpc: "2.0", error, id }.

WebSocket Transport

Persistent bidirectional connection. The server can push notifications (console errors, network failures, dialog events) without polling.

Semantic Selectors

10 selector types (role, text, label, testId, ref, css, xpath, placeholder, semantic, coordinates) let agents target elements by meaning, not DOM structure.

Fusion Operations

6 optimizations that batch multiple operations into single requests, cutting roundtrips by up to 55% in complex workflows.

Architecture

BAP uses a two-process architecture that separates the protocol layer from the browser automation engine:
MCP Client / Shell Agent
    |
    | JSON-RPC 2.0 over WebSocket
    v
BAP Server (Playwright engine)
    |
    | CDP / Playwright
    v
Browser (Chromium / Firefox / WebKit)
This separation enables:
  • Session persistence — the browser survives client restarts
  • Multi-client access — CLI and MCP can control the same browser simultaneously
  • Shared state — observations, element refs, and cookies persist across interfaces

Connection Lifecycle

1

Open WebSocket

Client connects to ws://localhost:9222 (default endpoint).
2

Initialize

Client sends initialize with client capabilities and optional sessionId. Server responds with server capabilities and protocol version.
3

Confirm Ready

Client sends notifications/initialized notification (no response expected).
4

Operate

Client sends method requests (browser/launch, page/navigate, agent/act, etc.). Server responds to each request and may push notifications for events.
5

Shutdown

Client sends shutdown or simply disconnects. If the client provided a sessionId, the server parks the browser state for later reconnection.

Session Persistence

Clients can include a sessionId in the initialize request. When a client with a session ID disconnects, the server parks the browser state (pages, contexts, cookies) in a dormant store instead of destroying it. Reconnecting with the same sessionId restores the full session. Dormant sessions expire after dormantSessionTtl (default: 300 seconds).
{
  "jsonrpc": "2.0",
  "method": "initialize",
  "params": {
    "sessionId": "my-agent-session",
    "clientInfo": { "name": "my-agent", "version": "1.0.0" }
  },
  "id": 1
}

Protocol Versioning

BAP follows semantic versioning:
LevelWhat changes
Patch (0.8.x)Bug fixes, performance improvements
Minor (0.x.0)New optional fields, new methods, new selector types
Major (x.0.0)Breaking changes to existing method signatures or behavior
All fields added in minor versions are optional. Old clients and servers ignore unknown fields — zero breaking changes for additive updates.

Method Categories

BAP defines 55 methods organized into 13 categories:
CategoryMethodsPurpose
Lifecycle3Initialize, confirm ready, shutdown
Browser2Launch and close browser
Page8Create, navigate, reload, back/forward, close, list, activate
Actions13Click, fill, type, press, hover, scroll, select, check, upload, drag
Observations7Screenshot, accessibility tree, DOM, element, PDF, content, ARIA snapshot
Storage5Get/set cookies and localStorage
Emulation4Viewport, user agent, geolocation, offline mode
Dialog/Trace/Events4Handle dialogs, start/stop traces, subscribe to events
Context3Create, list, destroy browser contexts
Frame3List, switch, return to main frame
Stream/Approval2Cancel streams, respond to approval requests
Discovery1Discover WebMCP tools on page
Agent3Composite act, observe, extract
See Methods for the full reference.