Design Principles
JSON-RPC 2.0
Every request and response follows the JSON-RPC 2.0 specification. Clients send
{ jsonrpc: "2.0", method, params, id } and receive { jsonrpc: "2.0", result, id } or { jsonrpc: "2.0", error, id }.WebSocket Transport
Persistent bidirectional connection. The server can push notifications (console errors, network failures, dialog events) without polling.
Semantic Selectors
10 selector types (role, text, label, testId, ref, css, xpath, placeholder, semantic, coordinates) let agents target elements by meaning, not DOM structure.
Fusion Operations
6 optimizations that batch multiple operations into single requests, cutting roundtrips by up to 55% in complex workflows.
Architecture
BAP uses a two-process architecture that separates the protocol layer from the browser automation engine:- Session persistence — the browser survives client restarts
- Multi-client access — CLI and MCP can control the same browser simultaneously
- Shared state — observations, element refs, and cookies persist across interfaces
Connection Lifecycle
Initialize
Client sends
initialize with client capabilities and optional sessionId. Server responds
with server capabilities and protocol version.Operate
Client sends method requests (
browser/launch, page/navigate, agent/act, etc.). Server
responds to each request and may push notifications for events.Session Persistence
Clients can include asessionId in the initialize request. When a client with a session ID disconnects, the server parks the browser state (pages, contexts, cookies) in a dormant store instead of destroying it.
Reconnecting with the same sessionId restores the full session. Dormant sessions expire after dormantSessionTtl (default: 300 seconds).
Protocol Versioning
BAP follows semantic versioning:| Level | What changes |
|---|---|
| Patch (0.8.x) | Bug fixes, performance improvements |
| Minor (0.x.0) | New optional fields, new methods, new selector types |
| Major (x.0.0) | Breaking changes to existing method signatures or behavior |
Method Categories
BAP defines 55 methods organized into 13 categories:| Category | Methods | Purpose |
|---|---|---|
| Lifecycle | 3 | Initialize, confirm ready, shutdown |
| Browser | 2 | Launch and close browser |
| Page | 8 | Create, navigate, reload, back/forward, close, list, activate |
| Actions | 13 | Click, fill, type, press, hover, scroll, select, check, upload, drag |
| Observations | 7 | Screenshot, accessibility tree, DOM, element, PDF, content, ARIA snapshot |
| Storage | 5 | Get/set cookies and localStorage |
| Emulation | 4 | Viewport, user agent, geolocation, offline mode |
| Dialog/Trace/Events | 4 | Handle dialogs, start/stop traces, subscribe to events |
| Context | 3 | Create, list, destroy browser contexts |
| Frame | 3 | List, switch, return to main frame |
| Stream/Approval | 2 | Cancel streams, respond to approval requests |
| Discovery | 1 | Discover WebMCP tools on page |
| Agent | 3 | Composite act, observe, extract |