Skip to main content
This page documents the core types used across the BAP protocol, TypeScript SDK, and Python SDK. All types are defined as Zod schemas in @browseragentprotocol/protocol and as Pydantic models in the Python SDK.

BAPSelector

A union type representing one of the 10 selector types. Used in all action and observation methods.
type BAPSelector =
  | { type: "css"; value: string }
  | { type: "xpath"; value: string }
  | { type: "role"; role: string; name?: string }
  | { type: "text"; value: string }
  | { type: "label"; value: string }
  | { type: "placeholder"; value: string }
  | { type: "testId"; value: string }
  | { type: "semantic"; description: string }
  | { type: "coordinates"; x: number; y: number }
  | { type: "ref"; value: string };

InteractiveElement

Returned by agent/observe. Represents a single interactive element on the page with a stable reference, selector, and metadata.
ref
string
required
Stable element reference ID (e.g., "@e1" or "@submitBtn"). Persists across observations for the same element.
selector
BAPSelector
required
Pre-computed selector that uniquely targets this element. Use this directly in action methods.
role
string
required
ARIA role (e.g., "button", "textbox", "link", "checkbox").
name
string
Accessible name of the element.
value
string
Current value (for input elements).
actionHints
string[]
required
What actions can be performed: "clickable", "editable", "selectable", "checkable", "expandable", "draggable", "scrollable", "submittable".
tagName
string
required
HTML tag name (e.g., "button", "input", "a").
bounds
ElementBounds
Bounding box { x, y, width, height } if includeBounds was requested.
focused
boolean
Whether the element currently has focus.
disabled
boolean
Whether the element is disabled.
stability
RefStability
Ref stability indicator: "stable" (same element, same ref), "new" (newly discovered), "moved" (element found but ref changed).
alternativeSelectors
BAPSelector[]
Alternative selectors ordered by reliability: testId, role+name, css #id, text, css path.

ExecutionStep

Defines a single step in an agent/act sequence.
action
string
required
The BAP method to execute (e.g., "action/click", "action/fill", "page/navigate").
params
Record<string, unknown>
required
Parameters for the action.
label
string
Human-readable label for logging and debugging.
condition
StepCondition
Pre-condition that must be met before execution. Object with selector, state ("visible" | "enabled" | "exists" | "hidden" | "disabled"), and optional timeout.
onError
string
Error handling strategy: "stop" (default), "skip", or "retry".
maxRetries
number
Max retries if onError is "retry" (1-5).

Building Steps

import { BAPClient } from "@browseragentprotocol/client";

const steps = [
  BAPClient.step("action/fill", {
    selector: { type: "label", value: "Email" },
    value: "user@example.com"
  }),
  BAPClient.step("action/click", {
    selector: { type: "role", role: "button", name: "Submit" }
  }),
];

AgentActResult

Returned by agent/act. Contains results for each step and optional fused observations.
completed
number
required
Number of steps completed successfully.
total
number
required
Total number of steps in the sequence.
success
boolean
required
Whether all steps succeeded.
results
StepResult[]
required
Results for each step in order. Each contains step (index), success, duration, optional result data, and optional error.
duration
number
required
Total execution time in milliseconds.
failedAt
number
Index of the first failed step (if any).
preObservation
AgentObserveResult
Pre-execution observation result (if preObserve was set).
postObservation
AgentObserveResult
Post-execution observation result (if postObserve was set).

AgentObserveResult

Returned by agent/observe. Contains the page snapshot.
metadata
ObserveMetadata
Page metadata: url, title, viewport { width, height }.
interactiveElements
InteractiveElement[]
List of interactive elements with stable refs and selectors.
totalInteractiveElements
number
Total count on the page (may exceed the returned list if maxElements was set).
accessibility
{ tree: AccessibilityNode[] }
Full accessibility tree (if includeAccessibility was set).
screenshot
ObserveScreenshot
Screenshot data with data (base64), format, width, height, annotated.
annotationMap
AnnotationMapping[]
Mapping from annotation labels to element refs and positions (if screenshot annotation was used).
changes
ObserveChanges
Incremental changes: added, updated, removed (if incremental: true).
webmcpTools
WebMCPTool[]
WebMCP tools discovered on the page (if includeWebMCPTools: true).

AgentExtractResult

Returned by agent/extract.
success
boolean
required
Whether extraction succeeded.
data
unknown
required
Extracted data matching the provided schema.
sources
ExtractionSourceRef[]
Source element references for extracted data (if includeSourceRefs was set).
confidence
number
Confidence score (0-1) for the extraction.
error
string
Error message if extraction failed.

ExtractionSchema

Simplified JSON Schema subset (max 2 levels deep) used by agent/extract:
{
  type: "object" | "array" | "string" | "number" | "boolean",
  properties?: Record<string, {
    type: string;
    description?: string;
    properties?: Record<string, { type: string; description?: string }>;
    items?: { type: string; description?: string; properties?: Record<...> };
  }>,
  required?: string[],
  items?: { type: string; description?: string; properties?: Record<...> },
  description?: string,
}

Allowed Act Actions

These method names are permitted in ExecutionStep.action:
ActionDescription
action/clickClick element
action/dblclickDouble-click
action/fillFill input
action/typeType character by character
action/pressPress key
action/hoverHover
action/scrollScroll
action/selectSelect option
action/checkCheck checkbox
action/uncheckUncheck checkbox
action/clearClear input
action/uploadUpload file
action/dragDrag element
page/navigateNavigate to URL
page/reloadReload page
page/goBackGo back
page/goForwardGo forward