Skip to main content
Fusion operations are optimizations that eliminate redundant DOM walks, serializations, and WebSocket roundtrips. They combine what would be multiple server calls into a single call, reducing latency and token usage. All fusion fields are optional protocol extensions — zero breaking changes. Old clients and servers simply ignore the new fields.

The 7 Fusion Operations

1. Navigate + Observe

Fuse navigation with observation in one server call instead of two.
bap goto https://example.com
bap observe
The server calls page.goto() then runs handleAgentObserve in-process, returning both the page state and interactive elements in a single response.

2. Act + Post-Observe

Get an observation after executing actions, without an extra roundtrip.
bap act click:text:"Next"
bap observe
In MCP, pass postObserve: true in the act tool parameters.

3. Act + Pre-Observe

Capture page state before the action executes. Useful for diffing before/after states. In MCP, pass preObserve: true in the act tool parameters. The server calls handleAgentObserve before the step loop executes.
Pre-observe runs BEFORE steps execute (fixed in v0.6.0). Earlier versions had a bug where pre-observe ran after all steps completed.

4. Incremental Observe

Only return changes since the last observation instead of the full element list.
bap observe
# Returns all 50 interactive elements
The server snapshots previous element refs before updating the registry, then diffs to produce a changes object. In MCP, pass incremental: true to the observe tool.

5. Response Tiers

Control how much data the observe response includes:
TierWhat is includedUse case
fullInteractive elements + accessibility tree + screenshotFirst observation of a page
interactiveInteractive elements only (skips tree and screenshot)Mid-workflow checks
minimalElements stripped to {ref, role, name, selector, tagName}Tight token budgets
bap observe --tier=minimal
bap act click:e3 --observe --tier=interactive

6. Selector Caching

Each ElementRegistryEntry stores a cachedCssSelector. When resolving a selector, the server tries the cached CSS path for a direct lookup before falling back to semantic resolution. This eliminates DOM traversal for repeat interactions.
Selector caching is automatic. No flags needed — it kicks in after the first successful resolution of any element.

7. Speculative Prefetch

After handleAgentAct, if the last step was a click or navigation, the server fires off an observe call 200ms later. The result is cached in ClientState.speculativeObservation (URL-matched, 5-second TTL). If the agent requests an observation within that window, the cached result is returned instantly.

Fusion in the TypeScript SDK

// Navigate + observe
await client.navigate("https://example.com", { observe: true });

// Act with pre/post observe
await client.act(steps, {
  preObserve: true,
  postObserve: true,
});

// Incremental observe
await client.observe({ incremental: true });

// Response tiers
await client.observe({ responseTier: "minimal" });

Fusion in MCP Tools

// navigate with fused observe
{ "tool": "navigate", "arguments": { "url": "https://example.com", "observe": true } }

// act with post-observe
{ "tool": "act", "arguments": { "steps": [...], "postObserve": true } }

// incremental observe with response tier
{ "tool": "observe", "arguments": { "incremental": true, "responseTier": "interactive" } }

Impact

Fusion operations can significantly reduce the number of server calls in multi-step workflows:
ScenarioWithout fusionWith fusionReduction
Navigate + observe2 calls1 call50%
Login form (fill + fill + click + observe)4 calls1 call75%
Browse 5 pages with observation10 calls5 calls50%
In CLI mode, the --observe flag is the single most impactful fusion. Add it to goto and act commands whenever the agent needs to see the page after the action.