Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
0048c84
feat: add browser-swarm extension bridge POC
shrey150 May 7, 2026
9b923c9
docs: clarify real browser setup for browser-swarm
shrey150 May 8, 2026
879231c
docs: add browser-swarm real browser setup helper
shrey150 May 8, 2026
e4e71d3
docs: default browser-swarm setup to OS opener
shrey150 May 8, 2026
86a3aa3
feat: harden browser-swarm worker isolation
shrey150 May 14, 2026
6d46b7d
fix: disable browser-swarm grouping for Arc
shrey150 May 14, 2026
f0de6bc
feat: isolate disposable Chrome relay ports
shrey150 May 14, 2026
1c34fea
fix: stabilize browser-swarm parallel input
shrey150 May 14, 2026
54a3517
chore: expose browser-swarm extension version
shrey150 May 14, 2026
ba3cf0a
docs: note browser-swarm extension version checks
shrey150 May 14, 2026
4bca710
docs: record live browser-swarm worker stress test
shrey150 May 14, 2026
b5ad544
docs: tighten browser-swarm worker command contract
shrey150 May 14, 2026
f9e9981
docs: record Arc mixed-worker swarm evidence
shrey150 May 14, 2026
7f5c93a
docs: record Arc service worker refresh blocker
shrey150 May 14, 2026
7dc89ce
fix: detect stale browser-swarm extension workers
shrey150 May 14, 2026
a3fe8ad
fix: harden browser-swarm relay edge cases
shrey150 May 14, 2026
de6e323
docs: record latest mixed-worker stress evidence
shrey150 May 14, 2026
5297512
fix: harden browser-swarm target scoping
shrey150 May 14, 2026
fbfdf0d
docs: tighten browser-swarm selector guidance
shrey150 May 14, 2026
4ca2cfc
docs: record disposable Arc launch blocker
shrey150 May 14, 2026
7f46e5e
fix: avoid duplicate browser-swarm detach events
shrey150 May 14, 2026
25ddd91
docs: record latest browser-swarm e2e evidence
shrey150 May 14, 2026
87267b6
fix: route root lifecycle commands through relay state
shrey150 May 14, 2026
9db5fda
docs: tighten browser-swarm worker session contract
shrey150 May 14, 2026
d455b35
docs: record latest browser-swarm chrome e2e
shrey150 May 14, 2026
a4d303e
docs: record latest mixed-worker swarm evidence
shrey150 May 14, 2026
cf50bc1
docs: record browser-swarm skill sync
shrey150 May 14, 2026
d23eef0
docs: record arc stale-worker probe
shrey150 May 14, 2026
e5a29bd
docs: clarify browser-swarm runtime evidence
shrey150 May 14, 2026
fbc59e3
docs: clarify arc extension refresh
shrey150 May 14, 2026
d998972
docs: stabilize browser-swarm evidence wording
shrey150 May 14, 2026
ef89ec2
docs: record current browser-swarm e2e evidence
shrey150 May 14, 2026
ae6b1ab
docs: record arc refresh blockers
shrey150 May 14, 2026
326bf4a
docs: record arc serialized click evidence
shrey150 May 14, 2026
886874f
test: add arc serialized click e2e
shrey150 May 14, 2026
c4f8945
docs: clarify arc parallel click check
shrey150 May 14, 2026
7e87cf2
docs: record latest chrome e2e run
shrey150 May 14, 2026
d8d6603
docs: record browser-swarm completion audit
shrey150 May 14, 2026
2c3e44e
docs: avoid volatile bugbot head in notes
shrey150 May 14, 2026
aed3c89
test: add arc parallel click gate
shrey150 May 14, 2026
9828748
docs: record arc self reload probe
shrey150 May 14, 2026
a9f3453
docs: record latest chrome swarm e2e
shrey150 May 14, 2026
6a611aa
test: share arc click e2e harness
shrey150 May 14, 2026
a726f6c
fix: version browser-swarm extension worker
shrey150 May 14, 2026
58a75fa
docs: clarify arc parallel click gate
shrey150 May 14, 2026
aba8036
docs: record current arc serialized smoke
shrey150 May 14, 2026
6ea6cf8
docs: record latest mixed agent smoke
shrey150 May 14, 2026
74a7236
docs: record arc update check probe
shrey150 May 14, 2026
060734a
docs: add browser-swarm completion checklist
shrey150 May 14, 2026
a681116
docs: record fresh arc stale-worker gate
shrey150 May 14, 2026
962f51f
docs: record disposable arc retry
shrey150 May 14, 2026
7d68f72
docs: record arc service worker registration evidence
shrey150 May 14, 2026
83bf2de
test: add arc worker stale registration diagnostic
shrey150 May 14, 2026
1ab52de
test: count exact arc worker registration urls
shrey150 May 14, 2026
7ac47b9
docs: record arc target serviceworker probe
shrey150 May 14, 2026
eebb503
test: point stale arc gate to diagnostic
shrey150 May 14, 2026
112ecf8
test: assert browser-swarm extension metadata
shrey150 May 14, 2026
b172bf8
docs: record current codex worker stress
shrey150 May 14, 2026
4a649dc
test: report arc worker preference registration
shrey150 May 14, 2026
edad4bb
docs: record disposable arc retry evidence
shrey150 May 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 15 additions & 0 deletions .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,21 @@
"./skills/browser"
]
},
{
"name": "browser-swarm",
"source": "./",
"description": "Coordinate multiple browser agents in one real Chrome profile through a Chrome extension bridge, colored tab group, and target-bound browse CLI endpoints.",
"version": "0.0.1",
"author": {
"name": "Browserbase"
},
"category": "automation",
"keywords": ["browser", "swarm", "chrome-extension", "tab-groups", "stagehand", "understudy", "browse-cli"],
"strict": false,
"skills": [
"./skills/browser-swarm"
]
},
{
"name": "functions",
"source": "./",
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ This plugin includes the following skills (see `skills/` for details):
| Skill | Description |
|-------|-------------|
| [browser](skills/browser/SKILL.md) | Automate web browser interactions via CLI commands — supports remote Browserbase sessions with anti-bot stealth, CAPTCHA solving, and residential proxies |
| [browser-swarm](skills/browser-swarm/SKILL.md) | Coordinate multiple browser agents in one real Chrome profile through a Chrome extension bridge, colored tab group, and target-bound browse CLI endpoints |
| [browserbase-cli](skills/browserbase-cli/SKILL.md) | Use the official `bb` CLI for Browserbase Functions and platform API workflows including sessions, projects, contexts, extensions, fetch, and dashboard |
| [functions](skills/functions/SKILL.md) | Deploy serverless browser automation to Browserbase cloud using the `bb` CLI |
| [site-debugger](skills/site-debugger/SKILL.md) | Diagnose and fix failing browser automations — analyzes bot detection, selectors, timing, auth, and captchas, then generates a tested site playbook |
Expand Down
204 changes: 204 additions & 0 deletions skills/browser-swarm/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
---
name: browser-swarm
description: Coordinate multiple browser agents in one real Chromium-family profile through a Chrome extension bridge, a colored tab group, and target-bound browse CLI endpoints.
compatibility: "Requires Node.js 20+, a Chromium-family browser with extension support, the browse CLI with --cdp and --session support, a locally loaded browser-swarm Chrome extension, and the /browser skill for CLI command reference."
license: MIT
allowed-tools: Bash
---

# Browser Swarm

Use this skill when one task benefits from several independent browser workstreams that should share the user's real browser profile, cookies, and extensions.

The top-level agent owns the user intent, decomposition, business logic, synthesis, and approval gates. Worker agents own one bounded browser workstream each. The browser-swarm adapter owns the worker-to-tab mapping and gives each worker a scoped browser capability.

This is tab capability isolation, not full browser identity isolation. All workers share the same real browser profile, cookies, local storage, extensions, and logged-in user state. The isolation promise is that each worker's browse endpoint exposes only its assigned tab.

## Architecture

The swarm has three parts:

1. A local adapter script in `scripts/swarm-relay.mjs`.
2. A Manifest V3 Chrome extension in `extension/`.
3. One unique `browse --session <worker-session> --cdp <target-bound-url>` context per worker.

Flow:

```text
top-level agent
-> starts/checks browser-swarm adapter
-> ensures N labeled tabs
-> spawns worker agents with one command contract each
<- receives evidence and synthesizes final answer

worker agent
-> browse ... --session <worker-session> --cdp <target-bound-url>
-> sees only its assigned tab
```

The extension is browser transport and tab control. The `browse` CLI remains the worker-facing browser API. The adapter exposes one target-bound CDP URL per tab so each worker keeps a stable active-page model without cross-agent tab races.

## Setup

From the skills repo:

```bash
cd skills/browser-swarm
npm install
```

Start the real-browser setup helper:

```bash
node scripts/setup-real-browser.mjs
```

### Real Browser Mode

Use this mode when the user wants the swarm in their own browser profile, for example Arc, Chrome, Chrome Canary, Chromium, or Chrome for Testing.

By default the helper opens `chrome://extensions` through the OS URL opener using the cross-platform `open` package. On a user's machine this should land in their default Chromium-family browser; for example Arc may route it to `arc://extensions`. This is not profile detection. If the opened browser/profile is not the one the user wants controlled, stop and rerun with an explicit browser.

The setup helper starts the relay if needed, opens the extension management page, prints the unpacked extension path, and waits until the extension connects:

```bash
node scripts/setup-real-browser.mjs
node scripts/setup-real-browser.mjs --browser arc
node scripts/setup-real-browser.mjs --browser chrome
node scripts/setup-real-browser.mjs --browser canary
node scripts/setup-real-browser.mjs --browser chromium
node scripts/setup-real-browser.mjs --browser chrome-for-testing
node scripts/setup-real-browser.mjs --extensions-url arc://extensions
```

The user must still approve/install the extension in the browser they want controlled:

1. Enable developer mode if needed.
2. Click "Load unpacked".
3. Select the printed `skills/browser-swarm/extension` path.
4. Wait for the helper or confirm manually:

```bash
curl -s http://127.0.0.1:19989/health
```

Proceed only when `extensionConnected` is `true`. If it is false, ask the user to confirm the extension is installed and enabled in the chosen browser/profile.

Do not try to install an unpacked extension into an already-running personal browser profile without the user's approval. The only automated install path in this POC is launching a separate browser process with `--load-extension`, which creates a separate test browser rather than using the user's active browser.

The default supported relay port is `19989`. The current extension connects to that port by default; non-default ports are only supported after the extension has been explicitly configured to use that port.

### Disposable Test Browser Mode

Use this mode only for e2e tests, demos, and throwaway profiles. It launches a separate browser profile:

```bash
node scripts/launch-chrome.mjs
```

The relay listens only on `127.0.0.1`.

## Prerequisites

The `/browser` skill contains the canonical `browse` CLI command reference. Ensure it is installed, then read it:

```bash
# Install if not already present
npx skills add browserbase/skills --skill browser -a '*' -g -y

# Load the command reference into context
cat ~/.agents/skills/browser/SKILL.md
```

Use only commands from that reference and the worker contract below. Do not invent flags or subcommands.

## Create A Swarm

Allocate one tab per workstream:

```bash
node scripts/swarm-relay.mjs ensure \
--count 3 \
--label flights \
--label rentals \
--label dinner \
--url "https://www.google.com/travel/flights" \
--url "https://www.google.com/search?q=san+diego+surfing+rentals+downtown" \
--url "https://www.kayak.com/San-Diego.10760.guide" \
--json
```

The response contains a `wsUrl` per target. The top-level agent must create one unique browse session name per worker and hand exactly one `wsUrl` plus one session name to each worker.

Session naming pattern:

```text
browser-swarm-<label>-<short-id>
```

## Worker Contract

Every worker must:

- Use only its assigned `wsUrl` by passing `--cdp "<wsUrl>"` on every `browse` command.
- Use only its assigned session by passing `--session "<session>"` on every `browse` command.
- Never use `browse tab new`, `browse tab close`, or `browse tab switch`.
- Only use commands documented in the `/browser` skill.
- Return concrete evidence: final URL, title, useful extracted facts, and screenshot path when relevant.
- Avoid irreversible actions such as purchases, reservations, or form submission without explicit user confirmation.

Worker prompt shape:

```text
You own the "<label>" browser-swarm tab.

For every browse command, include both flags exactly:
--session "<session>"
--cdp "<wsUrl>"

<Include the Commands section from ~/.agents/skills/browser/SKILL.md here>

Do not create, close, or switch tabs. Do not use any other browser target.
Do not invent browse flags or commands. Only use commands from the reference above.
Find options, collect evidence, and report concise results.
```

Example worker commands:

```bash
browse get title --session "browser-swarm-flights-a1b2c3d4" --cdp "ws://127.0.0.1:19989/devtools/browser/<targetId>"
browse snapshot --compact --session "browser-swarm-flights-a1b2c3d4" --cdp "ws://127.0.0.1:19989/devtools/browser/<targetId>"
browse screenshot --path /tmp/browser-swarm/flights.png --session "browser-swarm-flights-a1b2c3d4" --cdp "ws://127.0.0.1:19989/devtools/browser/<targetId>"
```

## Offsite Pattern

For a task like "plan an offsite to San Diego next week - we need flights booked, surfing rentals and dinner near downtown":

1. Create three tabs: `flights`, `rentals`, `dinner`.
2. Spawn one worker per tab.
3. Assign `flights` to Google Flights or Kayak flights.
4. Assign `rentals` to San Diego surf rental search/results.
5. Assign `dinner` to restaurants near downtown San Diego.
6. Aggregate worker evidence into one plan and list any actions requiring approval before booking.

## Adapter Isolation Contract

The adapter must preserve the worker-owned-tab model:

- `Target.getTargets` on a target-bound endpoint returns only the assigned tab.
- `Target.getTargetInfo` and `Target.attachToTarget` reject sibling target IDs.
- Worker endpoints reject tab creation and closing. Tab lifecycle belongs to the top-level browser-swarm harness.
- Events are forwarded only to clients that own the event target.

This contract is what keeps subagents from racing over Chrome's active tab state. If a worker is handed a raw, unscoped ModCDP or browser CDP endpoint, this isolation contract no longer holds.

## Future ModCDP Substrate

ModCDP is a promising replacement for the hand-rolled extension transport and CDP forwarding code. Do not make workers connect directly to a generic ModCDP proxy as the final browser-swarm interface. The final shape should remain:

```text
browse CLI -> browser-swarm adapter -> ModCDP or extension transport -> real browser tab
```

ModCDP should become the lower-level substrate only after an adapter spike proves it can preserve the target-bound worker isolation behavior above.
27 changes: 27 additions & 0 deletions skills/browser-swarm/extension/manifest.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
{
"manifest_version": 3,
"name": "Browser Swarm Bridge",
"version": "0.1.0",
"description": "Bridge a scoped Chrome tab group to browser-swarm agents through localhost.",
"permissions": [
"alarms",
"debugger",
"storage",
"tabGroups",
"tabs"
],
"host_permissions": [
"<all_urls>"
],
"background": {
"service_worker": "service-worker.js",
"type": "module"
},
"action": {
"default_title": "Browser Swarm Bridge"
},
"content_security_policy": {
"extension_pages": "script-src 'self'; connect-src 'self' ws://127.0.0.1:* http://127.0.0.1:*; object-src 'none';"
}
}

Loading