browserbase · shubh24 · May 30, 2026 · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026
diff --git a/README.md b/README.md
@@ -19,6 +19,7 @@ This plugin includes the following skills (see `skills/` for details):
 | [fetch](skills/fetch/SKILL.md) | Fetch HTML or JSON from static pages without a browser session — inspect status codes, headers, follow redirects |
 | [search](skills/search/SKILL.md) | Search the web and return structured results (titles, URLs, metadata) without a browser session |
 | [ui-test](skills/ui-test/SKILL.md) | AI-powered adversarial UI testing — analyzes git diffs to test changes, or explores the full app to find bugs |
+| [browsability](skills/browsability/SKILL.md) | Assess how usable a website is by an AI browser agent — how much stealth/proxy/captcha help it needs to get in, whether controls are labeled/reachable, iframe/shadow-DOM traps, and extra steps vs a human; reports what helps and what hurts, with concrete fixes (no numeric score) |
 
 ## Installation
 

diff --git a/skills/browsability/LICENSE.txt b/skills/browsability/LICENSE.txt
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2026 Browserbase, Inc.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/skills/browsability/SKILL.md b/skills/browsability/SKILL.md
@@ -0,0 +1,86 @@
+---
+name: browsability
+description: "Assess how usable a website is BY AN AI BROWSER AGENT — its browsability. Look at how little infrastructure help the agent needs to get in (stealth/proxy/captcha), whether it can perceive and drive the live DOM (are controls labeled and reachable, are there iframe/shadow-DOM/deep-DOM traps), and how many extra steps it takes versus a human. Report what helps and what hurts, with concrete fixes — no numeric score. Use when the user asks how browsable / agent-friendly / agent-ready a website or a specific web flow (signup, checkout, search) is for a BROWSER agent, to compare sites on browser-agent usability, or to get a browsability report with fixes. Triggers: 'how browsable is <site>', 'is this site agent-friendly for a browser agent', 'check this checkout/signup flow for agents', 'browser-agent friendliness', 'DOM friction', 'browsability of <url>'. NOT for SEO/AEO or content discoverability (a different layer), and NOT for docs/SDK onboarding DX (use the agent-experience skill for that)."
+license: MIT
+metadata:
+  author: browserbase
+  version: "0.2.0"
+allowed-tools: Read Bash Glob Grep Agent
+compatibility: "Uses the browse CLI (`npm install -g @browserbasehq/browse-cli`) via the `browse` skill to look at and drive the site. Remote mode needs BROWSERBASE_API_KEY."
+---
+
+# Browsability — how usable is a site for a browser agent?
+
+Judge how well an AI **browser** agent can *operate* a website. The idea is simple:
+
+> **Browsability is how little help an agent needs to succeed — and how much harder the site is for
+> an agent than for a person.** A 10-click checkout that takes a human 10 clicks too is fine; a
+> 3-click task that takes the agent 10 because the buttons are unlabeled is not — those extra clicks
+> are the agent's problem, not the workflow's.
+
+This is the *operability* layer — driving the live UI. It is **not** discoverability, so ignore
+`llms.txt`, sitemaps, and SEO/AEO. It is also distinct from docs/SDK onboarding (that's the
+`agent-experience` skill).
+
+> **Agents run on remote/cloud browsers.** So the target environment is a *remote* browser, not your
+> local one. A site that works in a local/residential browser but **blocks or errors on a remote
+> browser is, by definition, not browsable** — an agent literally can't use it. Treat that gap as a
+> top finding, not a footnote.
+
+There is **no scoring formula here.** Look at the site with your own eyes (and the agent's), use the
+checklist in `references/rubric.md` as a guide for what tends to matter, and decide what actually
+matters for *this* site. Then report what helps and what hurts.
+
+## How to assess
+
+1. **Actually try to use the site** with the `browse` skill. Open it, take a `browse snapshot`
+   (the accessibility tree — this is what an agent "sees"), and attempt a real task the site is for:
+   find the pricing, create an account, add to cart, submit the contact form. Notice where it's easy
+   and where you get stuck.
+
+2. **Test on a remote browser, and notice how much help it took to get in.** That's the environment
+   agents actually run in. If a vanilla remote session sails through, great — that's maximally
+   browsable. If you needed stealth, a proxy, or captcha-solving just to load or act, that counts
+   against the site. (`references/rubric.md` describes this assistance ladder.) Remember
+   `solveCaptchas` is **on by default** — to see if a site is hostile at the front door, try it with
+   captcha-solving off first.
+
+   **If a task fails on the remote browser, confirm with a local one.** If it works locally but is
+   blocked or errors out remotely, the site is gating cloud/automated browsers — **flag that as a
+   major browsability failure** (it's the whole point: an agent can't use the site). When something
+   comes back empty, always check the final URL — a `chrome-error://…` (or a title that's just the
+   bare domain) means the navigation *failed/was blocked*, not that the page rendered empty.
+
+3. **Watch for the things that trip up browser agents** as you go — read `references/rubric.md` for
+   the full checklist, but in short: unlabeled / `<div>`-as-button controls, custom dropdowns,
+   iframes (especially cross-origin), shadow DOM, very deep or huge DOMs, blocking cookie/consent
+   walls, and flows that take the agent more steps than a person.
+
+Use judgment over completeness — surface the few things that genuinely make or break this site for an
+agent, not an exhaustive audit.
+
+## How to report
+
+Use **two separate tables** — one for what helps, one for what hurts. Do **not** put them in one
+two-column table: a Helps row and a Hurts row are unrelated, so placing them side by side falsely
+implies they're connected.
+
+First, what helps:
+
+| ✅ Helps browsability |
+|---|
+| Native `<button>` / `<select>` with clear labels |
+| Loads & acts fine in a vanilla session |
+| Main flow is same-origin |
+
+Then, what hurts — each with its concrete fix:
+
+| ⚠️ Hurts browsability | Fix |
+|---|---|
+| Signup CTA is an unlabeled `<div>` → agent can't see it | make it a `<button>` or add `aria-label` |
+| Needs proxy + captcha-solving just to load | ease bot-walls on agent-relevant flows |
+| Checkout is a cross-origin iframe → fragile | same-origin embed, or a direct route |
+
+Cite what you observed. Optionally close with one plain-language line ("easy / moderate / hard for a
+browser agent, because…"). Do not invent a number. In Slack contexts use mrkdwn (`*bold*`, `•`
+bullets), not tables.
diff --git a/skills/browsability/references/rubric.md b/skills/browsability/references/rubric.md
@@ -0,0 +1,92 @@
+# What makes a site browsable for a browser agent
+
+A checklist of what tends to help or hurt an AI **browser** agent trying to operate a website.
+Grounded in what the open-source [Stagehand](https://github.com/browserbase/stagehand) framework
+treats as hard, plus the public Browserbase session settings.
+
+**Use this as a guide, not a rule book.** There is no scoring formula. Look at the site, try the
+task, and decide what actually matters for *this* site — then report what helps and what hurts.
+
+## The idea
+
+**Browsability is how little help an agent needs to succeed, and how much harder the site is for an
+agent than for a person.** Only the *agent-specific* friction counts: a long workflow that's long for
+humans too isn't a browsability problem; a simple task made hard by unlabeled controls is. This is
+*operability* (driving the UI), not *discoverability* (being found/cited — that's SEO/AEO, out of
+scope).
+
+When you see extra steps, ask: *would a human also need this step?* If yes, it's the workflow (don't
+count it). If no — e.g. the agent had to click open a custom dropdown that a person reads at a glance
+— that's the agent tax, and it hurts browsability.
+
+---
+
+## 1. Getting in — how much help did the agent need?
+
+Re-frame "how protected is this site" as a ladder of assistance. The less help needed, the more
+browsable. Browserbase exposes these public session settings; each one you have to switch on to make
+the task work is a mark against the site:
+
+- `solveCaptchas` — CAPTCHA challenges (**on by default**, so test with it off to see front-door hostility)
+- `proxies` — IP blocks, rate limits, geo-gating
+- `fingerprint` — headless-browser fingerprint detection
+- `advancedStealth` — advanced anti-bot detection
+- `context` (persist) — re-auth / re-consent walls
+
+**Helps:** a plain vanilla headless session can load and act. **Hurts:** the task only works once you
+add stealth, a proxy, or captcha-solving — and the more of those it needs, the worse.
+
+**The remote-vs-local test (the strongest signal here).** Agents run on remote/cloud browsers, so
+that's the environment that counts. If a task **works on a local/residential browser but is blocked
+or errors on a remote one**, the site is gating cloud/automated browsers — that is a *major*
+browsability failure, because a real agent simply cannot use it. Flag it loudly; do not excuse it as
+"we just need a proxy." (Diagnostic tip: when a remote page comes back empty, check the final URL —
+`chrome-error://…` or a title that's only the bare domain means the navigation was *blocked/failed*,
+not that the page rendered empty. Confirm by loading the same URL locally.)
+
+## 2. Seeing the controls — can the agent perceive what to click?
+
+Browser agents work off an **accessibility tree**, and a control is only visible to the agent if it
+has an accessible name, named children, or a real semantic role. An unlabeled `<div role="generic">`
+button is dropped before the model ever sees it — effectively invisible.
+
+- **Helps:** native `<button>`, `<a href>`, `<input>`, `<select>` with real text or labels; inputs tied to a `<label>`.
+- **Hurts:** icon-only buttons with no `aria-label`; `<div onclick>` "buttons"; inputs with no label; controls hidden inside closed shadow DOM.
+
+## 3. Structural traps
+
+Hard walls that browser agents struggle with regardless of labeling:
+
+- **Cross-origin iframes** — separately-managed frames that can drop out mid-action; fragile.
+- **Shadow DOM** — closed roots are opaque to the agent.
+- **Very deep DOM (hundreds of levels)** — forces slower, shallower page reads.
+- **Very large DOM** — the accessibility snapshot can get truncated; elements past the cap vanish.
+- **Never-settling pages** — constant streaming/polling means the page never looks "done loading," so the agent waits out a timeout on every step.
+- **Virtualized / infinite lists** — no "scroll until found"; the agent has to scroll-and-look in a loop.
+
+## 4. Extra steps the agent pays (but a human doesn't)
+
+- **Custom dropdowns vs native `<select>`** — a native select is one action; a custom dropdown makes the agent click to open, re-read the page, then pick — two+ actions. Multiply across a form and it adds up.
+- **Needless modals / multi-step wizards** that a human clicks through without thinking but the agent must navigate explicitly.
+- Count only the steps *beyond* what a person would need.
+
+## 5. When things break — can the agent recover?
+
+- **Blocking overlays** — cookie/consent walls, login walls, paywalls that aren't dismissed automatically and sit on top of the flow.
+- **Unstable DOM** — elements that move or re-render between looking and clicking, forcing the agent to re-find them (a sign of a hostile, racey page).
+- **Slow / hanging navigation** — pages that exceed load timeouts.
+
+---
+
+## Turning findings into fixes
+
+| Finding | Fix |
+|---|---|
+| Unlabeled / `<div>`-as-button controls | use semantic `<button>` / `<a>`, or add `aria-label` |
+| Many custom dropdowns | use native `<select>` where possible |
+| Cross-origin iframe in the flow | same-origin embed, or a direct route |
+| Closed shadow DOM | open shadow roots, or expose semantic fallbacks |
+| Deep / very large DOM | flatten nesting, paginate, reduce node count |
+| Needs heavy stealth/proxy/captcha to work | reduce hostile bot-walls on agent-relevant flows |
+| More steps than a human needs | collapse the funnel; remove needless modal steps |
+| UI-only with no agent path | (ceiling) offer an API / deep-link for agents so they needn't drive the UI at all |