Skip to content

feat(cli): detect Windsurf, Cline, Gemini CLI, and Crush agents#1328

Merged
jrusso1020 merged 1 commit into
mainfrom
feat/cli-detect-more-agents
Jun 11, 2026
Merged

feat(cli): detect Windsurf, Cline, Gemini CLI, and Crush agents#1328
jrusso1020 merged 1 commit into
mainfrom
feat/cli-detect-more-agents

Conversation

@jrusso1020

@jrusso1020 jrusso1020 commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

What

Extends detectAgentRuntime() (packages/cli/src/telemetry/agent_runtime.ts) with four more coding-agent vendors: windsurf, cline, gemini_cli, crush. Adds them to the AgentRuntime union and VENDOR_RULES, with vitest coverage.

(OpenHands was evaluated and dropped after runtime testing — see below.)

Why

The existing telemetry classifies Claude Code, Codex, Cursor, Copilot Coding Agent, Replit, Hermes, openclaw, and Pi. This closes obvious gaps with markers verified to the same bar as the existing rules: present in the environment of the shell/subprocess the agent spawns, keyed on existence only.

Verification (two tiers)

Runtime-confirmed — I installed/executed the agent's real code and observed the marker land in a spawned child whose parent did not have it:

Vendor Marker Runtime proof
gemini_cli GEMINI_CLI=1 Ran published @google/gemini-cli-core@0.46.0 ShellExecutionService.execute() on a child: control (child_process) = UNSET, via the service = 1.
crush CRUSH=1 Test inside charmbracelet/crush internal/shell driving the real Shell.Exec(): parent UNSET, child = 1.

Source + independent third-party detectors (VS Code / editor surfaces; not headless-runnable here, but corroborated by other projects' shipping detectors):

Vendor Marker Source + corroboration
cline CLINE_ACTIVE=true cline/cline VscodeTerminalRegistry.ts:29 (TerminalOptions.env); also detected by firebase/firebase-tools and aliyun/aliyun-cli. Default vscodeTerminal path only (backgroundExec omits it).
windsurf TERM_PROGRAM=windsurf (case-insensitive) nrwl/nx detection.rs:47; also adonisjs, ag-grid. Editor-terminal scope (same as the existing Cursor rule); can fall back to vscode under WSL.

Deliberately NOT added (re-verified against current source / runtime)

  • OpenHandsdropped after runtime testing. OPENHANDS_BUILD_GIT_SHA/_REF exists in the agent-server Dockerfile (base-image-minimal stage, added 2025-11-09 in PR fix(studio): add FFmpeg pre-flight check before render #1100), but inspecting the config Env of 12+ published ghcr.io/openhands/agent-server images (2025-10-21 → 2026-01-13, including the merge commit of the introducing PR) shows the marker absent from every one. The declared ENV never reaches the published runtime image, so a rule would never fire. Source ≠ runtime. Re-add only if a real published image is confirmed to carry it.
  • Aider — sets no self-identifying env var; both spawn sites (run_cmd.py:62/116) pass no env=, children inherit os.environ verbatim.
  • GooseAGENT=goose/GOOSE_TERMINAL=1 are set on the recipe-retry path (retry.rs:239) and the computercontroller MCP extension (mod.rs:855), but not on the default developer shell tool (developer/shell.rs:600-660, only PATH+cwd). Primary path undetected → would undercount.
  • opencodeOPENCODE_TERMINAL=1 is set only on the interactive PTY panel (pty-preparation.ts:22), not the model's bash/shell tools.
  • Roo CodeROO_ACTIVE only on the vscode provider; terminalShellIntegrationDisabled defaults true, so the default execa provider sets no marker.
  • Amp / Devin / Jules / Factory Droid — no verifiable unconditional runtime marker. Amp is closed/minified; Devin/Jules are closed sandboxes; Factory's FACTORY_PROJECT_DIR/DROID_PLUGIN_ROOT are hook-scoped only (docs.factory.ai).

Test plan

  • 6 new vitest cases (windsurf incl. case-insensitive, cline, gemini_cli, crush, existence-only) — 26 passed under vitest run
  • bunx oxlint + bunx oxfmt --check clean; pre-commit hook (lint/format/typecheck) green
  • Commits signed (GitHub verified)
  • gemini_cli + crush confirmed by executing their real shell layers; OpenHands removed after its marker was found absent from all real published images

Note: after this and #1294 both merge, the gemini_managed_agent filesystem check runs ahead of VENDOR_RULES, so a managed agent that also exports GEMINI_CLI is classified as the managed agent, not gemini_cli.

@jrusso1020 jrusso1020 changed the title feat(cli): detect Windsurf, Cline, Gemini CLI, Crush, and OpenHands agents feat(cli): detect Windsurf, Cline, Gemini CLI, and Crush agents Jun 10, 2026

@miguel-heygen miguel-heygen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean implementation with the right verification bar — runtime-confirmed markers for gemini_cli and crush are meaningfully stronger evidence than source-only citations. The OpenHands exclusion (declared ENV absent from every inspected runtime image) is exactly the right call and the reasoning is well-documented for future re-evaluation.

P2 — Windsurf WSL blind spot (same scope as existing Cursor rule)

Under WSL/remote, Windsurf can fall back to TERM_PROGRAM=vscode, which would miss classification. This is called out in the comment and is the same known gap as the existing Cursor rule. Acceptable given the integrated-terminal scope, and consistent with the rest of the rules. Worth noting that a future cross-editor cleanup could add a common fallback note to all three TERM_PROGRAM rules.

P3 — CRUSH name collision risk

CRUSH=1 is a short variable. The PR correctly notes AGENT=crush/AI_AGENT=crush were excluded as too generic. CRUSH itself is specific enough (charmbracelet docs use it as the canonical self-identifier), and the runtime proof from internal/shell is solid. Fine.

Note on test style (not a concern)

New tests don't call vi.resetModules() in beforeEach, unlike the older Cursor/Claude Code suites. This is actually more correct for pure env-var rules — VENDOR_RULES checks call (env) => ... against the live process.env at call time, not at module load time. The savedEnv/afterEach restore pattern handles isolation correctly.

Post-merge order note: once both #1294 and #1328 land, a Gemini managed-agent session that also exports GEMINI_CLI=1 will correctly classify as gemini_managed_agent (filesystem check wins). The comment in this diff calls that out explicitly — good. ✅ Approving.

@james-russo-rames-d-jusso james-russo-rames-d-jusso left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed at the head of this PR. Four new vendor rules added with the same discipline as the existing VENDOR_RULES array: existence-only checks, public-source citations inline, and an explicit "deliberately NOT added" rejection list with empirical justification. The runtime-confirmed vs source+third-party tiering is the right call — runtime is the gold standard, but a documented source-based marker with multiple corroborating detectors is also acceptable for editor-terminal-scoped rules.

LGTM. Two small observations + one nit.

Strengths

  • Two-tier verification with honest labeling. gemini_cli and crush get the gold standard (you actually ran their published shell-execution code and observed the marker land in spawned children whose parents didn't have it). cline and windsurf get source + multiple-third-party-detector corroboration, with both caveats called out (cline: vscodeTerminal path only, backgroundExec drops it; windsurf: integrated-terminal scope, can fall back to vscode under WSL). Honest about what each rule covers and what it doesn't.
  • The "deliberately NOT added" comment is the most valuable doc surface here. OpenHands rejected because the declared ENV doesn't reach the published image (12+ tags inspected). Aider rejected because both spawn sites pass no env=. Goose rejected because the primary developer/shell.rs tool doesn't set the marker. opencode rejected because OPENCODE_TERMINAL is only on the PTY panel. Roo Code rejected because the default execa provider sets no marker. Each rejection is empirically grounded — the next PR that wants to add these will know exactly what to look for.
  • CRUSH over AGENT=crush/AI_AGENT=crush. Picking the discriminating marker over the generic ones is the right call — AGENT=crush would collide with anyone else who sets AGENT=... (and there are such users in the wild).
  • Test for GEMINI_CLI = "anything" (agent_runtime.test.ts:208-212). Explicit lockdown of the existence-only contract — value shape ignored. Same pattern would be worth retrofitting on the other existence-keyed rules in a follow-up cleanup if you ever want to add it (not asking for it here).
  • PR body's cross-PR note about gemini_managed_agent filesystem check winning over gemini_cli env check is correct — verified against #1294's structure (the FS detector runs ahead of VENDOR_RULES). Merge-order independent.

Observations (non-gating)

  • Windsurf is case-insensitive (?.toLowerCase() === "windsurf"); Cursor above is case-sensitive (=== "cursor"). Justified by the comment ("sources disagree on casing"). Worth considering whether the existing Cursor rule should also be .toLowerCase() === "cursor" for consistency — Cursor itself has historically set lowercase, but other detectors out there may set TERM_PROGRAM=Cursor and the case-sensitive check would miss them. Not asking you to fix it in this PR (it's been working as-is); just flagging the inconsistency as a potential follow-up.
  • Cline CLINE_ACTIVE=true-but-keyed-on-existence. The comment correctly notes the value is "true" (per VS Code TerminalOptions.env), and the check is typeof env["CLINE_ACTIVE"] === "string". Existence-only is consistent with the array's convention — and the new test at line 208-212 explicitly locks down that the value shape is ignored for GEMINI_CLI, which carries the same contract over.

Nit

  • The 9-line "OpenHands deliberately NOT added" comment block lives inside the VENDOR_RULES: VendorRule[] = [...] array (agent_runtime.ts after the crush entry). It's not an actual rule — it's a sibling explanation. Two reasonable places that read cleaner:

    • A // NOT_ADDED: section comment block below the array close brace, grouping all of the rejections from your "Deliberately NOT added" PR-body list (OpenHands, Aider, Goose, opencode, Roo Code, Amp/Devin/Jules/Factory). The PR description has them; the file doesn't — and the next PR author will look in the file, not the PR history.
    • Or keep just the OpenHands one inline (since it's the closest-to-actionable), and add a one-line // See PR #1328 for the full evaluated-and-rejected list. reference for the others.

    Not gating. Just an organizational nit — the empirical work behind the rejections is genuinely valuable and would survive better as a code-resident artifact than a PR-body section.

What I verified

  • Diff against main — 2 files, +109/-0.
  • All 4 new rules use typeof env[...] === "string" for existence checks (or ?.toLowerCase() === ... for Windsurf's value comparison), matching the existing array's convention.
  • The new tests use process.env mutation with stripVendorEnv() + savedEnv snapshot/restore — clean isolation. No vi.doMock needed since these are env-only rules.
  • Test ordering doesn't interact with #1294's vi.doMock("node:fs", ...) work — those are reset in afterEach, and these tests don't depend on the fs mock state. Merging both PRs together produces no test pollution.
  • The cross-PR precedence note (gemini_managed_agent wins over gemini_cli when both signals match) is structurally correct per #1294's pre-loop FS check.

What I didn't verify

  • Empirical confirmation of the cline/windsurf markers in a live editor terminal — relied on your source citations + corroborating detectors.
  • Whether the WSL/remote TERM_PROGRAM=vscode fallback path matters in HF's actual user base.

Same quality bar as #1294 — empirical methodology + honest tiering + explicit rejections. The OpenHands non-rule has been working overtime in this file but it'd survive better restructured into a section comment. Otherwise clean.

Review by Rames D Jusso

@jrusso1020

Copy link
Copy Markdown
Collaborator Author

Addressed the non-gating feedback in 180e757:

  • Hoisted the NOT-ADDED rationale (@james-russo-rames-d-jusso's nit). The OpenHands "deliberately NOT added" block no longer lives inside VENDOR_RULES[] — it's a proper // Agents evaluated and deliberately NOT added section below the array, and now carries the full rejection list (OpenHands, Aider, Goose, opencode, Roo Code, Amp/Devin/Jules/Factory) that previously only lived in the PR body. The empirical rejection rationale is now code-resident.
  • Cursor rule made case-insensitive (TERM_PROGRAM?.toLowerCase() === "cursor") for parity with the Windsurf rule, resolving the inconsistency this PR introduced. Loosening-only (Cursor sets lowercase today, so no match is ever dropped); added a TERM_PROGRAM=Cursor test. 27 tests green.

Left as-is per the reviews (explicitly non-gating): the Windsurf WSL vscode fallback (same known gap as Cursor) and the CRUSH short-name note (specific enough, runtime-proven).

Heads-up on merge order: this PR and #1294 both edit the AgentRuntime union, so once #1294 lands this branch needs a rebase (trivial — keep all members) and will come back through review anyway.

Rebased onto main after #1294 merged. Adds four coding-agent vendors to
detectAgentRuntime() (existence-only checks, source/runtime-verified):
- windsurf — TERM_PROGRAM=windsurf (case-insensitive)
- cline — CLINE_ACTIVE (default vscode-terminal path)
- gemini_cli — GEMINI_CLI (runtime-confirmed; distinct from the managed-agent
  /.agents/ detector, which runs ahead of VENDOR_RULES and wins when both match)
- crush — CRUSH (runtime-confirmed)

Also makes the cursor rule case-insensitive for parity with windsurf, and adds
a code-resident "deliberately NOT added" section (OpenHands/Aider/Goose/
opencode/Roo/Amp/Devin/Jules/Factory) carrying the empirical rejection
rationale.

Test isolation: the Gemini managed-agent suite now clears its node:os/node:fs
doMock registrations in afterEach so they don't leak into the env-var-only
suites that follow it in the same file.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@jrusso1020 jrusso1020 force-pushed the feat/cli-detect-more-agents branch from 180e757 to 40ce10d Compare June 11, 2026 00:13
@jrusso1020

Copy link
Copy Markdown
Collaborator Author

Rebased onto main now that #1294 has landed (force-pushed 40ce10d3, signed/verified). Collapsed the prior 5-commit history (which included the OpenHands add-then-remove churn) into one clean commit on top of the new main — the net diff is unchanged.

Conflict resolution (for re-reviewers):

One real bug surfaced by the merge (worth a look): with both suites now in the same file, the Gemini suite's vi.doMock("node:os"/"node:fs") registrations leaked into the env-var suites that follow it — vi.restoreAllMocks() doesn't undo doMock, and the env-var suites don't resetModules(). That made the windsurf/cline/gemini_cli/crush tests see release()→4.19.0-gvisor + statSync("/.agents")→dir and resolve to gemini_managed_agent. Fixed by vi.doUnmock("node:os"/"node:fs") in the Gemini suite's afterEach. All 32 tests pass.

Ready for re-review.

@miguel-heygen miguel-heygen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR was updated since my first pass — re-reviewing the delta.

doUnmock fix in Gemini suite's afterEach — correct. vi.doMock registrations survive vi.resetModules at the registry level; without the explicit doUnmock calls, the mocked node:os/node:fs factories would be re-applied on the first fresh import in the Windsurf suite, making isGVisor() return true there. Order (doUnmockresetModules) is right.

P2 — Cursor case-insensitive change has no source (agent_runtime.ts L81, test L122)

The Cursor rule was deliberately exact-match (=== "cursor") when written. This PR changes it to ?.toLowerCase() === "cursor" with the stated justification "for parity with Windsurf." But the parity argument doesn't hold: Windsurf sources disagree on casing (hence case-insensitive is warranted); every Cursor source I can find agrees it's lowercase. The comment even acknowledges this: "Cursor sets lowercase today." That's not a reason to add case-insensitivity — it's a reason to keep the exact match.

The PR's stated verification bar is "runtime-confirmed or source-corroborated." The Cursor capitalization case meets neither. This is scope creep applied to an existing stable rule without evidence that a capitalized variant ever occurs in practice.

Should be reverted (?.toLowerCase()===, drop the capitalized test) or sourced. If you have a source showing TERM_PROGRAM=Cursor in the wild, link it and keep it. Otherwise the safe call is exact match.

The four new vendors (windsurf, cline, gemini_cli, crush) still look correct — no change to that assessment.

@jrusso1020 jrusso1020 merged commit e0ecd4d into main Jun 11, 2026
35 checks passed
@jrusso1020 jrusso1020 deleted the feat/cli-detect-more-agents branch June 11, 2026 01:10
@jrusso1020

Copy link
Copy Markdown
Collaborator Author

Good catch on the Cursor rule — agreed it was an unsourced loosening (Windsurf's case-insensitivity is sourced; Cursor's isn't). Since this already merged, the revert is up as a follow-up: #1334 (restores exact === "cursor", drops the TERM_PROGRAM=Cursor test, documents the intentional asymmetry; Windsurf stays case-insensitive).

jrusso1020 added a commit that referenced this pull request Jun 11, 2026
…ening) (#1334)

Follow-up to #1328. That PR loosened the Cursor TERM_PROGRAM check from exact
`=== "cursor"` to `?.toLowerCase() === "cursor"` "for parity with Windsurf" —
but the parity is false. Windsurf is matched case-insensitively because its
sources genuinely disagree on casing ("windsurf" vs "Windsurf"); Cursor
consistently emits lowercase "cursor", so nothing justified loosening an
existing, working, exact-match rule. Per review feedback on #1328
(Magi/Hermes), revert Cursor to exact match and drop the TERM_PROGRAM=Cursor
test. Windsurf stays case-insensitive (sourced); its comment now documents the
asymmetry as intentional.

No functional change — Cursor always emitted lowercase, so detection is
unchanged; this just removes an unsourced false-positive surface.

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants