feat(cli): detect Windsurf, Cline, Gemini CLI, and Crush agents#1328
Conversation
miguel-heygen
left a comment
There was a problem hiding this comment.
Clean implementation with the right verification bar — runtime-confirmed markers for gemini_cli and crush are meaningfully stronger evidence than source-only citations. The OpenHands exclusion (declared ENV absent from every inspected runtime image) is exactly the right call and the reasoning is well-documented for future re-evaluation.
P2 — Windsurf WSL blind spot (same scope as existing Cursor rule)
Under WSL/remote, Windsurf can fall back to TERM_PROGRAM=vscode, which would miss classification. This is called out in the comment and is the same known gap as the existing Cursor rule. Acceptable given the integrated-terminal scope, and consistent with the rest of the rules. Worth noting that a future cross-editor cleanup could add a common fallback note to all three TERM_PROGRAM rules.
P3 — CRUSH name collision risk
CRUSH=1 is a short variable. The PR correctly notes AGENT=crush/AI_AGENT=crush were excluded as too generic. CRUSH itself is specific enough (charmbracelet docs use it as the canonical self-identifier), and the runtime proof from internal/shell is solid. Fine.
Note on test style (not a concern)
New tests don't call vi.resetModules() in beforeEach, unlike the older Cursor/Claude Code suites. This is actually more correct for pure env-var rules — VENDOR_RULES checks call (env) => ... against the live process.env at call time, not at module load time. The savedEnv/afterEach restore pattern handles isolation correctly.
Post-merge order note: once both #1294 and #1328 land, a Gemini managed-agent session that also exports GEMINI_CLI=1 will correctly classify as gemini_managed_agent (filesystem check wins). The comment in this diff calls that out explicitly — good. ✅ Approving.
james-russo-rames-d-jusso
left a comment
There was a problem hiding this comment.
Reviewed at the head of this PR. Four new vendor rules added with the same discipline as the existing VENDOR_RULES array: existence-only checks, public-source citations inline, and an explicit "deliberately NOT added" rejection list with empirical justification. The runtime-confirmed vs source+third-party tiering is the right call — runtime is the gold standard, but a documented source-based marker with multiple corroborating detectors is also acceptable for editor-terminal-scoped rules.
LGTM. Two small observations + one nit.
Strengths
- Two-tier verification with honest labeling.
gemini_cliandcrushget the gold standard (you actually ran their published shell-execution code and observed the marker land in spawned children whose parents didn't have it).clineandwindsurfget source + multiple-third-party-detector corroboration, with both caveats called out (cline: vscodeTerminal path only, backgroundExec drops it;windsurf: integrated-terminal scope, can fall back tovscodeunder WSL). Honest about what each rule covers and what it doesn't. - The "deliberately NOT added" comment is the most valuable doc surface here. OpenHands rejected because the declared
ENVdoesn't reach the published image (12+ tags inspected). Aider rejected because both spawn sites pass noenv=. Goose rejected because the primarydeveloper/shell.rstool doesn't set the marker. opencode rejected becauseOPENCODE_TERMINALis only on the PTY panel. Roo Code rejected because the defaultexecaprovider sets no marker. Each rejection is empirically grounded — the next PR that wants to add these will know exactly what to look for. CRUSHoverAGENT=crush/AI_AGENT=crush. Picking the discriminating marker over the generic ones is the right call —AGENT=crushwould collide with anyone else who setsAGENT=...(and there are such users in the wild).- Test for
GEMINI_CLI = "anything"(agent_runtime.test.ts:208-212). Explicit lockdown of the existence-only contract — value shape ignored. Same pattern would be worth retrofitting on the other existence-keyed rules in a follow-up cleanup if you ever want to add it (not asking for it here). - PR body's cross-PR note about
gemini_managed_agentfilesystem check winning overgemini_clienv check is correct — verified against #1294's structure (the FS detector runs ahead ofVENDOR_RULES). Merge-order independent.
Observations (non-gating)
- Windsurf is case-insensitive (
?.toLowerCase() === "windsurf"); Cursor above is case-sensitive (=== "cursor"). Justified by the comment ("sources disagree on casing"). Worth considering whether the existing Cursor rule should also be.toLowerCase() === "cursor"for consistency — Cursor itself has historically set lowercase, but other detectors out there may setTERM_PROGRAM=Cursorand the case-sensitive check would miss them. Not asking you to fix it in this PR (it's been working as-is); just flagging the inconsistency as a potential follow-up. - Cline
CLINE_ACTIVE=true-but-keyed-on-existence. The comment correctly notes the value is"true"(per VS Code TerminalOptions.env), and the check istypeof env["CLINE_ACTIVE"] === "string". Existence-only is consistent with the array's convention — and the new test at line 208-212 explicitly locks down that the value shape is ignored forGEMINI_CLI, which carries the same contract over.
Nit
-
The 9-line "OpenHands deliberately NOT added" comment block lives inside the
VENDOR_RULES: VendorRule[] = [...]array (agent_runtime.tsafter thecrushentry). It's not an actual rule — it's a sibling explanation. Two reasonable places that read cleaner:- A
// NOT_ADDED:section comment block below the array close brace, grouping all of the rejections from your "Deliberately NOT added" PR-body list (OpenHands, Aider, Goose, opencode, Roo Code, Amp/Devin/Jules/Factory). The PR description has them; the file doesn't — and the next PR author will look in the file, not the PR history. - Or keep just the OpenHands one inline (since it's the closest-to-actionable), and add a one-line
// See PR #1328 for the full evaluated-and-rejected list.reference for the others.
Not gating. Just an organizational nit — the empirical work behind the rejections is genuinely valuable and would survive better as a code-resident artifact than a PR-body section.
- A
What I verified
- Diff against
main— 2 files, +109/-0. - All 4 new rules use
typeof env[...] === "string"for existence checks (or?.toLowerCase() === ...for Windsurf's value comparison), matching the existing array's convention. - The new tests use
process.envmutation withstripVendorEnv()+savedEnvsnapshot/restore — clean isolation. Novi.doMockneeded since these are env-only rules. - Test ordering doesn't interact with #1294's
vi.doMock("node:fs", ...)work — those are reset in afterEach, and these tests don't depend on the fs mock state. Merging both PRs together produces no test pollution. - The cross-PR precedence note (
gemini_managed_agentwins overgemini_cliwhen both signals match) is structurally correct per #1294's pre-loop FS check.
What I didn't verify
- Empirical confirmation of the
cline/windsurfmarkers in a live editor terminal — relied on your source citations + corroborating detectors. - Whether the WSL/remote
TERM_PROGRAM=vscodefallback path matters in HF's actual user base.
Same quality bar as #1294 — empirical methodology + honest tiering + explicit rejections. The OpenHands non-rule has been working overtime in this file but it'd survive better restructured into a section comment. Otherwise clean.
— Review by Rames D Jusso
|
Addressed the non-gating feedback in 180e757:
Left as-is per the reviews (explicitly non-gating): the Windsurf WSL Heads-up on merge order: this PR and #1294 both edit the |
Rebased onto main after #1294 merged. Adds four coding-agent vendors to detectAgentRuntime() (existence-only checks, source/runtime-verified): - windsurf — TERM_PROGRAM=windsurf (case-insensitive) - cline — CLINE_ACTIVE (default vscode-terminal path) - gemini_cli — GEMINI_CLI (runtime-confirmed; distinct from the managed-agent /.agents/ detector, which runs ahead of VENDOR_RULES and wins when both match) - crush — CRUSH (runtime-confirmed) Also makes the cursor rule case-insensitive for parity with windsurf, and adds a code-resident "deliberately NOT added" section (OpenHands/Aider/Goose/ opencode/Roo/Amp/Devin/Jules/Factory) carrying the empirical rejection rationale. Test isolation: the Gemini managed-agent suite now clears its node:os/node:fs doMock registrations in afterEach so they don't leak into the env-var-only suites that follow it in the same file. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
180e757 to
40ce10d
Compare
|
Rebased onto Conflict resolution (for re-reviewers):
One real bug surfaced by the merge (worth a look): with both suites now in the same file, the Gemini suite's Ready for re-review. |
miguel-heygen
left a comment
There was a problem hiding this comment.
The PR was updated since my first pass — re-reviewing the delta.
✅ doUnmock fix in Gemini suite's afterEach — correct. vi.doMock registrations survive vi.resetModules at the registry level; without the explicit doUnmock calls, the mocked node:os/node:fs factories would be re-applied on the first fresh import in the Windsurf suite, making isGVisor() return true there. Order (doUnmock → resetModules) is right.
P2 — Cursor case-insensitive change has no source (agent_runtime.ts L81, test L122)
The Cursor rule was deliberately exact-match (=== "cursor") when written. This PR changes it to ?.toLowerCase() === "cursor" with the stated justification "for parity with Windsurf." But the parity argument doesn't hold: Windsurf sources disagree on casing (hence case-insensitive is warranted); every Cursor source I can find agrees it's lowercase. The comment even acknowledges this: "Cursor sets lowercase today." That's not a reason to add case-insensitivity — it's a reason to keep the exact match.
The PR's stated verification bar is "runtime-confirmed or source-corroborated." The Cursor capitalization case meets neither. This is scope creep applied to an existing stable rule without evidence that a capitalized variant ever occurs in practice.
Should be reverted (?.toLowerCase() → ===, drop the capitalized test) or sourced. If you have a source showing TERM_PROGRAM=Cursor in the wild, link it and keep it. Otherwise the safe call is exact match.
The four new vendors (windsurf, cline, gemini_cli, crush) still look correct — no change to that assessment.
|
Good catch on the Cursor rule — agreed it was an unsourced loosening (Windsurf's case-insensitivity is sourced; Cursor's isn't). Since this already merged, the revert is up as a follow-up: #1334 (restores exact |
…ening) (#1334) Follow-up to #1328. That PR loosened the Cursor TERM_PROGRAM check from exact `=== "cursor"` to `?.toLowerCase() === "cursor"` "for parity with Windsurf" — but the parity is false. Windsurf is matched case-insensitively because its sources genuinely disagree on casing ("windsurf" vs "Windsurf"); Cursor consistently emits lowercase "cursor", so nothing justified loosening an existing, working, exact-match rule. Per review feedback on #1328 (Magi/Hermes), revert Cursor to exact match and drop the TERM_PROGRAM=Cursor test. Windsurf stays case-insensitive (sourced); its comment now documents the asymmetry as intentional. No functional change — Cursor always emitted lowercase, so detection is unchanged; this just removes an unsourced false-positive surface. Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
What
Extends
detectAgentRuntime()(packages/cli/src/telemetry/agent_runtime.ts) with four more coding-agent vendors: windsurf, cline, gemini_cli, crush. Adds them to theAgentRuntimeunion andVENDOR_RULES, with vitest coverage.(OpenHands was evaluated and dropped after runtime testing — see below.)
Why
The existing telemetry classifies Claude Code, Codex, Cursor, Copilot Coding Agent, Replit, Hermes, openclaw, and Pi. This closes obvious gaps with markers verified to the same bar as the existing rules: present in the environment of the shell/subprocess the agent spawns, keyed on existence only.
Verification (two tiers)
Runtime-confirmed — I installed/executed the agent's real code and observed the marker land in a spawned child whose parent did not have it:
gemini_cliGEMINI_CLI=1@google/gemini-cli-core@0.46.0ShellExecutionService.execute()on a child: control (child_process) =UNSET, via the service =1.crushCRUSH=1charmbracelet/crush internal/shelldriving the realShell.Exec(): parentUNSET, child =1.Source + independent third-party detectors (VS Code / editor surfaces; not headless-runnable here, but corroborated by other projects' shipping detectors):
clineCLINE_ACTIVE=trueVscodeTerminalRegistry.ts:29(TerminalOptions.env); also detected byfirebase/firebase-toolsandaliyun/aliyun-cli. DefaultvscodeTerminalpath only (backgroundExec omits it).windsurfTERM_PROGRAM=windsurf(case-insensitive)detection.rs:47; also adonisjs, ag-grid. Editor-terminal scope (same as the existing Cursor rule); can fall back tovscodeunder WSL.Deliberately NOT added (re-verified against current source / runtime)
OPENHANDS_BUILD_GIT_SHA/_REFexists in the agent-server Dockerfile (base-image-minimal stage, added 2025-11-09 in PR fix(studio): add FFmpeg pre-flight check before render #1100), but inspecting the configEnvof 12+ publishedghcr.io/openhands/agent-serverimages (2025-10-21 → 2026-01-13, including the merge commit of the introducing PR) shows the marker absent from every one. The declaredENVnever reaches the published runtime image, so a rule would never fire. Source ≠ runtime. Re-add only if a real published image is confirmed to carry it.run_cmd.py:62/116) pass noenv=, children inheritos.environverbatim.AGENT=goose/GOOSE_TERMINAL=1are set on the recipe-retry path (retry.rs:239) and thecomputercontrollerMCP extension (mod.rs:855), but not on the defaultdevelopershell tool (developer/shell.rs:600-660, onlyPATH+cwd). Primary path undetected → would undercount.OPENCODE_TERMINAL=1is set only on the interactive PTY panel (pty-preparation.ts:22), not the model'sbash/shelltools.ROO_ACTIVEonly on thevscodeprovider;terminalShellIntegrationDisableddefaultstrue, so the defaultexecaprovider sets no marker.FACTORY_PROJECT_DIR/DROID_PLUGIN_ROOTare hook-scoped only (docs.factory.ai).Test plan
vitest runbunx oxlint+bunx oxfmt --checkclean; pre-commit hook (lint/format/typecheck) greenNote: after this and #1294 both merge, the
gemini_managed_agentfilesystem check runs ahead ofVENDOR_RULES, so a managed agent that also exportsGEMINI_CLIis classified as the managed agent, notgemini_cli.