Skip to content

fix: prevent glob searches from surfacing ignored files#1245

Closed
morluto wants to merge 14 commits into
MoonshotAI:mainfrom
morluto:fix/glob-ignore-files
Closed

fix: prevent glob searches from surfacing ignored files#1245
morluto wants to merge 14 commits into
MoonshotAI:mainfrom
morluto:fix/glob-ignore-files

Conversation

@morluto

@morluto morluto commented Jun 30, 2026

Copy link
Copy Markdown

Related Issue

Resolve #1244

Problem

Glob documents that it respects .gitignore, .ignore, and .rgignore by default, but two ripgrep behaviors made that false in common cases.

First, broad patterns like *, **, and **/* were passed as positive --glob filters. In ripgrep, a positive --glob can re-include files that ignore rules would otherwise exclude. That means a search like Glob({ pattern: "*" }) could return ignored.log even when .gitignore contains *.log.

Second, ripgrep only applies .gitignore files outside git repositories when --no-require-git is set. Since Glob did not pass that flag, plain project directories without .git did not honor .gitignore.

What changed

  • Skip positive --glob for broad all-file patterns (*, **, **/*) and let rg --files enumerate normally.
  • Keep positive --glob for specific patterns like src/**/*.ts.
  • Detect whether the search root is inside a git repo.
  • Pass --no-require-git for non-git search roots.
  • Add real-ripgrep regression tests for .gitignore behavior in both git and non-git directories.
  • Add a patch changeset.

Reference Patterns

This follows two useful reference patterns:

  • opencode's find path avoids a positive glob for broad discovery, which keeps ignore rules in control instead of re-including ignored files.
  • pi's find path walks upward for .git and uses --no-require-git outside repos, which makes .gitignore work in normal non-git folders.

Kimi Code's implementation combines those patterns while preserving existing sensitive-file exclusions and specific glob matching.

Review Notes

Start with packages/agent-core/src/tools/builtin/file/glob.ts. The main behavior change is in argument construction for ripgrep.

Then review packages/agent-core/test/tools/glob.test.ts, especially the real-ripgrep tests at the bottom. Those tests reproduce the two failures this PR fixes.

Validation

  • PATH=/root/node24/bin:$PATH pnpm vitest run packages/agent-core/test/tools/glob.test.ts
  • PATH=/root/node24/bin:$PATH pnpm --filter @moonshot-ai/agent-core run typecheck
  • PATH=/root/node24/bin:$PATH pnpm exec oxlint packages/agent-core/src/tools/builtin/file/glob.ts packages/agent-core/test/tools/glob.test.ts

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked a related issue, or explained the problem above.
  • I have added tests that prove my feature works.
  • Ran gen-changesets skill, or this PR needs no changeset.
  • Ran gen-docs skill, or this PR needs no doc update. Existing docs already describe the intended ignore-file behavior; this PR restores that behavior.

@changeset-bot

changeset-bot Bot commented Jun 30, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 3b60ca9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@moonshot-ai/kimi-code Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@morluto morluto marked this pull request as ready for review July 1, 2026 00:21

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3fc6b3a8ae

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
A positive ripgrep --glob "always overrides any other ignore logic,"
so passing any non-broad user pattern (e.g. `*.ts`, `dist/**/*.js`) as
--glob re-included files excluded by .gitignore/.ignore/.rgignore. In a
repo with .gitignore containing `*.ts`, Glob({ pattern: '*.ts' }) would
surface ignored.ts despite the documented default ignore behavior.

Stop passing the user's pattern as a positive --glob. Instead, let
rg --files enumerate non-ignored files and filter the results
in-process with picomatch, preserving both the pattern match and the
ignore-file respect. Broad patterns (`*`, `**`, `**/*`) still skip the
filter since they match everything.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f01f7b42d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
…d patterns

runRipgrepOnce: add optional lineFilter that filters stdout lines as they
stream in, before they count toward the MAX_OUTPUT_BYTES cap. GlobTool uses
this so a selective pattern (e.g. rare/**/*.ts) is not starved by a large
prefix of non-matching paths when rg --files returns >10MB of output.

GlobTool:
- Apply the user's glob pattern as a streaming lineFilter on rg --files
  output so matching paths survive the stdout cap. The post-cap
  filterByPattern re-check remains as a safety net.
- Remove nocase:true from picomatch options. ripgrep's default glob
  matching is case-sensitive (--glob-case-insensitive is a separate
  flag), so the previous setting caused *.TS to match foo.ts on POSIX,
  diverging from rg --glob behaviour.
- Normalize the search root and absolute paths case-insensitively on
  Windows before computing the relative path in filterByPattern, so a
  user-supplied root with different casing than rg returns doesn't
  produce escaped paths that fail to match.
- Validate glob patterns for unclosed [ and { before running rg.
  picomatch silently treats malformed globs as literals and reports "No
  matches found", whereas rg --glob used to exit 2 with a hard error.
  The new validateGlobPattern restores that error behaviour.
@chatgpt-codex-connector

Copy link
Copy Markdown

💡 Codex Review

return picomatch.isMatch(relPath, pattern, opts);

P2 Badge Handle rooted ripgrep globs

For patterns that start with /, this compares the rootless relative path (for example src/a.ts) directly against the rooted pattern, so a valid call such as Glob({ pattern: '/src/*.ts' }) now reports no matches. I checked ripgrep with rg --files --glob '/src/*.ts' ., and it matches ./src/a.ts because --glob uses gitignore-style rooted globs; the in-process matcher needs to strip or otherwise account for the leading root anchor.


lineBuffer += buf.toString('utf8');

P2 Badge Decode filtered stdout across chunk boundaries

When lineFilter is enabled for any non-broad Glob pattern, decoding each chunk with buf.toString('utf8') can split a multibyte filename across stream chunks and insert replacement characters before the next chunk arrives. In that case a path like src/é.ts can either fail a Unicode-specific pattern or be returned as a corrupted path that follow-up Read/Edit cannot open; use a streaming decoder or carry undecoded bytes between chunks.


else if (ch === '{') braceDepth++;

P2 Badge Skip braces inside character classes

When a caller uses ripgrep's valid bracket escape for a literal brace, such as [{]foo.ts to match {foo.ts, this scanner still increments braceDepth for the { inside the character class and returns an invalid-pattern error before ripgrep runs. I verified rg --files --glob '[{]foo.ts' matches that filename, so the new validation rejects valid glob patterns unless the scanner tracks when it is inside [...].


const comparableAbs = pathClass === 'win32' ? normAbs.toLowerCase() : normAbs;

P2 Badge Preserve Windows path case before matching

Fresh evidence after the earlier POSIX nocase fix is that this Windows path still lowercases the whole candidate before deriving rel, so matchUserPattern checks src/a.ts instead of the actual SRC/A.TS. On a Windows backend, a case-sensitive pattern such as SRC/**/*.TS can pass rg's relative output but be dropped here; checked rg --help, case-insensitive matching requires --glob-case-insensitive, so this changes Glob results for uppercase paths.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…s case

matchUserPattern():
- Strip a leading `/` from the pattern before matching. ripgrep's --glob
  treats `/src/*.ts` as rooted at the search root (equivalent to
  `src/*.ts`), but the in-process matcher compared the rootless relative
  path directly against the rooted pattern and reported no matches.

readStreamWithCap() in run-rg.ts:
- Use a streaming TextDecoder with `{ stream: true }` instead of
  buf.toString('utf8') per chunk. A multibyte filename split across
  chunk boundaries would otherwise produce replacement characters,
  corrupting the path or failing the pattern match.

validateGlobPattern():
- Track whether the scanner is inside a `[...]` character class and skip
  brace depth counting while inside one. ripgrep accepts `[{]foo.ts` as
  a valid pattern (the `{` is literal inside the bracket), but the
  previous scanner incremented braceDepth for it and rejected the
  pattern before ripgrep ran.

filterByPattern() / makeLineFilter():
- Preserve the original case of the relative path on Windows. The
  previous code lowercased the entire path before deriving the relative
  path, so case-sensitive patterns like `SRC/**/*.TS` would fail against
  rg's original-case output. The new `relativePath()` helper compares
  case-insensitively to find the boundary but returns the original-case
  suffix for pattern matching.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6f176103bb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Replace matchUserPattern (which called picomatch.isMatch on every line,
reparsing/compiling the pattern each time) with compileGlobMatcher,
which compiles the picomatch matcher once and returns a reusable test
function. filterByPattern and makeLineFilter now build the matcher once
per search and call the compiled function for each path line. In large
trees this avoids hundreds of thousands of redundant pattern
compilations that could burn the Glob timeout in JS rather than in rg.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a600edbf35

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
A rooted basename pattern like /foo.ts was stripped to foo.ts, which
has no / and therefore fell into the basename-only branch — matching
sub/foo.ts and deep/nested/foo.ts as well as the root file. rg treats
/foo.ts as a gitignore-style rooted glob that only matches at the
search root. Track whether the original pattern was rooted and use
full-relative-path matching in that case so the in-process matcher
matches rg's behavior.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d810f4e976

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts
…narrow search paths

validateGlobPattern():
- Inside a character class, [ is a literal character, not a second
  opening bracket. Previously [[]foo.ts was rejected as unclosed because
  the inner [ incremented bracketDepth a second time and the single ]
  only decremented it once. Now only ] closes a character class, so
  [[]foo.ts is accepted as a valid gitignore-style escape that matches a
  literal [.

compileGlobMatcher() / escapePicomatchExtensions():
- picomatch's extglob: false and rangeBraces: false options have no
  effect — the patterns are still parsed as extglob/range. Preprocess
  the pattern to escape these constructs so picomatch treats them as
  literal, matching rg --glob behavior:
    @(a|b) -> [@]\(a\|b\)  (literal, not extglob)
    {1..2} -> \{1..2\}     (literal, not range expansion)
  Brace alternatives like {ts,tsx} are left intact (supported by both).

buildRgArgs() / deriveSearchPath():
- Derive a safe literal search path from the pattern prefix and pass it
  to rg instead of always searching from '.'. For example,
  src/**/*.ts now searches 'src' instead of '.', so rg only walks the
  relevant subtree instead of collecting and sorting unrelated files
  before the streaming filter can see anything. Falls back to '.' when
  no literal prefix can be derived (e.g. *.ts, **/*.ts).

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 248855a554

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts
…idation

Address PR MoonshotAI#1245 review comments:

- Remove deriveSearchPath and always search from `.`. A derived prefix
  passed as the rg PATH overrides ignore rules (command-line paths are
  authoritative in rg), allows `../outside/**` to escape the authorized
  tree, and errors out on non-existent prefixes. The streaming line
  filter already prevents the output cap from being exhausted.

- Fix relativePath to only reject actual escapes (`..` or `../`), not
  files whose names start with two dots (e.g. `..config/a.ts`).

- Preserve `*` and `?` wildcards before literal parentheses. rg treats
  them as regular wildcards and `(` as a literal character, not as
  extglob prefixes. Only `@`, `!`, `+` are escaped as literal prefixes.

- Match rg's range-brace behavior: `{N..M}` and `{foo}` are single
  brace alternatives with the braces removed (matching `N..M` / `foo`),
  not literal braces or picomatch ranges.

- Reject malformed character classes (`[]`, `[!]`, `[^]`, `[z-a]`) and
  dangling trailing backslashes before running ripgrep, matching rg's
  hard parse errors instead of falling through to picomatch.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 71c9af3113

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
…empty glob as broad

- rewriteBraces now tracks character-class state so braces inside `[]`
  (e.g. `[{a}].ts`) are left intact — rg treats them as literal class
  members, not brace-group delimiters
- empty brace alternatives (`ab{,c}`, `ab{c,,d}`, `ab{,}`) are filtered
  out to match rg, which ignores them; picomatch would expand the empty
  arm and surface extra files
- empty pattern `''` is now treated as a broad all-files glob, matching
  rg's `-g ''` behavior; previously picomatch threw on the empty string

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: daf49705dd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
morluto and others added 2 commits July 1, 2026 19:45
…ted classes, range arms, ./ prefix, unmatched braces

Unified the picomatch escape into a single character-class-aware scanner:
- POSIX bracket classes `[:digit:]` inside `[]` are escaped (`[:` → `[\:`)
  so picomatch treats them as literal, matching rg
- Extglob rewrites and bare parenthesis alternation `(a|b)` are now
  skipped inside character classes — `[@(a)].ts` preserves all chars as
  literal class members
- Bare `(a|b)` outside char classes has parens/pipe escaped so picomatch
  treats them as literal, matching rg
- `[!` is converted to `[^` for picomatch's negation syntax
- Range-like arms in brace alternatives (`{1..2,3}`) have dots replaced
  with `[.]` to prevent picomatch range expansion
- `./` prefix is stripped from patterns so the in-process filter matches
  rg's un-prefixed glob subject
- Unmatched `}` is now rejected by the validator, matching rg's error

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f7d0f50c55

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread packages/agent-core/src/tools/builtin/file/glob.ts
Comment thread packages/agent-core/src/tools/builtin/file/glob.ts Outdated
- compileGlobMatcher now detects a leading `!` as rg's glob exclusion
  marker (not a picomatch extglob prefix), strips it, and negates the
  matcher so `!(a).ts` excludes files matching `(a).ts` and includes
  everything else, matching rg --glob behavior
- isBroadPattern handles `!` prefix: `!` alone is broad (exclude
  nothing), but `!*` / `!**` are not broad (exclude everything)
- Brace arm splitting now uses splitBraceArms which splits on unescaped
  commas only, so `{\,,a}.ts` correctly produces two arms (`\,` and `a`)
  instead of corrupting the escaped comma into an empty segment

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ea6426ceb9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +402 to +403
const rooted = pattern.startsWith('/');
let normalizedPattern = rooted ? pattern.slice(1) : pattern;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve rooted exclusions after stripping !

When the pattern starts with !/, rooted is computed before the leading exclusion marker is removed, so !/foo.ts is compiled as a negation of /foo.ts; relative rg output never starts with /, so the matcher keeps every file. Checked rg --help (--glob uses gitignore globs and ! excludes a glob) and rg --files -g '!/foo.ts' excludes only the root foo.ts; with this code Glob({ pattern: '!/foo.ts' }) returns foo.ts too. Fresh evidence beyond the earlier leading-! case: the failure requires the rooted !/ form.

Useful? React with 👍 / 👎.

Comment on lines +376 to +379
if (pattern.startsWith('!')) {
const rest = pattern.slice(1);
return rest === '';
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Treat a bare ! as an empty exclusion

For pattern: '!', this branch marks the pattern as broad and skips filtering, so the tool returns the first files from the tree. Ripgrep treats ! as an exclusion glob marker applied to the empty glob: since -g '' matches all files and rg --files -g '!' returns no matches, this regresses a valid pattern from no output to a broad file listing.

Useful? React with 👍 / 👎.

Comment on lines +493 to +496
if (ch === '\\' && i + 1 < pattern.length) {
result += ch + (pattern[i + 1] ?? '');
i += 2;
continue;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Drop escapes before ordinary glob characters

When a user escapes a non-special character, gitignore-style glob parsing removes the backslash before matching; checked rg --help for --glob (Globbing rules match .gitignore globs) and empirically rg --files --hidden -g 'foo\bar' matches foobar. Passing the backslash through to picomatch makes the in-process filter look for a literal backslash filename instead, so patterns containing these valid escapes can miss the file rg would have returned and surface a different literal-backslash file.

Useful? React with 👍 / 👎.

Comment on lines +537 to +538
const inner = pattern.slice(i + 1, j);
const escapedInner = inner.replaceAll('|', '\\|');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Escape nested literal parentheses too

This only escapes | inside the parenthesized span, leaving any nested ( or ) for picomatch to interpret. For a literal pattern such as (a(b|c)).ts, rg's .gitignore glob semantics match the filename containing both pairs of parentheses, but this rewrite produces a matcher for (ab|c).ts; fresh evidence beyond the earlier bare-parenthesis thread is that nested literal parentheses are still left active inside the escaped outer pair.

Useful? React with 👍 / 👎.

}
// Escape picomatch-only extensions that rg --glob does not support,
// so the in-process matcher matches the same files rg would.
const escapedPattern = escapeForPicomatch(normalizedPattern);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply gitignore preprocessing before picomatch

Patterns are now compiled directly here, but rg parses --glob values as gitignore-style lines first: checked rg --help (Globbing rules match .gitignore globs), and empirically rg --files --hidden -g '#foo' treats the pattern as a comment/broad search while -g 'foo ' trims the unescaped trailing space and matches foo. Without that preprocessing, Glob({ pattern: '#foo' }) only returns a literal #foo file and Glob({ pattern: 'foo ' }) looks for a trailing-space filename, so valid rg-compatible patterns return the wrong files.

Useful? React with 👍 / 👎.

@morluto

morluto commented Jul 1, 2026

Copy link
Copy Markdown
Author

Superseded by #1280, which keeps the same fix but rewrites the branch into three clean commits with a simpler review surface.

@morluto morluto closed this Jul 1, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3b60ca9f8f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

): Promise<ExecutableToolResult> {
const searchRoot = searchRoots[0] ?? this.workspace.workspaceDir;

const patternError = validateGlobPattern(args.pattern);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preprocess comment globs before validating

When the pattern starts with #, ripgrep applies .gitignore parsing first (rg --help says --glob uses .gitignore globs), so a commented glob like #[ is ignored and behaves like a broad search; I checked rg --files -g '#[' returns the normal file list. This validation runs on the raw pattern before preprocessGitignoreGlobPattern, so the same input is rejected as an unclosed character class instead of searching, even though the later matcher already treats #... as broad. Skip validation for comment globs or validate the preprocessed form.

Useful? React with 👍 / 👎.

Comment on lines +414 to +416
if (normalizedPattern.startsWith('!')) {
negated = true;
normalizedPattern = normalizedPattern.slice(1);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve literal bangs after the exclusion marker

For patterns beginning with !!, ripgrep consumes only the first ! as the exclusion marker (rg --help: Precede a glob with a ! to exclude it); empirically rg --files -g '!!foo' excludes the literal !foo and still returns unrelated files such as bar. Here the first ! is stripped, but the remaining !foo is passed to picomatch without disabling its own leading-bang negation, so the outer negation flips an already-negated matcher and keeps only foo, dropping unrelated files. Escape the remaining leading bang or compile with picomatch negation disabled after consuming rg's marker.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Glob searches can surface ignored files

1 participant