Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
120 changes: 120 additions & 0 deletions src/discover/registry.rs
Original file line number Diff line number Diff line change
Expand Up @@ -700,6 +700,13 @@ fn rewrite_segment_inner(seg: &str, excluded: &[String], depth: usize) -> Option
}
}

// rtk grep expects `pattern path [extra_args...]`, while rg/grep commonly receive
// leading flags before the pattern. Reorder simple flag prefixes so the rewritten
// command still parses correctly. If we can't do that safely, prefer no rewrite.
if rule.rtk_cmd == "rtk grep" {
return rewrite_grep_like(env_prefix, cmd_clean, redirect_suffix);
}

// Try each rewrite prefix (longest first) with word-boundary check
for &prefix in rule.rewrite_prefixes {
if let Some(rest) = strip_word_prefix(cmd_clean, prefix) {
Expand All @@ -715,6 +722,95 @@ fn rewrite_segment_inner(seg: &str, excluded: &[String], depth: usize) -> Option
None
}

fn rewrite_grep_like(env_prefix: &str, cmd_clean: &str, redirect_suffix: &str) -> Option<String> {
let args: Vec<String> = tokenize(cmd_clean)
.into_iter()
.filter(|t| t.kind == TokenKind::Arg)
.map(|t| t.value)
.collect();

Comment on lines +726 to +731
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrite_grep_like builds args by tokenizing and then filtering to TokenKind::Arg. In lexer::tokenize, characters like *, {}, !, etc. are emitted as TokenKind::Shellism, so this rewrite path will silently drop them and corrupt arguments (e.g., rg foo src/*.rs would lose the * and change the effective paths/pattern). Consider either (a) returning None (skip rewrite) if any non-Arg tokens occur within what should be a single shell word, or (b) reconstructing shell words by grouping contiguous Arg+Shellism tokens using offsets so globs/brace expansions are preserved exactly.

Suggested change
let args: Vec<String> = tokenize(cmd_clean)
.into_iter()
.filter(|t| t.kind == TokenKind::Arg)
.map(|t| t.value)
.collect();
let tokens = tokenize(cmd_clean);
if tokens.iter().any(|t| t.kind != TokenKind::Arg) {
return None;
}
let args: Vec<String> = tokens.into_iter().map(|t| t.value).collect();

Copilot uses AI. Check for mistakes.
let base = args.first()?;
if base != "rg" && base != "grep" {
return None;
}

let mut leading_flags = Vec::new();
let mut idx = 1;
while idx < args.len() {
let arg = &args[idx];
if arg == "-" || !arg.starts_with('-') {
break;
}

if arg == "--" {
return None;
}

if grep_flag_requires_separate_value(arg) {
return None;
}
Comment on lines +739 to +751
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The safety check for leading flags assumes a hard-coded denylist of options that take a separate value (grep_flag_requires_separate_value). Any missing value-taking flag will be treated as a “simple” flag and can shift idx so that the next value is mis-identified as the pattern/path (e.g. ripgrep flags like --max-depth 2 would make 2 become the pattern). To ensure the rewrite is actually safe, consider flipping this to an allowlist of valueless flags that you are confident don’t consume the next arg, or conservatively skip rewrite for unknown -X/--long leading flags unless they’re in a known-safe set or use --flag=value form.

Copilot uses AI. Check for mistakes.

leading_flags.push(arg.clone());
idx += 1;
}

let pattern = args.get(idx)?.clone();
idx += 1;

let (path, has_explicit_path) = match args.get(idx) {
Some(arg) if arg != "-" && !arg.starts_with('-') => {
idx += 1;
(arg.clone(), true)
}
_ => (".".to_string(), false),
};
Comment on lines +760 to +766
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Path parsing treats "-" as not being an explicit path (Some(arg) if arg != "-" && !arg.starts_with('-')). For grep/rg, - is commonly used to mean stdin; with the current logic rg PATTERN - rewrites to rtk grep PATTERN . -, changing behavior (searching . and stdin instead of only stdin). Consider accepting - as a valid explicit path (set has_explicit_path = true), or if stdin paths aren’t supported, return None when args[idx] == "-" so the original command is preserved.

Copilot uses AI. Check for mistakes.

let mut parts = vec![format!("{}rtk grep", env_prefix), pattern];
if has_explicit_path || !leading_flags.is_empty() || idx < args.len() {
parts.push(path);
}
parts.extend(args[idx..].iter().cloned());
parts.extend(leading_flags);
Some(format!("{}{}", parts.join(" "), redirect_suffix))
}

Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new rewrite_grep_like behavior isn’t covered for a few important edge cases that the implementation explicitly tries to be “safe” about: (1) shell-glob/brace arguments (e.g. src/*.rs), (2) stdin path -, and (3) unknown leading flags that take a separate value. Adding tests for these would help prevent regressions, especially since this rewrite path is intentionally conservative (returning None when unsafe).

Suggested change
#[cfg(test)]
mod tests {
use super::rewrite_grep_like;
#[test]
fn rewrite_grep_like_preserves_shell_glob_path_argument() {
assert_eq!(
rewrite_grep_like("", r#"rg foo "src/*.rs""#, ""),
Some(r#"rtk grep foo src/*.rs"#.to_string())
);
}
#[test]
fn rewrite_grep_like_preserves_brace_style_path_argument() {
assert_eq!(
rewrite_grep_like("", r#"rg foo "src/{lib,bin}""#, ""),
Some(r#"rtk grep foo src/{lib,bin}"#.to_string())
);
}
#[test]
fn rewrite_grep_like_keeps_stdin_dash_as_remaining_argument() {
assert_eq!(
rewrite_grep_like("", "rg foo -", ""),
Some("rtk grep foo . -".to_string())
);
}
#[test]
fn rewrite_grep_like_returns_none_for_long_flag_with_separate_value() {
assert_eq!(rewrite_grep_like("", "rg --glob *.rs foo src", ""), None);
}
#[test]
fn rewrite_grep_like_returns_none_for_short_flag_with_separate_value() {
assert_eq!(rewrite_grep_like("", "grep -e foo src", ""), None);
}
}

Copilot uses AI. Check for mistakes.
fn grep_flag_requires_separate_value(arg: &str) -> bool {
matches!(
arg,
"-A" | "-B"
| "-C"
| "-e"
| "-f"
| "-g"
| "-m"
| "-M"
| "-t"
| "-T"
| "--after-context"
| "--before-context"
| "--context"
| "--encoding"
| "--engine"
| "--file"
| "--glob"
| "--iglob"
| "--ignore-file"
| "--max-columns"
| "--max-count"
| "--path-separator"
| "--pre"
| "--pre-glob"
| "--regexp"
| "--replace"
| "--sort"
| "--sortr"
| "--threads"
| "--type"
| "--type-add"
| "--type-not"
)
}

/// Strip a command prefix with word-boundary check.
/// Returns the remainder of the command after the prefix, or `None` if no match.
fn strip_word_prefix<'a>(cmd: &'a str, prefix: &str) -> Option<&'a str> {
Expand Down Expand Up @@ -1277,6 +1373,30 @@ mod tests {
);
}

#[test]
fn test_rewrite_rg_with_leading_flags_reorders_for_rtk_grep() {
assert_eq!(
rewrite_command(
r#"rg -n -U "document:\s*\{\s*create:" api/src scripts"#,
&[]
),
Some(r#"rtk grep "document:\s*\{\s*create:" api/src scripts -n -U"#.into())
);
}

#[test]
fn test_rewrite_rg_with_leading_flags_and_default_path() {
assert_eq!(
rewrite_command(r#"rg -n "fn main""#, &[]),
Some(r#"rtk grep "fn main" . -n"#.into())
);
}

#[test]
fn test_rewrite_rg_with_value_flags_skips_rewrite() {
assert_eq!(rewrite_command(r#"rg -g "*.rs" "fn main" src"#, &[]), None);
}

#[test]
fn test_rewrite_playwright() {
let commands = vec![
Expand Down
Loading