Skip to content

Fix IGNORE_IP_REGEX_START never matching localhost IPs#280

Open
Chessing234 wants to merge 1 commit intoallenai:mainfrom
Chessing234:fix/ignore-ip-regex-fstring
Open

Fix IGNORE_IP_REGEX_START never matching localhost IPs#280
Chessing234 wants to merge 1 commit intoallenai:mainfrom
Chessing234:fix/ignore-ip-regex-fstring

Conversation

@Chessing234
Copy link
Copy Markdown

Summary

BaseUrlTagger.IGNORE_IP_REGEX_START uses an r"..." (raw string) instead of f"..." (f-string), so the regex is compiled as the literal text ^{IGNORE_IP_REGEX.pattern} rather than expanding to ^(127\.0\.0\.1|0\.0\.0\.0|::1).

Bug: Line 62 — re.compile(r"^{IGNORE_IP_REGEX.pattern}") never matches any real IP.
Impact: The localhost check at line 97 (if not self.IGNORE_IP_REGEX_START.match(...)) always passes, so 127.0.0.1, 0.0.0.0, and ::1 are incorrectly yielded as blocklist URLs instead of being filtered out.
Evidence: The adjacent ONLY_URL_REGEX (line 64) and ADP_FORMAT_REGEX (line 65) both correctly use f"..." for the same interpolation pattern.

Fix: Change r"..."f"..." on line 62.

Test plan

  • Confirmed with Python REPL that the r-string compiles to literal ^{IGNORE_IP_REGEX.pattern}
  • Confirmed the f-string correctly expands to ^(127\.0\.0\.1|0\.0\.0\.0|::1)
  • Adjacent class attributes use f-strings for the same pattern — consistent fix
  • One-character change, no side effects

IGNORE_IP_REGEX_START is compiled with r"^{IGNORE_IP_REGEX.pattern}",
which is a raw string that literally matches the text
"{IGNORE_IP_REGEX.pattern}" instead of expanding the regex pattern.
This means the localhost IP check on line 97 never matches, so
127.0.0.1, 0.0.0.0, and ::1 are incorrectly yielded as blocklist
URLs. The adjacent ONLY_URL_REGEX and ADP_FORMAT_REGEX both correctly
use f-strings for the same interpolation pattern.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant