feat(api): redactSecrets util for LLM input from observability data#2188
Conversation
Adds a reusable best-effort secret redactor with conservative allowlist patterns covering: PEM blocks, basic-auth URLs, key=value pairs, JSON-shaped secrets, HTTP secret headers, Bearer/Basic auth values, JWTs, AWS access keys, Slack tokens, and GitHub token shapes. Codifies the design rule for HyperDX AI endpoints in the file header: LLM input derived from observability data passes through redactSecrets; user-authored prose does not. Internal-only; no consumer in this commit. Imported by the upcoming /ai/summarize endpoint and any future LLM endpoints that ingest observability data. Refs HDX-3992.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
🔵 Tier 2 — Low RiskSmall, isolated change with no API route or data model modifications. Why this tier:
Review process: AI review + quick human skim (target: 5–15 min). Reviewer validates AI assessment and checks for domain-specific concerns. Stats
|
PR ReviewSolid, well-scoped utility with strong test coverage and good documentation of design intent. A couple of items worth considering before merge:
None of the above are merge-blockers given the file-header explicitly frames this as best-effort and acknowledges that downstream LLM context degradation from over-redaction is the lesser harm. But the |
E2E Test Results✅ All tests passed • 164 passed • 3 skipped • 1182s
Tests ran across 4 shards in parallel. |
Address review comments on #2188: - basic-auth-url now handles "@" in passwords. Previous regex stopped at the first "@", leaving any password tail before the host visible. New regex greedily consumes the password and backtracks to the last "@" before the host; host is captured and preserved in the replacement. New test: a password containing "@" must be fully redacted, with the host intact. - key-value pattern now matches shell-style quoted values: PASSWORD="hunter2 with spaces" and API_KEY='abc 123' are redacted. Previously the unquoted character class stopped at the leading quote, so neither pattern fired. Two new tests cover both quote styles. - pem pattern is bounded by {0,16000}? on the lazy match so an unmatched BEGIN does not scan an unbounded amount of trailing input. Real PEM blocks are well under 16KB; the API caps the whole request body at 50KB. New test asserts unchanged output and sub-500ms wall-clock on a 50KB unmatched-BEGIN payload. - Header "Known gaps" comment now mentions raw "@" in basic-auth usernames (ambiguous to parse without percent-encoding). 44 tests pass; eight new cases for the items above. No changes to the public surface. Refs HDX-3992.
|
Thanks for the review. Pushed fixes in 9753dc1.
44 tests pass; eight new cases. |
The previous review-fix commit pushed prod lines from 139 to 153, just over the Tier 2 threshold (< 150 prod lines). Compressing the verbose comments on PEM, basic-auth-url, and key-value patterns brings prod back to 144. No behavior change.
Co-Authored-By: Claude Opus <model> <[email protected]>
The PR body has always declared this PR as having no user-facing change (internal-only utility, no consumer in this PR). The changeset was added in error and would surface a stray "feat(api)" line in the next release notes for code that no production caller reaches yet. Drop it; the consumer's PR (#2206) carries the changeset that ships the user-facing behavior.
Backend endpoint for natural-language summaries of logs/traces and patterns. Subject-prompt registry keyed by `kind`, hardcoded tone modifiers (default | noir | attenborough | shakespeare), and a 30 req/min per-user rate limit. User content is wrapped in <data> tags so the model can separate data from instructions; secrets are redacted via the utility from #2188. Initial release covers `event` and `pattern`. The `alert` kind, conversation history (`messages` array), and trace-context enrichment land in follow-up PRs as their UI consumers ship.
|
LGTM for the sole purpose of omitting secrets before they are sent to the LLM. However, we should note that this should not be used anywhere else as the current implementation is fairly naive and wouldn't cover many kinds of secrets. |
Deep Review🔴 P0/P1 -- must fix
🟡 P2 -- recommended
🔵 P3 nitpicks (12)
Reviewers (10): correctness, security, adversarial, testing, maintainability, project-standards, performance, kieran-typescript, agent-native, learnings-researcher. Testing gaps:
|
|
Agreed on the scope. The deep-review that fired ~20 minutes after merge confirmed the same point with specifics: the I'll open a follow-up PR for those P0/P1 items and tighten the file's docstring to "LLM-input only; not a general-purpose redactor." |
Deep-review on the merged #2188 surfaced three security/correctness gaps in the redactor; this PR addresses each. - `bearer` value class now includes `_`, so a JWT bearer token with underscores in its signature ("eyJ...AbC_DeF...") no longer terminates at the first underscore and leaks the trailing bytes past the [REDACTED] marker. base64url uses both "_" and "-"; the alphabet is now consistent with the `jwt` pattern. - `basic-auth-url` scheme allowlist now covers the database/queue connection strings most likely to land in observability payloads with embedded credentials: postgres(ql), mysql, mariadb, mongodb(+srv), redis(s), amqp(s), kafka, clickhouse. Previously these slipped through entirely while http(s)/ftp/ssh were redacted. - New `llm-vendor-key` pattern catches OpenAI ("sk-...") and Anthropic ("sk-ant-...") API keys. This redactor specifically fronts an LLM-provider call, so a vendor-shape key must not leak to the very provider that issued it. Floors at 20 chars after the prefix to avoid catching English fragments like "sk-ip" or "sk-line". Docstring now scopes the redactor explicitly to LLM input ("LLM-input only; not a general-purpose secret redactor"), per fleon's caveat on #2188 and the discovered gaps. Pattern coverage test, multi-secret integration test, and per-pattern regression tests cover each new shape plus the JWT-with-underscore regression. 53/53 unit tests pass. No OpenAPI / changeset surface change.
## Summary Follow-up to #2188 addressing three P0/P1 findings from the deep-review that fired ~20 minutes after merge. - **`bearer` value class now includes `_`.** A JWT bearer token with underscores in the signature (base64url uses `_`) terminated at the first underscore, leaking the post-`_` tail past the `[REDACTED]` marker. The class is now `[A-Za-z0-9._~+/=_-]`, consistent with the `jwt` pattern. - **`basic-auth-url` scheme allowlist extended.** The original allowlist `(https?|ftp|ssh)` missed every database/queue connection string most likely to appear in observability payloads with embedded credentials. Now covers `postgres(ql)`, `mysql`, `mariadb`, `mongodb(+srv)`, `redis(s)`, `amqp(s)`, `kafka`, `clickhouse`. - **New `llm-vendor-key` pattern.** This redactor specifically fronts an LLM-provider call, so a leaked OpenAI / Anthropic key would be exfiltrated to the very provider that issued it. Pattern is `\bsk-(?:ant-)?[A-Za-z0-9_-]{20,}\b` so it catches OpenAI's 48+ char and Anthropic's longer formats while avoiding English fragments like `sk-ip` / `sk-line`. Docstring now scopes the redactor explicitly to LLM input ("**LLM-input only.** Do not use as a general-purpose secret redactor"), aligning with [fleon's caveat](#2188 (comment)) on the original PR. The deferred P2/P3 items from the same deep-review (false-positive on prose like "a bearer of bad news", JSON-quoted escaped quotes, oversized PEM, vendor tokens for Stripe/Twilio/Datadog/GCP, ReDoS bounds on `basic-auth-url`, idempotence assertion, exact-shape vs `toContain`) are intentionally out of scope here. Several reshape behavior in ways that need a fresh look; tracked in #2237. ## Test plan - [x] `yarn jest src/utils/__tests__/redactSecrets.test.ts` (53/53 passing, including the new JWT-with-underscore regression, postgres / mongodb+srv / mysql / redis / amqp / kafka / clickhouse cases, OpenAI / Anthropic / free-floating `sk-` cases, and the multi-secret integration test extended with an LLM key) - [x] `yarn workspace @hyperdx/api ci:lint` clean (eslint + tsc + spectral) - [x] `prose-lint --base origin/main --staged` clean - [x] Tier predictor reports Tier 2 (1 prod file, 54 prod lines)
Summary
Adds a reusable best-effort secret redactor at
packages/api/src/utils/redactSecrets.ts. Internal-only utility; no consumer in this PR. The next AI-summarize PR (HDX-3992 split) imports it; future LLM endpoints that ingest observability data should also.The file header codifies the design rule for HyperDX AI endpoints:
Patterns covered
pem-----BEGIN ... PRIVATE KEY-----blocks (RSA, EC, DSA, OPENSSH, PKCS#8)basic-auth-urlhttps://user:pass@hostkey-valuepassword=...,api_key=...,token=..., etc.json-quoted{"password":"..."}and similarhttp-headerX-Api-Key:,X-Auth-Token:,Api-Key:bearerAuthorization: Bearer ...basicAuthorization: Basic ...jwteyJ...three-segment base64urlaws-access-keyAKIA[16],ASIA[16]slack-tokenxox[a-z]-...github-tokenghp_,gho_,ghu_,ghs_,ghr_Known gaps (deferred to follow-ups)
Why this is its own PR
Splits cleanly from the larger AI summarize work (HDX-3992 / #2108). Lands as a small, isolated, test-heavy change so review is fast and the util is in place before downstream consumers arrive.
Tests
38 cases in
packages/api/src/utils/__tests__/redactSecrets.test.tscovering: each pattern with a positive case, a "looks similar but isn't" negative, and at least one multi-secret payload. Pattern-coverage assertion exposes the registry shape so future additions get a compile-time signal.All neighboring api utils tests still pass (8 suites, 61 tests).
No user-facing change
The util is internal API code with no production consumer in this PR. No changeset.
References