feat: AI summarize with extensible subjects, trace context, and security hardening#2108
feat: AI summarize with extensible subjects, trace context, and security hardening#2108alex-fedotyev wants to merge 9 commits into
Conversation
|
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
E2E Test Results✅ All tests passed • 145 passed • 3 skipped • 1087s
Tests ran across 4 shards in parallel. |
- Add POST /ai/summarize endpoint that uses the configured LLM to generate concise, actionable summaries for individual events and patterns - Add useAISummarize hook in the frontend to call the new endpoint - Update AISummarizeButton and AISummarizePatternButton to use real AI when aiAssistantEnabled is true, falling back to the Easter egg themes when no AI provider is configured - Update AISummaryPanel to support both real AI and Easter egg display modes (no info popover / dismiss for real AI, no italic theme label) HDX-3992 Co-authored-by: Alex Fedotyev <alex-fedotyev@users.noreply.github.com>
- Export formatEventContent/formatPatternContent for testability - Always show 'Don't show' link (both real AI and easter egg modes) - Real AI: visible always (not gated by easter egg dates) - Easter egg: still uses dismissEasterEgg() for localStorage persistence - AISummaryPanel: show 'Don't show' link in collapsed state for easy dismissal, remove April Fools popover in real AI mode Tests (42 new): - formatEventContent: 9 tests covering all field extraction paths - formatPatternContent: 3 tests for pattern/samples formatting - AISummarizeButton: 10 tests (real AI, fake AI, dismiss, error, toggle) - AISummarizePatternButton: 5 tests (visibility, AI/fallback, dismiss) - AISummaryPanel: 8 tests (dismiss, error, theme label, real vs fake) - POST /ai/summarize: 7 tests (validation, prompts, error handling) Co-authored-by: Alex Fedotyev <alex-fedotyev@users.noreply.github.com>
The Pattern type requires an 'id' field. Added it to all test fixtures to fix the CI TypeScript check. Co-authored-by: Alex Fedotyev <alex-fedotyev@users.noreply.github.com>
SampleLog requires __hdx_timestamp alongside __hdx_pattern_field. Added the missing field and used proper Pattern typing throughout. Co-authored-by: Alex Fedotyev <alex-fedotyev@users.noreply.github.com>
…own output - Enrich trace summaries with full trace context (span groups with count/sum/p50 durations, error spans with details) - Add tone/style picker (noir, attenborough, shakespeare) gated behind ?smart=true — persisted in localStorage, auto-regenerates on change - Render AI output as markdown with highlighted key details - Improve prompts: terse for healthy events, focused for errors - Enrich pattern summaries with sample attributes - Add env-local-preload.js so .env.development.local overrides work - Fix react-markdown ESM mock for Jest
- Make trace span fetch lazy (only when user clicks Summarize) - Filter __hdx_ internal keys from pattern sample attributes - Add bounds clamp to percentile calculation - Cap trace context output at 4KB to stay within content limits - Fix stale test assertions for rewritten prompts - Remove stale comment in AISummaryPanel
useEventsAroundFocus runs getConfig in a useMemo even when enabled=false. When the real trace source hasn't loaded yet, passing undefined crashes on source.kind access. Provide a minimal stub TTraceSource so the useMemo produces a harmless no-op config without throwing.
351795a to
57086e8
Compare
Knip - Unused Code Analysis🔴 24 issues found Unused exports (12)
Unused exported types (12)
Knip finds unused files, dependencies, and exports in your codebase. |
…marize
Architectural refactor so adding new subjects (alerts, metric anomalies) or
follow-up conversation flows doesn't require rethinking shared code.
Backend
- Subject-keyed prompt registry in routers/api/aiSummarize.ts
- API schema: { kind, content, tone?, messages? } — messages shape fixed
now so future conversation UI doesn't need a schema change
- Per-user in-memory rate limiter (30/min) via new middleware/rateLimit.ts
- Prompt injection defense: redactSecrets() scrubs password/token/Bearer/JWT
patterns, content wrapped in <data>...</data> tags, system prompt tells
the model to treat anything inside as data not instructions
- Prompts explicitly note severity labels can be misleading
Frontend
- Shared useAISummarizeState hook deduplicates state from both buttons
- Subject definitions (EVENT_SUBJECT, PATTERN_SUBJECT) in dedicated files
- Shared classifiers (isErrorEvent, isWarnEvent, normalizeSeverity) that
cross-check severity + statusCode + HTTP + body regex instead of trusting
severity alone
- traceContext.ts uses classifiers, handles non-string SpanAttributes safely,
drops db.statement to avoid credential leaks
- Button components trimmed from ~250 lines each to ~80
Tests
- +45 new tests: classifiers, traceContext edge cases, API redaction,
delimiter wrapping, alert kind, conversation/messages mode
- Add rate-limit middleware tests (6 cases: under/over limit, window reset,
per-user isolation, per-limiter isolation, error payload shape)
- Document messages[] trust boundary in aiSummarize schema — caller can
claim any assistant role, acceptable for single-shot, requires
server-side state when a follow-up UI is built
- Extend secret redaction to JSON-shape ("key":"value") and HTTP-header
shape (X-Api-Key: value) + tests
- Add supportsTraceContext gate tests — pattern subject and event
subjects without a traceId must never enable the span fetch
- Dedupe coerce-attribute helpers into formatHelpers.ts; used by both
subject formatters and traceContext
- Simplify TraceAttributeValue to `unknown` (the other union members were
absorbed anyway)
- Comment why abort-reset uses queueMicrotask instead of setTimeout(0)
(jest.useFakeTimers would otherwise freeze the reset)
|
Superseding this draft. I'm decomposing HDX-3992 into a stacked PR series so each piece reviews on its own. Stack base: #2188 ( Stack on top:
Keeping this branch around for cherry-picks, but closing the PR so it doesn't sit in the queue. |
…2188) ## Summary Adds a reusable best-effort secret redactor at `packages/api/src/utils/redactSecrets.ts`. Internal-only utility; no consumer in this PR. The next AI-summarize PR (HDX-3992 split) imports it; future LLM endpoints that ingest observability data should also. The file header codifies the design rule for HyperDX AI endpoints: > Any LLM input derived from observability data passes through `redactSecrets` before leaving the API process. User-authored prose (the chart-builder assistant where the user types their own question) does NOT, because redacting the user's own input would strip exactly what they meant to ask. ## Patterns covered | name | shape | |------|-------| | `pem` | `-----BEGIN ... PRIVATE KEY-----` blocks (RSA, EC, DSA, OPENSSH, PKCS#8) | | `basic-auth-url` | `https://user:pass@host` | | `key-value` | `password=...`, `api_key=...`, `token=...`, etc. | | `json-quoted` | `{"password":"..."}` and similar | | `http-header` | `X-Api-Key:`, `X-Auth-Token:`, `Api-Key:` | | `bearer` | `Authorization: Bearer ...` | | `basic` | `Authorization: Basic ...` | | `jwt` | `eyJ...` three-segment base64url | | `aws-access-key` | `AKIA[16]`, `ASIA[16]` | | `slack-token` | `xox[a-z]-...` | | `github-token` | `ghp_`, `gho_`, `ghu_`, `ghs_`, `ghr_` | ## Known gaps (deferred to follow-ups) - URL percent-encoded values (beyond what query-string parsing already catches) - Vendor-specific tokens not listed above (Stripe, Twilio, Datadog) - Generic high-entropy hex blobs (too many false positives without surrounding context) ## Why this is its own PR Splits cleanly from the larger AI summarize work (HDX-3992 / #2108). Lands as a small, isolated, test-heavy change so review is fast and the util is in place before downstream consumers arrive. ## Tests 38 cases in `packages/api/src/utils/__tests__/redactSecrets.test.ts` covering: each pattern with a positive case, a "looks similar but isn't" negative, and at least one multi-secret payload. Pattern-coverage assertion exposes the registry shape so future additions get a compile-time signal. ``` yarn jest src/utils/__tests__/redactSecrets.test.ts # Test Suites: 1 passed, 1 total # Tests: 38 passed, 38 total ``` All neighboring api utils tests still pass (8 suites, 61 tests). ## No user-facing change The util is internal API code with no production consumer in this PR. No changeset. ## References - Linear: HDX-3992 - Related (do-not-merge while we split): #2108
Summary
Replaces the April Fools Easter egg AI Summarize feature with a real LLM-powered summarization system designed to scale to new subjects (alerts, incidents, metrics) without schema churn.
What's in this PR
Backend (
packages/api)POST /ai/summarize— subject-registry endpoint accepting{ kind, content, tone?, messages? }kindis an enum —event | pattern | alert. New kinds are added by registering a system prompt inaiSummarize.ts.messagesis an optional conversation history (capped length + size). Not wired to UI yet — ships with the shape fixed so future follow-up-question flows don't need an API change.redactSecrets()(scrubspassword=,token=,Bearer ..., JWTs) and wrapped in<data>...</data>tags. The system prompt explicitly tells the model to treat anything inside<data>as data, not instructions..env.development.localoverride — newscripts/env-local-preload.jsso API dev scripts respect a gitignored override file the way Next.js does.Frontend (
packages/app)aiSummarize/subjects.ts+ per-subject formatters (eventSubject.ts,patternSubject.ts). Adding a new surface (e.g. alerts) = define a subject + render<AISummaryPanel>bound touseAISummarizeState({ subject, input }).useAISummarizeStatehook — consolidates state machine previously duplicated across both button components. Handles real-AI vs easter-egg branch, tone picker, dismiss, regenerate, input-change abort, and lazy trace-context fetch.classifiers.ts) —isErrorEvent,isWarnEvent,normalizeSeveritythat cross-check multiple signals (severity, status code, HTTP code, exception, body regex). Used to decide what to prioritize in trace context. Raw severity/body always go to the model unchanged so it can draw its own conclusion.count/errors/sum/p50durations, up to 10 error spans with exception details) and appends to the prompt. Capped at 4KB. Fetched lazily — only when user clicks Summarize, not on panel open.react-markdownwith**bold**highlights for key details and`code`for values. Easter egg mode remains plain italic text.?smart=trueURL flag enables a compact tone selector (Detective Noir / Nature Documentary / Shakespearean Drama) that persists in localStorage and auto-regenerates on change. Tone values are enum-validated server-side; no freeform prompt injection.Extensibility notes
The architecture is designed so the following changes don't require rethinking shared code:
ALERT_SUBJECTwith aformatAlertContent, register a prompt in the backend'sSUBJECT_PROMPTSmap, render<AISummaryPanel>bound to the hook. No changes to the hook or panel.messages: ConversationMessage[]. Future UI sends user's follow-up plus prior turns; no server change needed.Security
<data>tag delimiters + explicit instruction to ignore embedded instructions.password,token, Bearer, JWT).db.statementis NOT included in trace context — redacted even post-regex, and span body usually has enough signal.Test coverage
<data>wrapping, conversation/messages mode, and error handling.make dev-int FILE=aiSummarize).How to test locally
ANTHROPIC_API_KEYinpackages/api/.env.development.local(orAI_API_KEY+AI_PROVIDER=anthropic).yarn dev?smart=trueto the URL to see the tone picker; changing it regenerates.References