feat: add rate of change as alert trigger condition#1943
feat: add rate of change as alert trigger condition#1943
Conversation
Add a new "Rate of Change" condition type for alerts alongside the existing threshold rules. This compares the current evaluation window's value to the immediately preceding window and fires when the absolute or percentage change exceeds the configured threshold -- similar to Datadog's Change Alert, Grafana's diff/percent_diff reducers, and Splunk's Sudden Change detector. - New enums: AlertConditionType (threshold | rate_of_change), AlertChangeType (absolute | percentage) - Zod validation requiring changeType when conditionType is rate_of_change - Evaluation engine: extended date range for 2-window lookback, computeRateOfChange function, baseline bucket tracking in processAlert - Frontend: condition type / change type selectors in saved search and dashboard tile alert forms, improved alert card summary on /alerts page - Notification templates updated for rate-of-change context - Unit tests (schema validation, computeRateOfChange, external API) - Integration tests (API CRUD, ClickHouse evaluation with 4 scenarios) - E2E Playwright tests for saved search and dashboard tile flows Made-with: Cursor
🦋 Changeset detectedLatest commit: df23a99 The changes in this PR will be included in the next version bump. This PR includes changesets to release 4 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Made-with: Cursor
E2E Test Results✅ All tests passed • 131 passed • 3 skipped • 1038s
Tests ran across 4 shards in parallel. |
- Remove unused imports (SavedSearch, Source, AlertConditionType) - Fix backward-compat test: Mongoose default makes conditionType 'threshold' on new alerts, not undefined - Add conditionType to external API snapshot test expectation - Fix E2E locator: use 'Rate of Change' text to avoid strict mode violation from ambiguous 'change' match Made-with: Cursor
…e-alerts Made-with: Cursor # Conflicts: # packages/api/src/routers/api/alerts.ts # packages/api/src/tasks/checkAlerts/index.ts # packages/common-utils/src/types.ts
…hyperdx into feature/rate-of-change-alerts
PR Review
|
The empty-bucket handler hardcoded '' as the previousBucketValues key, which silently missed alerts for all groups when a grouped rate-of-change alert's evaluation window returned no data from ClickHouse. Now iterates over all known group keys so each group's change is evaluated independently. Also deduplicates AlertConditionType and AlertChangeType enums by importing from @hyperdx/common-utils instead of redefining locally. Made-with: Cursor
- Fix misleading log message: "insufficient baseline data" replaced with "zero baseline makes percentage change undefined" since the baseline IS present but is zero - Error and skip evaluation when a rate-of-change alert is missing changeType instead of silently falling back to absolute mode - Document intentional behavior when grouped RoC baseline bucket is empty (previousBucketValues stays unpopulated; has-data path handles missing baselines per-group) - Revert unrelated formatting changes to 4 dashboard JSON templates Made-with: Cursor
Extract 6 rate-of-change integration tests from singleInvocationAlert.test.ts (1491 lines) into a new singleInvocationRocAlert.test.ts to comply with the 300-line file guideline. Also remove now-unused AlertChangeType and AlertConditionType imports from the original file. Made-with: Cursor
The early-return guard ensures changeType is always defined before these expressions execute. Replace the misleading ?? fallback with a non-null assertion to reflect the actual invariant. Made-with: Cursor
previousCreatedAt may be more recent than 2 windows ago, so the minimum ensures the baseline window is always included in the query. Made-with: Cursor
🔴 Tier 4 — CriticalTouches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD. Why this tier:
Review process: Deep review from a domain expert. Synchronous walkthrough may be required. Stats
|
| <Group gap="xs" mb={4}> | ||
| <Text size="sm" opacity={0.7}> | ||
| Condition | ||
| </Text> | ||
| <NativeSelect | ||
| data={optionsToSelectData(ALERT_CONDITION_TYPE_OPTIONS)} | ||
| size="xs" | ||
| name={`conditionType`} | ||
| control={control} | ||
| data-testid="condition-type-select" | ||
| /> | ||
| {isRateOfChange && ( | ||
| <NativeSelect | ||
| data={optionsToSelectData(ALERT_CHANGE_TYPE_OPTIONS)} | ||
| size="xs" | ||
| name={`changeType`} | ||
| control={control} | ||
| data-testid="change-type-select" | ||
| /> | ||
| )} | ||
| </Group> |
There was a problem hiding this comment.
I feel like we need to change the generated chart when assigning a rate of change alert. The chart itself should compute a rate and display that rather than borrowing from the threshold chart. This should be able to be accomplished by computing the value for each granule, then using a difference with a window function. Ex: value - lag(value) OVER ORDER BY __hdx_time_bucket
| conditionType: z.nativeEnum(AlertConditionType).optional(), | ||
| changeType: z.nativeEnum(AlertChangeType).optional(), |
There was a problem hiding this comment.
If you want to go the extra mile, we could make AlertBaseObjectSchema a discriminated union using conditionType as the discriminator (see SourceSchema for an example). If conditionType equals 'threshold' the type system could infer that changeType is undefined, but if it equals 'rate-of-change' the type system resolves that changeType must be a value. superRefine later on may not even be needed, it would just work
Summary
Adds Rate of Change as a new alert condition type alongside the existing threshold rules, modeled after Datadog's Change Alerts, Grafana's diff/percent_diff reducers, and Splunk's Sudden Change detectors.
/alertspage summary now shows the condition type (Threshold vs Rate of Change) and whether percentage change is in use.Key areas touched:
common-utils,api/models,api/utils/zod): newAlertConditionTypeandAlertChangeTypeenums, Zod validation, Mongoose schema updates.api/tasks/checkAlerts): extended date range to fetch 2 windows for comparison, newcomputeRateOfChange()function with absolute and percentage modes.app/): condition type and change type selectors in alert forms, updated alert card summary on/alerts, newAlertPreviewChartprops.api/routers,api/controllers,api/tasks/checkAlerts/template): response includes new fields, notification titles reflect rate-of-change context.Screenshots or video
How to test locally or on Vercel
yarn devand open the app./alertsshows the correct condition type label and%suffix when percentage mode is used.yarn ci:unitcovers schema validation,computeRateOfChange, and external API translation.make dev-int FILE=alertsandmake dev-int FILE=singleInvocationAlertcover CRUD and ClickHouse evaluation.make dev-e2e FILE=alertscovers saved search and dashboard tile rate-of-change alert creation.References