Skip to content

fix: filter logs with ID field#2071

Open
karl-power wants to merge 2 commits intomainfrom
karl/id-field-in-log-filter
Open

fix: filter logs with ID field#2071
karl-power wants to merge 2 commits intomainfrom
karl/id-field-in-log-filter

Conversation

@karl-power
Copy link
Copy Markdown
Contributor

@karl-power karl-power commented Apr 8, 2026

Summary

Fixes an issue where clicking a log row to expand it could show the wrong row's details when multiple rows share identical visible column values (Timestamp, ServiceName, Body, etc.).

The root cause: the detail panel query reconstructs a WHERE clause from only the visible columns, without the original search filters. When rows collide on those columns, LIMIT 1 returns whichever ClickHouse finds first.

Fix: Add a __hdx_id materialized column (UInt16, random) to otel_logs that acts as a tiebreaker. The app detects this column during source inference, stores it as uniqueRowIdExpression, and injects it as a hidden column in search queries. Since the row WHERE clause builder already iterates all columns in the row data, __hdx_id is automatically included in detail queries — disambiguating rows without surfacing the column in the UI.

Key changes:

  • Schema: Add __hdx_id to otel_logs seed + CH migration
  • Source inference (source.ts): Auto-detect __hdx_id and set uniqueRowIdExpression
  • Query builder (DBRowTable.tsx): Generalize appendSelectWithPrimaryAndPartitionKeyappendSelectWithAdditionalKeys with an extraKeys param; use it to inject uniqueRowIdExpression into the SELECT (hidden from UI, but included in the row data used to build the WHERE clause)
  • Source form (SourceForm.tsx): Expose uniqueRowIdExpression field for manual configuration
  • Tests: Cover extraKeys behavior and inferTableSourceConfig __hdx_id detection

How to test locally

First reproduce problem on main.

  1. Pick any log row from otel_logs. Go to CH UI and query the row - easiest is to copy the query performed when clicking row to open sidebar.
  2. Insert the same row into otel_logs but change one nested value - I changed ResourceAttributes.host.arch from arm64 to arm65.
  3. Go back to search page and query for a log with ResourceAttributes.host.arch of arm65. One result should appear. Click to open sidebar.
  4. Log detail view in sidebar will show arm64 - it selected the wrong row.

On this branch.

  1. Build the otel collector (make dev-build) and restart.
  2. Repeat steps above.
  3. Log detail view in sidebar now shows arm65 - it selected the correct row.
  4. Test editing the Unique Row Identifier Expression value in log source configuration to simulate another source schema e.g. __hdx_id as test_id

References

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 8, 2026

🦋 Changeset detected

Latest commit: d790f62

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@hyperdx/api Patch
@hyperdx/app Patch
@hyperdx/otel-collector Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Ready Ready Preview, Comment Apr 28, 2026 2:06am

Request Review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 8, 2026

E2E Test Results

All tests passed • 158 passed • 3 skipped • 1240s

Status Count
✅ Passed 158
❌ Failed 0
⚠️ Flaky 5
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

@karl-power karl-power force-pushed the karl/id-field-in-log-filter branch from a7ee44f to 3151b4d Compare April 9, 2026 09:00
@karl-power karl-power marked this pull request as ready for review April 9, 2026 13:53
@github-actions github-actions Bot added the review/tier-4 Critical — deep review + domain expert sign-off label Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

🔴 Tier 4 — Critical

Touches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD.

Why this tier:

  • Critical-path files (1):
    • docker/otel-collector/schema/seed/00002_otel_logs.sql
  • Cross-layer change: touches frontend (packages/app) + backend (packages/api)

Review process: Deep review from a domain expert. Synchronous walkthrough may be required.
SLA: Schedule synchronous review within 2 business days.

Stats
  • Production files changed: 7
  • Production lines changed: 72 (+ 159 in test files, excluded from tier calculation)
  • Branch: karl/id-field-in-log-filter
  • Author: karl-power

To override this classification, remove the review/tier-4 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

@karl-power karl-power requested a review from wrn14897 April 9, 2026 13:54
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

PR Review

  • UInt16 (65,536 values) is too narrow for row disambiguation → High-volume systems with repeated identical log lines (health checks, polling) will trigger collisions via the birthday paradox (~50% chance with just ~300 same-timestamp/service/body rows). rand() already returns UInt32; drop the toUInt16() cast in both the seed SQL and migration to get 4 billion distinct values with zero extra storage cost.

  • ⚠️ HDX_ID_COLUMN constant defined inside inferTableSourceConfig → Minor but it's a fixed magic string — move it to module scope so other parts of the codebase can reference it without re-declaring.

  • useSource is correctly guarded with enabled: id != null, so no spurious API calls when sourceId is undefined.

  • ✅ Migration uses IF NOT EXISTS / IF EXISTS guards, test coverage is solid, deduplication logic in appendSelectWithAdditionalKeys looks correct.

@karl-power
Copy link
Copy Markdown
Contributor Author

@wrn14897 Do you know if the migrations are needed here? I tried to run them locally which invokes migrate, but I don't see that installed anywhere.

Copy link
Copy Markdown
Member

@wrn14897 wrn14897 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the UUID type for the id column (https://clickhouse.com/docs/sql-reference/data-types/uuid). Using toUInt16(rand()) only provides 65,536 possible values, which could lead to collisions.

I’d also suggest raising this with the team, since it involves a schema change. It would be good to make sure we all align on the approach for the ID field before moving forward. cc @knudtty

@karl-power
Copy link
Copy Markdown
Contributor Author

karl-power commented Apr 13, 2026

I think we should use the UUID type for the id column

@MikeShi42 do you have thoughts on this as you originally suggested UInt16?

I'm guessing the reasoning was that UInt16 combined with the other fields from the search query would be enough to reduce the chance of a collision, with the benefit of less storage than UUID

@wrn14897
Copy link
Copy Markdown
Member

I think we should use the UUID type for the id column

@MikeShi42 do you have thoughts on this as you originally suggested UInt16?

I'm guessing the reasoning was that UInt16 combined with the other fields from the search query would be enough to reduce the chance of a collision, with the benefit of less storage than UUID

I understand the reason for reducing storage (to achieve a better compression ratio), and that this ID is only used internally. Also, as you mentioned, the chance of collisions would be lower if the search is always combined with other fields. I just want to confirm that this is intentional

@karl-power
Copy link
Copy Markdown
Contributor Author

@wrn14897 I have shared with the team but didn't receive any feedback yet. Are you happy to proceed like this or prefer to use uuid?

@MikeShi42
Copy link
Copy Markdown
Contributor

@karl-power instead of creating a new column, any thoughts on if we should instead just use _block_number and _block_offset and enabling enable_block_number_column and enable_block_offset_column by default in new tables? This will technically allow other features as well within ClickHouse - so more useful than a random id.

We can tell existing users to run an update to add this new random id column (as suggested in this PR) if they haven't set up their table to use block number/offsets.

@karl-power
Copy link
Copy Markdown
Contributor Author

karl-power commented Apr 28, 2026

@karl-power instead of creating a new column, any thoughts on if we should instead just use _block_number and _block_offset and enabling enable_block_number_column and enable_block_offset_column by default in new tables? This will technically allow other features as well within ClickHouse - so more useful than a random id.

We can tell existing users to run an update to add this new random id column (as suggested in this PR) if they haven't set up their table to use block number/offsets.

@MikeShi42 Yeah I think the block_number/offset combo would work. So you're suggesting to remove the UI control for the unique identifier and try to fallback to __hdx_id if the block settings aren't enabled?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

review/tier-4 Critical — deep review + domain expert sign-off

Projects

None yet

Development

Successfully merging this pull request may close these issues.

‘expand log line’ not use all filter.

4 participants