fix: filter logs with ID field#2071
Conversation
🦋 Changeset detectedLatest commit: d790f62 The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
E2E Test Results✅ All tests passed • 158 passed • 3 skipped • 1240s
Tests ran across 4 shards in parallel. |
a7ee44f to
3151b4d
Compare
🔴 Tier 4 — CriticalTouches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD. Why this tier:
Review process: Deep review from a domain expert. Synchronous walkthrough may be required. Stats
|
PR Review
|
|
@wrn14897 Do you know if the migrations are needed here? I tried to run them locally which invokes |
3151b4d to
8772bd3
Compare
wrn14897
left a comment
There was a problem hiding this comment.
I think we should use the UUID type for the id column (https://clickhouse.com/docs/sql-reference/data-types/uuid). Using toUInt16(rand()) only provides 65,536 possible values, which could lead to collisions.
I’d also suggest raising this with the team, since it involves a schema change. It would be good to make sure we all align on the approach for the ID field before moving forward. cc @knudtty
@MikeShi42 do you have thoughts on this as you originally suggested UInt16? I'm guessing the reasoning was that UInt16 combined with the other fields from the search query would be enough to reduce the chance of a collision, with the benefit of less storage than UUID |
I understand the reason for reducing storage (to achieve a better compression ratio), and that this ID is only used internally. Also, as you mentioned, the chance of collisions would be lower if the search is always combined with other fields. I just want to confirm that this is intentional |
8772bd3 to
5d75767
Compare
|
@wrn14897 I have shared with the team but didn't receive any feedback yet. Are you happy to proceed like this or prefer to use |
|
@karl-power instead of creating a new column, any thoughts on if we should instead just use We can tell existing users to run an update to add this new random id column (as suggested in this PR) if they haven't set up their table to use block number/offsets. |
@MikeShi42 Yeah I think the |
Summary
Fixes an issue where clicking a log row to expand it could show the wrong row's details when multiple rows share identical visible column values (Timestamp, ServiceName, Body, etc.).
The root cause: the detail panel query reconstructs a WHERE clause from only the visible columns, without the original search filters. When rows collide on those columns,
LIMIT 1returns whichever ClickHouse finds first.Fix: Add a
__hdx_idmaterialized column (UInt16, random) tootel_logsthat acts as a tiebreaker. The app detects this column during source inference, stores it asuniqueRowIdExpression, and injects it as a hidden column in search queries. Since the row WHERE clause builder already iterates all columns in the row data,__hdx_idis automatically included in detail queries — disambiguating rows without surfacing the column in the UI.Key changes:
__hdx_idtootel_logsseed + CH migrationsource.ts): Auto-detect__hdx_idand setuniqueRowIdExpressionDBRowTable.tsx): GeneralizeappendSelectWithPrimaryAndPartitionKey→appendSelectWithAdditionalKeyswith anextraKeysparam; use it to injectuniqueRowIdExpressioninto the SELECT (hidden from UI, but included in the row data used to build the WHERE clause)SourceForm.tsx): ExposeuniqueRowIdExpressionfield for manual configurationextraKeysbehavior andinferTableSourceConfig__hdx_iddetectionHow to test locally
First reproduce problem on
main.otel_logs. Go to CH UI and query the row - easiest is to copy the query performed when clicking row to open sidebar.otel_logsbut change one nested value - I changedResourceAttributes.host.archfromarm64toarm65.ResourceAttributes.host.archofarm65. One result should appear. Click to open sidebar.arm64- it selected the wrong row.On this branch.
make dev-build) and restart.arm65- it selected the correct row.Unique Row Identifier Expressionvalue in log source configuration to simulate another source schema e.g.__hdx_id as test_idReferences