Skip to content

Intern LowCardinality Row values to reduce string allocations#1790

Open
Onyx2406 wants to merge 2 commits intoClickHouse:mainfrom
Onyx2406:perf/lowcardinality-string-interning
Open

Intern LowCardinality Row values to reduce string allocations#1790
Onyx2406 wants to merge 2 commits intoClickHouse:mainfrom
Onyx2406:perf/lowcardinality-string-interning

Conversation

@Onyx2406
Copy link
Copy Markdown
Contributor

@Onyx2406 Onyx2406 commented Mar 8, 2026

Summary

Optimize LowCardinality(String) scanning by caching Row() results per dictionary index, reducing string allocations from O(N rows) to O(K unique values).

Fixes #1762

The problem

For LowCardinality(String) columns, Row() creates a new Go string for every row via byte-to-string conversion, even when many rows share the same dictionary value. For a column with 1M rows and 100 unique values, this means 1M string allocations instead of 100.

Benchmark from @bobrik in the issue:

  • Baseline: 7406ms, 9282 MiB RAM (1M allocations)
  • Optimized: 429ms, 1341 MiB RAM (K allocations)
  • Speedup: 17x, Memory: 7x reduction

Fix (20 insertions, 1 file)

Add a rowCache map[int]any to LowCardinality that caches index.Row(idx) results by dictionary index. On the first access for a given index, the value is created and cached. Subsequent rows with the same index reuse the cached value.

  • Cache only used for non-pointer access (ptr == false) — the common path
  • Cache cleared on Reset() (between blocks)
  • Lazy initialization — no allocation until first Row() call

Why this works

LowCardinality columns store data as:

  • Dictionary (index): K unique values
  • Keys: N row-to-dictionary mappings

Without caching: Row(i)indexRowNum(i)index.Row(idx) → new string every time
With caching: Row(i)indexRowNum(i) → cache lookup → return cached string

Test plan

  • go build ./lib/column/... passes
  • CI (full test suite)
  • Existing lowcardinality_test.go tests cover scanning behavior

For LowCardinality(String) columns, Row() previously created a new
Go string for every row even when many rows share the same dictionary
value. This caused O(N) string allocations where N = total rows.

Add a rowCache map keyed by dictionary index that caches the first
Row() result for each unique value. Subsequent rows with the same
dictionary index reuse the cached string, reducing allocations from
O(N) to O(K) where K = unique values (typically << N).

Benchmark from the issue reporter shows potential for:
- 17x speedup (7406ms -> 429ms)
- 7x memory reduction (9282MiB -> 1341MiB)

Fixes ClickHouse#1762
Comment thread lib/column/lowcardinality.go Outdated
Switch rowCache from map[int]any to []any, eagerly populating all
dictionary entries on first Row() call. Slice index is O(1) with
no hashing overhead, and the dictionary size is known upfront.
@Onyx2406 Onyx2406 requested a review from bobrik March 10, 2026 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LowCardinality(String) is not as efficient as it could be

2 participants