perf: partition-aware bloom filtering for point-read pipeline#102
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughAdds an optional Changes
Sequence Diagram(s)sequenceDiagram
participant Caller as Iterator/Caller
participant Table as Table
participant Index as PartitionIndex
participant Disk as Disk/IO
participant Filter as FilterBlock
Caller->>Table: bloom_may_contain_key(key, key_hash)
Table->>Table: debug_assert hash == Builder::get_hash(key)
alt pinned_filter_block present
Table->>Filter: bloom_may_contain_hash(key_hash)
Filter-->>Table: bool / Err
Table-->>Caller: Ok(bool) / Err
else pinned_filter_index present
Table->>Index: seek(key, seqno::MAX_SEQNO)
Index-->>Table: ceiling partition entry / None
alt ceiling partition found
Table->>Disk: load partition's filter block bytes
Disk-->>Filter: filter bytes (uncompressed)
Table->>Filter: maybe_contains_hash(key_hash)
Filter-->>Table: bool / Err
Table-->>Caller: Ok(bool) / Err
else no ceiling partition
Table-->>Caller: Ok(false)
end
else fallback
Table->>Table: bloom_may_contain_hash(key_hash)
Table-->>Caller: Ok(bool) / Err
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'lsm-tree db_bench'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.
| Benchmark suite | Current: 0e9a2a3 | Previous: befb450 | Ratio |
|---|---|---|---|
mergerandom |
564209.4550305106 ops/sec |
718056.7379855699 ops/sec |
1.27 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @polaz
There was a problem hiding this comment.
Pull request overview
Improves point-read merge-resolution performance for partitioned bloom filters by adding a key-aware bloom check that can seek the filter partition index, and plumbing the user key through the point-read iterator pipeline so only the relevant bloom partition is queried.
Changes:
- Add
Table::bloom_may_contain_key(key, key_hash)to support partition-aware bloom checks via partition-index seeking. - Extend
IterStatewithbloom_keyand dispatchbloom_passes()to the key-aware bloom path when available. - Add integration tests intended to validate partitioned bloom skipping behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/table/mod.rs |
Adds key-aware, partition-seeking bloom check for partitioned filters. |
src/range.rs |
Wires optional bloom_key into the bloom pre-filter decision logic. |
src/tree/mod.rs |
Populates IterState.bloom_key for single-key merge-resolution pipeline. |
tests/partitioned_bloom_skip.rs |
Adds tests for partitioned bloom skip behavior. |
40affb7 to
17914cd
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
ca1bb85 to
bf28b64
Compare
0316119 to
8fdf161
Compare
7de32e8 to
07c814b
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/partitioned_bloom_skip.rs`:
- Around line 137-148: The test currently uses a distractor SST key
("zzz_other") that is pruned by table.check_key_range_overlap before
bloom_may_contain_key runs; change the distractor so it passes the range-check
but fails the bloom check (for example insert a key that shares the same
key-range as "counter" but is not the bloom_key—e.g. "counter_other" or another
lexicographically adjacent key that falls within the table's min/max), or
alternatively update the test/comment to state it only verifies merge
correctness (rename test). Update the insert call in the failing test (the
tree.insert that creates the distractor SST) and its comment, and ensure the new
key triggers resolve_merge_via_pipeline while allowing TreeIter::create_range ->
bloom_may_contain_key to execute; references: TreeIter::create_range,
table.check_key_range_overlap, bloom_may_contain_key, and the test's
merge/insert sequence (merge("counter", ...), insert of distractor).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 207b4aa9-7db2-49ed-b0ce-0aa43cf1a184
📒 Files selected for processing (5)
src/range.rssrc/table/mod.rssrc/table/tests.rssrc/tree/mod.rstests/partitioned_bloom_skip.rs
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
a76b0cf to
0e9a2a3
Compare
Add bloom_may_contain_key() which seeks the partitioned filter TLI by user key to check only the matching partition's bloom — replacing the conservative Ok(true) fallback that bloom_may_contain_key_hash() uses for partitioned/TLI filter configurations. - Add bloom_key field to IterState, set by resolve_merge_via_pipeline - bloom_passes() dispatches to key-aware method when bloom_key present - Keys beyond all partition boundaries return Ok(false) (definite miss) - Full (non-partitioned) filters delegate to existing hash-only path Closes #83
SeqNo::MAX (u64::MAX) violates the reserved MSB range invariant. Use crate::seqno::MAX_SEQNO (0x7FFF_FFFF_FFFF_FFFF) which is the safe upper bound for "latest" seeks. Also add merge pipeline test that exercises the bloom_may_contain_key path through resolve_merge_via_pipeline → TreeIter → bloom_passes, and clarify existing test docs to distinguish Table::get vs pipeline bloom paths.
- Eliminate redundant key_slice clone by deriving range from bloom_key - Replace unwrap_or_default() with .expect() in test merge operator for explicit failure on malformed data - Relax exact filter_queries assertion to >= 1 (counter may vary with internal filter probes) - Add clarifying comment for seek+next None → Ok(false) correctness - Add full_filter_bloom_skip_merge_pipeline test covering the bloom_may_contain_key delegation path for non-partitioned filters
- Correct test docstrings: Table::get path tests don't claim bloom_may_contain_key; merge pipeline tests explain the code path - Document why pipeline tests assert correctness not metrics (io_skipped_by_filter is only incremented by Table::get, not by bloom_passes in the pipeline path)
Cover full-filter delegation and partitioned-filter seek+reject paths directly at the Table level, including Ok(false) for keys beyond all partition boundaries.
Key beyond table key range is rejected by key-range overlap check before bloom. The bloom_may_contain_key Ok(false) path for keys beyond partitions is covered by unit test in table::tests.
- Assert bloom_may_contain_key_hash returns Ok(true) conservatively for partitioned filters while bloom_may_contain_key returns Ok(false) — demonstrates the core behavioral improvement - Assert both methods agree for full (non-partitioned) filters - Extract SumMerge to module level to avoid duplication
- Add comment explaining UserKey (Slice) Deref<Target=[u8]> coercion in bloom_passes to prevent false "won't compile" review flags - Document key_hash pre-computation contract in bloom_may_contain_key doc comment (same pattern as Table::get)
Replace distractor key "zzz_other" with bracketing keys "aaa"+"zzz" that make key_range_overlap pass, ensuring bloom_may_contain_key is the actual filter step in the merge pipeline tests.
Slice::from(&[u8]) copies the key (not zero-copy). Document that this runs once per merge resolution and is negligible compared to I/O savings from partition-aware bloom filtering.
…ssertion - Add debug_assert that bloom_key requires key_hash to be set - Document that pinned_filter_block and pinned_filter_index are mutually exclusive (set at construction time) - Relax exact io_skipped_by_filter assertion to filter_queries >= 1 since bloom filters are probabilistic (FPR ~0.8% at 10 bpk)
…ition assertion - Add debug_assert_eq verifying key_hash matches BloomBuilder::get_hash(key) in bloom_may_contain_key to catch misuse in debug builds - Gate beyond-partitions Ok(false) assertion on pinned_filter_index.is_some() since test_with_table runs multiple table configurations (pinned/unpinned)
filter_queries counts lookups (not just rejects), so docstring should say "filter lookup occurred" not "filter rejects the table".
pinned_filter_index is loaded unconditionally in Table::recover when filter_tli exists, so the else branch (hash-only fallback) was dead code in the partitioned filter test.
0e9a2a3 to
8634930
Compare
Summary
Table::bloom_may_contain_key(key, key_hash)— seeks the partitioned filter TLI by user key and checks only the matching partition's bloom filter, replacing the conservativeOk(true)fallbackbloom_keyfield toIterState, populated byresolve_merge_via_pipelinefor single-key point-read pipelinesbloom_passes()dispatches to the key-aware method whenbloom_keyis available, falls back to hash-only path otherwisedebug_assertensuresbloom_keyis never set withoutkey_hashTechnical Details
Previously,
bloom_may_contain_key_hashreturnedOk(true)for partitioned/TLI filter configurations because the partition index is keyed by user key boundaries, not by raw hash — checking by hash alone would require scanning all partitions. The newbloom_may_contain_keymethod accepts the actual user key, seeks the TLI to the correct partition in O(log P), and queries only that partition's bloom filter. Keys beyond all partition boundaries returnOk(false)(definite miss).The existing
bloom_may_contain_key_hash(hash-only) path is preserved unchanged for callers that don't have the key available (e.g. prefix scans).pinned_filter_blockandpinned_filter_indexare mutually exclusive (set at construction time), so the branch order inbloom_may_contain_keyis safe.Slice::from(key)in the merge pipeline copies the key once per resolution (not zero-copy), but the cost is negligible compared to I/O savings.Known Limitations
resolve_merge_via_pipelinesetsbloom_key— general range scans still use hash-only bloom pre-filtering (which is correct but less effective for partitioned filters)unimplemented!for unpinned TLI inTable::get)Test Plan
partitioned_bloom_skip_for_point_reads— verifies bloom filter is queried for non-matching key with partitioned filters (metrics:filter_queries >= 1)partitioned_bloom_skip_beyond_partitions— verifies key beyond all partition boundaries is correctly rejectedpartitioned_bloom_skip_merge_pipeline— exercisesbloom_may_contain_keythrough the merge pipeline with bracketing distractor keysfull_filter_bloom_skip_merge_pipeline— covers the full-filter delegation path through the merge pipelinebloom_may_contain_key_full_filter— unit test: both methods agree for full filtersbloom_may_contain_key_partitioned_filter— unit test: contrast assertion proving key-based rejects while hash-only returns conservativeOk(true)Closes #83
Summary by CodeRabbit
Performance Improvements
New Features
Tests