Skip to content

refactor: unify merge resolution via bloom-filtered iterator pipeline#69

Merged
polaz merged 10 commits into
mainfrom
feat/#46-refactor-unify-merge-resolution-via-bloom-filtered
Mar 22, 2026
Merged

refactor: unify merge resolution via bloom-filtered iterator pipeline#69
polaz merged 10 commits into
mainfrom
feat/#46-refactor-unify-merge-resolution-via-bloom-filtered

Conversation

@polaz
Copy link
Copy Markdown
Member

@polaz polaz commented Mar 22, 2026

Summary

  • Replace hand-rolled resolve_merge_get() with resolve_merge_via_pipeline() that reuses Merger → MvccStream on a key..=key range
  • Add standard bloom pre-filtering (Table::bloom_may_contain_key_hash()) to skip many disk tables for point reads
  • Eliminate duplicated operand collection / RT suppression / Indirection logic between point reads and range scans

Net -143 lines — merge resolution now lives in one place (MvccStream).

Changes

File What
table/mod.rs Extract bloom_may_contain_hash() base, add bloom_may_contain_key_hash()
range.rs Add key_hash to IterState, bloom_passes() helper for unified prefix+key bloom
tree/mod.rs resolve_merge_via_pipeline() replaces ~150-line resolve_merge_get()
memtable/mod.rs Remove unused get_all_for_key() and its tests
tests/merge_operator.rs Update comments referencing old function name

Test plan

  • cargo check — 0 warnings, 0 errors
  • cargo test — 757 passed, 0 failed
  • All 44 merge operator tests pass unchanged
  • Benchmark point-read latency on 100-table tree within 5% of baseline

Closes #46

Summary by CodeRabbit

  • Improvements

    • Enhanced bloom-filter pre-filtering for single- and multi-table scans with optional key-hash checks and consolidated pass/fail logic; prefix-based skip metrics adjusted.
  • Refactor

    • Merge resolution unified into a pipeline-based point-read path; obsolete per-key retrieval pathway removed.
  • Tests

    • Added and updated tests validating prefix/bloom behavior and merge resolution with overlapping/non-matching tables.
  • Chores

    • Added a benchmark for merge point-read performance.

Copilot AI review requested due to automatic review settings March 22, 2026 08:20
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Refactors point-read merge resolution to use a bloom-filtered single-key iterator pipeline with optional key-hash prefilter, adds IterState.key_hash, introduces table bloom APIs for raw key hashes, removes Memtable::get_all_for_key, and updates tests and benchmarks to exercise the new pipeline and bloom behavior.

Changes

Cohort / File(s) Summary
Merge resolution pipeline
src/tree/mod.rs
Removed Tree::resolve_merge_get; added Tree::resolve_merge_via_pipeline that builds an inclusive single-key range, sets IterState.key_hash, and resolves merges via the iterator/MvccStream pipeline; updated callsites to pass Option<&Arc<dyn MergeOperator>>.
Iterator & bloom gating
src/range.rs
Added IterState.key_hash: Option<u64>; centralized per-table bloom logic into bloom_passes(state, table) and applied bloom pre-filtering for both prefix and key hashes across single- and multi-table scan paths; moved prefix-skip metric into bloom_passes.
Table bloom APIs
src/table/mod.rs
Introduced private bloom_may_contain_hash(...), reintroduced maybe_contains_prefix(...) as a thin wrapper, and added pub(crate) bloom_may_contain_key_hash(key_hash: u64) -> Result<bool> to support precomputed key-hash checks.
Memtable removal
src/memtable/mod.rs
Removed Memtable::get_all_for_key(...) and its unit tests; memtable point-collection is now handled via the unified iterator pipeline.
Tests & benches
tests/merge_operator.rs, tests/tree_prefix_bloom.rs, benches/merge_point_read.rs, Cargo.toml
Updated/added tests to validate merge resolution via resolve_merge_via_pipeline and prefix-bloom behavior; added Criterion benchmark merge_point_read exercising merge-point reads with/without cache.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Tree as Tree
    participant Pipeline as Iterator\n(IterState)
    participant Bloom as BloomFilter
    participant Table as Table/Memtable
    participant Stream as MvccStream\n(Merger)

    Client->>Tree: get(key, seqno)
    Tree->>Pipeline: build single-key range\nset IterState.key_hash = hash(key)
    Pipeline->>Bloom: bloom_may_contain_key_hash(hash)
    Bloom-->>Pipeline: pass / skip
    Pipeline->>Table: open iterators for active/sealed + bloom-passing tables
    Table-->>Stream: emit entries for key
    Stream->>Stream: resolve merges / apply RT & indirection rules
    Stream-->>Tree: resolved value / None / error
    Tree-->>Client: return result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Poem

🐰 I hop through filters, one-key at a time,
I set my hash and let the blooms chime,
Memtables step aside, SSTs take their cue,
Pipelines hum softly and stitch old with new,
A tiny rabbit cheers — merge resolved true!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: unifying merge resolution through a bloom-filtered iterator pipeline, which is the core refactor across multiple files.
Linked Issues check ✅ Passed All acceptance criteria from issue #46 are met: resolve_merge_get removed, point reads use MvccStream pipeline with bloom pre-filtering, all 44 merge tests pass, and benchmarks for latency on deep trees are included.
Out of Scope Changes check ✅ Passed All changes directly support the merge resolution refactor. Removed get_all_for_key (no longer needed), added bloom APIs, added benchmarks, and updated tests—all align with the issue objectives.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/#46-refactor-unify-merge-resolution-via-bloom-filtered

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 22, 2026

Codecov Report

❌ Patch coverage is 92.75362% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/range.rs 84.61% 4 Missing ⚠️
src/tree/mod.rs 97.05% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors point-read merge resolution to reuse the existing Merger → MvccStream iterator pipeline (via a single-key key..=key range), while adding bloom-based pre-filtering so point reads can skip irrelevant SSTs without duplicating merge logic.

Changes:

  • Replace the hand-rolled point-read merge collector with resolve_merge_via_pipeline() backed by TreeIter/MvccStream.
  • Add a standard bloom key-check API on Table and thread a key_hash through iterator state to enable bloom skipping for single-key pipelines.
  • Remove the now-unused memtable helper (get_all_for_key) and its tests; update test comments to reference the new function.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/table/mod.rs Factors bloom checking into a hash-based helper and exposes bloom_may_contain_key() for point-read pre-filtering.
src/range.rs Extends IterState with key_hash and centralizes bloom filtering via bloom_passes() for both prefix and key-based skips.
src/tree/mod.rs Switches point-read merge resolution to the iterator pipeline (resolve_merge_via_pipeline) and updates merge-operator plumbing accordingly.
src/memtable/mod.rs Removes get_all_for_key() and its unit tests, which were only used by the old point-read merge path.
tests/merge_operator.rs Updates comments referring to the old merge-resolution function name.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

Comment thread src/table/mod.rs Outdated
Comment thread src/tree/mod.rs Outdated
Comment thread tests/merge_operator.rs Outdated
Comment thread tests/merge_operator.rs
@polaz polaz force-pushed the feat/#46-refactor-unify-merge-resolution-via-bloom-filtered branch from cec697b to feb7585 Compare March 22, 2026 09:09
@polaz polaz requested a review from Copilot March 22, 2026 09:53
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 22, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

@polaz Sure, I'll re-review the PR now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/blob_tree/mod.rs:648-655
Timestamp: 2026-03-21T09:32:08.642Z
Learning: In `structured-world/lsm-tree`, `BlobTree::merge` (in `src/blob_tree/mod.rs`) delegates directly to `self.index.merge(...)`. BlobTree's `get()`/`resolve_key()` does not yet resolve merge operands — it handles Indirection but not merge operand collection. Full BlobTree merge read-path support is a documented known limitation deferred to follow-up work (introduced in PR `#28`). The compaction path already handles merge operands correctly, including Indirection safety via the pending buffer. Do not flag the absence of BlobTree merge read-path resolution as a blocking concern in PR `#28` or future PRs until the follow-up work is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.598Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/tree/mod.rs:913-956
Timestamp: 2026-03-22T03:09:12.274Z
Learning: In `structured-world/lsm-tree` (`src/tree/mod.rs`), the `resolve_merge_get` function's disk-scan loop in step 3 is a flat scan across all levels/runs and does not apply the L0-specific newest-first ordering used by `get_internal_entry_from_tables`. This means it can stop on a base/tombstone boundary from an older L0 run while a newer overlapping L0 run still has visible `MergeOperand` entries. This is a known limitation documented with `// TODO(`#46`)` in the code and is intentionally deferred to issue `#46`. Do not flag the flat L0 scan in `resolve_merge_get` as a merge-semantics bug in PR `#28` or future PRs until issue `#46` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 45
File: tools/db_bench/src/workloads/mergerandom.rs:27-35
Timestamp: 2026-03-21T17:08:52.777Z
Learning: In `structured-world/lsm-tree`, `lsm_tree`'s merge operator API is internal (not public). The `mergerandom` workload in `tools/db_bench/src/workloads/mergerandom.rs` is intentionally an overwrite/compaction stress benchmark rather than a true merge-operand benchmark. True merge-operand benchmarks are planned for a future PR using the CoordiNode posting list API. Do not flag `mergerandom` for lacking merge-operand calls in PR `#45` or future PRs until that API is exposed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/abstract_tree.rs:128-129
Timestamp: 2026-03-22T03:19:31.834Z
Learning: In `structured-world/lsm-tree` (`src/abstract_tree.rs`), the flush path (`AbstractTree::flush`) intentionally does NOT pass range tombstones into `CompactionStream` for RT-boundary-aware merge resolution. Flush is a co-location pass: both KV entries (including merged operands) and RTs are written together into the output SST. RT suppression is enforced exclusively at read time via `is_suppressed_by_range_tombstones` (point reads) and `RangeTombstoneFilter` (range scans). Do not flag the absence of RT propagation into `CompactionStream` during flush as a merge-semantics bug in PR `#28` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 43
File: tests/tree_prefix_bloom.rs:91-105
Timestamp: 2026-03-21T22:11:09.195Z
Learning: In `structured-world/lsm-tree`, prefix-bloom integration tests in `tests/tree_prefix_bloom.rs` intentionally validate correctness (result counts and key ordering) rather than asserting that the bloom-skip pruning path fired. Bloom-skip metrics/counters are `pub(crate)` and cannot be observed from integration tests. Do not flag these tests for failing to assert on internal pruning behavior in PR `#43` or future PRs.

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-20T22:03:44.692Z
Learning: Applies to src/memtable/interval_tree.rs : `src/memtable/interval_tree.rs` — Interval tree for memtable range tombstone queries

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.289Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/instructions/rust.instructions.md:0-0
Timestamp: 2026-03-20T22:04:21.226Z
Learning: Applies to **/*.rs : Flag incorrect merge semantics: tombstones not propagated to lower levels, point deletes applied out of order (Tier 1 — MUST flag)

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-20T22:03:44.692Z
Learning: Applies to src/range_tombstone_filter.rs : `src/range_tombstone_filter.rs` — MVCC-aware range tombstone filtering for iterators
✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.

Comment thread src/range.rs
Comment thread tests/merge_operator.rs Outdated
Comment thread tests/tree_prefix_bloom.rs Outdated
@polaz polaz force-pushed the feat/#46-refactor-unify-merge-resolution-via-bloom-filtered branch 2 times, most recently from 58fcb1a to b53720a Compare March 22, 2026 10:33
@polaz polaz requested a review from Copilot March 22, 2026 10:46
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 22, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

@polaz Sure, I'll re-review the PR now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/tree/mod.rs:913-956
Timestamp: 2026-03-22T03:09:12.274Z
Learning: In `structured-world/lsm-tree` (`src/tree/mod.rs`), the `resolve_merge_get` function's disk-scan loop in step 3 is a flat scan across all levels/runs and does not apply the L0-specific newest-first ordering used by `get_internal_entry_from_tables`. This means it can stop on a base/tombstone boundary from an older L0 run while a newer overlapping L0 run still has visible `MergeOperand` entries. This is a known limitation documented with `// TODO(`#46`)` in the code and is intentionally deferred to issue `#46`. Do not flag the flat L0 scan in `resolve_merge_get` as a merge-semantics bug in PR `#28` or future PRs until issue `#46` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/blob_tree/mod.rs:648-655
Timestamp: 2026-03-21T09:32:08.642Z
Learning: In `structured-world/lsm-tree`, `BlobTree::merge` (in `src/blob_tree/mod.rs`) delegates directly to `self.index.merge(...)`. BlobTree's `get()`/`resolve_key()` does not yet resolve merge operands — it handles Indirection but not merge operand collection. Full BlobTree merge read-path support is a documented known limitation deferred to follow-up work (introduced in PR `#28`). The compaction path already handles merge operands correctly, including Indirection safety via the pending buffer. Do not flag the absence of BlobTree merge read-path resolution as a blocking concern in PR `#28` or future PRs until the follow-up work is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.724Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path (lines ~749–759) uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to a separate follow-up issue. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until that issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.289Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/instructions/rust.instructions.md:0-0
Timestamp: 2026-03-20T22:04:21.226Z
Learning: Applies to **/*.rs : Flag incorrect merge semantics: tombstones not propagated to lower levels, point deletes applied out of order (Tier 1 — MUST flag)

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.724Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to issue `#72`. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until issue `#72` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/abstract_tree.rs:128-129
Timestamp: 2026-03-22T03:19:31.834Z
Learning: In `structured-world/lsm-tree` (`src/abstract_tree.rs`), the flush path (`AbstractTree::flush`) intentionally does NOT pass range tombstones into `CompactionStream` for RT-boundary-aware merge resolution. Flush is a co-location pass: both KV entries (including merged operands) and RTs are written together into the output SST. RT suppression is enforced exclusively at read time via `is_suppressed_by_range_tombstones` (point reads) and `RangeTombstoneFilter` (range scans). Do not flag the absence of RT propagation into `CompactionStream` during flush as a merge-semantics bug in PR `#28` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 45
File: tools/db_bench/src/workloads/mergerandom.rs:27-35
Timestamp: 2026-03-21T17:08:52.777Z
Learning: In `structured-world/lsm-tree`, `lsm_tree`'s merge operator API is internal (not public). The `mergerandom` workload in `tools/db_bench/src/workloads/mergerandom.rs` is intentionally an overwrite/compaction stress benchmark rather than a true merge-operand benchmark. True merge-operand benchmarks are planned for a future PR using the CoordiNode posting list API. Do not flag `mergerandom` for lacking merge-operand calls in PR `#45` or future PRs until that API is exposed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in issue `#31` — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.598Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 43
File: tests/tree_prefix_bloom.rs:91-105
Timestamp: 2026-03-21T22:11:09.195Z
Learning: In `structured-world/lsm-tree`, prefix-bloom integration tests in `tests/tree_prefix_bloom.rs` intentionally validate correctness (result counts and key ordering) rather than asserting that the bloom-skip pruning path fired. Bloom-skip metrics/counters are `pub(crate)` and cannot be observed from integration tests. Do not flag these tests for failing to assert on internal pruning behavior in PR `#43` or future PRs.
✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Comment thread src/tree/mod.rs
Comment thread tests/merge_operator.rs
@polaz polaz requested a review from Copilot March 22, 2026 11:15
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 22, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

@polaz Sure, I'll re-review the PR now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/tree/mod.rs:913-956
Timestamp: 2026-03-22T03:09:12.274Z
Learning: In `structured-world/lsm-tree` (`src/tree/mod.rs`), the `resolve_merge_get` function's disk-scan loop in step 3 is a flat scan across all levels/runs and does not apply the L0-specific newest-first ordering used by `get_internal_entry_from_tables`. This means it can stop on a base/tombstone boundary from an older L0 run while a newer overlapping L0 run still has visible `MergeOperand` entries. This is a known limitation documented with `// TODO(`#46`)` in the code and is intentionally deferred to issue `#46`. Do not flag the flat L0 scan in `resolve_merge_get` as a merge-semantics bug in PR `#28` or future PRs until issue `#46` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.724Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path (lines ~749–759) uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to a separate follow-up issue. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until that issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.724Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to issue `#72`. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until issue `#72` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.289Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/blob_tree/mod.rs:648-655
Timestamp: 2026-03-21T09:32:08.642Z
Learning: In `structured-world/lsm-tree`, `BlobTree::merge` (in `src/blob_tree/mod.rs`) delegates directly to `self.index.merge(...)`. BlobTree's `get()`/`resolve_key()` does not yet resolve merge operands — it handles Indirection but not merge operand collection. Full BlobTree merge read-path support is a documented known limitation deferred to follow-up work (introduced in PR `#28`). The compaction path already handles merge operands correctly, including Indirection safety via the pending buffer. Do not flag the absence of BlobTree merge read-path resolution as a blocking concern in PR `#28` or future PRs until the follow-up work is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 43
File: tests/tree_prefix_bloom.rs:91-105
Timestamp: 2026-03-21T22:11:09.195Z
Learning: In `structured-world/lsm-tree`, prefix-bloom integration tests in `tests/tree_prefix_bloom.rs` intentionally validate correctness (result counts and key ordering) rather than asserting that the bloom-skip pruning path fired. Bloom-skip metrics/counters are `pub(crate)` and cannot be observed from integration tests. Do not flag these tests for failing to assert on internal pruning behavior in PR `#43` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/abstract_tree.rs:128-129
Timestamp: 2026-03-22T03:19:31.834Z
Learning: In `structured-world/lsm-tree` (`src/abstract_tree.rs`), the flush path (`AbstractTree::flush`) intentionally does NOT pass range tombstones into `CompactionStream` for RT-boundary-aware merge resolution. Flush is a co-location pass: both KV entries (including merged operands) and RTs are written together into the output SST. RT suppression is enforced exclusively at read time via `is_suppressed_by_range_tombstones` (point reads) and `RangeTombstoneFilter` (range scans). Do not flag the absence of RT propagation into `CompactionStream` during flush as a merge-semantics bug in PR `#28` or future PRs.

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/instructions/rust.instructions.md:0-0
Timestamp: 2026-03-20T22:04:21.226Z
Learning: Applies to **/*.rs : Flag incorrect merge semantics: tombstones not propagated to lower levels, point deletes applied out of order (Tier 1 — MUST flag)

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.598Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/table/multi_writer.rs:129-206
Timestamp: 2026-03-20T11:14:02.461Z
Learning: In `structured-world/lsm-tree`, the `write_rts_to_writer` function in `src/table/multi_writer.rs` has a known limitation when `clip_range_tombstones=true` (compaction mode): RT fragments that don't intersect any output writer's KV span are silently dropped, which can lose delete-only gaps (e.g. an RT covering `[m,p)` while all output tables cover `[a,l]` and `[q,z]`). This limitation is intentionally deferred to issue `#32` and is documented as a Known Limitation in PR `#21`. Do not flag this as a blocking concern in PR `#21` or any PR until `#32` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 45
File: tools/db_bench/src/workloads/mergerandom.rs:27-35
Timestamp: 2026-03-21T17:08:52.777Z
Learning: In `structured-world/lsm-tree`, `lsm_tree`'s merge operator API is internal (not public). The `mergerandom` workload in `tools/db_bench/src/workloads/mergerandom.rs` is intentionally an overwrite/compaction stress benchmark rather than a true merge-operand benchmark. True merge-operand benchmarks are planned for a future PR using the CoordiNode posting list API. Do not flag `mergerandom` for lacking merge-operand calls in PR `#45` or future PRs until that API is exposed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.
✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Comment thread src/tree/mod.rs
@polaz polaz force-pushed the feat/#46-refactor-unify-merge-resolution-via-bloom-filtered branch from 661ea79 to 17131f7 Compare March 22, 2026 11:37
@polaz polaz requested a review from Copilot March 22, 2026 11:37
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 22, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

@polaz Sure, I'll re-review the PR now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.289Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/tree/mod.rs:913-956
Timestamp: 2026-03-22T03:09:12.274Z
Learning: In `structured-world/lsm-tree` (`src/tree/mod.rs`), the `resolve_merge_get` function's disk-scan loop in step 3 is a flat scan across all levels/runs and does not apply the L0-specific newest-first ordering used by `get_internal_entry_from_tables`. This means it can stop on a base/tombstone boundary from an older L0 run while a newer overlapping L0 run still has visible `MergeOperand` entries. This is a known limitation documented with `// TODO(`#46`)` in the code and is intentionally deferred to issue `#46`. Do not flag the flat L0 scan in `resolve_merge_get` as a merge-semantics bug in PR `#28` or future PRs until issue `#46` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.724Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path (lines ~749–759) uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to a separate follow-up issue. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until that issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in issue `#31` — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.724Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to issue `#72`. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until issue `#72` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/table/multi_writer.rs:129-206
Timestamp: 2026-03-20T11:14:02.461Z
Learning: In `structured-world/lsm-tree`, the `write_rts_to_writer` function in `src/table/multi_writer.rs` has a known limitation when `clip_range_tombstones=true` (compaction mode): RT fragments that don't intersect any output writer's KV span are silently dropped, which can lose delete-only gaps (e.g. an RT covering `[m,p)` while all output tables cover `[a,l]` and `[q,z]`). This limitation is intentionally deferred to issue `#32` and is documented as a Known Limitation in PR `#21`. Do not flag this as a blocking concern in PR `#21` or any PR until `#32` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/abstract_tree.rs:128-129
Timestamp: 2026-03-22T03:19:31.834Z
Learning: In `structured-world/lsm-tree` (`src/abstract_tree.rs`), the flush path (`AbstractTree::flush`) intentionally does NOT pass range tombstones into `CompactionStream` for RT-boundary-aware merge resolution. Flush is a co-location pass: both KV entries (including merged operands) and RTs are written together into the output SST. RT suppression is enforced exclusively at read time via `is_suppressed_by_range_tombstones` (point reads) and `RangeTombstoneFilter` (range scans). Do not flag the absence of RT propagation into `CompactionStream` during flush as a merge-semantics bug in PR `#28` or future PRs.

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/instructions/rust.instructions.md:0-0
Timestamp: 2026-03-20T22:04:21.226Z
Learning: Applies to **/*.rs : Flag incorrect merge semantics: tombstones not propagated to lower levels, point deletes applied out of order (Tier 1 — MUST flag)

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/blob_tree/mod.rs:648-655
Timestamp: 2026-03-21T09:32:08.642Z
Learning: In `structured-world/lsm-tree`, `BlobTree::merge` (in `src/blob_tree/mod.rs`) delegates directly to `self.index.merge(...)`. BlobTree's `get()`/`resolve_key()` does not yet resolve merge operands — it handles Indirection but not merge operand collection. Full BlobTree merge read-path support is a documented known limitation deferred to follow-up work (introduced in PR `#28`). The compaction path already handles merge operands correctly, including Indirection safety via the pending buffer. Do not flag the absence of BlobTree merge read-path resolution as a blocking concern in PR `#28` or future PRs until the follow-up work is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.598Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 43
File: tests/tree_prefix_bloom.rs:91-105
Timestamp: 2026-03-21T22:11:09.195Z
Learning: In `structured-world/lsm-tree`, prefix-bloom integration tests in `tests/tree_prefix_bloom.rs` intentionally validate correctness (result counts and key ordering) rather than asserting that the bloom-skip pruning path fired. Bloom-skip metrics/counters are `pub(crate)` and cannot be observed from integration tests. Do not flag these tests for failing to assert on internal pruning behavior in PR `#43` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 45
File: tools/db_bench/src/workloads/mergerandom.rs:27-35
Timestamp: 2026-03-21T17:08:52.777Z
Learning: In `structured-world/lsm-tree`, `lsm_tree`'s merge operator API is internal (not public). The `mergerandom` workload in `tools/db_bench/src/workloads/mergerandom.rs` is intentionally an overwrite/compaction stress benchmark rather than a true merge-operand benchmark. True merge-operand benchmarks are planned for a future PR using the CoordiNode posting list API. Do not flag `mergerandom` for lacking merge-operand calls in PR `#45` or future PRs until that API is exposed.
✅ Actions performed

Full review triggered.

polaz added 5 commits March 22, 2026 15:07
- Remove stale line number reference from merge_rt test comment
- Reword prefix bloom test docstring to describe correctness
  validation independent of bloom filter behavior
- Add comment explaining why multi-table runs skip bloom filtering
- Remove brittle line number reference from merge_tombstone test
- Add code comment explaining intentional TreeIter reuse in
  resolve_merge_via_pipeline (bloom pre-filter + shared logic)
- Broaden multi-table run bloom guard to trigger on key_hash too
  (was prefix_hash-only, so point-read merges skipped bloom pruning)
- Replace inline prefix bloom check with bloom_passes() for
  consistent prefix+key bloom handling and metrics
- Soften test comments to not assert definite bloom behavior
- Rename prefix bloom test to describe correctness scenario
@polaz polaz force-pushed the feat/#46-refactor-unify-merge-resolution-via-bloom-filtered branch from 2f5304f to 56aafdb Compare March 22, 2026 13:35
Copy link
Copy Markdown

@sw-release-bot sw-release-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'lsm-tree db_bench'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.

Benchmark suite Current: 3d97a40 Previous: b9c6897 Ratio
fillrandom 875131.7860956681 ops/sec 1078840.3879110864 ops/sec 1.23
readrandom 550753.7447771127 ops/sec 655440.3305655628 ops/sec 1.19
overwrite 740668.2265781432 ops/sec 929551.8754039077 ops/sec 1.26

This comment was automatically generated by workflow using github-action-benchmark.

CC: @polaz

@polaz polaz requested a review from Copilot March 22, 2026 13:49
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 22, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

@polaz Sure, I'll re-review the PR now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.749Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path (lines ~749–759) uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to a separate follow-up issue. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until that issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/tree/mod.rs:913-956
Timestamp: 2026-03-22T03:09:14.362Z
Learning: In `structured-world/lsm-tree` (`src/tree/mod.rs`), the `resolve_merge_get` function's disk-scan loop in step 3 is a flat scan across all levels/runs and does not apply the L0-specific newest-first ordering used by `get_internal_entry_from_tables`. This means it can stop on a base/tombstone boundary from an older L0 run while a newer overlapping L0 run still has visible `MergeOperand` entries. This is a known limitation documented with `// TODO(`#46`)` in the code and is intentionally deferred to issue `#46`. Do not flag the flat L0 scan in `resolve_merge_get` as a merge-semantics bug in PR `#28` or future PRs until issue `#46` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.749Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to issue `#72`. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until issue `#72` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.298Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/table/multi_writer.rs:129-206
Timestamp: 2026-03-20T11:14:05.620Z
Learning: In `structured-world/lsm-tree`, the `write_rts_to_writer` function in `src/table/multi_writer.rs` has a known limitation when `clip_range_tombstones=true` (compaction mode): RT fragments that don't intersect any output writer's KV span are silently dropped, which can lose delete-only gaps (e.g. an RT covering `[m,p)` while all output tables cover `[a,l]` and `[q,z]`). This limitation is intentionally deferred to issue `#32` and is documented as a Known Limitation in PR `#21`. Do not flag this as a blocking concern in PR `#21` or any PR until `#32` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/abstract_tree.rs:128-129
Timestamp: 2026-03-22T03:19:33.840Z
Learning: In `structured-world/lsm-tree` (`src/abstract_tree.rs`), the flush path (`AbstractTree::flush`) intentionally does NOT pass range tombstones into `CompactionStream` for RT-boundary-aware merge resolution. Flush is a co-location pass: both KV entries (including merged operands) and RTs are written together into the output SST. RT suppression is enforced exclusively at read time via `is_suppressed_by_range_tombstones` (point reads) and `RangeTombstoneFilter` (range scans). Do not flag the absence of RT propagation into `CompactionStream` during flush as a merge-semantics bug in PR `#28` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/blob_tree/mod.rs:648-655
Timestamp: 2026-03-21T09:32:12.067Z
Learning: In `structured-world/lsm-tree`, `BlobTree::merge` (in `src/blob_tree/mod.rs`) delegates directly to `self.index.merge(...)`. BlobTree's `get()`/`resolve_key()` does not yet resolve merge operands — it handles Indirection but not merge operand collection. Full BlobTree merge read-path support is a documented known limitation deferred to follow-up work (introduced in PR `#28`). The compaction path already handles merge operands correctly, including Indirection safety via the pending buffer. Do not flag the absence of BlobTree merge read-path resolution as a blocking concern in PR `#28` or future PRs until the follow-up work is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 45
File: tools/db_bench/src/workloads/mergerandom.rs:27-35
Timestamp: 2026-03-21T17:08:56.084Z
Learning: In `structured-world/lsm-tree`, `lsm_tree`'s merge operator API is internal (not public). The `mergerandom` workload in `tools/db_bench/src/workloads/mergerandom.rs` is intentionally an overwrite/compaction stress benchmark rather than a true merge-operand benchmark. True merge-operand benchmarks are planned for a future PR using the CoordiNode posting list API. Do not flag `mergerandom` for lacking merge-operand calls in PR `#45` or future PRs until that API is exposed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.612Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 67
File: src/memtable/mod.rs:72-77
Timestamp: 2026-03-22T12:26:47.795Z
Learning: In `structured-world/lsm-tree` (`src/memtable/interval_tree.rs` and `src/memtable/mod.rs`), the memtable `IntervalTree` used for range tombstone containment queries (`is_key_suppressed_by_range_tombstone`, `query_suppression`, `insert_range_tombstone`) still uses lexicographic `Ord` on `UserKey` rather than the pluggable `SharedComparator` introduced in PR `#67`. Threading the comparator into the AVL interval tree is a non-trivial refactor intentionally deferred from PR `#67` and tracked in issue `#81`. Do not flag the absence of comparator-aware RT suppression in the memtable interval tree as a blocking concern in PR `#67` or future PRs until issue `#81` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.855Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.855Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in issue `#31` — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 43
File: tests/tree_prefix_bloom.rs:91-105
Timestamp: 2026-03-21T22:11:11.575Z
Learning: In `structured-world/lsm-tree`, prefix-bloom integration tests in `tests/tree_prefix_bloom.rs` intentionally validate correctness (result counts and key ordering) rather than asserting that the bloom-skip pruning path fired. Bloom-skip metrics/counters are `pub(crate)` and cannot be observed from integration tests. Do not flag these tests for failing to assert on internal pruning behavior in PR `#43` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T13:03:58.016Z
Learning: In `structured-world/lsm-tree` (`src/table/mod.rs`), `Table::bloom_may_contain_key_hash` returns `Ok(true)` (conservative fallback) for partitioned bloom filters (`pinned_filter_index.is_some()`) and filter TLI (`regions.filter_tli.is_some()`). Extending it to accept a key+seqno for partition-aware seeking is tracked in issue `#83`. Do not flag the conservative fallback in `bloom_may_contain_key_hash` for partitioned/TLI filters as a blocking concern in PR `#69` or future PRs until issue `#83` is addressed.
✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/tree/mod.rs (1)

776-816: Please attach the deep-tree point-read benchmark before merge.

This rewires the hot merge point-read path, and the PR acceptance criteria still call for staying within roughly 5% latency on a 100-table tree. The semantic coverage looks solid; the missing release signal here is the benchmark data.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tree/mod.rs` around lines 776 - 816, Before merging, add and attach the
deep-tree point-read benchmark results that measure latency on a ~100-table tree
for the new hot-path in resolve_merge_via_pipeline (the TreeIter-based
point-read/merge path). Run the existing benchmark harness (the deep-tree
point-read benchmark used in previous PRs) against the rewritten path
(resolve_merge_via_pipeline) and include the before/after latency numbers and
configuration (table count, data size, seqno, merge_operator used) in the PR so
reviewers can verify you stayed within the ~5% latency budget.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/tree/mod.rs`:
- Around line 776-816: Before merging, add and attach the deep-tree point-read
benchmark results that measure latency on a ~100-table tree for the new hot-path
in resolve_merge_via_pipeline (the TreeIter-based point-read/merge path). Run
the existing benchmark harness (the deep-tree point-read benchmark used in
previous PRs) against the rewritten path (resolve_merge_via_pipeline) and
include the before/after latency numbers and configuration (table count, data
size, seqno, merge_operator used) in the PR so reviewers can verify you stayed
within the ~5% latency budget.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ef11b0e0-045a-49ef-871a-68b9173f069a

📥 Commits

Reviewing files that changed from the base of the PR and between b9c6897 and 56aafdb.

📒 Files selected for processing (6)
  • src/memtable/mod.rs
  • src/range.rs
  • src/table/mod.rs
  • src/tree/mod.rs
  • tests/merge_operator.rs
  • tests/tree_prefix_bloom.rs
💤 Files with no reviewable changes (1)
  • src/memtable/mod.rs

Criterion benchmark measuring point-read merge resolution latency
on trees with 10/50/100 L0 tables, both cached and uncached.
Exercises the bloom-filtered iterator pipeline.
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 22, 2026

Benchmark: merge point-read on deep L0

cargo bench --bench merge_point_read — merge resolution via resolve_merge_via_pipeline on trees with bloom-filtered L0 tables.

Setup: 1 base value + N-1 unrelated tables (bloom rejects) + 1 merge operand in memtable. CounterMerge operator. macOS ARM64.

Tables Uncached (µs) Cached (µs)
10 2.23 1.33
50 2.33 1.56
100 2.90 1.99

Key observations:

  • 10→100 tables: +30% uncached, +50% cached — bloom pre-filter rejects 99/100 tables, latency does not grow linearly
  • Sub-3µs for 100-table uncached merge point-read
  • Cached path is ~1.5x faster than uncached (block cache eliminates disk I/O)
  • Bloom filter effectiveness confirmed: with 100 tables only 1-2 pass the filter

Benchmark added in commit 4836e82 (benches/merge_point_read.rs).

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
benches/merge_point_read.rs (1)

35-112: Extract shared tree-population logic for cached/uncached cases.

Both branches duplicate insert/flush/merge setup. A small helper (parameterized by cache size) would reduce drift and make benchmark intent easier to maintain.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benches/merge_point_read.rs` around lines 35 - 112, Duplicate setup for
populating trees (in merge_point_read.rs using variables tree and tree_cached
and their insert/flush/merge flows) should be extracted into a single helper
function (e.g., build_populated_tree or populate_tree) that accepts parameters
for cache size (or Cache::with_capacity_bytes arg), table_count and initial
seqno and returns the opened tree and the final sequence number; replace both
uncached and cached blocks with calls to this helper and use its returned seqno
for the get/merge checks to eliminate duplicated insert/flush/merge logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benches/merge_point_read.rs`:
- Around line 37-47: Multiple calls in the benchmark setup (tempdir(),
Config::new(...) .use_cache(...).with_merge_operator(...).open(), and other
operations referenced in this diff) use unwrap() on fallible filesystem and tree
operations which can panic; change these to fallible handling by returning
Result from the benchmark helper and using the ? operator (or explicit match/if
let error handling) so errors propagate to the harness boundary, and update the
caller to handle the Result; specifically replace tempdir().unwrap(), the
Config::new(...).open().unwrap(), and the other listed unwrap() sites with ? (or
proper error branching) and ensure functions like the setup helper return
Result<..., E> so the harness can handle/report I/O/runtime errors instead of
panicking.

---

Nitpick comments:
In `@benches/merge_point_read.rs`:
- Around line 35-112: Duplicate setup for populating trees (in
merge_point_read.rs using variables tree and tree_cached and their
insert/flush/merge flows) should be extracted into a single helper function
(e.g., build_populated_tree or populate_tree) that accepts parameters for cache
size (or Cache::with_capacity_bytes arg), table_count and initial seqno and
returns the opened tree and the final sequence number; replace both uncached and
cached blocks with calls to this helper and use its returned seqno for the
get/merge checks to eliminate duplicated insert/flush/merge logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9a481a27-0b5f-43e4-b36d-3c7914474811

📥 Commits

Reviewing files that changed from the base of the PR and between 56aafdb and 4836e82.

📒 Files selected for processing (2)
  • Cargo.toml
  • benches/merge_point_read.rs
✅ Files skipped from review due to trivial changes (1)
  • Cargo.toml

Comment thread benches/merge_point_read.rs
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor: unify merge resolution via bloom-filtered iterator pipeline

2 participants