feat(range-tombstone): per-source RT visibility in range/prefix iteration#39
Conversation
…tion Replace the single global read_seqno cutoff in RangeTombstoneFilter and ActiveTombstoneSet with a per-RT visibility cutoff. Each collected tombstone is now paired with the seqno of its originating source so that ephemeral memtable RTs use their own index_seqno instead of the outer scan seqno. This fixes two classes of bugs when index_seqno != outer_seqno: - Over-suppress: an ephemeral RT visible at outer_seqno but not at eph_seqno would incorrectly suppress base-tree keys. - Leak: an ephemeral RT visible at eph_seqno but not at outer_seqno would fail to suppress ephemeral KVs that entered the merged stream. - ActiveTombstoneSet::activate() now takes per-RT cutoff_seqno param - RangeTombstoneFilter::new() accepts Vec<(RangeTombstone, SeqNo)> - range.rs tags each collected RT with its source's seqno - Add 6 regression tests covering both failure modes Closes #33
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRange-tombstone visibility was changed from a single global cutoff to per-tombstone cutoffs. Tombstones are now propagated as Changes
Sequence Diagram(s)sequenceDiagram
participant TreeIter
participant Sources as SSTs / Sealed / Active / Ephemeral
participant RTFilter as RangeTombstoneFilter
participant ActiveSet as ActiveTombstoneSet
participant Scanner as Iterator/Scanner
TreeIter->>Sources: collect RTs with source_seqno
Sources-->>TreeIter: (RangeTombstone, source_seqno)*
TreeIter->>RTFilter: new(inner, tombstones: Vec<(RT, SeqNo)>)
RTFilter->>ActiveSet: activate(rt, cutoff_seqno) for each (rt, cutoff)
ActiveSet->>ActiveSet: query rt.visible_at(cutoff_seqno)?
RTFilter-->>Scanner: filter/emit entries using per-tombstone visibility
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
This PR fixes incorrect range tombstone (RT) visibility when an ephemeral memtable overlay uses an index_seqno that differs from the outer scan seqno, by carrying a per-source visibility cutoff alongside each collected RT and applying suppression using that cutoff during iteration.
Changes:
- Tag collected RTs with a per-source cutoff seqno (ephemeral uses
index_seqno, other sources use outer scanseqno) and pass them through range/prefix iteration. - Refactor
RangeTombstoneFilterandActiveTombstoneSetto operate onVec<(RangeTombstone, SeqNo)>(per-RT cutoff) instead of a single global read cutoff. - Add regression tests covering both “over-suppress” and “leak” scenarios when
index_seqno != outer_seqno.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
src/range.rs |
Collects RTs with per-source cutoffs and wires updated filter into range/prefix iteration. |
src/range_tombstone_filter.rs |
Updates filter to activate tombstones with per-tombstone cutoff seqnos for fwd/rev scans. |
src/active_tombstone_set.rs |
Moves visibility cutoff from struct-global to per-activation parameter. |
tests/range_tombstone_ephemeral.rs |
Adds regression coverage for ephemeral overlay RT visibility divergence cases. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/range.rs`:
- Around line 337-349: The dedup step using dedup_by(|a, b| a.0 == b.0) drops
differing cutoffs for identical RangeTombstone keys, which can discard a visible
copy; change the dedup to preserve the maximum cutoff for equal RTs (i.e., when
merging entries in all_range_tombstones after sort_by(|a,b| a.0.cmp(&b.0)), keep
the entry with the larger cutoff rather than arbitrarily keeping the first), so
the subsequent all(... visible_at(...)) check and RangeTombstoneFilter::new see
the correct cutoff; also add a regression test covering a duplicated RT from two
sources with different cutoffs (persisted vs ephemeral) to ensure the visible
copy is preserved.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 81e0e3e8-b736-4291-93fd-195e6023ca43
📒 Files selected for processing (4)
src/active_tombstone_set.rssrc/range.rssrc/range_tombstone_filter.rstests/range_tombstone_ephemeral.rs
…ources When the same RT exists in both a persisted SST (cutoff=outer_seqno) and the ephemeral memtable (cutoff=eph_seqno), dedup_by on RT-only discards the copy with the higher cutoff, making the RT invisible and leaking covered keys. This test currently FAILS — the fix follows in the next commit.
When the same RT exists in multiple sources with different per-source cutoffs (e.g., persisted SST with outer_seqno + ephemeral memtable with eph_seqno), dedup_by on RT-only arbitrarily discarded one copy. This could drop the copy with the higher cutoff, making the RT invisible and leaking covered keys. Now dedup merges duplicates by keeping the max cutoff, so the RT stays visible if ANY source's snapshot includes it.
dedup_by(|a, b|) passes a=later, b=earlier; b survives. The existing code correctly updates b.1. Expanded comment to prevent confusion with the counter-intuitive parameter naming.
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
… assertions Both forward_per_source_cutoff_mixed and reverse_per_source_cutoff_mixed only checked max_active_seqno() which was ambiguous (same value whether 1 or 2 RTs activated). Now expire past the visible RT's boundary and assert the set becomes empty, proving the invisible RT was never activated.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/active_tombstone_set.rs (1)
349-350: Rename the new tests to include the expected behavior.Both names capture the setup, but not the assertion being protected. Encoding the expected outcome would make failures easier to scan.
As per coding guidelines,
Test naming convention: fn <what>_<condition>_<expected>().Also applies to: 430-431
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/active_tombstone_set.rs` around lines 349 - 350, Rename the newly added test functions to follow the project convention fn <what>_<condition>_<expected>() so the test name encodes the assertion; for example, change fn forward_per_source_cutoff_mixed() to a name that includes the expected behavior (e.g., forward_per_source_cutoff_mixed_keeps_expected_tombstones) and apply the same renaming pattern to the other new test(s) in this file (the other test near the new additions) so each name clearly states what is being tested, under what condition, and what the expected outcome is.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/active_tombstone_set.rs`:
- Around line 349-350: Rename the newly added test functions to follow the
project convention fn <what>_<condition>_<expected>() so the test name encodes
the assertion; for example, change fn forward_per_source_cutoff_mixed() to a
name that includes the expected behavior (e.g.,
forward_per_source_cutoff_mixed_keeps_expected_tombstones) and apply the same
renaming pattern to the other new test(s) in this file (the other test near the
new additions) so each name clearly states what is being tested, under what
condition, and what the expected outcome is.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: adafef29-a0bb-4055-a933-5ff62449b89c
📒 Files selected for processing (1)
src/active_tombstone_set.rs
…ntion - forward_per_source_cutoff_mixed → forward_mixed_cutoffs_activates_only_visible_rt - reverse_per_source_cutoff_mixed → reverse_mixed_cutoffs_activates_only_visible_rt - per_source_cutoff_different_visibility → mixed_cutoffs_suppress_only_visible_source
Guard (IterGuard trait) is required for .key() calls on iterator items. Add clarifying comment matching the pattern in range_tombstone.rs.
Vec<Vec<u8>> compares correctly with vec![b"a", b"b"] via Rust's blanket PartialEq impl. Add doc comment matching range_tombstone.rs.
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
build_ephemeral silently discarded insert_range_tombstone return value. Now asserts size > 0 to catch rejected (start >= end) RTs. Also document defensive re-sort in RangeTombstoneFilter::new.
- Replace magic 999 with EPHEMERAL_MT_ID const - Align key-range doc with actual inserts (a..d, not a..e)
- MvccStream RT suppression uses per-source cutoffs (u64) from the new (RangeTombstone, cutoff) tuple format after upstream PR #39 - resolve_merge_buffered initializes result_seqno from entries.last() instead of using 0 as sentinel (seqno can legitimately be 0) - key_entries_buf reuses allocation across next_back() calls - Document RT clone rationale in range pipeline
- MvccStream RT suppression uses per-source cutoffs (u64) from the new (RangeTombstone, cutoff) tuple format after upstream PR #39 - resolve_merge_buffered initializes result_seqno from entries.last() instead of using 0 as sentinel (seqno can legitimately be 0) - key_entries_buf reuses allocation across next_back() calls - Document RT clone rationale in range pipeline
- MvccStream RT suppression uses per-source cutoffs (u64) from the new (RangeTombstone, cutoff) tuple format after upstream PR #39 - resolve_merge_buffered initializes result_seqno from entries.last() instead of using 0 as sentinel (seqno can legitimately be 0) - key_entries_buf reuses allocation across next_back() calls - Document RT clone rationale in range pipeline
Summary
read_seqnocutoff inRangeTombstoneFilter/ActiveTombstoneSetwith per-RT visibility cutoffindex_seqno, all other sources use the outer scan seqnoindex_seqno != outer_seqno:outer_seqnobut not ateph_seqnoincorrectly suppressed base-tree keyseph_seqnobut not atouter_seqnofailed to suppress ephemeral KVs in the merged streamChanged files
src/active_tombstone_set.rsactivate()takes per-RTcutoff_seqnoparam; removed global cutoff from structsrc/range_tombstone_filter.rsVec<(RangeTombstone, SeqNo)>; constructor no longer takes globalread_seqnosrc/range.rsouter_seqno, ephemeral →eph_seqno; dedup merges by max cutofftests/range_tombstone_ephemeral.rsTest plan
Unit tests (3):
forward_mixed_cutoffs_activates_only_visible_rt— ActiveTombstoneSet: only RT with sufficient cutoff is activated; expiry proves invisible RT was skippedreverse_mixed_cutoffs_activates_only_visible_rt— ActiveTombstoneSetReverse: same with reverse expirymixed_cutoffs_suppress_only_visible_source— RangeTombstoneFilter end-to-end: visible-source RT suppresses, invisible-source RT does notIntegration tests (7):
eph_seqno < outer_seqno)eph_seqno > outer_seqno)eph_seqno == outer_seqno(no divergence)outer_seqno, unaffected by ephemeralSuite: full test suite passes (321+ unit + integration + doc-tests, 0 failures)
Closes #33
Summary by CodeRabbit
Bug Fixes
Tests