feat(range-tombstone): per-source RT visibility in range/prefix iteration by polaz · Pull Request #39 · structured-world/coordinode-lsm-tree

polaz · 2026-03-20T22:52:47Z

Summary

Replace single global read_seqno cutoff in RangeTombstoneFilter / ActiveTombstoneSet with per-RT visibility cutoff
Each collected tombstone is paired with its originating source's seqno — ephemeral memtable RTs use index_seqno, all other sources use the outer scan seqno
Fixes two classes of bugs when ephemeral index_seqno != outer_seqno:
- Over-suppress: ephemeral RT visible at outer_seqno but not at eph_seqno incorrectly suppressed base-tree keys
- Leak: ephemeral RT visible at eph_seqno but not at outer_seqno failed to suppress ephemeral KVs in the merged stream
Cross-source dedup merges duplicate RTs by keeping the max cutoff, so an RT remains visible if ANY source's snapshot includes it

Changed files

File	Change
`src/active_tombstone_set.rs`	`activate()` takes per-RT `cutoff_seqno` param; removed global cutoff from struct
`src/range_tombstone_filter.rs`	Internal vectors now `Vec<(RangeTombstone, SeqNo)>`; constructor no longer takes global `read_seqno`
`src/range.rs`	Tags each collected RT with source seqno: disk/memtable → `outer_seqno`, ephemeral → `eph_seqno`; dedup merges by max cutoff
`tests/range_tombstone_ephemeral.rs`	7 integration tests covering both failure modes + cross-source dedup

Test plan

Unit tests (3):

forward_mixed_cutoffs_activates_only_visible_rt — ActiveTombstoneSet: only RT with sufficient cutoff is activated; expiry proves invisible RT was skipped
reverse_mixed_cutoffs_activates_only_visible_rt — ActiveTombstoneSetReverse: same with reverse expiry
mixed_cutoffs_suppress_only_visible_source — RangeTombstoneFilter end-to-end: visible-source RT suppresses, invisible-source RT does not

Integration tests (7):

Over-suppress: forward + reverse iteration (eph_seqno < outer_seqno)
Over-suppress: bounded range query
Over-suppress: prefix query
Leak: forward + reverse iteration (eph_seqno > outer_seqno)
Normal case: eph_seqno == outer_seqno (no divergence)
Base-tree RT uses outer_seqno, unaffected by ephemeral
Duplicate RT from two sources (SST + ephemeral) — dedup keeps max cutoff

Suite: full test suite passes (321+ unit + integration + doc-tests, 0 failures)

Closes #33

Summary by CodeRabbit

Bug Fixes
- Fixed range tombstone visibility tracking to respect per-source cutoffs, ensuring correct key suppression across snapshots with different visibility boundaries.
- Improved handling of ephemeral memtable range tombstones with independent visibility cutoffs.
Tests
- Added comprehensive integration tests for range tombstone visibility under various snapshot cutoff configurations.

…tion Replace the single global read_seqno cutoff in RangeTombstoneFilter and ActiveTombstoneSet with a per-RT visibility cutoff. Each collected tombstone is now paired with the seqno of its originating source so that ephemeral memtable RTs use their own index_seqno instead of the outer scan seqno. This fixes two classes of bugs when index_seqno != outer_seqno: - Over-suppress: an ephemeral RT visible at outer_seqno but not at eph_seqno would incorrectly suppress base-tree keys. - Leak: an ephemeral RT visible at eph_seqno but not at outer_seqno would fail to suppress ephemeral KVs that entered the merged stream. - ActiveTombstoneSet::activate() now takes per-RT cutoff_seqno param - RangeTombstoneFilter::new() accepts Vec<(RangeTombstone, SeqNo)> - range.rs tags each collected RT with its source's seqno - Add 6 regression tests covering both failure modes Closes #33

coderabbitai · 2026-03-20T22:53:01Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Range-tombstone visibility was changed from a single global cutoff to per-tombstone cutoffs. Tombstones are now propagated as (RangeTombstone, SeqNo) pairs (source-specific seqnos like ephemeral index_seqno) and activation/visibility checks use each tombstone's associated cutoff.

Changes

Cohort / File(s)	Summary
Active tombstone sets `src/active_tombstone_set.rs`	Removed stored `cutoff_seqno`; `new()` is parameterless. `activate(&mut, rt, cutoff_seqno)` now accepts per-call cutoff. `initialize_from` accepts `IntoIterator<Item = (RangeTombstone, SeqNo)>`.
Range tombstone filter `src/range_tombstone_filter.rs`	`RangeTombstoneFilter::new(inner, Vec<(RangeTombstone, SeqNo)>)` replaces prior signature; stores forward/reverse lists as `(RangeTombstone, SeqNo)`; activation passes per-tombstone cutoff; sorting/dedup adjusted to preserve tombstone identity while merging cutoffs.
Tree iterator / collection `src/range.rs`	`TreeIter::create_range` collects tombstones as `(RangeTombstone, SeqNo)` using source-specific seqnos (SST/sealed/active use scan seqno; ephemeral uses `eph_seqno`), deduplicates by tombstone identity keeping max cutoff, and applies per-tombstone visibility checks (fast-path and wrapped path updated).
Tests — ephemeral RT behavior `tests/range_tombstone_ephemeral.rs`	New integration tests exercising per-source RT visibility across forward/reverse/range/prefix iterators, mixed cutoffs, deduplication behavior, and base-tree cutoff semantics.

Sequence Diagram(s)

sequenceDiagram
    participant TreeIter
    participant Sources as SSTs / Sealed / Active / Ephemeral
    participant RTFilter as RangeTombstoneFilter
    participant ActiveSet as ActiveTombstoneSet
    participant Scanner as Iterator/Scanner

    TreeIter->>Sources: collect RTs with source_seqno
    Sources-->>TreeIter: (RangeTombstone, source_seqno)*
    TreeIter->>RTFilter: new(inner, tombstones: Vec<(RT, SeqNo)>)
    RTFilter->>ActiveSet: activate(rt, cutoff_seqno) for each (rt, cutoff)
    ActiveSet->>ActiveSet: query rt.visible_at(cutoff_seqno)?
    RTFilter-->>Scanner: filter/emit entries using per-tombstone visibility

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related issues

feat: per-source RT visibility in range/prefix iteration when ephemeral memtable index_seqno differs from outer scan seqno #33 — Implements per-source RT visibility to prevent leaks/over-suppression when ephemeral index_seqno != outer seqno.
perf: optimize range tombstone lookup in table-skip and point-read paths #27 — Related: touches TreeIter::create_range and RT collection/visibility logic modified here.

Possibly related PRs

feat: add range tombstones (delete_range / delete_prefix) #21 — Prior PR that introduced the range-tombstone/filter architecture; this change adjusts those types/signatures to carry per-tombstone cutoffs.

Poem

A rabbit pairs stones with seqnos neat and bright,
Each tombstone knows its moment, none steal the light.
Ephemeral whispers guard their own small day,
No leaked keys wander, no rightful keys stray. 🐇✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately summarizes the main change: introducing per-source range tombstone visibility in range/prefix iteration, replacing a single global cutoff with per-tombstone cutoffs tied to originating sources.
Linked Issues check	✅ Passed	The PR fully implements the objectives from issue `#33`: tags each RT with originating source seqno, updates RangeTombstoneFilter to use per-RT cutoffs instead of global read_seqno, and includes comprehensive regression tests for ephemeral memtable visibility.
Out of Scope Changes check	✅ Passed	All code changes are directly related to the stated objective of per-source RT visibility: modifying active tombstone sets, range tombstone filter, range collection logic, and adding targeted integration tests.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/#33-feat-per-source-rt-visibility-in-rangeprefix-itera

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-03-20T22:55:21Z

Codecov Report

❌ Patch coverage is 99.24812% with 1 line in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/range.rs	95.83%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copilot

Pull request overview

This PR fixes incorrect range tombstone (RT) visibility when an ephemeral memtable overlay uses an index_seqno that differs from the outer scan seqno, by carrying a per-source visibility cutoff alongside each collected RT and applying suppression using that cutoff during iteration.

Changes:

Tag collected RTs with a per-source cutoff seqno (ephemeral uses index_seqno, other sources use outer scan seqno) and pass them through range/prefix iteration.
Refactor RangeTombstoneFilter and ActiveTombstoneSet to operate on Vec<(RangeTombstone, SeqNo)> (per-RT cutoff) instead of a single global read cutoff.
Add regression tests covering both “over-suppress” and “leak” scenarios when index_seqno != outer_seqno.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
`src/range.rs`	Collects RTs with per-source cutoffs and wires updated filter into range/prefix iteration.
`src/range_tombstone_filter.rs`	Updates filter to activate tombstones with per-tombstone cutoff seqnos for fwd/rev scans.
`src/active_tombstone_set.rs`	Moves visibility cutoff from struct-global to per-activation parameter.
`tests/range_tombstone_ephemeral.rs`	Adds regression coverage for ephemeral overlay RT visibility divergence cases.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/range.rs`:
- Around line 337-349: The dedup step using dedup_by(|a, b| a.0 == b.0) drops
differing cutoffs for identical RangeTombstone keys, which can discard a visible
copy; change the dedup to preserve the maximum cutoff for equal RTs (i.e., when
merging entries in all_range_tombstones after sort_by(|a,b| a.0.cmp(&b.0)), keep
the entry with the larger cutoff rather than arbitrarily keeping the first), so
the subsequent all(... visible_at(...)) check and RangeTombstoneFilter::new see
the correct cutoff; also add a regression test covering a duplicated RT from two
sources with different cutoffs (persisted vs ephemeral) to ensure the visible
copy is preserved.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 81e0e3e8-b736-4291-93fd-195e6023ca43

📥 Commits

Reviewing files that changed from the base of the PR and between defb0bc and 8ff500e.

📒 Files selected for processing (4)

src/active_tombstone_set.rs
src/range.rs
src/range_tombstone_filter.rs
tests/range_tombstone_ephemeral.rs

…ources When the same RT exists in both a persisted SST (cutoff=outer_seqno) and the ephemeral memtable (cutoff=eph_seqno), dedup_by on RT-only discards the copy with the higher cutoff, making the RT invisible and leaking covered keys. This test currently FAILS — the fix follows in the next commit.

When the same RT exists in multiple sources with different per-source cutoffs (e.g., persisted SST with outer_seqno + ephemeral memtable with eph_seqno), dedup_by on RT-only arbitrarily discarded one copy. This could drop the copy with the higher cutoff, making the RT invisible and leaking covered keys. Now dedup merges duplicates by keeping the max cutoff, so the RT stays visible if ANY source's snapshot includes it.

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

…ngeprefix-itera

dedup_by(|a, b|) passes a=later, b=earlier; b survives. The existing code correctly updates b.1. Expanded comment to prevent confusion with the counter-intuitive parameter naming.

polaz · 2026-03-21T08:19:40Z

@coderabbitai re-review

coderabbitai · 2026-03-21T08:19:53Z

@polaz Sure, I'll kick off a full re-review of PR #39 right away!

🧠 Learnings used

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.289Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/table/multi_writer.rs:129-206
Timestamp: 2026-03-20T11:14:02.461Z
Learning: In `structured-world/lsm-tree`, the `write_rts_to_writer` function in `src/table/multi_writer.rs` has a known limitation when `clip_range_tombstones=true` (compaction mode): RT fragments that don't intersect any output writer's KV span are silently dropped, which can lose delete-only gaps (e.g. an RT covering `[m,p)` while all output tables cover `[a,l]` and `[q,z]`). This limitation is intentionally deferred to issue `#32` and is documented as a Known Limitation in PR `#21`. Do not flag this as a blocking concern in PR `#21` or any PR until `#32` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in issue `#31` — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-20T22:03:44.692Z
Learning: Applies to src/range_tombstone_filter.rs : `src/range_tombstone_filter.rs` — MVCC-aware range tombstone filtering for iterators

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-20T22:03:44.692Z
Learning: Applies to src/active_tombstone_set.rs : `src/active_tombstone_set.rs` — Tracks active range tombstones during compaction

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/table/writer/mod.rs:437-457
Timestamp: 2026-03-20T12:29:43.719Z
Learning: In `structured-world/lsm-tree` (`src/table/writer/mod.rs`), the synthetic weak-tombstone sentinel written for RT-only SSTs (when `item_count == 0` but `range_tombstones` is non-empty) intentionally increments `meta.tombstone_count`, `meta.weak_tombstone_count`, `meta.item_count`, and `meta.key_count`. All on-disk metadata counters must match actual block contents for recovery validation consistency (established in commit 7452427a for item_count/key_count; same reasoning applies to tombstone counters). The resulting +1 inflation on GC heuristics is harmless — it overestimates reclaimable space and triggers slightly earlier checks. Do not flag this as a metadata accuracy bug in PR `#21` or future PRs.

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-20T22:03:44.692Z
Learning: Applies to src/range_tombstone.rs : `src/range_tombstone.rs` — Range tombstone data model and serialization

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/tree/mod.rs:747-749
Timestamp: 2026-03-20T20:05:24.184Z
Learning: In `structured-world/lsm-tree` (`src/tree/mod.rs`), `get_internal_entry_from_version` searches newest-first (active memtable → sealed memtables → SST tables). Each source's `get(key, seqno)` already returns the most-recent entry visible at that seqno within that source. Therefore, a point tombstone returned from the active memtable IS the authoritative latest version for that key — returning `Ok(None)` immediately is correct and does NOT hide a higher-seqno value from a lower source. Do not flag this early-return pattern as hiding newer entries.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/table/writer/mod.rs:394-463
Timestamp: 2026-03-20T20:05:18.309Z
Learning: In `structured-world/lsm-tree` (`src/table/writer/mod.rs`), the `key#min`/`key#max` widening for RT bounds in `Writer::finish()` is intentionally limited to the `item_count == 0` (RT-only) branch. For mixed KV+RT tables, `MultiWriter::write_rts_to_writer` (lines 167–199 of `src/table/multi_writer.rs`) already widens `meta.first_key`/`meta.last_key` conservatively in flush mode (`clip=false`). In compaction mode (`clip=true`), RTs are intersected with the KV key range so no widening is needed. Do not flag the absence of unconditional widening in `Writer::finish()` as a bug in PR `#21` or future PRs.

✅ Actions performed

Full review triggered.

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

…ngeprefix-itera

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

… assertions Both forward_per_source_cutoff_mixed and reverse_per_source_cutoff_mixed only checked max_active_seqno() which was ambiguous (same value whether 1 or 2 RTs activated). Now expire past the visible RT's boundary and assert the set becomes empty, proving the invisible RT was never activated.

coderabbitai

🧹 Nitpick comments (1)

src/active_tombstone_set.rs (1)
349-350: Rename the new tests to include the expected behavior.

Both names capture the setup, but not the assertion being protected. Encoding the expected outcome would make failures easier to scan.

As per coding guidelines, Test naming convention: fn <what>_<condition>_<expected>().

Also applies to: 430-431
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/active_tombstone_set.rs` around lines 349 - 350, Rename the newly added
test functions to follow the project convention fn
<what>_<condition>_<expected>() so the test name encodes the assertion; for
example, change fn forward_per_source_cutoff_mixed() to a name that includes the
expected behavior (e.g.,
forward_per_source_cutoff_mixed_keeps_expected_tombstones) and apply the same
renaming pattern to the other new test(s) in this file (the other test near the
new additions) so each name clearly states what is being tested, under what
condition, and what the expected outcome is.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/active_tombstone_set.rs`:
- Around line 349-350: Rename the newly added test functions to follow the
project convention fn <what>_<condition>_<expected>() so the test name encodes
the assertion; for example, change fn forward_per_source_cutoff_mixed() to a
name that includes the expected behavior (e.g.,
forward_per_source_cutoff_mixed_keeps_expected_tombstones) and apply the same
renaming pattern to the other new test(s) in this file (the other test near the
new additions) so each name clearly states what is being tested, under what
condition, and what the expected outcome is.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: adafef29-a0bb-4055-a933-5ff62449b89c

📥 Commits

Reviewing files that changed from the base of the PR and between 090e5ba and e4bb194.

📒 Files selected for processing (1)

src/active_tombstone_set.rs

…ntion - forward_per_source_cutoff_mixed → forward_mixed_cutoffs_activates_only_visible_rt - reverse_per_source_cutoff_mixed → reverse_mixed_cutoffs_activates_only_visible_rt - per_source_cutoff_different_visibility → mixed_cutoffs_suppress_only_visible_source

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

Guard (IterGuard trait) is required for .key() calls on iterator items. Add clarifying comment matching the pattern in range_tombstone.rs.

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

Vec<Vec<u8>> compares correctly with vec![b"a", b"b"] via Rust's blanket PartialEq impl. Add doc comment matching range_tombstone.rs.

polaz · 2026-03-21T11:15:01Z

@coderabbitai re-review

coderabbitai · 2026-03-21T11:15:17Z

@polaz Sure, I'll kick off a full re-review of PR #39 right away!

🧠 Learnings used

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.289Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in issue `#31` — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/table/multi_writer.rs:129-206
Timestamp: 2026-03-20T11:14:02.461Z
Learning: In `structured-world/lsm-tree`, the `write_rts_to_writer` function in `src/table/multi_writer.rs` has a known limitation when `clip_range_tombstones=true` (compaction mode): RT fragments that don't intersect any output writer's KV span are silently dropped, which can lose delete-only gaps (e.g. an RT covering `[m,p)` while all output tables cover `[a,l]` and `[q,z]`). This limitation is intentionally deferred to issue `#32` and is documented as a Known Limitation in PR `#21`. Do not flag this as a blocking concern in PR `#21` or any PR until `#32` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/table/writer/mod.rs:394-463
Timestamp: 2026-03-20T20:05:18.309Z
Learning: In `structured-world/lsm-tree` (`src/table/writer/mod.rs`), the `key#min`/`key#max` widening for RT bounds in `Writer::finish()` is intentionally limited to the `item_count == 0` (RT-only) branch. For mixed KV+RT tables, `MultiWriter::write_rts_to_writer` (lines 167–199 of `src/table/multi_writer.rs`) already widens `meta.first_key`/`meta.last_key` conservatively in flush mode (`clip=false`). In compaction mode (`clip=true`), RTs are intersected with the KV key range so no widening is needed. Do not flag the absence of unconditional widening in `Writer::finish()` as a bug in PR `#21` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/table/writer/mod.rs:437-457
Timestamp: 2026-03-20T12:29:43.719Z
Learning: In `structured-world/lsm-tree` (`src/table/writer/mod.rs`), the synthetic weak-tombstone sentinel written for RT-only SSTs (when `item_count == 0` but `range_tombstones` is non-empty) intentionally increments `meta.tombstone_count`, `meta.weak_tombstone_count`, `meta.item_count`, and `meta.key_count`. All on-disk metadata counters must match actual block contents for recovery validation consistency (established in commit 7452427a for item_count/key_count; same reasoning applies to tombstone counters). The resulting +1 inflation on GC heuristics is harmless — it overestimates reclaimable space and triggers slightly earlier checks. Do not flag this as a metadata accuracy bug in PR `#21` or future PRs.

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-20T22:03:44.692Z
Learning: Applies to **/*.rs : Before flagging any potential issue, trace the call chain visible in the PR diff. Check the caller to see if it handles the case, verify type-level guarantees, and read adjacent comments about deliberate design decisions.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 23
File: .github/instructions/rust.instructions.md:0-0
Timestamp: 2026-03-17T00:34:54.145Z
Learning: In `structured-world/lsm-tree`, the lint suppression policy distinguishes between legacy and new code:
- Existing/upstream `#[allow(clippy::...)]` in **unchanged** code: treat as legacy — do **not** flag it in reviews.
- New code introduced in a PR **must** use `#[expect(clippy::...)]` with a justification reason.
This avoids churning upstream code while enforcing the `#[expect]` standard for all new contributions.

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-20T22:03:44.692Z
Learning: Read the 'out of scope' section in the PR description before reviewing. Do not flag items listed as out of scope — they are tracked in another PR.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 12
File: src/table/block/mod.rs:109-119
Timestamp: 2026-03-15T16:10:30.764Z
Learning: In `structured-world/lsm-tree`, pre-allocation bounds checks on `uncompressed_length`, `data_length`, `handle.size()`, and `real_val_len` (256 MiB cap, returning `DecompressedSizeTooLarge`) are intentionally deferred to PR `#7` (branch `feat/#258-security-validate-uncompressedlength-before-decomp`, closes issues `#258` and `#257`). Do not flag missing size-cap guards in `Block::from_reader`, `Block::from_file`, or `blob_file::Reader::get` as unaddressed — they are handled in that separate security PR.

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-20T22:03:44.692Z
Learning: Applies to src/range_tombstone_filter.rs : `src/range_tombstone_filter.rs` — MVCC-aware range tombstone filtering for iterators

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-20T22:03:44.692Z
Learning: Applies to src/active_tombstone_set.rs : `src/active_tombstone_set.rs` — Tracks active range tombstones during compaction

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-20T22:03:44.692Z
Learning: Applies to src/range_tombstone.rs : `src/range_tombstone.rs` — Range tombstone data model and serialization

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/tree/mod.rs:747-749
Timestamp: 2026-03-20T20:05:24.184Z
Learning: In `structured-world/lsm-tree` (`src/tree/mod.rs`), `get_internal_entry_from_version` searches newest-first (active memtable → sealed memtables → SST tables). Each source's `get(key, seqno)` already returns the most-recent entry visible at that seqno within that source. Therefore, a point tombstone returned from the active memtable IS the authoritative latest version for that key — returning `Ok(None)` immediately is correct and does NOT hide a higher-seqno value from a lower source. Do not flag this early-return pattern as hiding newer entries.

✅ Actions performed

Full review triggered.

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

build_ephemeral silently discarded insert_range_tombstone return value. Now asserts size > 0 to catch rejected (start >= end) RTs. Also document defensive re-sort in RangeTombstoneFilter::new.

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

- Replace magic 999 with EPHEMERAL_MT_ID const - Align key-range doc with actual inserts (a..d, not a..e)

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

- MvccStream RT suppression uses per-source cutoffs (u64) from the new (RangeTombstone, cutoff) tuple format after upstream PR #39 - resolve_merge_buffered initializes result_seqno from entries.last() instead of using 0 as sentinel (seqno can legitimately be 0) - key_entries_buf reuses allocation across next_back() calls - Document RT clone rationale in range pipeline

Copilot AI review requested due to automatic review settings March 20, 2026 22:52

Copilot started reviewing on behalf of polaz March 20, 2026 22:56 View session

Copilot AI reviewed Mar 20, 2026

View reviewed changes

Comment thread src/range.rs Outdated

coderabbitai Bot reviewed Mar 20, 2026

View reviewed changes

Comment thread src/range.rs

polaz added 2 commits March 21, 2026 01:06

polaz requested a review from Copilot March 21, 2026 07:28

Copilot started reviewing on behalf of polaz March 21, 2026 07:29 View session

Copilot AI reviewed Mar 21, 2026

View reviewed changes

Comment thread src/range.rs Outdated

polaz added 2 commits March 21, 2026 09:42

Merge branch 'main' into feat/#33-feat-per-source-rt-visibility-in-ra…

8c82ac8

…ngeprefix-itera

style(range-tombstone): clarify dedup_by element survival in comment

090e5ba

dedup_by(|a, b|) passes a=later, b=earlier; b survives. The existing code correctly updates b.1. Expanded comment to prevent confusion with the counter-intuitive parameter naming.

polaz requested a review from Copilot March 21, 2026 08:19

structured-world deleted a comment from coderabbitai Bot Mar 21, 2026

Copilot started reviewing on behalf of polaz March 21, 2026 08:19 View session

Copilot AI reviewed Mar 21, 2026

View reviewed changes

Merge branch 'main' into feat/#33-feat-per-source-rt-visibility-in-ra…

886be6a

…ngeprefix-itera

polaz requested a review from Copilot March 21, 2026 08:54

Copilot started reviewing on behalf of polaz March 21, 2026 08:58 View session

Copilot AI reviewed Mar 21, 2026

View reviewed changes

Comment thread src/active_tombstone_set.rs Outdated

Comment thread src/active_tombstone_set.rs Outdated

polaz requested a review from Copilot March 21, 2026 09:30

Copilot started reviewing on behalf of polaz March 21, 2026 09:31 View session

coderabbitai Bot reviewed Mar 21, 2026

View reviewed changes

Copilot AI reviewed Mar 21, 2026

View reviewed changes

Comment thread tests/range_tombstone_ephemeral.rs

style(range-tombstone): document Guard trait import in ephemeral tests

bcc74ce

Guard (IterGuard trait) is required for .key() calls on iterator items. Add clarifying comment matching the pattern in range_tombstone.rs.

polaz requested a review from Copilot March 21, 2026 09:46

Copilot started reviewing on behalf of polaz March 21, 2026 09:51 View session

Copilot AI reviewed Mar 21, 2026

View reviewed changes

docs(range-tombstone): document PartialEq blanket impl in collect_keys

90be0bb

Vec<Vec<u8>> compares correctly with vec![b"a", b"b"] via Rust's blanket PartialEq impl. Add doc comment matching range_tombstone.rs.

polaz requested a review from Copilot March 21, 2026 11:14

Copilot AI reviewed Mar 21, 2026

View reviewed changes

Comment thread tests/range_tombstone_ephemeral.rs Outdated

Comment thread src/range_tombstone_filter.rs

Comment thread src/active_tombstone_set.rs

Comment thread src/range.rs

Copilot started reviewing on behalf of polaz March 21, 2026 11:26 View session

fix(range-tombstone): assert RT insertion succeeds in test helper

d355514

build_ephemeral silently discarded insert_range_tombstone return value. Now asserts size > 0 to catch rejected (start >= end) RTs. Also document defensive re-sort in RangeTombstoneFilter::new.

polaz requested a review from Copilot March 21, 2026 11:43

Copilot AI reviewed Mar 21, 2026

View reviewed changes

Comment thread tests/range_tombstone_ephemeral.rs Outdated

Comment thread tests/range_tombstone_ephemeral.rs Outdated

Copilot started reviewing on behalf of polaz March 21, 2026 11:57 View session

style(range-tombstone): use named const for ephemeral memtable ID

b006d49

- Replace magic 999 with EPHEMERAL_MT_ID const - Align key-range doc with actual inserts (a..d, not a..e)

polaz requested a review from Copilot March 21, 2026 17:01

Copilot started reviewing on behalf of polaz March 21, 2026 17:02 View session

Copilot AI reviewed Mar 21, 2026

View reviewed changes

polaz merged commit c03a443 into main Mar 21, 2026
31 checks passed

coderabbitai Bot mentioned this pull request Mar 21, 2026

feat: prefix bloom filters for graph key encoding #43

Merged

4 tasks

This was referenced Mar 21, 2026

feat: merge operators for commutative LSM operations #28

Merged

perf: optimize range tombstone lookup in table-skip and point-read #55

Merged

coderabbitai Bot mentioned this pull request Mar 22, 2026

feat(testing): db_bench suite + property-based model tests #45

Merged

5 tasks

This was referenced Mar 22, 2026

feat: extend prefix bloom skipping to multi-table runs #64

Merged

test(encryption): extract tempfile helper + mixed-load stress test #128

Open

perf(bench): add multi-threaded support to all db_bench workloads #136

Closed

polaz mentioned this pull request Apr 8, 2026

feat(compression): enable dictionary compression in pure Rust backend #229

Merged

9 tasks

Conversation

polaz commented Mar 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changed files

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Poem

Uh oh!

codecov Bot commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

polaz commented Mar 21, 2026

Uh oh!

coderabbitai Bot commented Mar 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

polaz commented Mar 21, 2026

Uh oh!

coderabbitai Bot commented Mar 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

polaz commented Mar 20, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 20, 2026 •

edited

Loading

codecov Bot commented Mar 20, 2026 •

edited

Loading