fix: thread UserComparator through Run, KeyRange, and Version methods by polaz · Pull Request #117 · structured-world/coordinode-lsm-tree

polaz · 2026-03-22T22:33:34Z

Summary

Extends comparator-aware coverage (#98 core fix landed in #100) to remaining code paths, plus fixes #122.

Leveled compaction choose() — all overlap detection, key range aggregation, trivial move decisions now use comparator
pick_minimal_compaction multi-run aware (fix: multi-level compaction — relax disjoint assert + merge input ranges optimization #122) — accepts &Level instead of &Run, scans all runs for overlap/containment. Eliminates missed tables in transient multi-run levels from multi-level compaction (feat(compaction): compute L2 overlaps per-range in multi-level path #108)
RunReader::new_cmp — comparator-aware table selection for range scans (create_range + create_range_point)
OwnedBounds::contains — comparator-aware containment for drop_range strategy
get_contained_cmp — comparator-aware table containment in runs
Level::aggregate_key_range_cmp + KeyRange::aggregate_cmp + KeyRange::contains_range_cmp — cross-run aggregation with comparator

What #100 covered vs what this PR adds

Area	#100	This PR
`Run::push_cmp`, `get_overlapping_cmp`, `range_overlap_indexes_cmp`	Done	—
`optimize_runs` + `Version::with_*` comparator threading	Done	—
Leveled `choose()` comparator threading	—	Done
`pick_minimal_compaction` multi-run aware (#122)	—	Done
`RunReader::new_cmp` for range scans	—	Done
`OwnedBounds::contains` with comparator	—	Done
`get_contained_cmp`, `contains_range_cmp`, `aggregate_cmp`	—	Done
`Level::aggregate_key_range_cmp`	—	Done
`RunReader::new` public API preservation	—	Done
`trim_slice` deduplication	—	Done

Test Plan

4 regression tests with ReverseComparator (compaction, leveled, merge operator, tombstone)
Unit test for get_contained_cmp with reverse comparator
All 17 custom_comparator tests pass + 17 custom_comparator_compaction (2 ignored — fix: thread UserComparator through ingestion guards + RunReader #116)
cargo check + cargo clippy --lib clean

Closes #122

Review skipped

This PR was authored by the user configured for CodeRabbit reviews. CodeRabbit does not review PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 40345c90-99e3-4dc3-8319-1b842a3c72b5

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

📝 Walkthrough

Walkthrough

Threads a UserComparator through key-range, run, reader, iteration, and compaction codepaths; adds comparator-aware APIs (_cmp) across KeyRange, Run, RunReader, Level/Version, and compaction Strategy::choose; refactors leveled picker to consider all runs in a level; adds regression tests for a ReverseComparator.

Changes

Cohort / File(s)	Summary
KeyRange APIs `src/key_range.rs`	Add comparator-aware methods: `contains_range_cmp` and `aggregate_cmp` for containment and extrema using `UserComparator`.
Run / Containment `src/version/run.rs`	Extract `trim_slice` helper; add `Run::get_contained_cmp(...)` using comparator-aware overlap/containment; extend tests (new reverse-comparator cases).
Level / Version `src/version/mod.rs`	Add `Level::aggregate_key_range_cmp(...)` to compute per-level key-range extrema with a comparator.
RunReader & Iteration `src/run_reader.rs`, `src/range.rs`	Introduce `RunReader::new_cmp(...)`; make `RunReader::new` forward to `new_cmp` (default comparator); update `TreeIter::create_range[_point]` to construct readers with the active comparator.
Compaction: drop_range `src/compaction/drop_range.rs`	`OwnedBounds::contains` now takes `cmp`; `Strategy::choose` uses `config` to obtain `cmp` and switches to comparator-aware overlap/index APIs.
Compaction: leveled picker `src/compaction/leveled/mod.rs`	Refactor `pick_minimal_compaction` to operate on entire `Level` references, thread `cmp` throughout, use comparator-aware overlap/containment APIs, and remove strict disjoint debug assertions.
Tests `tests/custom_comparator.rs`, `tests/custom_comparator_compaction.rs`	Add comprehensive regression tests for ReverseComparator behavior across reopen, compaction, merge, and tombstone propagation; adjust two test ignore messages.

Sequence Diagram(s)

sequenceDiagram
  participant Client as "Strategy::choose"
  participant Version as "Version / Level / Run"
  participant RunReader as "RunReader::new_cmp"
  participant KeyRange as "KeyRange::*_cmp"

  Client->>Version: Level::aggregate_key_range_cmp(cmp)
  Client->>Version: run.range_overlap_indexes_cmp(bounds, cmp)
  Client->>RunReader: RunReader::new_cmp(run, range, cmp)
  RunReader->>Version: run.range_overlap_indexes_cmp(range, cmp)
  RunReader->>KeyRange: key_range.contains_range_cmp(..., cmp)
  KeyRange-->>RunReader: containment result
  RunReader-->>Client: candidate readers / tables
  Client-->>Client: build table_ids, apply hidden-table checks, return Choice

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

perf(compaction): pick_minimal_compaction skips second run in multi-run levels #132 — Refactors picker to be multi-run aware; this PR updates pick_minimal_compaction to scan whole Level(s).
fix: thread UserComparator through ingestion guards + RunReader #116 — Threads UserComparator into RunReader and overlap lookups; this PR adds RunReader::new_cmp and comparator-aware overlap/containment APIs.

Possibly related PRs

feat: custom key comparison / comparator #67 — Wires UserComparator into compaction and key-range APIs; closely related to the _cmp plumbing added here.
test+fix: integration tests for compaction/merge with custom comparator #100 — Adds comparator-aware plumbing across compaction/run/version APIs; strong code-level overlap.
fix(test): use shared seqno counter in proptest oracle #97 — Modifies RunReader construction and its use in src/range.rs; related to the RunReader::new_cmp changes in this PR.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the primary change: threading the UserComparator through Run, KeyRange, and Version methods across multiple modules.
Description check	✅ Passed	The description comprehensively explains the changes made across modules, relates them to previous PRs (`#100`), and provides clear context for the work done.
Linked Issues check	✅ Passed	The PR successfully addresses all coding requirements from `#122`: Part 1 (relaxed debug assertions), Part 3 (multi-run aware pick_minimal_compaction accepting &Level), and comprehensive comparator threading throughout the codebase.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to threading UserComparator through the specified modules and improving multi-level compaction; no unrelated modifications were introduced.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/compaction/drop_range.rs (1)
82-86: ⚠️ Potential issue | 🟠 Major

Use comparator-aware containment check to match overlap logic.

Line 82 correctly uses range_overlap_indexes_cmp_bounds(&self.bounds, cmp), but line 86 filters with self.bounds.contains(x.key_range()), which relies on lexicographic byte comparison only. After getting tables that overlap in comparator order, filtering them by lexicographic containment breaks correctness for custom comparators like ReverseComparator.

The codebase has established the pattern (see src/version/run.rs:233): after range_overlap_indexes_cmp_bounds(), follow with contains_range_cmp() on the KeyRange. Add a contains_cmp method to OwnedBounds that accepts the comparator:
pub fn contains_cmp(&self, range: &KeyRange, cmp: &dyn crate::comparator::UserComparator) -> bool {
    // Implement using cmp.compare() similar to KeyRange::contains_range_cmp
}
Then update line 86 to: .filter(|x| self.bounds.contains_cmp(x.key_range(), cmp))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/compaction/drop_range.rs` around lines 82 - 86, The filter uses
lexicographic contains but the overlap was computed with a comparator; add a
comparator-aware containment to OwnedBounds and use it: implement
OwnedBounds::contains_cmp(&self, range: &KeyRange, cmp: &dyn
crate::comparator::UserComparator) (mirroring KeyRange::contains_range_cmp and
using cmp.compare()), then replace the current .filter(|x|
self.bounds.contains(x.key_range())) with .filter(|x|
self.bounds.contains_cmp(x.key_range(), cmp)) so containment uses the same
comparator as range_overlap_indexes_cmp_bounds.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/blob_tree/ingest.rs`:
- Around line 254-259: The BlobIngestion write guards currently check ordering
with native byte Ord (e.g., `key > *prev`) which breaks custom comparators;
update all ordering checks inside BlobIngestion (the guards at the spots
referenced around Line 76, 116, 137) to use the index comparator instead (access
`index.config.comparator.as_ref()` or accept the comparator where BlobIngestion
is constructed) and replace the `>`/`<`/`==` checks with comparator-based
comparisons (e.g., call the comparator's compare/compare_bytes method and test
for Ordering::Greater/Equal/etc.) so validation uses the configured comparator
semantics.

In `@src/tree/ingest.rs`:
- Around line 325-330: The guarded write-path comparisons currently use plain
lexicographic operators like `key > *prev`, which is incorrect for custom
comparators; update each guard (the checks at the sites referenced near the
`with_new_l0_run` usage where `prev` and `key` are compared) to use the
configured comparator from `self.tree.config.comparator` instead of `>`: call
the comparator (via its `as_ref()` or its compare method) to compare `prev` and
`key` and interpret the returned Ordering to enforce monotonicity consistent
with the comparator (replace `key > *prev` semantics with a comparator-based
ordering test). Ensure this change is applied at all four guard sites so
ingestion ordering matches `with_new_l0_run`’s comparator.

---

Outside diff comments:
In `@src/compaction/drop_range.rs`:
- Around line 82-86: The filter uses lexicographic contains but the overlap was
computed with a comparator; add a comparator-aware containment to OwnedBounds
and use it: implement OwnedBounds::contains_cmp(&self, range: &KeyRange, cmp:
&dyn crate::comparator::UserComparator) (mirroring KeyRange::contains_range_cmp
and using cmp.compare()), then replace the current .filter(|x|
self.bounds.contains(x.key_range())) with .filter(|x|
self.bounds.contains_cmp(x.key_range(), cmp)) so containment uses the same
comparator as range_overlap_indexes_cmp_bounds.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 64855ab2-089b-4a47-9a5c-85ac53af537b

📥 Commits

Reviewing files that changed from the base of the PR and between 7c3fa37 and 54f896c.

📒 Files selected for processing (14)

src/blob_tree/ingest.rs
src/compaction/drop_range.rs
src/compaction/flavour.rs
src/compaction/leveled/mod.rs
src/compaction/worker.rs
src/key_range.rs
src/range.rs
src/run_reader.rs
src/tree/ingest.rs
src/tree/mod.rs
src/version/mod.rs
src/version/optimize.rs
src/version/run.rs
tests/custom_comparator.rs

Copilot

Pull request overview

This PR fixes incorrect table ordering and range selection when a non-lexicographic UserComparator is configured by introducing comparator-aware variants across Run/KeyRange/Level/RunReader and threading the comparator through Version mutation and compaction/range-scan call sites.

Changes:

Add _cmp variants for sorting/searching/overlap checks that use UserComparator instead of bytewise lexicographic order.
Thread SharedComparator through optimize_runs, Version::with_* methods, compaction strategies, and range scan construction (RunReader::new_cmp).
Add regression tests using ReverseComparator to cover compaction, leveled compaction, merges, and tombstones.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/custom_comparator.rs	Adds regression tests ensuring iteration/compaction behavior respects custom comparator ordering.
src/version/run.rs	Introduces comparator-aware run insertion and overlap/containment index selection helpers.
src/version/optimize.rs	Makes run optimization comparator-aware and updates unit tests accordingly.
src/version/mod.rs	Threads comparator into Version mutations and adds `Level::aggregate_key_range_cmp`.
src/tree/mod.rs	Passes configured comparator into Version updates during runtime operations.
src/tree/ingest.rs	Passes comparator into ingestion path when creating new L0 runs.
src/run_reader.rs	Adds `RunReader::new_cmp` to cull tables for range scans using comparator-aware overlap logic.
src/range.rs	Updates range iteration to use comparator-aware `RunReader` construction.
src/key_range.rs	Adds comparator-aware key-range operations (contains/overlaps/aggregate).
src/compaction/worker.rs	Threads comparator into `with_moved` and `with_dropped` version updates.
src/compaction/leveled/mod.rs	Uses comparator-aware range aggregation and overlap detection in leveled strategy decisions.
src/compaction/flavour.rs	Threads comparator into version-building during compaction application.
src/compaction/drop_range.rs	Uses comparator-aware overlap selection for dropping ranges.
src/blob_tree/ingest.rs	Passes comparator into blob ingestion path when creating new L0 runs.

polaz · 2026-03-22T22:56:05Z

@coderabbitai re-review

coderabbitai · 2026-03-22T22:56:24Z

@polaz: Sure! I'll re-review the changes in PR #117 now.

🧠 Learnings used

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 67
File: src/memtable/mod.rs:72-77
Timestamp: 2026-03-22T12:26:47.795Z
Learning: In `structured-world/lsm-tree` (`src/memtable/interval_tree.rs` and `src/memtable/mod.rs`), the memtable `IntervalTree` used for range tombstone containment queries (`is_key_suppressed_by_range_tombstone`, `query_suppression`, `insert_range_tombstone`) still uses lexicographic `Ord` on `UserKey` rather than the pluggable `SharedComparator` introduced in PR `#67`. Threading the comparator into the AVL interval tree is a non-trivial refactor intentionally deferred from PR `#67` and tracked in issue `#81`. Do not flag the absence of comparator-aware RT suppression in the memtable interval tree as a blocking concern in PR `#67` or future PRs until issue `#81` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 100
File: tests/custom_comparator_compaction.rs:421-447
Timestamp: 2026-03-22T20:07:10.672Z
Learning: In `structured-world/lsm-tree` (`tests/custom_comparator_compaction.rs`), the two range-scan integration tests (`reverse_comparator_range_scan_after_compaction` and the second range-scan test) are intentionally `#[ignore]`d in PR `#100` because `RunReader::new()` still uses lexicographic `range_overlap_indexes` rather than the comparator-aware variant for on-disk range reads. Threading the `SharedComparator` through `RunReader` is a known deferred limitation documented in the PR body and tracked as a follow-up. Do not flag the ignored range-scan tests or the absence of comparator-aware `RunReader` overlap logic as a blocking concern in PR `#100` or future PRs until that follow-up issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 117
File: src/tree/ingest.rs:325-330
Timestamp: 2026-03-22T22:44:14.193Z
Learning: In `structured-world/lsm-tree` (`src/tree/ingest.rs`), the write-path monotonicity guards in `Ingestion::write`, `write_tombstone`, `write_weak_tombstone`, and `write_indirection` use plain lexicographic `key > *prev` comparisons instead of the configured `UserComparator`. This is a pre-existing issue acknowledged as valid but intentionally out of scope for PR `#98` (issue `#98` covers `Run::push()` and related methods). It is tracked in a separate follow-up issue. Do not flag the lexicographic ordering guards in `src/tree/ingest.rs` ingestion write paths as a concern in PR `#98` or future PRs until that separate issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 67
File: src/comparator.rs:7-33
Timestamp: 2026-03-22T11:14:39.924Z
Learning: In `structured-world/lsm-tree` (`src/comparator.rs`), the `UserComparator` trait introduced in PR `#67` intentionally has no stable persisted identity (no `name()` method). Comparator identity persistence and mismatch detection on tree reopen is deferred to issue `#74`. The current design follows RocksDB's approach of placing responsibility on the caller (documented in `src/comparator.rs` lines 31–33 and in `Config::comparator` doc). Do not flag the absence of `UserComparator::name()` or comparator identity persistence as a blocking concern in PR `#67` or future PRs until issue `#74` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 101
File: src/tree/mod.rs:1370-1381
Timestamp: 2026-03-22T20:30:38.753Z
Learning: In `structured-world/lsm-tree`, the `UserComparator` trait (and all custom-comparator support) was introduced in PR `#67` within the same release cycle as the comparator-name persistence added in PR `#101`. All trees created before PR `#67` used only `DefaultUserComparator` (whose `name()` returns `"default"`). Manifests without a `comparator_name` section default to `"default"` in `Manifest::decode_from` (src/manifest.rs). Therefore, there are no legacy trees with a custom comparator and no upgrade path is needed — do not flag the missing-section fallback to `"default"` as a legacy-compatibility risk in PR `#101` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 101
File: src/tree/mod.rs:1370-1381
Timestamp: 2026-03-22T20:38:37.516Z
Learning: In `structured-world/lsm-tree`, the `UserComparator` trait was introduced in PR `#67` in the same release cycle as comparator-name persistence (PR `#101`, closes issue `#74`). All trees created before PR `#101` used `DefaultUserComparator` (no custom-comparator tree could have existed without the trait). Therefore, the manifest fallback of `"default"` for a missing `comparator_name` section is safe: it matches every pre-existing tree exactly, and no legacy custom-comparator trees exist that would be incorrectly rejected on reopen. Do not flag the missing-section → `"default"` fallback as an unsafe upgrade path.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.749Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path (lines ~749–759) uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to a separate follow-up issue. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until that issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.749Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to issue `#72`. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until issue `#72` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.612Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.298Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

✅ Actions performed

Full review triggered.

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 3 comments.

codecov · 2026-03-22T23:04:55Z

Codecov Report

❌ Patch coverage is 90.06211% with 16 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/compaction/leveled/mod.rs	88.73%	8 Missing ⚠️
src/range.rs	28.57%	5 Missing ⚠️
src/version/run.rs	88.00%	3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated no new comments.

polaz · 2026-03-23T00:45:33Z

@coderabbitai re-review

coderabbitai · 2026-03-23T00:45:46Z

@polaz: Sure! I'll re-review the changes in PR #117 now.

🧠 Learnings used

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 67
File: src/memtable/mod.rs:72-77
Timestamp: 2026-03-22T12:26:47.795Z
Learning: In `structured-world/lsm-tree` (`src/memtable/interval_tree.rs` and `src/memtable/mod.rs`), the memtable `IntervalTree` used for range tombstone containment queries (`is_key_suppressed_by_range_tombstone`, `query_suppression`, `insert_range_tombstone`) still uses lexicographic `Ord` on `UserKey` rather than the pluggable `SharedComparator` introduced in PR `#67`. Threading the comparator into the AVL interval tree is a non-trivial refactor intentionally deferred from PR `#67` and tracked in issue `#81`. Do not flag the absence of comparator-aware RT suppression in the memtable interval tree as a blocking concern in PR `#67` or future PRs until issue `#81` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.298Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.855Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.855Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in issue `#31` — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 100
File: tests/custom_comparator_compaction.rs:421-447
Timestamp: 2026-03-22T20:07:10.672Z
Learning: In `structured-world/lsm-tree` (`tests/custom_comparator_compaction.rs`), the two range-scan integration tests (`reverse_comparator_range_scan_after_compaction` and the second range-scan test) are intentionally `#[ignore]`d in PR `#100` because `RunReader::new()` still uses lexicographic `range_overlap_indexes` rather than the comparator-aware variant for on-disk range reads. Threading the `SharedComparator` through `RunReader` is a known deferred limitation documented in the PR body and tracked as a follow-up. Do not flag the ignored range-scan tests or the absence of comparator-aware `RunReader` overlap logic as a blocking concern in PR `#100` or future PRs until that follow-up issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 117
File: src/tree/ingest.rs:325-330
Timestamp: 2026-03-22T22:44:14.193Z
Learning: In `structured-world/lsm-tree` (`src/tree/ingest.rs`), the write-path monotonicity guards in `Ingestion::write`, `write_tombstone`, `write_weak_tombstone`, and `write_indirection` use plain lexicographic `key > *prev` comparisons instead of the configured `UserComparator`. This is a pre-existing issue acknowledged as valid but intentionally out of scope for PR `#98` (issue `#98` covers `Run::push()` and related methods). It is tracked in a separate follow-up issue. Do not flag the lexicographic ordering guards in `src/tree/ingest.rs` ingestion write paths as a concern in PR `#98` or future PRs until that separate issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.749Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path (lines ~749–759) uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to a separate follow-up issue. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until that issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.749Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to issue `#72`. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until issue `#72` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 108
File: src/compaction/leveled/test.rs:681-727
Timestamp: 2026-03-23T00:40:16.100Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/test.rs`), the regression test `multi_level_sparse_keyspace_data_integrity` (added in PR `#108` for issue `#72`) uses full-span SSTs (writes both "a" and "z" in each flush) rather than genuinely disjoint key ranges. This is a known, accepted coverage gap: at unit-test data volumes the force-trivial-move scoring (99.99) cascades all intermediate levels to Lmax, making it structurally impossible to simultaneously populate L1 and L2 in a unit test. The test verifies data integrity (read-back correctness) only, not the per-range L2 sparse-overlap optimization. Do not flag the full-span flush pattern in this test as a deficiency in PR `#108` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.612Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 67
File: src/comparator.rs:7-33
Timestamp: 2026-03-22T11:14:39.924Z
Learning: In `structured-world/lsm-tree` (`src/comparator.rs`), the `UserComparator` trait introduced in PR `#67` intentionally has no stable persisted identity (no `name()` method). Comparator identity persistence and mismatch detection on tree reopen is deferred to issue `#74`. The current design follows RocksDB's approach of placing responsibility on the caller (documented in `src/comparator.rs` lines 31–33 and in `Config::comparator` doc). Do not flag the absence of `UserComparator::name()` or comparator identity persistence as a blocking concern in PR `#67` or future PRs until issue `#74` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 101
File: src/tree/mod.rs:1370-1381
Timestamp: 2026-03-22T20:30:38.753Z
Learning: In `structured-world/lsm-tree`, the `UserComparator` trait (and all custom-comparator support) was introduced in PR `#67` within the same release cycle as the comparator-name persistence added in PR `#101`. All trees created before PR `#67` used only `DefaultUserComparator` (whose `name()` returns `"default"`). Manifests without a `comparator_name` section default to `"default"` in `Manifest::decode_from` (src/manifest.rs). Therefore, there are no legacy trees with a custom comparator and no upgrade path is needed — do not flag the missing-section fallback to `"default"` as a legacy-compatibility risk in PR `#101` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 101
File: src/tree/mod.rs:1370-1381
Timestamp: 2026-03-22T20:38:37.516Z
Learning: In `structured-world/lsm-tree`, the `UserComparator` trait was introduced in PR `#67` in the same release cycle as comparator-name persistence (PR `#101`, closes issue `#74`). All trees created before PR `#101` used `DefaultUserComparator` (no custom-comparator tree could have existed without the trait). Therefore, the manifest fallback of `"default"` for a missing `comparator_name` section is safe: it matches every pre-existing tree exactly, and no legacy custom-comparator trees exist that would be incorrectly rejected on reopen. Do not flag the missing-section → `"default"` fallback as an unsafe upgrade path.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-23T00:44:56.025Z
Learning: In `structured-world/lsm-tree`, issue `#83` has been resolved in PR `#102` (`feat/#83-perf-extend-bloommaycontainkeyhash-to-support-part`). `Table::bloom_may_contain_key(key, key_hash)` was added to `src/table/mod.rs`, which seeks the partitioned filter TLI by user key and queries only the matching partition's bloom filter, returning `Ok(false)` for keys beyond all partition boundaries. `IterState` in `src/range.rs` gained `bloom_key: Option<UserKey>`; `bloom_passes()` dispatches to `bloom_may_contain_key` when `bloom_key` is set (single-key pipeline path) and falls back to `bloom_may_contain_key_hash` otherwise (prefix/range scans). The prior conservative `Ok(true)` fallback in `bloom_may_contain_key_hash` for partitioned/TLI filters is preserved for callers without a key. Do not flag the conservative fallback in `bloom_may_contain_key_hash` for partitioned/TLI filters as a blocking concern in any future PR — it is intentional for callers that lack a user key.

✅ Actions performed

Full review triggered.

sw-release-bot

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'lsm-tree db_bench'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.

Benchmark suite	Current: `9dfceab`	Previous: `3b54ecb`	Ratio
`fillrandom`	`820414.8419444144` ops/sec	`1205121.0074657367` ops/sec	`1.47`
`readrandom`	`466762.9383897613` ops/sec	`614429.2119898967` ops/sec	`1.32`
`seekrandom`	`332287.8196148191` ops/sec	`402284.5910887458` ops/sec	`1.21`
`overwrite`	`913550.8727567153` ops/sec	`1146265.159478525` ops/sec	`1.25`
`readwhilewriting`	`388242.9573256522` ops/sec	`525608.465712753` ops/sec	`1.35`

This comment was automatically generated by workflow using github-action-benchmark.

CC: @polaz

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated no new comments.

polaz · 2026-03-23T08:56:47Z

@coderabbitai re-review

coderabbitai · 2026-03-23T08:57:05Z

@polaz: Sure! I'll re-review the changes in PR #117 now.

🧠 Learnings used

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 67
File: src/memtable/mod.rs:72-77
Timestamp: 2026-03-22T12:26:47.795Z
Learning: In `structured-world/lsm-tree` (`src/memtable/interval_tree.rs` and `src/memtable/mod.rs`), the memtable `IntervalTree` used for range tombstone containment queries (`is_key_suppressed_by_range_tombstone`, `query_suppression`, `insert_range_tombstone`) still uses lexicographic `Ord` on `UserKey` rather than the pluggable `SharedComparator` introduced in PR `#67`. Threading the comparator into the AVL interval tree is a non-trivial refactor intentionally deferred from PR `#67` and tracked in issue `#81`. Do not flag the absence of comparator-aware RT suppression in the memtable interval tree as a blocking concern in PR `#67` or future PRs until issue `#81` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-23T00:44:56.025Z
Learning: In `structured-world/lsm-tree`, issue `#83` has been resolved in PR `#102` (`feat/#83-perf-extend-bloommaycontainkeyhash-to-support-part`). `Table::bloom_may_contain_key(key, key_hash)` was added to `src/table/mod.rs`, which seeks the partitioned filter TLI by user key and queries only the matching partition's bloom filter, returning `Ok(false)` for keys beyond all partition boundaries. `IterState` in `src/range.rs` gained `bloom_key: Option<UserKey>`; `bloom_passes()` dispatches to `bloom_may_contain_key` when `bloom_key` is set (single-key pipeline path) and falls back to `bloom_may_contain_key_hash` otherwise (prefix/range scans). The prior conservative `Ok(true)` fallback in `bloom_may_contain_key_hash` for partitioned/TLI filters is preserved for callers without a key. Do not flag the conservative fallback in `bloom_may_contain_key_hash` for partitioned/TLI filters as a blocking concern in any future PR — it is intentional for callers that lack a user key.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 100
File: tests/custom_comparator_compaction.rs:421-447
Timestamp: 2026-03-22T20:07:10.672Z
Learning: In `structured-world/lsm-tree` (`tests/custom_comparator_compaction.rs`), the two range-scan integration tests (`reverse_comparator_range_scan_after_compaction` and the second range-scan test) are intentionally `#[ignore]`d in PR `#100` because `RunReader::new()` still uses lexicographic `range_overlap_indexes` rather than the comparator-aware variant for on-disk range reads. Threading the `SharedComparator` through `RunReader` is a known deferred limitation documented in the PR body and tracked as a follow-up. Do not flag the ignored range-scan tests or the absence of comparator-aware `RunReader` overlap logic as a blocking concern in PR `#100` or future PRs until that follow-up issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 117
File: src/tree/ingest.rs:325-330
Timestamp: 2026-03-22T22:44:14.193Z
Learning: In `structured-world/lsm-tree` (`src/tree/ingest.rs`), the write-path monotonicity guards in `Ingestion::write`, `write_tombstone`, `write_weak_tombstone`, and `write_indirection` use plain lexicographic `key > *prev` comparisons instead of the configured `UserComparator`. This is a pre-existing issue acknowledged as valid but intentionally out of scope for PR `#98` (issue `#98` covers `Run::push()` and related methods). It is tracked in a separate follow-up issue. Do not flag the lexicographic ordering guards in `src/tree/ingest.rs` ingestion write paths as a concern in PR `#98` or future PRs until that separate issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.612Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.298Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T13:03:58.016Z
Learning: In `structured-world/lsm-tree` (`src/table/mod.rs`), `Table::bloom_may_contain_key_hash` returns `Ok(true)` (conservative fallback) for partitioned bloom filters (`pinned_filter_index.is_some()`) and filter TLI (`regions.filter_tli.is_some()`). Extending it to accept a key+seqno for partition-aware seeking is tracked in issue `#83`. Do not flag the conservative fallback in `bloom_may_contain_key_hash` for partitioned/TLI filters as a blocking concern in PR `#69` or future PRs until issue `#83` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 108
File: src/compaction/leveled/test.rs:681-727
Timestamp: 2026-03-23T00:40:16.100Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/test.rs`), the regression test `multi_level_sparse_keyspace_data_integrity` (added in PR `#108` for issue `#72`) uses full-span SSTs (writes both "a" and "z" in each flush) rather than genuinely disjoint key ranges. This is a known, accepted coverage gap: at unit-test data volumes the force-trivial-move scoring (99.99) cascades all intermediate levels to Lmax, making it structurally impossible to simultaneously populate L1 and L2 in a unit test. The test verifies data integrity (read-back correctness) only, not the per-range L2 sparse-overlap optimization. Do not flag the full-span flush pattern in this test as a deficiency in PR `#108` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.855Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.855Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in issue `#31` — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 67
File: src/comparator.rs:7-33
Timestamp: 2026-03-22T11:14:39.924Z
Learning: In `structured-world/lsm-tree` (`src/comparator.rs`), the `UserComparator` trait introduced in PR `#67` intentionally has no stable persisted identity (no `name()` method). Comparator identity persistence and mismatch detection on tree reopen is deferred to issue `#74`. The current design follows RocksDB's approach of placing responsibility on the caller (documented in `src/comparator.rs` lines 31–33 and in `Config::comparator` doc). Do not flag the absence of `UserComparator::name()` or comparator identity persistence as a blocking concern in PR `#67` or future PRs until issue `#74` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 101
File: src/tree/mod.rs:1370-1381
Timestamp: 2026-03-22T20:30:38.753Z
Learning: In `structured-world/lsm-tree`, the `UserComparator` trait (and all custom-comparator support) was introduced in PR `#67` within the same release cycle as the comparator-name persistence added in PR `#101`. All trees created before PR `#67` used only `DefaultUserComparator` (whose `name()` returns `"default"`). Manifests without a `comparator_name` section default to `"default"` in `Manifest::decode_from` (src/manifest.rs). Therefore, there are no legacy trees with a custom comparator and no upgrade path is needed — do not flag the missing-section fallback to `"default"` as a legacy-compatibility risk in PR `#101` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 101
File: src/tree/mod.rs:1370-1381
Timestamp: 2026-03-22T20:38:37.516Z
Learning: In `structured-world/lsm-tree`, the `UserComparator` trait was introduced in PR `#67` in the same release cycle as comparator-name persistence (PR `#101`, closes issue `#74`). All trees created before PR `#101` used `DefaultUserComparator` (no custom-comparator tree could have existed without the trait). Therefore, the manifest fallback of `"default"` for a missing `comparator_name` section is safe: it matches every pre-existing tree exactly, and no legacy custom-comparator trees exist that would be incorrectly rejected on reopen. Do not flag the missing-section → `"default"` fallback as an unsafe upgrade path.

✅ Actions performed

Full review triggered.

polaz · 2026-03-23T10:24:19Z

@coderabbitai re-review

pick_minimal_compaction operates on first_run() only. If a level has two runs (transient state from multi-level compaction #108), the second run would be missed. Return DoNothing and let the next compaction pass heal the multi-run state first. Replaces the relaxed debug_assert (run_count <= 2) with a runtime guard that avoids the problematic code path entirely.

Runtime DoNothing guard can stall compaction if multi-run state persists. Revert to relaxed debug_assert (run_count <= 2) — pick_minimal_compaction with first_run() is suboptimal but not incorrect for transient multi-run levels, and compaction still makes forward progress.

Tests exercise RunReader::new so the lint doesn't fire in test builds. Unconditional #[expect(dead_code)] would trigger unused-expect warning with deny(unused).

- Use crate::comparator::UserComparator in doc link to avoid broken intra-doc reference - Change reason to "crate-internal API" since run_reader is a private module

Deduplicate identical trim_slice inner functions in get_contained and get_contained_cmp into a single module-level helper.

…ect message - key_range contains_key doc: list contains_range_cmp as existing - leveled first_run expect: "at least one run" not "exactly one"

…run levels debug_assert with a hard run_count limit can panic in debug builds for valid transient states. Replace with log::debug since multi-run L1+ is a performance concern, not a correctness issue.

The manifest round-trips table order within each run, so recovered runs are already in comparator-sorted order. No re-sort needed.

RunReader comparator plumbing is done (new_cmp), but range bounds interpretation for reverse comparator remains unresolved (#116). Update ignore annotations to reflect the actual blocker.

Also update ignore annotations on range scan tests to reflect the actual blocker (#116 range bounds interpretation).

Accept &Level instead of &Run<Table> so the picker scans ALL runs in both levels. Trivial move checks overlap across all next-level runs; merge pull-in collects contained tables from all curr-level runs. Eliminates missed tables in transient multi-run levels from multi-level compaction (#108). Closes #122

Production SSTs store (comparator_min, comparator_max). With reverse comparator "z" < "p", so key range is (z,p) not (p,z).

take_while after flat_map kills the entire iterator when one run's window exceeds the size cap. Move inside flat_map so each run's windows are capped independently.

Exercises RunReader::new_cmp path in create_range — needs multiple SSTs in a single L1 run, which only happens after leveled compaction. Covers both full and bounded range scans.

Copilot

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

tests/custom_comparator.rs:1

flush_active_memtable is used elsewhere in this file with a monotonically increasing seqno that is >= the writes being flushed (e.g., flush_active_memtable(2) after seqnos 0/1). Here it is repeatedly called with 0 while inserts use seqnos 10..=90, which risks producing incorrect SST metadata or MVCC visibility assumptions and can make the test flaky/non-representative. Fix: pass a monotonically increasing flush seqno (e.g., key + 1, or a local counter) that is >= the max seqno written to the memtable being flushed.

use lsm_tree::{AbstractTree, Config, Guard as _, SharedComparator, UserComparator};

Takes first/last from already-sorted slice — no key comparison, works correctly for any comparator.

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated no new comments.

## Summary - Add `KeyRange::merge_sorted_cmp()` to coalesce sorted key ranges into disjoint intervals using a custom comparator - Replace per-table L2 overlap queries in multi-level compaction with merged-interval queries, reducing redundant binary searches when L0 tables overlap - Parts 1 and 3 of #122 were already completed in #117; this PR implements Part 2 (merge input ranges optimization) ## Technical Details Previously, multi-level compaction queried L2 once per input table — O(L2_runs × input_tables × log L2_run_size). With overlapping L0 tables, many queries hit the same L2 regions redundantly. Now, input key ranges from L0+L1 are sorted and merged into disjoint intervals first, then L2 is queried with the (typically much smaller) set of merged intervals. ## Test Plan - 8 unit tests for `merge_sorted_cmp` (empty, single, disjoint, overlapping, adjacent, contained, mixed, reverse comparator) - All 21 existing leveled compaction tests pass (including multi-level data integrity tests) - Full suite: 490 lib + 33 doc tests pass, zero clippy warnings Closes #122

Copilot AI review requested due to automatic review settings March 22, 2026 22:33

polaz force-pushed the feat/#98-bug-runpush-sorts-tables-lexicographically-ignorin branch from 54f896c to 66cfb53 Compare March 22, 2026 22:36

coderabbitai Bot reviewed Mar 22, 2026

View reviewed changes

Comment thread src/blob_tree/ingest.rs

Comment thread src/tree/ingest.rs

Copilot AI reviewed Mar 22, 2026

View reviewed changes

Comment thread src/version/mod.rs

Comment thread tests/custom_comparator.rs Outdated

coderabbitai Bot mentioned this pull request Mar 22, 2026

fix: ingestion write guards use lexicographic ordering instead of configured UserComparator #118

Closed

Copilot started reviewing on behalf of polaz March 22, 2026 22:54 View session

polaz requested a review from Copilot March 22, 2026 22:55

Copilot AI reviewed Mar 22, 2026

View reviewed changes

Comment thread src/version/run.rs Outdated

Comment thread src/version/mod.rs

Comment thread src/version/optimize.rs Outdated

Copilot started reviewing on behalf of polaz March 22, 2026 23:09 View session

polaz requested a review from Copilot March 23, 2026 00:32

Copilot started reviewing on behalf of polaz March 23, 2026 00:33 View session

polaz force-pushed the feat/#98-bug-runpush-sorts-tables-lexicographically-ignorin branch from a11eb34 to efd0bc2 Compare March 23, 2026 00:38

Copilot AI reviewed Mar 23, 2026

View reviewed changes

polaz force-pushed the feat/#98-bug-runpush-sorts-tables-lexicographically-ignorin branch from efd0bc2 to 47aac04 Compare March 23, 2026 01:40

sw-release-bot Bot reviewed Mar 23, 2026

View reviewed changes

polaz requested a review from Copilot March 23, 2026 08:38

Copilot started reviewing on behalf of polaz March 23, 2026 08:40 View session

polaz force-pushed the feat/#98-bug-runpush-sorts-tables-lexicographically-ignorin branch from 47aac04 to 2c86f4d Compare March 23, 2026 08:44

Copilot AI reviewed Mar 23, 2026

View reviewed changes

polaz force-pushed the feat/#98-bug-runpush-sorts-tables-lexicographically-ignorin branch from 2c86f4d to ce54e3e Compare March 23, 2026 10:21

polaz requested a review from Copilot March 23, 2026 10:24

polaz added 16 commits March 23, 2026 20:34

fix(run_reader): gate dead_code expect behind cfg(not(test))

7d55651

Tests exercise RunReader::new so the lint doesn't fire in test builds. Unconditional #[expect(dead_code)] would trigger unused-expect warning with deny(unused).

fix(run_reader): qualify doc link and correct dead_code reason

fa8110e

- Use crate::comparator::UserComparator in doc link to avoid broken intra-doc reference - Change reason to "crate-internal API" since run_reader is a private module

refactor(run): extract shared trim_slice helper from get_contained

27694a1

Deduplicate identical trim_slice inner functions in get_contained and get_contained_cmp into a single module-level helper.

docs(key_range,leveled): correct contains_range_cmp reference and exp…

c9d440b

…ect message - key_range contains_key doc: list contains_range_cmp as existing - leveled first_run expect: "at least one run" not "exactly one"

refactor(compaction): replace debug_assert with log::debug for multi-…

e9d4c74

…run levels debug_assert with a hard run_count limit can panic in debug builds for valid transient states. Replace with log::debug since multi-run L1+ is a performance concern, not a correctness issue.

docs(version): clarify recovery preserves comparator-sorted run order

05e1efb

The manifest round-trips table order within each run, so recovered runs are already in comparator-sorted order. No re-sort needed.

docs(test): clarify ignore reason for reverse comparator range scans

7901dad

RunReader comparator plumbing is done (new_cmp), but range bounds interpretation for reverse comparator remains unresolved (#116). Update ignore annotations to reflect the actual blocker.

test(run): add unit test for get_contained_cmp with reverse comparator

b01a22c

Also update ignore annotations on range scan tests to reflect the actual blocker (#116 range bounds interpretation).

docs(compaction): link multi-run picker concern to #122 Part 3

d3c9868

fix(test): use comparator-ordered key ranges in reverse cmp fixture

8cef990

Production SSTs store (comparator_min, comparator_max). With reverse comparator "z" < "p", so key range is (z,p) not (p,z).

fix(compaction): scope take_while per-run in multi-run picker

1766f1d

take_while after flat_map kills the entire iterator when one run's window exceeds the size cap. Move inside flat_map so each run's windows are capped independently.

test(comparator): add multi-table run range scan with u64 comparator

2c1feb9

Exercises RunReader::new_cmp path in create_range — needs multiple SSTs in a single L1 run, which only happens after leveled compaction. Covers both full and bounded range scans.

docs(run): clarify trim_slice returns span between first and last match

9dfceab

polaz force-pushed the feat/#98-bug-runpush-sorts-tables-lexicographically-ignorin branch from 1f8fe4e to 9dfceab Compare March 23, 2026 18:34

polaz requested review from Copilot and removed request for Copilot March 23, 2026 18:35

Copilot started reviewing on behalf of polaz March 23, 2026 18:42 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

Comment thread src/compaction/leveled/mod.rs

Comment thread src/compaction/leveled/mod.rs

docs(table): clarify aggregate_run_key_range is comparator-agnostic

9eb1234

Takes first/last from already-sorted slice — no key comparison, works correctly for any comparator.

polaz requested a review from Copilot March 23, 2026 18:56

Copilot started reviewing on behalf of polaz March 23, 2026 18:57 View session

Copilot AI reviewed Mar 23, 2026

View reviewed changes

polaz merged commit 7586739 into main Mar 23, 2026
19 of 20 checks passed

polaz deleted the feat/#98-bug-runpush-sorts-tables-lexicographically-ignorin branch March 23, 2026 19:04

sw-release-bot Bot mentioned this pull request Mar 23, 2026

chore: release v5.0.0 #60

Closed

polaz mentioned this pull request Mar 23, 2026

perf(compaction): merge input ranges before L2 overlap query #146

Merged

Conversation

polaz commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What #100 covered vs what this PR adds

Test Plan

Related

Uh oh!

coderabbitai Bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

polaz commented Mar 22, 2026

Uh oh!

coderabbitai Bot commented Mar 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

polaz commented Mar 23, 2026

Uh oh!

coderabbitai Bot commented Mar 23, 2026

Uh oh!

sw-release-bot Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

⚠️ Performance Alert ⚠️

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

polaz commented Mar 23, 2026

Uh oh!

coderabbitai Bot commented Mar 23, 2026

Uh oh!

polaz commented Mar 23, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Reviewers

polaz commented Mar 22, 2026 •

edited

Loading

coderabbitai Bot commented Mar 22, 2026 •

edited

Loading

codecov Bot commented Mar 22, 2026 •

edited

Loading

sw-release-bot Bot left a comment •

edited

Loading