Skip to content

feat: custom key comparison / comparator#67

Merged
polaz merged 22 commits into
mainfrom
feat/#17-feat-custom-key-comparison--comparator
Mar 22, 2026
Merged

feat: custom key comparison / comparator#67
polaz merged 22 commits into
mainfrom
feat/#17-feat-custom-key-comparison--comparator

Conversation

@polaz
Copy link
Copy Markdown
Member

@polaz polaz commented Mar 22, 2026

Summary

  • Add pluggable UserComparator trait for custom key ordering instead of hardcoded lexicographic byte comparison
  • Thread comparator through memtable, block index search, merge iterator, point read, and RT suppression paths
  • Enable CoordiNode to define natural ordering for composite keys without manual byte encoding tricks

Technical Details

New public API:

  • UserComparator trait — compare(&self, a: &[u8], b: &[u8]) -> Ordering + is_lexicographic() for fast-path detection
  • DefaultUserComparator — lexicographic bytes (backward compatible default)
  • Config::comparator(Arc<dyn UserComparator>) — builder method (field is pub(crate))
  • Bytewise equality invariant: compare(a, b) == Equal must imply a == b (bloom/hash rely on this)
  • Comparator identity is not persisted — caller ensures same comparator across open/close

Threading strategy:

  • Memtable: MemtableKey wrapper carries SharedComparator for SkipMap ordering
  • Block search: ParsedItem::compare_key accepts &dyn UserComparator; compare_prefixed_slice has zero-alloc fast path for lexicographic comparators
  • Merge iterator: HeapItem uses InternalKey::compare_with; Merger::new requires explicit comparator
  • Point reads: Run::get_for_key_cmp for correct table selection
  • RT suppression: is_suppressed_by_range_tombstones uses comparator for key-range filter and containment
  • Data/index block iterators: store SharedComparator, use in seek predicates
  • Static default_comparator() via LazyLock avoids repeated Arc allocations

Known limitations:

  • Memtable interval tree for range tombstones still uses lexicographic Ord — RT suppression in memtable may be incorrect with non-lexicographic comparators (tracked as follow-up issue)
  • KeyRange comparisons in some compaction paths still use lexicographic ordering
  • Comparator identity is not persisted to disk (same approach as RocksDB)

Test Plan

  • All existing lib + integration tests pass
  • 6 new integration tests: reverse comparator, u64 big-endian comparator
  • Tests cover in-memory and after-flush point reads + range scans
  • cargo clippy clean

Closes #17

Summary by CodeRabbit

  • New Features

    • Add support for pluggable/custom key comparators to control iteration and lookup ordering (e.g., reverse or numeric ordering).
    • Iteration, point-reads, and range behavior now respect configured comparator semantics.
  • API Changes

    • Configuration builder accepts a comparator; components that perform key ordering now require or accept a comparator to ensure consistent behavior.
  • Tests

    • New and updated tests verify custom comparator behaviors and ordering across operations.

Copilot AI review requested due to automatic review settings March 22, 2026 08:18
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

Warning

Rate limit exceeded

@polaz has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 2 minutes and 39 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e7862a31-651e-4ffd-9b79-ee00bd779fb1

📥 Commits

Reviewing files that changed from the base of the PR and between 9f3ee1e and 0efc392.

📒 Files selected for processing (37)
  • src/abstract_tree.rs
  • src/blob_tree/ingest.rs
  • src/blob_tree/mod.rs
  • src/compaction/flavour.rs
  • src/compaction/worker.rs
  • src/comparator.rs
  • src/config/mod.rs
  • src/key.rs
  • src/lib.rs
  • src/memtable/mod.rs
  • src/merge.rs
  • src/range.rs
  • src/range_tombstone.rs
  • src/table/block/decoder.rs
  • src/table/block_index/full.rs
  • src/table/block_index/two_level.rs
  • src/table/block_index/volatile.rs
  • src/table/data_block/iter.rs
  • src/table/data_block/iter_test.rs
  • src/table/data_block/mod.rs
  • src/table/index_block/iter.rs
  • src/table/index_block/mod.rs
  • src/table/inner.rs
  • src/table/iter.rs
  • src/table/meta.rs
  • src/table/mod.rs
  • src/table/scanner.rs
  • src/table/tests.rs
  • src/table/util.rs
  • src/tree/ingest.rs
  • src/tree/inner.rs
  • src/tree/mod.rs
  • src/version/run.rs
  • src/version/super_version.rs
  • src/vlog/blob_file/meta.rs
  • tests/custom_comparator.rs
  • tests/range_tombstone_ephemeral.rs

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduce a pluggable UserComparator and thread a SharedComparator through config, memtables, mergers, iterators, table recovery, tree/version/run logic, compaction, and metadata reads so all internal key comparisons use the configured comparator.

Changes

Cohort / File(s) Summary
Comparator Core
src/comparator.rs, src/lib.rs
Add UserComparator trait, DefaultUserComparator, SharedComparator alias, default_comparator() and public re-exports.
Configuration
src/config/mod.rs
Add internal comparator: SharedComparator, default init and Config::comparator(...) builder.
Memtable & Merge
src/memtable/mod.rs, src/merge.rs, src/version/super_version.rs, src/abstract_tree.rs
Memtable keys now wrap an InternalKey+comparator; memtable ctor requires comparator; HeapItem/Merger store comparator; comparator threaded into flush/merge initialization.
Tree / Version / Run
src/tree/mod.rs, src/tree/inner.rs, src/tree/ingest.rs, src/version/run.rs
Make many Tree APIs comparator-aware, propagate comparator into SuperVersions, memtable creation, table recovery; add Run::get_for_key_cmp.
Table Core & Recovery
src/table/mod.rs, src/table/inner.rs, src/table/meta.rs, src/table/tests.rs
Table::recover gains a comparator param stored in Inner; metadata point-reads and decode/recovery use comparator; tests updated.
Iterators & Block Readers
src/table/iter.rs, src/table/scanner.rs, src/table/data_block/*, src/table/index_block/*, src/table/block_index/*, src/table/block/decoder.rs, src/table/data_block/iter_test.rs
Thread SharedComparator into data/index/block iterators and decoders; replace direct byte compares with comparator calls; iterator constructors and tests updated.
Block Index Internals
src/table/block_index/full.rs, src/table/block_index/two_level.rs, src/table/block_index/volatile.rs
Block-index structs store comparator and create per-block iterators via closures that pass the comparator.
Table Utilities
src/table/util.rs
compare_prefixed_slice now accepts comparator; adds lexicographic fast-path and comparator fallback (allocating path).
Compaction & Worker
src/compaction/flavour.rs, src/compaction/worker.rs, src/compaction/*
Propagate comparator into compaction stream and table recovery during compaction flows; update Merger/CompactionStream wiring.
Blob, Vlog & BlobTree
src/blob_tree/*, src/vlog/blob_file/meta.rs
Thread comparator into blob-tree/table recoveries and vlog metadata point-reads.
Range-tombstone & RT filtering
src/tree/mod.rs, src/table/mod.rs, src/table/meta.rs
Range-tombstone decoding, selection, sorting and suppression use comparator-based ordering and visibility checks.
Tests & Examples
tests/custom_comparator.rs, tests/range_tombstone_ephemeral.rs
Add integration tests for custom comparators; update test helpers and ephemeral memtable construction to pass comparator.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Config
    participant Tree
    participant SuperVersions
    participant Memtable
    participant Merger
    participant Table
    participant DataBlock
    participant Comparator

    Client->>Config: open(config with comparator)
    Config->>Tree: create Tree (holds comparator)
    Client->>Tree: get(key)
    Tree->>SuperVersions: get_internal_entry_from_version(key, seq, comparator)
    SuperVersions->>Memtable: point_read(key, seqno, comparator)
    Memtable->>Comparator: compare user keys for bounds/ordering
    alt Memtable miss
        Tree->>Table: point_read/scan(key, comparator)
        Table->>DataBlock: iter(comparator)
        DataBlock->>Comparator: compare keys during seek/iter
        DataBlock-->>Table: InternalValue
    end
    Table-->>Tree: InternalValue
    Tree-->>Client: Value
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related issues

Possibly related PRs

Poem

🐇 I hopped through bytes and compared each key,

A tiny comparator kept order true for me.
Memtables bowed and mergers marched in line,
Tables recovered, every seek refined.
Hooray — now keys behave exactly as I please!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: custom key comparison / comparator' accurately and concisely describes the main change: introducing pluggable custom key comparison through a UserComparator trait instead of hardcoded lexicographic ordering.
Linked Issues check ✅ Passed All coding requirements from issue #17 are met: pluggable UserComparator trait, DefaultUserComparator implementation, Config builder method, comparator integration through key paths (memtable, block index, merge, point reads, range-tombstone suppression), and comprehensive tests.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing custom key comparison as specified in issue #17. The PR documents known deferred follow-ups (memtable interval-tree threading, certain compaction KeyRange comparisons, comparator identity persistence) as separate issues rather than including incomplete implementations.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/#17-feat-custom-key-comparison--comparator

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a pluggable UserComparator so the LSM can order user keys using a caller-defined comparison (instead of hardcoded lexicographic bytes), and threads that comparator through memtables, SST block/index iteration, merge logic, and point-read paths.

Changes:

  • Adds the UserComparator API (with DefaultUserComparator) and a Config::comparator(...) builder to configure ordering.
  • Reworks memtable and SST seeking/iteration to use the configured comparator (including block index/data block seek predicates and point reads).
  • Adds integration tests for custom comparators (reverse lexicographic + big-endian u64 ordering).

Reviewed changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
tests/range_tombstone_ephemeral.rs Updates ephemeral memtable construction to pass a comparator.
tests/custom_comparator.rs New integration tests covering custom comparator behavior across reads/iteration and after flush.
src/vlog/blob_file/meta.rs Uses the default comparator for metadata key point-reads after point-read API changes.
src/version/super_version.rs Threads comparator into initial memtable creation in SuperVersions.
src/version/run.rs Adds get_for_key_cmp to select candidate tables using a custom comparator.
src/tree/mod.rs Threads comparator through point-read paths and memtable creation/rotation.
src/tree/inner.rs Initializes SuperVersions with the configured comparator.
src/tree/ingest.rs Passes comparator when opening tables during ingestion.
src/table/util.rs Updates prefixed-slice comparison to delegate to the user comparator (currently via allocation).
src/table/tests.rs Updates table test harness to pass the default comparator to Table open/iter paths.
src/table/scanner.rs Threads comparator into table scanning so block iteration uses the correct ordering.
src/table/mod.rs Stores comparator on Table and uses it for filter/index seeks and data block point reads.
src/table/meta.rs Uses the default comparator for reading metadata keys via point-read.
src/table/iter.rs Threads comparator into table iterators and data-block reader construction.
src/table/inner.rs Adds comparator to the table Inner state.
src/table/index_block/mod.rs Comparator-aware key comparisons for index block parsed items + comparator-carrying iterator.
src/table/index_block/iter.rs Stores comparator in index-block iter and uses it in seek predicates.
src/table/data_block/mod.rs Comparator-aware key comparisons, point reads, and iter construction for data blocks.
src/table/data_block/iter_test.rs Updates data-block iterator tests to pass default comparator.
src/table/data_block/iter.rs Comparator-aware seek predicates and comparisons during data-block iteration.
src/table/block_index/volatile.rs Threads comparator into volatile index iteration/seeking.
src/table/block_index/two_level.rs Threads comparator into two-level index iteration/seeking.
src/table/block_index/full.rs Stores comparator in full block index and uses it when iterating.
src/table/block/decoder.rs Extends ParsedItem::compare_key to accept a comparator.
src/range.rs Switches memtable range iteration to comparator-compatible bounds type (range_internal).
src/merge.rs Comparator-aware heap ordering for merging iterators (adds comparator plumbing).
src/memtable/mod.rs Implements comparator-aware ordering via MemtableKey and adds comparator to memtable state.
src/lib.rs Adds and re-exports the comparator module/types.
src/key.rs Adds InternalKey::compare_with(...) for comparator-based ordering.
src/config/mod.rs Adds comparator field + default + Config::comparator(...) builder.
src/comparator.rs New comparator trait + default implementation + shared comparator type alias.
src/compaction/flavour.rs Passes comparator when opening tables during compaction.
src/blob_tree/mod.rs Threads comparator into point-read path for blob tree index lookups.
src/blob_tree/ingest.rs Passes comparator when opening tables during blob ingestion.
Comments suppressed due to low confidence (1)

src/range.rs:362

  • Tree range iteration merges sources with Merger::new(iters), which hardcodes the default (lexicographic) comparator. With a custom Config::comparator, this can yield incorrectly ordered merged output and can break range-scan correctness. Thread the tree’s configured comparator into TreeIter and build the merger via Merger::with_comparator(iters, comparator.clone()) (or equivalent).
                let iter = lock.version.active_memtable.range_internal(range.clone());

                iters.push(Box::new(
                    iter.filter(move |item| seqno_filter(item.key.seqno, seqno))
                        .map(Ok),

Comment thread src/comparator.rs Outdated
Comment thread src/table/util.rs
Comment thread src/comparator.rs
Comment thread src/merge.rs Outdated
Comment thread src/table/index_block/iter.rs
Comment thread src/memtable/mod.rs
Comment thread tests/custom_comparator.rs Outdated
Comment thread src/table/index_block/mod.rs
Comment thread src/table/data_block/mod.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/table/mod.rs (1)

452-462: ⚠️ Potential issue | 🔴 Critical

Use the user comparator when ordering recovered range tombstones.

recover now has the comparator in hand, but the later rts.sort_unstable_by(|a, b| a.start.cmp(&b.start)...) in this function still hardcodes lexicographic ordering. The point-read/range-tombstone suppression paths rely on that vector being sorted for binary search, so a non-lexicographic comparator can miss a covering tombstone or evaluate them in the wrong order.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/table/mod.rs` around lines 452 - 462, The range-tombstone vector is being
sorted with lexicographic key ordering even though recover has the user
comparator; change the rts.sort_unstable_by call in recover to use the provided
comparator (SharedComparator) to compare tombstone start keys instead of
a.start.cmp(&b.start) so binary searches and suppression use the user-defined
ordering (e.g., replace the closure to call comparator.compare(&a.start,
&b.start) and use that Ordering for the sort, adding any necessary tie-breaker
using the comparator as well).
src/table/data_block/mod.rs (1)

416-460: ⚠️ Potential issue | 🟠 Major

Hash-based fast paths bypass the comparator's equality relation, risking false negatives with custom comparators.

Table::get uses filter_block.maybe_contains_hash(key_hash) and DataBlock::point_read uses hash_index_reader.get(needle) — both operating on raw-byte hashes — without checking the comparator type. The UserComparator trait does not document that Equal must imply bytewise equality; the U64Comparator example shows comparators extracting and comparing u64 values, not comparing bytes directly. If a custom comparator treats byte-different keys as equal, these hash-based shortcuts can produce false negatives: the hash-index lookup returns MARKER_FREE or the bloom filter rejects the hash before the comparator is ever consulted.

Either document that UserComparator::Equal requires bytewise equality, or gate the raw-byte hash/bloom shortcuts to the default comparator.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/table/data_block/mod.rs` around lines 416 - 460, The hash-based
fast-paths must be disabled when a non-default comparator is used to avoid
byte-hash vs comparator-equality mismatches: update Table::get (where
filter_block.maybe_contains_hash is used) and DataBlock::point_read (where
hash_index_reader.get(needle) is used and MARKER_FREE/MARKER_CONFLICT are
interpreted) to check the comparator type (e.g., add or call a predicate on the
SharedComparator/UserComparator such as is_default() or equivalent) and only use
bloom/hash-index shortcuts when the comparator is the default bytewise
comparator; otherwise fall back to the existing comparator-aware code paths
(seqno-aware binary search / full scan) so the comparator.compare/Equal logic is
always consulted for non-default comparators.
src/tree/mod.rs (1)

785-823: ⚠️ Potential issue | 🔴 Critical

Range tombstone suppression is still byte-ordered.

get_internal_entry_from_version now resolves the candidate value with comparator, but the suppression it calls later in this file still narrows persisted RTs with lexicographic checks (metadata.key_range.contains_key(key) / rt.start.as_ref() <= key). On a non-default comparator tree, remove_range can therefore miss keys or suppress unrelated ones on point reads. The RT/key-range checks need the same comparator before this path is correct.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tree/mod.rs` around lines 785 - 823, get_internal_entry_from_version uses
the user comparator to resolve table candidates but still calls suppression
helpers that use byte-ordered checks (e.g., metadata.key_range.contains_key,
rt.start.as_ref() <= key), causing incorrect range-tombstone suppression on
custom comparators. Fix by making the suppression path comparator-aware: change
is_suppressed_by_range_tombstones (and any helpers that check
metadata.key_range.contains_key or compare rt.start/rt.end to keys) to accept
the same comparator argument and perform comparisons using the comparator's
compare/compare_key methods instead of byte-wise ordering, then update calls
from get_internal_entry_from_version (and calls from
get_internal_entry_from_sealed_memtables/get_internal_entry_from_tables if
present) to pass the comparator through. Ensure signatures and call sites are
updated consistently.
🧹 Nitpick comments (1)
src/table/util.rs (1)

150-166: Consider avoiding one allocation per prefix-compressed comparison.

This helper sits under DataBlockParsedItem::compare_key, so prefix-compressed seeks/scans now allocate and copy the full user key for every comparison. Reusing a scratch buffer or adding a concat-aware comparator hook would keep the comparator support without turning scans into an allocation-heavy hot path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/table/util.rs` around lines 150 - 166, The compare_prefixed_slice helper
currently allocates a new Vec for every call causing hot-path allocations;
modify its API and callers (e.g., DataBlockParsedItem::compare_key) to accept a
reusable scratch buffer (e.g., &mut Vec<u8>) or add a new comparator method
(e.g., UserComparator::compare_concat(prefix: &[u8], suffix: &[u8], needle:
&[u8]) -> Ordering) and call that instead—update compare_prefixed_slice to reuse
the provided scratch buffer (or call the new concat-aware comparator) and adjust
all call sites to pass the scratch Vec or call the new method so scans no longer
allocate per comparison.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/table/data_block/mod.rs`:
- Around line 281-293: compare_key uses three unchecked slice reads
(bytes.get_unchecked(...)) without documented invariants, which is unsound for
malformed on-disk blocks; either validate the recorded ranges when decoding in
parse_full / parse_truncated so compare_key can rely on in-bounds offsets, or
change compare_key to use safe checked slicing (bytes.get(...) or indexing) and
return a sensible error on out-of-bounds; if you retain unsafe gets, add a //
SAFETY: comment directly above each unsafe block explaining the precise
invariant guaranteed by parse_full/parse_truncated (e.g., that
prefix.0..prefix.1 and key.0..key.1 are within bytes) and ensure those
invariants are asserted at decode time.

---

Outside diff comments:
In `@src/table/data_block/mod.rs`:
- Around line 416-460: The hash-based fast-paths must be disabled when a
non-default comparator is used to avoid byte-hash vs comparator-equality
mismatches: update Table::get (where filter_block.maybe_contains_hash is used)
and DataBlock::point_read (where hash_index_reader.get(needle) is used and
MARKER_FREE/MARKER_CONFLICT are interpreted) to check the comparator type (e.g.,
add or call a predicate on the SharedComparator/UserComparator such as
is_default() or equivalent) and only use bloom/hash-index shortcuts when the
comparator is the default bytewise comparator; otherwise fall back to the
existing comparator-aware code paths (seqno-aware binary search / full scan) so
the comparator.compare/Equal logic is always consulted for non-default
comparators.

In `@src/table/mod.rs`:
- Around line 452-462: The range-tombstone vector is being sorted with
lexicographic key ordering even though recover has the user comparator; change
the rts.sort_unstable_by call in recover to use the provided comparator
(SharedComparator) to compare tombstone start keys instead of
a.start.cmp(&b.start) so binary searches and suppression use the user-defined
ordering (e.g., replace the closure to call comparator.compare(&a.start,
&b.start) and use that Ordering for the sort, adding any necessary tie-breaker
using the comparator as well).

In `@src/tree/mod.rs`:
- Around line 785-823: get_internal_entry_from_version uses the user comparator
to resolve table candidates but still calls suppression helpers that use
byte-ordered checks (e.g., metadata.key_range.contains_key, rt.start.as_ref() <=
key), causing incorrect range-tombstone suppression on custom comparators. Fix
by making the suppression path comparator-aware: change
is_suppressed_by_range_tombstones (and any helpers that check
metadata.key_range.contains_key or compare rt.start/rt.end to keys) to accept
the same comparator argument and perform comparisons using the comparator's
compare/compare_key methods instead of byte-wise ordering, then update calls
from get_internal_entry_from_version (and calls from
get_internal_entry_from_sealed_memtables/get_internal_entry_from_tables if
present) to pass the comparator through. Ensure signatures and call sites are
updated consistently.

---

Nitpick comments:
In `@src/table/util.rs`:
- Around line 150-166: The compare_prefixed_slice helper currently allocates a
new Vec for every call causing hot-path allocations; modify its API and callers
(e.g., DataBlockParsedItem::compare_key) to accept a reusable scratch buffer
(e.g., &mut Vec<u8>) or add a new comparator method (e.g.,
UserComparator::compare_concat(prefix: &[u8], suffix: &[u8], needle: &[u8]) ->
Ordering) and call that instead—update compare_prefixed_slice to reuse the
provided scratch buffer (or call the new concat-aware comparator) and adjust all
call sites to pass the scratch Vec or call the new method so scans no longer
allocate per comparison.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d43bb8ca-ad24-4d19-acc0-be8b3cb9331a

📥 Commits

Reviewing files that changed from the base of the PR and between 2470c43 and 48a912c.

📒 Files selected for processing (34)
  • src/blob_tree/ingest.rs
  • src/blob_tree/mod.rs
  • src/compaction/flavour.rs
  • src/comparator.rs
  • src/config/mod.rs
  • src/key.rs
  • src/lib.rs
  • src/memtable/mod.rs
  • src/merge.rs
  • src/range.rs
  • src/table/block/decoder.rs
  • src/table/block_index/full.rs
  • src/table/block_index/two_level.rs
  • src/table/block_index/volatile.rs
  • src/table/data_block/iter.rs
  • src/table/data_block/iter_test.rs
  • src/table/data_block/mod.rs
  • src/table/index_block/iter.rs
  • src/table/index_block/mod.rs
  • src/table/inner.rs
  • src/table/iter.rs
  • src/table/meta.rs
  • src/table/mod.rs
  • src/table/scanner.rs
  • src/table/tests.rs
  • src/table/util.rs
  • src/tree/ingest.rs
  • src/tree/inner.rs
  • src/tree/mod.rs
  • src/version/run.rs
  • src/version/super_version.rs
  • src/vlog/blob_file/meta.rs
  • tests/custom_comparator.rs
  • tests/range_tombstone_ephemeral.rs

Comment thread src/table/data_block/mod.rs
@polaz polaz force-pushed the feat/#17-feat-custom-key-comparison--comparator branch from 48a912c to 3db1e06 Compare March 22, 2026 08:58
@polaz polaz requested a review from Copilot March 22, 2026 09:48
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 22, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

@polaz Sure, I'll re-review the changes in PR #67 now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:29:48.199Z
Learning: In `structured-world/lsm-tree` (`src/prefix.rs`), `is_valid_scan_boundary(&self, prefix: &[u8]) -> bool` has been added to the `PrefixExtractor` trait in PR `#68` (closes issue `#47`), replacing the previously internal `is_valid_prefix_boundary` name. The method has a default implementation that checks `prefixes(prefix).any(|p| p == prefix)`, correct for sub-slice extractors, and can be overridden for O(1) checks (e.g., `FixedLengthPrefix`) or for extractors whose `prefixes()` does not return sub-slices. `compute_prefix_hash` in `src/prefix.rs` calls `is_valid_scan_boundary` as the sole call site. Do not flag the absence of `is_valid_scan_boundary` or suggest adding it in future PRs — it is already part of the public API.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 45
File: tools/db_bench/src/workloads/mergerandom.rs:27-35
Timestamp: 2026-03-21T17:08:52.777Z
Learning: In `structured-world/lsm-tree`, `lsm_tree`'s merge operator API is internal (not public). The `mergerandom` workload in `tools/db_bench/src/workloads/mergerandom.rs` is intentionally an overwrite/compaction stress benchmark rather than a true merge-operand benchmark. True merge-operand benchmarks are planned for a future PR using the CoordiNode posting list API. Do not flag `mergerandom` for lacking merge-operand calls in PR `#45` or future PRs until that API is exposed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.598Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.289Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in issue `#31` — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.
✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 36 changed files in this pull request and generated 4 comments.

Comment thread src/comparator.rs Outdated
Comment thread tests/custom_comparator.rs
Comment thread src/config/mod.rs
Comment thread src/table/util.rs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 22, 2026

Codecov Report

❌ Patch coverage is 93.67397% with 26 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/table/index_block/mod.rs 33.33% 8 Missing ⚠️
src/comparator.rs 68.18% 7 Missing ⚠️
src/table/util.rs 86.11% 5 Missing ⚠️
src/tree/mod.rs 96.10% 3 Missing ⚠️
src/blob_tree/mod.rs 95.45% 1 Missing ⚠️
src/merge.rs 90.00% 1 Missing ⚠️
src/table/meta.rs 96.15% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

@polaz polaz force-pushed the feat/#17-feat-custom-key-comparison--comparator branch 2 times, most recently from 272b524 to f17a79f Compare March 22, 2026 10:19
@polaz polaz requested a review from Copilot March 22, 2026 10:19
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 22, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

@polaz Sure, I'll re-review PR #67 now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:29:48.199Z
Learning: In `structured-world/lsm-tree` (`src/prefix.rs`), `is_valid_scan_boundary(&self, prefix: &[u8]) -> bool` has been added to the `PrefixExtractor` trait in PR `#68` (closes issue `#47`), replacing the previously internal `is_valid_prefix_boundary` name. The method has a default implementation that checks `prefixes(prefix).any(|p| p == prefix)`, correct for sub-slice extractors, and can be overridden for O(1) checks (e.g., `FixedLengthPrefix`) or for extractors whose `prefixes()` does not return sub-slices. `compute_prefix_hash` in `src/prefix.rs` calls `is_valid_scan_boundary` as the sole call site. Do not flag the absence of `is_valid_scan_boundary` or suggest adding it in future PRs — it is already part of the public API.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.289Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.724Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path (lines ~749–759) uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to a separate follow-up issue. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until that issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.724Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to issue `#72`. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until issue `#72` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in issue `#31` — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.598Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 43
File: tests/tree_prefix_bloom.rs:91-105
Timestamp: 2026-03-21T22:11:09.195Z
Learning: In `structured-world/lsm-tree`, prefix-bloom integration tests in `tests/tree_prefix_bloom.rs` intentionally validate correctness (result counts and key ordering) rather than asserting that the bloom-skip pruning path fired. Bloom-skip metrics/counters are `pub(crate)` and cannot be observed from integration tests. Do not flag these tests for failing to assert on internal pruning behavior in PR `#43` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/tree/mod.rs:913-956
Timestamp: 2026-03-22T03:09:12.274Z
Learning: In `structured-world/lsm-tree` (`src/tree/mod.rs`), the `resolve_merge_get` function's disk-scan loop in step 3 is a flat scan across all levels/runs and does not apply the L0-specific newest-first ordering used by `get_internal_entry_from_tables`. This means it can stop on a base/tombstone boundary from an older L0 run while a newer overlapping L0 run still has visible `MergeOperand` entries. This is a known limitation documented with `// TODO(`#46`)` in the code and is intentionally deferred to issue `#46`. Do not flag the flat L0 scan in `resolve_merge_get` as a merge-semantics bug in PR `#28` or future PRs until issue `#46` is addressed.

Learnt from: CR
Repo: structured-world/lsm-tree PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2026-03-20T22:03:44.692Z
Learning: Applies to **/*.rs : Before flagging any potential issue, trace the call chain visible in the PR diff. Check the caller to see if it handles the case, verify type-level guarantees, and read adjacent comments about deliberate design decisions.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 45
File: tools/db_bench/src/workloads/mergerandom.rs:27-35
Timestamp: 2026-03-21T17:08:52.777Z
Learning: In `structured-world/lsm-tree`, `lsm_tree`'s merge operator API is internal (not public). The `mergerandom` workload in `tools/db_bench/src/workloads/mergerandom.rs` is intentionally an overwrite/compaction stress benchmark rather than a true merge-operand benchmark. True merge-operand benchmarks are planned for a future PR using the CoordiNode posting list API. Do not flag `mergerandom` for lacking merge-operand calls in PR `#45` or future PRs until that API is exposed.
✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 36 changed files in this pull request and generated 4 comments.

Comment thread src/tree/mod.rs
Comment thread src/table/util.rs
Comment thread src/table/util.rs
Comment thread src/table/util.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/tree/mod.rs (1)

952-958: ⚠️ Potential issue | 🔴 Critical

Use comparator-aware run lookup in merge point reads.

Line 957 still uses run.get_for_key(key). In this comparator-enabled path, that can miss the correct table under non-lexicographic ordering and drop visible merge/base entries during get()/multi_get() resolution. Use run.get_for_key_cmp(key, comparator) here.

Suggested fix
-                if let Some(table) = run.get_for_key(key) {
+                if let Some(table) = run.get_for_key_cmp(key, comparator) {

As per coding guidelines: **/*.rs: “Flag data corruption bugs: wrong compaction merge logic, incorrect key ordering, dropped or duplicated entries during merge (Tier 1 — MUST flag)”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tree/mod.rs` around lines 952 - 958, The loop over
super_version.version.iter_levels().flat_map(...) is using run.get_for_key(key)
which is not comparator-aware and can miss entries; replace that call with
run.get_for_key_cmp(key, comparator) in this comparator-enabled path so the run
lookup respects non-lexicographic ordering (keep the rest of the logic,
including the key_slice range handling, unchanged), ensuring you pass the same
comparator in scope and remove the old get_for_key usage.
♻️ Duplicate comments (1)
src/table/data_block/mod.rs (1)

281-295: ⚠️ Potential issue | 🔴 Critical

These on-disk slice offsets are still trusted without validation.

The new comparator-aware path still dereferences prefix/key with get_unchecked, but parse_full and parse_truncated only record those ranges—they do not prove that they stay within bytes. A malformed block can still turn a seek into UB instead of a corruption error.

As per coding guidelines, "Flag missing validation: unchecked block offset, unvalidated segment metadata from disk (Tier 1 — MUST flag)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/table/data_block/mod.rs` around lines 281 - 295, compare_key currently
uses get_unchecked on offsets recorded by parse_full/parse_truncated (prefix and
key), which can lead to UB for malformed on-disk data; change compare_key to
validate those ranges before dereferencing (use bytes.get(range) instead of
get_unchecked) and propagate a corruption error instead of panicking/UB when a
range is out of bounds. Specifically, replace the unsafe slices in compare_key
(prefix / self.key usage and calls to compare_prefixed_slice / cmp.compare) with
safe lookups, return a Result<Ordering, CorruptionError> (or your crate's
existing corruption error type) from compare_key, and update its callers to
handle/propagate that error; keep parse_full and parse_truncated as-is (they
only record ranges), but rely on the new runtime bounds checks in compare_key to
prevent UB.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/comparator.rs`:
- Around line 7-33: The Comparator trait lacks a stable persisted identity: add
a stable identifier method (e.g., Comparator::name() or fingerprint()) and
persist that string into the tree metadata when creating a new tree, then update
reopen/recover code paths (config module and Table::recover) to read the stored
comparator identity and compare it to the caller-supplied comparator::name(); if
they differ, return an error and refuse to open/recover the tree. Ensure the
persistence/read happens alongside other tree metadata (creation path in table
creation) and that Table::recover, config loading, and any open/path that
accepts a Comparator check and fail fast on mismatches.

In `@src/memtable/mod.rs`:
- Around line 26-29: The approximate_size calculation still assumes stored
entries are InternalValue-only, but MemtableKey now includes a SharedComparator;
update Memtable::item_size / approximate_size logic to account for the new
MemtableKey layout by adding the size of SharedComparator (or its pointer/arc
overhead) plus MemtableKey::inner to the per-item size calculation; locate
usages around the MemtableKey struct and functions named item_size and
approximate_size and adjust their size math to include comparator overhead so
memtable rotation triggers correctly.

---

Outside diff comments:
In `@src/tree/mod.rs`:
- Around line 952-958: The loop over
super_version.version.iter_levels().flat_map(...) is using run.get_for_key(key)
which is not comparator-aware and can miss entries; replace that call with
run.get_for_key_cmp(key, comparator) in this comparator-enabled path so the run
lookup respects non-lexicographic ordering (keep the rest of the logic,
including the key_slice range handling, unchanged), ensuring you pass the same
comparator in scope and remove the old get_for_key usage.

---

Duplicate comments:
In `@src/table/data_block/mod.rs`:
- Around line 281-295: compare_key currently uses get_unchecked on offsets
recorded by parse_full/parse_truncated (prefix and key), which can lead to UB
for malformed on-disk data; change compare_key to validate those ranges before
dereferencing (use bytes.get(range) instead of get_unchecked) and propagate a
corruption error instead of panicking/UB when a range is out of bounds.
Specifically, replace the unsafe slices in compare_key (prefix / self.key usage
and calls to compare_prefixed_slice / cmp.compare) with safe lookups, return a
Result<Ordering, CorruptionError> (or your crate's existing corruption error
type) from compare_key, and update its callers to handle/propagate that error;
keep parse_full and parse_truncated as-is (they only record ranges), but rely on
the new runtime bounds checks in compare_key to prevent UB.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e24eb803-8ec2-497a-be55-f1244c98c10d

📥 Commits

Reviewing files that changed from the base of the PR and between 48a912c and c9cf924.

📒 Files selected for processing (36)
  • src/abstract_tree.rs
  • src/blob_tree/ingest.rs
  • src/blob_tree/mod.rs
  • src/compaction/flavour.rs
  • src/compaction/worker.rs
  • src/comparator.rs
  • src/config/mod.rs
  • src/key.rs
  • src/lib.rs
  • src/memtable/mod.rs
  • src/merge.rs
  • src/range.rs
  • src/table/block/decoder.rs
  • src/table/block_index/full.rs
  • src/table/block_index/two_level.rs
  • src/table/block_index/volatile.rs
  • src/table/data_block/iter.rs
  • src/table/data_block/iter_test.rs
  • src/table/data_block/mod.rs
  • src/table/index_block/iter.rs
  • src/table/index_block/mod.rs
  • src/table/inner.rs
  • src/table/iter.rs
  • src/table/meta.rs
  • src/table/mod.rs
  • src/table/scanner.rs
  • src/table/tests.rs
  • src/table/util.rs
  • src/tree/ingest.rs
  • src/tree/inner.rs
  • src/tree/mod.rs
  • src/version/run.rs
  • src/version/super_version.rs
  • src/vlog/blob_file/meta.rs
  • tests/custom_comparator.rs
  • tests/range_tombstone_ephemeral.rs
✅ Files skipped from review due to trivial changes (3)
  • tests/range_tombstone_ephemeral.rs
  • src/table/meta.rs
  • src/table/index_block/iter.rs
🚧 Files skipped from review as they are similar to previous changes (16)
  • src/compaction/flavour.rs
  • src/blob_tree/ingest.rs
  • src/tree/ingest.rs
  • src/tree/inner.rs
  • src/blob_tree/mod.rs
  • src/range.rs
  • src/vlog/blob_file/meta.rs
  • src/table/block/decoder.rs
  • src/table/scanner.rs
  • tests/custom_comparator.rs
  • src/table/tests.rs
  • src/table/block_index/full.rs
  • src/table/data_block/iter.rs
  • src/merge.rs
  • src/key.rs
  • src/config/mod.rs

Comment thread src/comparator.rs
Comment thread src/memtable/mod.rs
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 36 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/tree/mod.rs:1105

  • is_suppressed_by_range_tombstones uses lexicographic ordering for memtable range-tombstone suppression (via Memtable::is_key_suppressed_by_range_tombstone / interval_tree), but uses the configured UserComparator for SST range-tombstone suppression below. With a non-lexicographic comparator, this makes suppression semantics differ before vs after flush/compaction, which can lead to inconsistent point-read results. Consider threading the comparator through memtable range-tombstone queries (and using it for interval-tree ordering/containment), or keep the SST suppression path lexicographic until the whole range-tombstone stack is comparator-aware.

Comment thread src/merge.rs
@polaz polaz force-pushed the feat/#17-feat-custom-key-comparison--comparator branch from 4c7c8bd to daad3e0 Compare March 22, 2026 11:35
@polaz polaz requested a review from Copilot March 22, 2026 11:36
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 22, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

@polaz Sure, I'll re-review PR #67 now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.724Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path (lines ~749–759) uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to a separate follow-up issue. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until that issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.724Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to issue `#72`. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until issue `#72` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/tree/mod.rs:913-956
Timestamp: 2026-03-22T03:09:12.274Z
Learning: In `structured-world/lsm-tree` (`src/tree/mod.rs`), the `resolve_merge_get` function's disk-scan loop in step 3 is a flat scan across all levels/runs and does not apply the L0-specific newest-first ordering used by `get_internal_entry_from_tables`. This means it can stop on a base/tombstone boundary from an older L0 run while a newer overlapping L0 run still has visible `MergeOperand` entries. This is a known limitation documented with `// TODO(`#46`)` in the code and is intentionally deferred to issue `#46`. Do not flag the flat L0 scan in `resolve_merge_get` as a merge-semantics bug in PR `#28` or future PRs until issue `#46` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 45
File: tools/db_bench/src/workloads/mergerandom.rs:27-35
Timestamp: 2026-03-21T17:08:52.777Z
Learning: In `structured-world/lsm-tree`, `lsm_tree`'s merge operator API is internal (not public). The `mergerandom` workload in `tools/db_bench/src/workloads/mergerandom.rs` is intentionally an overwrite/compaction stress benchmark rather than a true merge-operand benchmark. True merge-operand benchmarks are planned for a future PR using the CoordiNode posting list API. Do not flag `mergerandom` for lacking merge-operand calls in PR `#45` or future PRs until that API is exposed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.289Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 43
File: tests/tree_prefix_bloom.rs:91-105
Timestamp: 2026-03-21T22:11:09.195Z
Learning: In `structured-world/lsm-tree`, prefix-bloom integration tests in `tests/tree_prefix_bloom.rs` intentionally validate correctness (result counts and key ordering) rather than asserting that the bloom-skip pruning path fired. Bloom-skip metrics/counters are `pub(crate)` and cannot be observed from integration tests. Do not flag these tests for failing to assert on internal pruning behavior in PR `#43` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.598Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:29:48.199Z
Learning: In `structured-world/lsm-tree` (`src/prefix.rs`), `is_valid_scan_boundary(&self, prefix: &[u8]) -> bool` has been added to the `PrefixExtractor` trait in PR `#68` (closes issue `#47`), replacing the previously internal `is_valid_prefix_boundary` name. The method has a default implementation that checks `prefixes(prefix).any(|p| p == prefix)`, correct for sub-slice extractors, and can be overridden for O(1) checks (e.g., `FixedLengthPrefix`) or for extractors whose `prefixes()` does not return sub-slices. `compute_prefix_hash` in `src/prefix.rs` calls `is_valid_scan_boundary` as the sole call site. Do not flag the absence of `is_valid_scan_boundary` or suggest adding it in future PRs — it is already part of the public API.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in issue `#31` — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.835Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 67
File: src/comparator.rs:7-33
Timestamp: 2026-03-22T11:14:38.339Z
Learning: In `structured-world/lsm-tree` (`src/comparator.rs`), the `UserComparator` trait introduced in PR `#67` intentionally has no stable persisted identity (no `name()` method). Comparator identity persistence and mismatch detection on tree reopen is deferred to issue `#74`. The current design follows RocksDB's approach of placing responsibility on the caller (documented in `src/comparator.rs` lines 31–33 and in `Config::comparator` doc). Do not flag the absence of `UserComparator::name()` or comparator identity persistence as a blocking concern in PR `#67` or future PRs until issue `#74` is addressed.
✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 36 out of 36 changed files in this pull request and generated 2 comments.

Comment thread src/version/run.rs
Comment thread src/config/mod.rs Outdated
polaz added 15 commits March 22, 2026 18:58
…FETY docs

- Restore key_range filter in is_suppressed_by_range_tombstones using
  comparator-aware comparison instead of lexicographic KeyRange::contains_key
- Add SAFETY comments on all unsafe get_unchecked calls in
  compare_prefixed_slice_lexicographic explaining the min() bound
- Document slow-path allocation rationale for custom comparators
- Config::comparator: pub -> pub(crate), builder method is the public API
- Document get_for_key_cmp: explains why comparator-based binary search
  is correct for runs sorted in comparator order
- decode_range_tombstones: validate start < end using comparator
  instead of lexicographic Ord
- Add SAFETY comment on final get_unchecked in
  compare_prefixed_slice_lexicographic explaining rest_len guard
- Add unit test for custom-comparator slow path (ReverseComparator)
@polaz polaz force-pushed the feat/#17-feat-custom-key-comparison--comparator branch from fd4cd54 to 0efc392 Compare March 22, 2026 16:59
@polaz polaz requested a review from Copilot March 22, 2026 16:59
@polaz
Copy link
Copy Markdown
Member Author

polaz commented Mar 22, 2026

@coderabbitai re-review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

@polaz Sure, I'll re-review PR #67 now!


🧠 Learnings used
Learnt from: polaz
Repo: structured-world/lsm-tree PR: 67
File: src/memtable/mod.rs:72-77
Timestamp: 2026-03-22T12:26:47.795Z
Learning: In `structured-world/lsm-tree` (`src/memtable/interval_tree.rs` and `src/memtable/mod.rs`), the memtable `IntervalTree` used for range tombstone containment queries (`is_key_suppressed_by_range_tombstone`, `query_suppression`, `insert_range_tombstone`) still uses lexicographic `Ord` on `UserKey` rather than the pluggable `SharedComparator` introduced in PR `#67`. Threading the comparator into the AVL interval tree is a non-trivial refactor intentionally deferred from PR `#67` and tracked in issue `#81`. Do not flag the absence of comparator-aware RT suppression in the memtable interval tree as a blocking concern in PR `#67` or future PRs until issue `#81` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:06.298Z
Learning: In `structured-world/lsm-tree`, the switch from `Mutex<IntervalTree>` to `RwLock<IntervalTree>` (or a freeze-on-seal immutable structure) for memtable range-tombstone storage in `src/memtable/mod.rs` is intentionally deferred from PR `#21` to a separate issue (`#30`). Do not flag the existing `Mutex<IntervalTree>` usage in `src/memtable/mod.rs` as a concurrency concern in the context of PR `#21`; it is tracked in issue `#30`.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 67
File: src/comparator.rs:7-33
Timestamp: 2026-03-22T11:14:39.924Z
Learning: In `structured-world/lsm-tree` (`src/comparator.rs`), the `UserComparator` trait introduced in PR `#67` intentionally has no stable persisted identity (no `name()` method). Comparator identity persistence and mismatch detection on tree reopen is deferred to issue `#74`. The current design follows RocksDB's approach of placing responsibility on the caller (documented in `src/comparator.rs` lines 31–33 and in `Config::comparator` doc). Do not flag the absence of `UserComparator::name()` or comparator identity persistence as a blocking concern in PR `#67` or future PRs until issue `#74` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.855Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in issue `#31` — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-20T08:17:39.855Z
Learning: In `structured-world/lsm-tree`, the `Memtable` range-tombstone storage uses `Mutex<IntervalTree>` (introduced in PR `#21`). Switching to `RwLock<IntervalTree>` or a freeze-on-seal immutable structure is a deliberate future improvement tracked in a separate issue — do not flag the `Mutex` usage as a blocking concern in PR `#21` or future PRs that reference this design.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.749Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path (lines ~749–759) uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to a separate follow-up issue. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until that issue is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/tree/mod.rs:913-956
Timestamp: 2026-03-22T03:09:14.362Z
Learning: In `structured-world/lsm-tree` (`src/tree/mod.rs`), the `resolve_merge_get` function's disk-scan loop in step 3 is a flat scan across all levels/runs and does not apply the L0-specific newest-first ordering used by `get_internal_entry_from_tables`. This means it can stop on a base/tombstone boundary from an older L0 run while a newer overlapping L0 run still has visible `MergeOperand` entries. This is a known limitation documented with `// TODO(`#46`)` in the code and is intentionally deferred to issue `#46`. Do not flag the flat L0 scan in `resolve_merge_get` as a merge-semantics bug in PR `#28` or future PRs until issue `#46` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:55:39.749Z
Learning: In `structured-world/lsm-tree` (`src/compaction/leveled/mod.rs`), the multi-level compaction path uses a single aggregate key range across all L0+L1 inputs to query L2 overlapping tables. Using per-range L2 overlap queries (one per contiguous L1 range segment) instead would avoid pulling in L2 tables that sit in gaps between disjoint L1 ranges on sparse keyspaces. This improvement is intentionally deferred from PR `#66` to issue `#72`. Do not flag the coarse aggregate L2 range query as a blocking concern in PR `#66` or future PRs until issue `#72` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 45
File: tools/db_bench/src/workloads/mergerandom.rs:27-35
Timestamp: 2026-03-21T17:08:56.084Z
Learning: In `structured-world/lsm-tree`, `lsm_tree`'s merge operator API is internal (not public). The `mergerandom` workload in `tools/db_bench/src/workloads/mergerandom.rs` is intentionally an overwrite/compaction stress benchmark rather than a true merge-operand benchmark. True merge-operand benchmarks are planned for a future PR using the CoordiNode posting list API. Do not flag `mergerandom` for lacking merge-operand calls in PR `#45` or future PRs until that API is exposed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/table/multi_writer.rs:129-206
Timestamp: 2026-03-20T11:14:05.620Z
Learning: In `structured-world/lsm-tree`, the `write_rts_to_writer` function in `src/table/multi_writer.rs` has a known limitation when `clip_range_tombstones=true` (compaction mode): RT fragments that don't intersect any output writer's KV span are silently dropped, which can lose delete-only gaps (e.g. an RT covering `[m,p)` while all output tables cover `[a,l]` and `[q,z]`). This limitation is intentionally deferred to issue `#32` and is documented as a Known Limitation in PR `#21`. Do not flag this as a blocking concern in PR `#21` or any PR until `#32` is addressed.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 21
File: src/table/writer/mod.rs:394-463
Timestamp: 2026-03-20T20:05:21.453Z
Learning: In `structured-world/lsm-tree` (`src/table/writer/mod.rs`), the `key#min`/`key#max` widening for RT bounds in `Writer::finish()` is intentionally limited to the `item_count == 0` (RT-only) branch. For mixed KV+RT tables, `MultiWriter::write_rts_to_writer` (lines 167–199 of `src/table/multi_writer.rs`) already widens `meta.first_key`/`meta.last_key` conservatively in flush mode (`clip=false`). In compaction mode (`clip=true`), RTs are intersected with the KV key range so no widening is needed. Do not flag the absence of unconditional widening in `Writer::finish()` as a bug in PR `#21` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 28
File: src/abstract_tree.rs:128-129
Timestamp: 2026-03-22T03:19:33.840Z
Learning: In `structured-world/lsm-tree` (`src/abstract_tree.rs`), the flush path (`AbstractTree::flush`) intentionally does NOT pass range tombstones into `CompactionStream` for RT-boundary-aware merge resolution. Flush is a co-location pass: both KV entries (including merged operands) and RTs are written together into the output SST. RT suppression is enforced exclusively at read time via `is_suppressed_by_range_tombstones` (point reads) and `RangeTombstoneFilter` (range scans). Do not flag the absence of RT propagation into `CompactionStream` during flush as a merge-semantics bug in PR `#28` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:21:07.612Z
Learning: In `structured-world/lsm-tree` (`src/range.rs`), prefix bloom filter skipping for multi-table runs (L0) has been implemented in PR `#64` (closes issue `#57`). The `_ =>` branch in `TreeIter::create_range` now applies a per-table key-range guard followed by a bloom check; 0 survivors skips the run, 1 survivor is demoted to `single_tables` for the range-tombstone table-skip optimization, and 2+ survivors are collected into a new `Run`. Range tombstones are still collected from ALL tables in the run regardless of bloom filtering. Do not flag the absence of multi-table run prefix bloom skipping in this repository — it has been implemented.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-21T22:21:36.968Z
Learning: In `structured-world/lsm-tree` (`src/table/mod.rs`), `Table::maybe_contains_prefix` intentionally returns `Ok(true)` (conservative fallback) when `pinned_filter_index.is_some() || regions.filter_tli.is_some()` (partitioned filter or filter TLI present). This guard was added in PR `#43` at line 793. Do not flag the absence of per-partition prefix-hash checking as a bug in PR `#43` or future PRs.

Learnt from: polaz
Repo: structured-world/lsm-tree PR: 0
File: :0-0
Timestamp: 2026-03-22T09:29:48.222Z
Learning: In `structured-world/lsm-tree` (`src/prefix.rs`), `is_valid_scan_boundary(&self, prefix: &[u8]) -> bool` has been added to the `PrefixExtractor` trait in PR `#68` (closes issue `#47`), replacing the previously internal `is_valid_prefix_boundary` name. The method has a default implementation that checks `prefixes(prefix).any(|p| p == prefix)`, correct for sub-slice extractors, and can be overridden for O(1) checks (e.g., `FixedLengthPrefix`) or for extractors whose `prefixes()` does not return sub-slices. `compute_prefix_hash` in `src/prefix.rs` calls `is_valid_scan_boundary` as the sole call site. Do not flag the absence of `is_valid_scan_boundary` or suggest adding it in future PRs — it is already part of the public API.
✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 37 out of 37 changed files in this pull request and generated no new comments.

@polaz polaz merged commit 610e113 into main Mar 22, 2026
20 checks passed
@polaz polaz deleted the feat/#17-feat-custom-key-comparison--comparator branch March 22, 2026 17:13
@sw-release-bot sw-release-bot Bot mentioned this pull request Mar 22, 2026
polaz added a commit that referenced this pull request Mar 23, 2026
## Summary

- Add `name() -> &'static str` method to `UserComparator` trait for
stable comparator identity
- Persist comparator name in tree manifest; check on reopen — mismatch
returns `Error::ComparatorMismatch`
- Backward compatible: trees created before this change default to
`"default"` (matching `DefaultUserComparator`)

## Technical Details

- Comparator name written as `comparator_name` section in sfa archive
during `persist_version`
- `SuperVersions` stores `comparator_name: Arc<str>` so flush/compaction
version upgrades include it without extra plumbing
- Check runs in `Tree::recover` after manifest decode, before any data
access
- Follows RocksDB `Comparator::Name()` pattern (requested in #67 review)

## Test Plan

- [x] Reopen with same comparator succeeds
- [x] Reopen with different custom comparator fails with
`ComparatorMismatch`
- [x] Reopen custom-comparator tree with default comparator fails
- [x] Reopen default-comparator tree with default comparator succeeds
- [x] All existing tests pass (429 unit + integration)

Closes #74

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
- Tree comparators are now persisted and automatically validated when
reopening a tree.

* **Bug Fixes**
- Attempting to reopen a tree with an incompatible comparator now fails
with a clear error message.

* **Tests**
- Added comprehensive tests for comparator persistence and validation
behavior.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: custom key comparison / comparator

2 participants