Skip to content

feat(error): RouteMismatch error, blocked_bloom cleanup, bench/clippy fixes#166

Merged
polaz merged 13 commits into
mainfrom
feat/#164-featerror-dedicated-error-for-route-compatibility
Mar 24, 2026
Merged

feat(error): RouteMismatch error, blocked_bloom cleanup, bench/clippy fixes#166
polaz merged 13 commits into
mainfrom
feat/#164-featerror-dedicated-error-for-route-compatibility

Conversation

@polaz
Copy link
Copy Markdown
Member

@polaz polaz commented Mar 24, 2026

Summary

  • Add Error::RouteMismatch { expected, found } with level-based detection — only returned when ALL missing tables are on levels not covered by any current route (prevents masking genuine SST corruption)
  • Remove unfinished blocked_bloom module entirely (upstream Implement Blocked Bloom Filters fjall-rs/lsm-tree#78 still open, never integrated into Segment loader); preserve FilterType::BlockedBloom enum variant for on-disk format compatibility
  • Fix never-looping for loops in prop_mvcc and prop_range_tombstone oracle get() methods
  • Update/remove benchmarks for current public API (Config 3-arg constructor, Cache, use_cache, SeqNo params, IterGuardImpl/Guard pattern); remove 4 dead bench targets; fix TempDir lifetime
  • Convert #[allow]#[expect] with reason strings in 14 test modules
  • Fix map_oris_none_or and needless borrow warnings in test code
  • Update level_routes reopen contract doc to mention RouteMismatch

Test plan

  • cargo test --test level_routing — 24 passed (4 new: route mismatch, unrecoverable without routes, unrecoverable with routes, mixed covered+uncovered)
  • cargo test --test prop_mvcc — 1 passed
  • cargo test --test prop_range_tombstone — 1 passed
  • cargo clippy --all-targets --all-features — 0 errors
  • codecov patch coverage — 100%

Closes #164

Copilot AI review requested due to automatic review settings March 24, 2026 19:12
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 24, 2026

📝 Walkthrough

Walkthrough

Adds a RouteMismatch error and route-aware recovery; removes several Criterion benches and blocked-bloom filter implementation; converts many test-module #[allow] to #[expect(..., reason = "test code")]; fixes iterator oracles and adds routing tests.

Changes

Cohort / File(s) Summary
Benchmarks Removed
benches/block.rs, benches/level_manifest.rs, benches/partition_point.rs, benches/tli.rs, benches/bloom.rs, Cargo.toml
Deleted multiple Criterion benchmark modules and removed corresponding [[bench]] entries from Cargo.toml.
Benchmarks Updated
benches/tree.rs
Updated tree benchmarks to new APIs (SeqNo handling, cache injection, guard-based iterator usage); removed snapshots/blob-tree sub-benchmarks.
Blocked-bloom removed
src/table/filter/blocked_bloom/...
src/table/filter/blocked_bloom/builder._rs, src/table/filter/blocked_bloom/mod._rs, src/table/filter/mod.rs
Removed blocked-bloom builder and reader implementations and tests; re-enabled pub mod blocked_bloom; exposure. Review public API deletions and callers.
New Error Variant
src/error.rs
Added pub enum Error::RouteMismatch { expected: usize, found: usize }.
Recovery Logic
src/tree/mod.rs
Tree::recover_levels now inspects config.level_routes and returns Error::RouteMismatch{expected,found} when ALL missing tables belong to levels not covered by configured routes; otherwise returns Unrecoverable.
Routing Tests
tests/level_routing.rs
Added four tests exercising reopen failure modes with routed trees and asserting RouteMismatch vs Unrecoverable.
Config docs
src/config/mod.rs
Updated documentation describing Config::level_routes reopen behavior and RouteMismatch vs Unrecoverable cases.
Clippy expectations on tests
src/...
src/compaction/leveled/mod.rs, src/compaction/stream.rs, src/compression.rs, src/file.rs, src/fs/std_fs.rs, src/memtable/arena.rs, src/mvcc_stream.rs, src/prefix.rs, src/table/block/mod.rs, src/table/block_index/iter.rs, src/table/meta.rs, src/table/mod.rs, src/table/multi_writer.rs, src/version/run.rs, src/vlog/blob_file/reader.rs, src/vlog/blob_file/scanner.rs
Added module-level #[expect(..., reason = "test code")] attributes on many #[cfg(test)] mod tests to explicitly expect common Clippy lints; no runtime changes.
Property-test and small test fixes
tests/prop_mvcc.rs, tests/prop_range_tombstone.rs, src/table/tests.rs
Rewrote oracle get implementations to use .range(...).next() (fixing never-looping for loops); minor key representation tweak in a table test.
Leveled compaction test tweaks
src/compaction/leveled/test.rs
Replaced map_or(true, ...) checks with `is_none_or(
Vlog & file test lint attrs
src/vlog/blob_file/reader.rs, src/vlog/blob_file/scanner.rs, src/file.rs
Expanded test-module #[expect] lint lists to include indexing_slicing and useless_vec with reason strings. Review test-only attributes for CI lint expectations.

Sequence Diagram(s)

sequenceDiagram
  participant Caller as Config::open()
  participant Tree as Tree::recover_levels
  participant Manifest as Manifest (manifest)
  participant FS as Filesystem (table files)

  Caller->>Tree: open(config with optional level_routes)
  Tree->>Manifest: read manifest (expected table count & levels)
  Manifest-->>Tree: manifest tables & levels
  Tree->>FS: enumerate found table files across routes/dirs
  FS-->>Tree: found table mapping (count & ids)
  alt found_count < expected_count
    Tree->>Tree: compute missing table levels
    Tree->>Tree: check if any configured route covers those levels
    alt all missing levels are uncovered
      Tree-->>Caller: Error::RouteMismatch{expected, found}
    else
      Tree-->>Caller: Error::Unrecoverable
    end
  else
    Tree-->>Caller: recovery success
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 I hopped through routes and counted files,

Benches vanished into tidy piles,
Tests now "expect" the tiny sins,
Oracles peek once—no looping spins,
I twitch my nose and guard the aisles. 🥕

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning The PR includes benchmark and blocked_bloom cleanup changes that were explicitly marked as out-of-scope in #164, assigned to separate issues (#174 and maintainer manual work). Move benchmark updates and blocked_bloom removal to their respective tracked issues (#174 and related tasks) to keep this PR focused on the three in-scope objectives from #164.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the three main change categories: RouteMismatch error addition, blocked_bloom cleanup, and benchmark/clippy fixes.
Linked Issues check ✅ Passed All primary objectives from #164 are met: RouteMismatch error added with level-based detection, never-looping for loops fixed with .next() iterator pattern, and #[allow] converted to #[expect] across test modules.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/#164-featerror-dedicated-error-for-route-compatibility

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@sw-release-bot sw-release-bot Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'lsm-tree db_bench'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.10.

Benchmark suite Current: 3791c93 Previous: 1cdc580 Ratio
fillseq 881124.7477353843 ops/sec (normalized) 1323716.6167250746 ops/sec (normalized) 1.50
fillrandom 472156.4948182528 ops/sec (normalized) 774829.5346473391 ops/sec (normalized) 1.64
readrandom 255794.3440049183 ops/sec (normalized) 371818.14238035976 ops/sec (normalized) 1.45
readseq 1463047.7775367857 ops/sec (normalized) 1670923.4123340282 ops/sec (normalized) 1.14
seekrandom 173527.55698330552 ops/sec (normalized) 269509.2857719811 ops/sec (normalized) 1.55
prefixscan 103288.2920352982 ops/sec (normalized) 134814.68759039504 ops/sec (normalized) 1.31
overwrite 490552.60781232204 ops/sec (normalized) 797083.1828964297 ops/sec (normalized) 1.62
mergerandom 362901.4011479248 ops/sec (normalized) 483433.67522262805 ops/sec (normalized) 1.33
readwhilewriting 216217.44662232275 ops/sec (normalized) 344268.30829649814 ops/sec (normalized) 1.59

This comment was automatically generated by workflow using github-action-benchmark.

CC: @polaz

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 22

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benches/block.rs`:
- Around line 8-10: The current placeholder benchmark function placeholder and
criterion_group!(benches, placeholder) leave the block benchmark as a
zero-workload stub; either remove the bench target or replace it with a minimal
public-API benchmark that actually exercises the block path. Update
benches/block.rs by deleting the placeholder bench or implementing a small
benchmark that constructs and manipulates a DataBlock via its public API (e.g.,
build a tiny DataBlock, call the public methods you expect to keep) and register
that function in criterion_group! instead of placeholder so the benchmark suite
retains meaningful coverage.

In `@benches/level_manifest.rs`:
- Around line 8-10: The current placeholder benchmark function (placeholder) and
its inclusion via criterion_group!(benches, placeholder) leave an inert stub;
either remove the placeholder target or replace it with an actual benchmark that
measures manifest iteration/lookup through the public API. To fix, delete the
placeholder() function and the criterion_group invocation referencing it if you
want no benchmark, or implement a Criterion benchmark that calls the project’s
public manifest methods (e.g., iterate_manifest or lookup_level equivalents) and
then register that benchmark in criterion_group!(benches, <your_benchmark_fn>)
so the test suite measures real work instead of an empty stub.

In `@benches/partition_point.rs`:
- Around line 7-10: The benchmark file registers a no-op benchmark function
named placeholder via criterion_group!(benches, placeholder) and
criterion_main!(benches); either implement a real benchmark for the
partition_point behavior or remove the registration and file to avoid shipping
an empty benchmark; specifically, delete or replace the placeholder function and
update/remove the criterion_group!(benches, placeholder) /
criterion_main!(benches) entries (or register a new real benchmark function that
exercises partition_point) so the bench target is not a no-op.

In `@benches/tli.rs`:
- Around line 8-10: Remove the inert TLI benchmark target: delete the
placeholder function and its criterion_group registration (the placeholder
function and criterion_group!(benches, placeholder)) from benches/tli.rs, or
replace them with a minimal Criterion benchmark that actually exercises the
IndexBlock lookup path (e.g., invoke IndexBlock::lookup or the public API that
routes to IndexBlock) so the file once again measures the intended TLI
performance.

In `@benches/tree.rs`:
- Around line 324-326: Open two tracking issues to re-enable the disabled
benchmarks for the removed APIs: one titled "Re-enable snapshot() benchmark" and
another titled "Re-enable open_as_blob_tree() benchmark". In each issue
reference the benches/tree.rs entries that currently contain the TODOs, describe
the exact benchmark to restore (the snapshot() benchmark and the
open_as_blob_tree() benchmark), list any test/bench names or expectations
required to re-enable them, note they should be re-opened only once the public
API is re-exposed or reimplemented, and add labels (category:bench,
priority:backlog) and an owner to track follow-up.
- Around line 16-23: The Cache wrapping style is inconsistent; choose one and
make all instances uniform (e.g., replace uses of
Cache::with_capacity_bytes(...).into() with
Arc::new(Cache::with_capacity_bytes(...)) or vice versa). Update the call sites
that pass the cache into Config::new(...).use_cache(...) — specifically the
expressions around Config::new, use_cache, and Cache::with_capacity_bytes — so
every benchmark uses the same wrapper (Arc::new(Cache::with_capacity_bytes(...))
or .into()) including the blocks at the shown snippet and the other locations
noted (around the use_cache calls at lines referenced in the comment).

In `@src/compaction/leveled/mod.rs`:
- Around line 6-11: Replace the #[allow(...)] block at the top of the leveled
module with #[expect(..., reason = "...")] attributes for each clippy lint so
stale suppressions will surface under MSRV 1.90; specifically, remove the single
#[allow(...)] containing clippy::unwrap_used, clippy::indexing_slicing,
clippy::useless_vec, clippy::unnecessary_map_or and add per-lint attributes like
#[expect(clippy::unwrap_used, reason = "temporary suppression — revisit when
API/logic changes")], #[expect(clippy::indexing_slicing, reason = "...")],
#[expect(clippy::useless_vec, reason = "...")], and
#[expect(clippy::unnecessary_map_or, reason = "...")] at the same location in
mod.rs so each suppression has an explicit reason and will be flagged if
outdated.

In `@src/compaction/stream.rs`:
- Line 409: Replace the module-level #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with individual #[expect(...,
reason = "...")] attributes: add #[expect(clippy::unwrap_used, reason =
"test-only code in this module")] , #[expect(clippy::indexing_slicing, reason =
"test-only code in this module")] , and #[expect(clippy::useless_vec, reason =
"test-only code in this module")], keeping them at the same module scope so the
lints are marked expected rather than allowed.

In `@src/compression.rs`:
- Line 287: Replace the module attribute #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with explicit #[expect(...,
reason = "...")] attributes on the test module; e.g. change the attribute
decorating the test module (the one currently suppressing clippy::unwrap_used,
clippy::indexing_slicing and clippy::useless_vec) to use
#[expect(clippy::unwrap_used, reason = "tests use .expect() for clearer
failures")], #[expect(clippy::indexing_slicing, reason = "tests use direct slice
indexing for concise assertions")], and #[expect(clippy::useless_vec, reason =
"temporary Vecs are used to construct test fixtures")], so each lint has its own
expect with a brief reason.

In `@src/file.rs`:
- Line 97: Replace the module-level attribute #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with the newer #[expect(...,
reason = "...")] form: add three separate expectations such as
#[expect(clippy::unwrap_used, reason = "test uses unwrap to intentionally assert
success of fallible operations")] #[expect(clippy::indexing_slicing, reason =
"tests use direct indexing for concise fixed-index assertions")] and
#[expect(clippy::useless_vec, reason = "tests construct Vecs for
clarity/explicit ownership in assertions")] applied to the same test module (the
attribute decorating the test module in src/file.rs) so each suppression is
converted to an explicit expectation with a short justification.

In `@src/fs/std_fs.rs`:
- Line 314: Replace the blanket #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with explicit #[expect(...,
reason = "...")] attributes: add three separate attributes
#[expect(clippy::unwrap_used, reason = "...")],
#[expect(clippy::indexing_slicing, reason = "...")], and
#[expect(clippy::useless_vec, reason = "...")] placed where the original
attribute was, supplying concise reasons describing why each suppression is
necessary (e.g., safe indexing with prior bounds check, deliberate unwrap
because error cannot occur, or intentional vec usage for API compatibility)
while keeping the same scope and behavior as the original attribute.

In `@src/memtable/arena.rs`:
- Around line 305-307: The tests module currently uses a single #[allow(...)]
attribute to silence multiple clippy lints; replace each allowance with an
#[expect(..., reason = "...")] attribute to match the project's lint suppression
strategy and provide reasons. Specifically, on the mod tests block that has
#[expect(clippy::expect_used, reason = "tests use expect for brevity")] and
#[allow(clippy::unwrap_used, clippy::indexing_slicing, clippy::useless_vec)],
convert those three allows into three separate #[expect] attributes (e.g.,
#[expect(clippy::unwrap_used, reason = "...")],
#[expect(clippy::indexing_slicing, reason = "...")],
#[expect(clippy::useless_vec, reason = "...")]) so clippy will report if any
suppression becomes unnecessary while keeping the existing
#[expect(clippy::expect_used, ...)] intact.

In `@src/mvcc_stream.rs`:
- Line 327: Replace the module-level #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with #[expect(...)] variants and
provide short reasons for each lint so the suppression will warn if it becomes
unnecessary; specifically, change the attribute to three expects referencing the
same clippy lints used (clippy::unwrap_used, clippy::indexing_slicing,
clippy::useless_vec) and add concise reason strings like "intentional unwrap on
well-validated input", "indexing is safe due to invariants", and "vec used for
FFI/heap-local optimization" (use the same attribute site as the existing
#[expect] on the previous line to keep consistency).

In `@src/prefix.rs`:
- Line 124: Replace the new clippy suppression attribute at the top of the file
(the attribute currently written as #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)]) with an equivalent
#[expect(..., reason = "...")] attribute: change each clippy lint listed to use
#[expect(clippy::unwrap_used, reason = "...")],
#[expect(clippy::indexing_slicing, reason = "...")], and
#[expect(clippy::useless_vec, reason = "...")] (or a single combined expect if
preferred), and provide a concise reason string explaining why the lint is
expected; MSRV 1.90 supports #[expect] so update the attribute on the same item
(the attribute currently suppressing clippy lints) accordingly.

In `@src/table/block_index/iter.rs`:
- Line 91: Replace the blanket test-module lint suppression attribute
`#[allow(clippy::unwrap_used, clippy::indexing_slicing, clippy::useless_vec)]`
with `#[expect(...)]` attributes and add concise justifications for each
expectation; specifically, change each lint to `#[expect(clippy::unwrap_used,
reason = "...")]`, `#[expect(clippy::indexing_slicing, reason = "...")]`, and
`#[expect(clippy::useless_vec, reason = "...")]` on the same test module (the
attribute you see at the top of the test module in iter.rs) and fill each reason
with a short explanation (e.g., tests intentionally use unwrap for brevity,
indexing is safe due to fixed test data, vecs improve readability) so the
suppressions are explicit and compliant with the MSRV Rust 1.90 policy.

In `@src/table/block/mod.rs`:
- Line 720: Replace the blanket clippy suppression attribute
#[allow(clippy::unwrap_used, clippy::indexing_slicing, clippy::useless_vec)]
with individual #[expect(..., reason = "...")] attributes for each lint (e.g.
#[expect(clippy::unwrap_used, reason = "explain why unwrap is safe here")],
#[expect(clippy::indexing_slicing, reason = "explain why indexing is safe
here")], #[expect(clippy::useless_vec, reason = "explain why vec here is
required")]) so each suppression documents the rationale and will be flagged
automatically when the lint is removed; apply these replaces to the same
attribute location (the current module-level attribute in mod.rs).

In `@src/table/meta.rs`:
- Line 281: Replace the module-level attribute #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] used on the test module in
src/table/meta.rs with an #[expect(..., reason = "...")] form; specifically
change the attribute on the test module to use #[expect] listing
clippy::unwrap_used, clippy::indexing_slicing, and clippy::useless_vec and add
reason = "test code" so unused suppressions will be flagged by the linter.
Ensure you update the attribute that decorates the test module (the one
currently using #[allow(...)] immediately above the mod test declaration).

In `@src/table/mod.rs`:
- Around line 21-26: Replace the blanket #[allow(...)] on the test module with
individual #[expect(..., reason = "...")] attributes for each lint so
suppression debt is explicit; specifically, for the module attribute that
currently lists clippy::unwrap_used, clippy::indexing_slicing,
clippy::useless_vec, and clippy::needless_borrows_for_generic_args, add one
#[expect(..., reason = "...")] per lint (e.g., reference the cfg(test) module in
src/table/mod.rs and replace the single #[allow(...)] with four separate
#[expect] attributes, each supplying a concise reason such as "convenience in
test code").

In `@src/table/multi_writer.rs`:
- Line 500: The attribute #[allow(clippy::unwrap_used, clippy::indexing_slicing,
clippy::useless_vec)] should be converted to the newer #[expect(..., reason =
"...")] form (or annotated with a brief reason comment if the crate MSRV <
1.79); update the attribute on the same item in multi_writer.rs to use
#[expect(clippy::unwrap_used, reason = "explain why unwrap is safe here"),
expect(clippy::indexing_slicing, reason = "explain why indexing is safe"),
expect(clippy::useless_vec, reason = "explain why Vec construction is
necessary")] when compiling on MSRV >= 1.79, otherwise retain #[allow(...)] but
add a short inline comment // reason: ... describing why each allow is required;
locate and change the attribute shown in the diff to include these reasons.

In `@src/version/run.rs`:
- Line 365: Replace the broad test-module attribute #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with per-lint #[expect(...,
reason = "...")] attributes on the same test module in run.rs: remove the
#[allow(...)] and add one #[expect(clippy::unwrap_used, reason = "explain why
unwrap is acceptable in these tests")] plus one
#[expect(clippy::indexing_slicing, reason = "explain safe indexing in tests")]
and one #[expect(clippy::useless_vec, reason = "explain why vec! is intentional
in tests")] so each suppression is explicit and carries a reason; keep the
attribute(s) at the module-level where the original attribute was applied (the
test module surrounding the functions in run.rs).

In `@src/vlog/blob_file/reader.rs`:
- Line 246: Replace the #[allow(clippy::unwrap_used, clippy::indexing_slicing,
clippy::useless_vec)] attribute with #[expect(..., reason = "...")] variants so
each lint suppression becomes an expect with a short rationale; specifically
change the attribute on the blob reader block from allow to expect (e.g.,
#[expect(clippy::unwrap_used, reason = "explain why unwrap is safe here")],
#[expect(clippy::indexing_slicing, reason = "...")],
#[expect(clippy::useless_vec, reason = "...")]) so the code continues to
suppress these lints but will warn when the suppression is no longer needed.

In `@src/vlog/blob_file/scanner.rs`:
- Line 201: The attribute line currently uses #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] which regresses from the
preferred #[expect] usage; replace this #[allow(...)] with
#[expect(clippy::unwrap_used, clippy::indexing_slicing, clippy::useless_vec,
reason = "brief justification describing why these lints are intentionally
suppressed here (e.g. safe unwrap after validation, performance-driven slicing,
vector needed for FFI)")] so each suppressed lint is tracked and will warn if
the suppression becomes unnecessary—keep the same lint names and provide a
concise reason string relevant to the scanner.rs code block where this attribute
sits.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e740ae2c-d242-45aa-bd42-50bbe8949804

📥 Commits

Reviewing files that changed from the base of the PR and between 1d04814 and 7673762.

📒 Files selected for processing (31)
  • benches/block.rs
  • benches/level_manifest.rs
  • benches/partition_point.rs
  • benches/tli.rs
  • benches/tree.rs
  • src/compaction/leveled/mod.rs
  • src/compaction/leveled/test.rs
  • src/compaction/stream.rs
  • src/compression.rs
  • src/error.rs
  • src/file.rs
  • src/fs/std_fs.rs
  • src/memtable/arena.rs
  • src/mvcc_stream.rs
  • src/prefix.rs
  • src/table/block/mod.rs
  • src/table/block_index/iter.rs
  • src/table/filter/blocked_bloom/builder.rs
  • src/table/filter/blocked_bloom/mod.rs
  • src/table/filter/mod.rs
  • src/table/meta.rs
  • src/table/mod.rs
  • src/table/multi_writer.rs
  • src/table/tests.rs
  • src/tree/mod.rs
  • src/version/run.rs
  • src/vlog/blob_file/reader.rs
  • src/vlog/blob_file/scanner.rs
  • tests/level_routing.rs
  • tests/prop_mvcc.rs
  • tests/prop_range_tombstone.rs

Comment thread benches/block.rs Outdated
Comment thread benches/level_manifest.rs Outdated
Comment thread benches/partition_point.rs Outdated
Comment thread benches/tli.rs Outdated
Comment thread benches/tree.rs
Comment thread src/table/mod.rs Outdated
Comment thread src/table/multi_writer.rs Outdated
Comment thread src/version/run.rs Outdated
Comment thread src/vlog/blob_file/reader.rs Outdated
Comment thread src/vlog/blob_file/scanner.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves reopen/recovery diagnostics for level-routing configurations, restores the blocked_bloom filter module, and updates tests/benches to align with the current public API and clippy expectations.

Changes:

  • Add Error::RouteMismatch { expected, found } and return it during recovery when level_routes is configured and fewer tables are found than expected.
  • Re-enable table::filter::blocked_bloom and adjust bloom/test/bench code to current APIs (iter guards, cache config, seqno params).
  • Fix clippy issues in integration tests (replace never-looping for with .next(), adopt is_none_or, and add lint suppressions in test modules).

Reviewed changes

Copilot reviewed 31 out of 31 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
tests/prop_range_tombstone.rs Fix oracle get() to use .range(...).next() (removes never-looping loop).
tests/prop_mvcc.rs Fix oracle get() to use .range(...).next() (removes never-looping loop).
tests/level_routing.rs Add regression test asserting RouteMismatch on reopen with missing route coverage.
src/vlog/blob_file/scanner.rs Switch test-module lint suppression to #[allow(...)].
src/vlog/blob_file/reader.rs Switch test-module lint suppression to #[allow(...)].
src/version/run.rs Switch test-module lint suppression to #[allow(...)].
src/tree/mod.rs Emit RouteMismatch when recovery finds fewer tables with level_routes configured.
src/table/tests.rs Adjust InternalValue::from_components call site key argument.
src/table/multi_writer.rs Add test-module lint suppression via #[allow(...)].
src/table/mod.rs Add test-module lint suppression via #[allow(...)].
src/table/meta.rs Add test-module lint suppression via #[allow(...)].
src/table/filter/mod.rs Re-enable blocked_bloom module export.
src/table/filter/blocked_bloom/mod.rs Add clippy #[expect] annotations; remove unused helpers.
src/table/filter/blocked_bloom/builder.rs Add clippy #[expect] annotations around float/int casts for sizing logic.
src/table/block_index/iter.rs Add test-module lint suppression via #[allow(...)].
src/table/block/mod.rs Add test-module lint suppression via #[allow(...)].
src/prefix.rs Add test-module lint suppression via #[allow(...)].
src/mvcc_stream.rs Add test-module lint suppression via #[allow(...)].
src/memtable/arena.rs Add test-module lint suppression via #[allow(...)].
src/fs/std_fs.rs Add test-module lint suppression via #[allow(...)].
src/file.rs Add test-module lint suppression via #[allow(...)].
src/error.rs Add Error::RouteMismatch { expected, found } variant and docs.
src/compression.rs Add test-module lint suppression via #[allow(...)].
src/compaction/stream.rs Add test-module lint suppression via #[allow(...)].
src/compaction/leveled/test.rs Use is_none_or instead of map_or.
src/compaction/leveled/mod.rs Add test-module lint suppression via #[allow(...)].
benches/tree.rs Update benchmarks to new public API (Config ctor, cache, seqno, guard iteration), and stub removed snapshot/blob benches.
benches/tli.rs Stub benchmark that depended on removed internal index types.
benches/partition_point.rs Stub benchmark that depended on removed binary_search public API.
benches/level_manifest.rs Stub benchmarks that accessed internal tree.levels.
benches/block.rs Stub benchmarks that depended on removed/restructured internal block types.
Comments suppressed due to low confidence (1)

src/table/filter/blocked_bloom/mod.rs:48

  • BlockedBloomFilterReader::new() parses on-disk bytes but still uses assert_eq! for the filter type and hash type (and relies on an expect() for slicing). If a table’s filter block is corrupted or from a different format, these asserts will panic and can crash the process; this should return a structured Error (e.g., InvalidHeader/InvalidTag/Unrecoverable) instead of panicking. Also consider validating the u64→usize conversions (use try_into / bounds checks) so a corrupted header can’t truncate/overflow and lead to incorrect indexing.

Comment thread src/tree/mod.rs Outdated
Comment thread src/vlog/blob_file/scanner.rs Outdated
Comment thread src/error.rs Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/tree/mod.rs`:
- Around line 1628-1659: The current logic in the branch that returns
crate::Error::RouteMismatch uses only config.level_routes to decide if missing
tables are due to a routing change, which misclassifies cases where the
route-to-path mapping changed; fix by persisting route provenance (either add
previous route→path mapping to the manifest or per-table metadata) and update
the check that computes all_missing_uncovered to compare missing tables'
persisted route/path identity against current config.level_routes (use
table_map's stored provenance if present) before returning Error::RouteMismatch;
if you cannot persist provenance in this change, revert to returning
Unrecoverable instead of RouteMismatch to avoid misclassification (update
references: config.level_routes, table_map, and crate::Error::RouteMismatch).

In `@tests/level_routing.rs`:
- Around line 714-720: The test removes one file from the directory but never
asserts that a deletion actually happened; modify the loop that iterates over
std::fs::read_dir(&tables_dir) so it tracks whether a file was removed (e.g., a
boolean `deleted` flag set to true inside the if branch where
std::fs::remove_file(entry.path()) is called) and after breaking from the loop
add an assertion (e.g., assert!(deleted, "expected to delete at least one table
file")) to fail the test if no file was removed; apply the same change to the
other identical block referenced around lines 756-762.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0b6428ba-1391-4ca0-bfff-9dbeedca4e32

📥 Commits

Reviewing files that changed from the base of the PR and between 7673762 and 3dd6e43.

📒 Files selected for processing (26)
  • Cargo.toml
  • benches/block.rs
  • benches/level_manifest.rs
  • benches/partition_point.rs
  • benches/tli.rs
  • benches/tree.rs
  • src/compaction/leveled/mod.rs
  • src/compaction/stream.rs
  • src/compression.rs
  • src/config/mod.rs
  • src/error.rs
  • src/file.rs
  • src/fs/std_fs.rs
  • src/memtable/arena.rs
  • src/mvcc_stream.rs
  • src/prefix.rs
  • src/table/block/mod.rs
  • src/table/block_index/iter.rs
  • src/table/meta.rs
  • src/table/mod.rs
  • src/table/multi_writer.rs
  • src/tree/mod.rs
  • src/version/run.rs
  • src/vlog/blob_file/reader.rs
  • src/vlog/blob_file/scanner.rs
  • tests/level_routing.rs
💤 Files with no reviewable changes (5)
  • benches/block.rs
  • benches/tli.rs
  • benches/partition_point.rs
  • benches/level_manifest.rs
  • Cargo.toml

Comment thread src/tree/mod.rs
Comment thread tests/level_routing.rs
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 33 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

src/table/filter/blocked_bloom/mod.rs:48

  • BlockedBloomFilterReader::new parses potentially untrusted on-disk bytes but currently relies on assert_eq! / expect() and unchecked as usize casts. Corrupted or malicious filter blocks could trigger panics (e.g., invalid filter/hash type, out-of-bounds slice, num_blocks == 0 leading to division-by-zero later) or truncation. Instead of suppressing clippy here, validate fields (num_blocks > 0, k > 0, bounds) and return a structured Error on invalid input.

Comment thread src/tree/mod.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/table/filter/blocked_bloom/mod.rs`:
- Around line 237-245: Add a unit test that constructs a complete
BlockedBloomFilter header (including MAGIC_BYTES, filter type byte, hash_type,
num_blocks and k as little-endian u64s) but omits the data section so that
BlockedBloomFilterReader::new advances past reader.read_u8() and then hits the
slice.get(offset..) → ok_or(...) validation; call BlockedBloomFilterReader::new
with that byte vector and assert the expected error result (or assert is_err()
if you want to require failure) to ensure the truncated data path is exercised.
- Around line 77-84: The constructor builds a BitArrayReader from raw data but
never validates that the claimed num_blocks fits in the provided buffer,
allowing later contains_hash to compute block/bit offsets that panic inside
BitArrayReader::get; fix by checking after slicing that data.len() >=
(num_blocks as usize).saturating_mul(CACHE_LINE_BYTES) and return
Err(crate::Error::InvalidHeader("BloomFilter: truncated data")) if not, before
calling BitArrayReader::new (use the existing symbols k, num_blocks,
BitArrayReader::new, contains_hash, and CACHE_LINE_BYTES to locate the code).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: da5ae764-f7d4-4928-8588-7fca6f2e483f

📥 Commits

Reviewing files that changed from the base of the PR and between 3dd6e43 and 475d39e.

📒 Files selected for processing (1)
  • src/table/filter/blocked_bloom/mod.rs

Comment thread src/table/filter/blocked_bloom/mod.rs Outdated
Comment thread src/table/filter/blocked_bloom/mod.rs Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/level_routing.rs`:
- Around line 796-851: The test currently only creates hot-tier tables, so
removing the warm route doesn't produce a mixed covered+uncovered missing-table
scenario; update mixed_missing_covered_and_uncovered_returns_unrecoverable to
explicitly create at least one warm-level table before deleting a hot table:
after the initial three_tier_config open/flush steps (where tree.insert and
tree.flush_active_memtable are used for hot), open the same three_tier_config
and perform an insert + flush that targets the warm tier (use tree.insert(...)
then tree.flush_active_memtable(...) for the warm-level flush or otherwise use
the API the test uses to force a flush into the warm level) so there is an
actual warm table on disk, then proceed to delete a file from hot_tables_dir and
remove the warm LevelRoute when reopening so the test exercises the mixed
covered+uncovered case and expects lsm_tree::Error::Unrecoverable.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 31c15da3-e139-4b3e-83ce-38fb1870d908

📥 Commits

Reviewing files that changed from the base of the PR and between d7d13f2 and 3337e44.

📒 Files selected for processing (1)
  • tests/level_routing.rs

Comment thread tests/level_routing.rs
@polaz polaz review requested due to automatic review settings March 24, 2026 20:15
polaz added 10 commits March 24, 2026 22:16
- Add Error::RouteMismatch variant returned when recovery finds fewer
  tables than expected AND level_routes is configured, distinguishing
  misconfiguration from actual data corruption (Unrecoverable)
- Re-enable blocked_bloom filter module (was disabled via ._rs rename)
  with proper clippy expect annotations
- Fix never-looping for loops in prop_mvcc and prop_range_tombstone
  oracles (replace with iterator .next())
- Update benchmarks for current API: Config::new(3 args),
  Cache (was BlockCache), use_cache (was block_cache), SeqNo params,
  IterGuardImpl iteration pattern
- Stub out benchmarks for removed internal types (ValueBlock,
  TopLevelIndex, binary_search)
- Add #[allow] for clippy lints in #[cfg(test)] modules
- Fix map_or → is_none_or and &[key] → [key] in test code

Closes #164
Adds missing_table_without_routes_returns_unrecoverable test to cover
the Unrecoverable branch (no level_routes configured but table file
deleted), bringing patch coverage to 100%.
…tion

- RouteMismatch is now returned ONLY when all missing tables are on
  levels not covered by any current route (strongly suggests removed
  routes).  If any missing table is on a level covered by a current
  route, return Unrecoverable (the SST was genuinely deleted/corrupted)
- Soften RouteMismatch doc: "often indicates" instead of "this is"
- Update level_routes reopen contract doc to mention RouteMismatch
- Add regression test: deleted_sst_with_routes_configured_returns_unrecoverable
- Convert #[allow] → #[expect] for clippy lint suppressions in 14 test modules
- Remove 4 dead placeholder bench targets (block, level_manifest,
  partition_point, tli) and their Cargo.toml entries
- Unify Cache wrapping to Arc::new() in bench/tree.rs
- Doc now accurately describes level-based detection logic
- Convert remaining #[allow] → #[expect] in leveled/mod.rs and table/mod.rs
assert_eq! on filter_type and hash_type would panic on corrupted
on-disk data instead of returning a structured error. Replace with
conditional checks returning Error::InvalidHeader. Also replace
expect() on slice bounds with ok_or().

Adds regression tests: corrupted_filter_type_returns_error,
corrupted_hash_type_returns_error, truncated_filter_returns_error.
blocked_bloom was never integrated into the Segment loader pipeline
(upstream fjall-rs#78 is still open). Remove the module
source, re-disable the module declaration, and clean up bench
references. FilterType::BlockedBloom enum variant is preserved for
on-disk format compatibility (tag byte 1).
Add deleted flag + assertion to both missing_table_without_routes and
deleted_sst_with_routes_configured tests so they fail explicitly if
the test setup didn't actually remove a file.
Add code comment noting that level-based RouteMismatch detection is
best-effort and cannot detect same-level path changes. Persisting
route provenance per-table would enable exact detection but requires
a manifest format change.
Exercises the path where some missing tables are on covered levels
(corruption) and some on uncovered levels (route mismatch) — must
return Unrecoverable since at least one SST was genuinely lost.
Compact one table to L3 (warm tier) and leave another at L0 (hot tier)
so the test genuinely exercises mixed covered+uncovered missing tables
when reopening with a partial route config.
@polaz polaz force-pushed the feat/#164-featerror-dedicated-error-for-route-compatibility branch from 7a40ad1 to 3791c93 Compare March 24, 2026 20:16
@polaz polaz requested a review from Copilot March 24, 2026 20:16
@polaz polaz changed the title feat(error): dedicated error for route-compatibility mismatch on reopen feat(error): RouteMismatch error, blocked_bloom cleanup, bench/clippy fixes Mar 24, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 34 changed files in this pull request and generated 4 comments.

Comment thread src/error.rs Outdated
Comment thread benches/tree.rs
Comment thread benches/tree.rs
Comment thread benches/bloom.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/config/mod.rs`:
- Around line 592-594: Update the doc comment in config::mod (the triple-slash
comment above the relevant item) to soften the guarantee: change the
unconditional phrasing that “omitting an old folder will fail with
RouteMismatch” to indicate it “may fail with RouteMismatch when all missing
tables are on levels not covered by config.level_routes,” and mention that
recover_levels in src/tree/mod.rs uses a best-effort level-coverage heuristic
(it does not track per-table route provenance) so mixed cases can still surface
as Unrecoverable; reference recover_levels and config.level_routes in the text
to make the behavior explicit.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 32ce5d55-24d8-484c-9f65-4a48a5703e08

📥 Commits

Reviewing files that changed from the base of the PR and between d7d13f2 and 3791c93.

📒 Files selected for processing (34)
  • Cargo.toml
  • benches/block.rs
  • benches/bloom.rs
  • benches/level_manifest.rs
  • benches/partition_point.rs
  • benches/tli.rs
  • benches/tree.rs
  • src/compaction/leveled/mod.rs
  • src/compaction/leveled/test.rs
  • src/compaction/stream.rs
  • src/compression.rs
  • src/config/mod.rs
  • src/error.rs
  • src/file.rs
  • src/fs/std_fs.rs
  • src/memtable/arena.rs
  • src/mvcc_stream.rs
  • src/prefix.rs
  • src/table/block/mod.rs
  • src/table/block_index/iter.rs
  • src/table/filter/blocked_bloom/builder._rs
  • src/table/filter/blocked_bloom/mod._rs
  • src/table/filter/mod.rs
  • src/table/meta.rs
  • src/table/mod.rs
  • src/table/multi_writer.rs
  • src/table/tests.rs
  • src/tree/mod.rs
  • src/version/run.rs
  • src/vlog/blob_file/reader.rs
  • src/vlog/blob_file/scanner.rs
  • tests/level_routing.rs
  • tests/prop_mvcc.rs
  • tests/prop_range_tombstone.rs
💤 Files with no reviewable changes (9)
  • benches/bloom.rs
  • src/table/filter/mod.rs
  • benches/tli.rs
  • benches/level_manifest.rs
  • benches/partition_point.rs
  • Cargo.toml
  • src/table/filter/blocked_bloom/mod._rs
  • src/table/filter/blocked_bloom/builder._rs
  • benches/block.rs

Comment thread src/config/mod.rs Outdated
- Fix level_route → level_routes typo in RouteMismatch rustdoc
- Soften level_routes reopen contract: "may fail" with RouteMismatch
  or Unrecoverable depending on which levels are affected
- Add comment explaining Guard trait import for into_inner()
- Hoist log::error! args into let bindings for coverage visibility
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 34 changed files in this pull request and generated 8 comments.

Comment thread benches/tree.rs
Comment thread benches/tree.rs
Comment thread benches/tree.rs
Comment thread benches/tree.rs
Comment thread benches/tree.rs
Comment thread benches/tree.rs
Comment thread benches/tree.rs
Comment thread benches/tree.rs
TempDir was moved into Config::new, causing immediate drop and
directory deletion. Pass by reference so TempDir lives for the
duration of the benchmark closure.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benches/tree.rs`:
- Around line 100-108: The iterator currently swallows iteration errors by using
guard.into_inner().ok(), which masks read failures; change both scan pipelines
that call tree.iter(MAX, None) (and the other similar pipeline at the later
block) to unwrap the result from guard.into_inner() (e.g., replace .ok() with
.unwrap() or otherwise propagate/panic) so the benchmark fails fast on iterator
read errors; locate uses of guard.into_inner().ok() in the scan closures and
replace them with a panic-on-error approach (unwrap) to ensure broken entries
are not silently skipped.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d3a0136b-a8d0-484e-9340-a2b46fac1b4f

📥 Commits

Reviewing files that changed from the base of the PR and between dad413c and 7bc435e.

📒 Files selected for processing (1)
  • benches/tree.rs

Comment thread benches/tree.rs
filter_map(ok()) swallowed scan errors, making benchmarks look correct
even with broken entries. Use map(unwrap()) to fail fast.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 34 changed files in this pull request and generated 1 comment.

Comment thread src/tree/mod.rs
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 34 out of 34 changed files in this pull request and generated no new comments.

@polaz polaz merged commit 8a07c58 into main Mar 24, 2026
25 checks passed
@polaz polaz deleted the feat/#164-featerror-dedicated-error-for-route-compatibility branch March 24, 2026 22:00
@sw-release-bot sw-release-bot Bot mentioned this pull request Mar 24, 2026
polaz pushed a commit that referenced this pull request Mar 27, 2026
## 🤖 New release

* `coordinode-lsm-tree`: 4.1.0 -> 4.2.0

<details><summary><i><b>Changelog</b></i></summary><p>

<blockquote>

##
[4.2.0](v4.1.0...v4.2.0)
- 2026-03-26

### Added

- comparator-aware range tombstones
([#180](#180))
- *(compression)* CompressionProvider trait + pure Rust zstd backend
([#176](#176))
- *(error)* RouteMismatch error, blocked_bloom cleanup, bench/clippy
fixes
([#166](#166))
- *(config)* per-level Fs routing for tiered storage
([#163](#163))

### Performance

- *(bench)* consolidate benchmarks + nextest + flamegraph pipeline
([#175](#175))

### Testing

- *(table)* add zstd dict helper coverage
([#181](#181))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/release-plz/release-plz/).

Co-authored-by: sw-release-bot[bot] <255865126+sw-release-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(error): RouteMismatch error + test/clippy fixes

2 participants