feat(error): RouteMismatch error, blocked_bloom cleanup, bench/clippy fixes#166
Conversation
📝 WalkthroughWalkthroughAdds a RouteMismatch error and route-aware recovery; removes several Criterion benches and blocked-bloom filter implementation; converts many test-module Changes
Sequence Diagram(s)sequenceDiagram
participant Caller as Config::open()
participant Tree as Tree::recover_levels
participant Manifest as Manifest (manifest)
participant FS as Filesystem (table files)
Caller->>Tree: open(config with optional level_routes)
Tree->>Manifest: read manifest (expected table count & levels)
Manifest-->>Tree: manifest tables & levels
Tree->>FS: enumerate found table files across routes/dirs
FS-->>Tree: found table mapping (count & ids)
alt found_count < expected_count
Tree->>Tree: compute missing table levels
Tree->>Tree: check if any configured route covers those levels
alt all missing levels are uncovered
Tree-->>Caller: Error::RouteMismatch{expected, found}
else
Tree-->>Caller: Error::Unrecoverable
end
else
Tree-->>Caller: recovery success
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'lsm-tree db_bench'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.10.
| Benchmark suite | Current: 3791c93 | Previous: 1cdc580 | Ratio |
|---|---|---|---|
fillseq |
881124.7477353843 ops/sec (normalized) |
1323716.6167250746 ops/sec (normalized) |
1.50 |
fillrandom |
472156.4948182528 ops/sec (normalized) |
774829.5346473391 ops/sec (normalized) |
1.64 |
readrandom |
255794.3440049183 ops/sec (normalized) |
371818.14238035976 ops/sec (normalized) |
1.45 |
readseq |
1463047.7775367857 ops/sec (normalized) |
1670923.4123340282 ops/sec (normalized) |
1.14 |
seekrandom |
173527.55698330552 ops/sec (normalized) |
269509.2857719811 ops/sec (normalized) |
1.55 |
prefixscan |
103288.2920352982 ops/sec (normalized) |
134814.68759039504 ops/sec (normalized) |
1.31 |
overwrite |
490552.60781232204 ops/sec (normalized) |
797083.1828964297 ops/sec (normalized) |
1.62 |
mergerandom |
362901.4011479248 ops/sec (normalized) |
483433.67522262805 ops/sec (normalized) |
1.33 |
readwhilewriting |
216217.44662232275 ops/sec (normalized) |
344268.30829649814 ops/sec (normalized) |
1.59 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @polaz
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 22
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@benches/block.rs`:
- Around line 8-10: The current placeholder benchmark function placeholder and
criterion_group!(benches, placeholder) leave the block benchmark as a
zero-workload stub; either remove the bench target or replace it with a minimal
public-API benchmark that actually exercises the block path. Update
benches/block.rs by deleting the placeholder bench or implementing a small
benchmark that constructs and manipulates a DataBlock via its public API (e.g.,
build a tiny DataBlock, call the public methods you expect to keep) and register
that function in criterion_group! instead of placeholder so the benchmark suite
retains meaningful coverage.
In `@benches/level_manifest.rs`:
- Around line 8-10: The current placeholder benchmark function (placeholder) and
its inclusion via criterion_group!(benches, placeholder) leave an inert stub;
either remove the placeholder target or replace it with an actual benchmark that
measures manifest iteration/lookup through the public API. To fix, delete the
placeholder() function and the criterion_group invocation referencing it if you
want no benchmark, or implement a Criterion benchmark that calls the project’s
public manifest methods (e.g., iterate_manifest or lookup_level equivalents) and
then register that benchmark in criterion_group!(benches, <your_benchmark_fn>)
so the test suite measures real work instead of an empty stub.
In `@benches/partition_point.rs`:
- Around line 7-10: The benchmark file registers a no-op benchmark function
named placeholder via criterion_group!(benches, placeholder) and
criterion_main!(benches); either implement a real benchmark for the
partition_point behavior or remove the registration and file to avoid shipping
an empty benchmark; specifically, delete or replace the placeholder function and
update/remove the criterion_group!(benches, placeholder) /
criterion_main!(benches) entries (or register a new real benchmark function that
exercises partition_point) so the bench target is not a no-op.
In `@benches/tli.rs`:
- Around line 8-10: Remove the inert TLI benchmark target: delete the
placeholder function and its criterion_group registration (the placeholder
function and criterion_group!(benches, placeholder)) from benches/tli.rs, or
replace them with a minimal Criterion benchmark that actually exercises the
IndexBlock lookup path (e.g., invoke IndexBlock::lookup or the public API that
routes to IndexBlock) so the file once again measures the intended TLI
performance.
In `@benches/tree.rs`:
- Around line 324-326: Open two tracking issues to re-enable the disabled
benchmarks for the removed APIs: one titled "Re-enable snapshot() benchmark" and
another titled "Re-enable open_as_blob_tree() benchmark". In each issue
reference the benches/tree.rs entries that currently contain the TODOs, describe
the exact benchmark to restore (the snapshot() benchmark and the
open_as_blob_tree() benchmark), list any test/bench names or expectations
required to re-enable them, note they should be re-opened only once the public
API is re-exposed or reimplemented, and add labels (category:bench,
priority:backlog) and an owner to track follow-up.
- Around line 16-23: The Cache wrapping style is inconsistent; choose one and
make all instances uniform (e.g., replace uses of
Cache::with_capacity_bytes(...).into() with
Arc::new(Cache::with_capacity_bytes(...)) or vice versa). Update the call sites
that pass the cache into Config::new(...).use_cache(...) — specifically the
expressions around Config::new, use_cache, and Cache::with_capacity_bytes — so
every benchmark uses the same wrapper (Arc::new(Cache::with_capacity_bytes(...))
or .into()) including the blocks at the shown snippet and the other locations
noted (around the use_cache calls at lines referenced in the comment).
In `@src/compaction/leveled/mod.rs`:
- Around line 6-11: Replace the #[allow(...)] block at the top of the leveled
module with #[expect(..., reason = "...")] attributes for each clippy lint so
stale suppressions will surface under MSRV 1.90; specifically, remove the single
#[allow(...)] containing clippy::unwrap_used, clippy::indexing_slicing,
clippy::useless_vec, clippy::unnecessary_map_or and add per-lint attributes like
#[expect(clippy::unwrap_used, reason = "temporary suppression — revisit when
API/logic changes")], #[expect(clippy::indexing_slicing, reason = "...")],
#[expect(clippy::useless_vec, reason = "...")], and
#[expect(clippy::unnecessary_map_or, reason = "...")] at the same location in
mod.rs so each suppression has an explicit reason and will be flagged if
outdated.
In `@src/compaction/stream.rs`:
- Line 409: Replace the module-level #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with individual #[expect(...,
reason = "...")] attributes: add #[expect(clippy::unwrap_used, reason =
"test-only code in this module")] , #[expect(clippy::indexing_slicing, reason =
"test-only code in this module")] , and #[expect(clippy::useless_vec, reason =
"test-only code in this module")], keeping them at the same module scope so the
lints are marked expected rather than allowed.
In `@src/compression.rs`:
- Line 287: Replace the module attribute #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with explicit #[expect(...,
reason = "...")] attributes on the test module; e.g. change the attribute
decorating the test module (the one currently suppressing clippy::unwrap_used,
clippy::indexing_slicing and clippy::useless_vec) to use
#[expect(clippy::unwrap_used, reason = "tests use .expect() for clearer
failures")], #[expect(clippy::indexing_slicing, reason = "tests use direct slice
indexing for concise assertions")], and #[expect(clippy::useless_vec, reason =
"temporary Vecs are used to construct test fixtures")], so each lint has its own
expect with a brief reason.
In `@src/file.rs`:
- Line 97: Replace the module-level attribute #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with the newer #[expect(...,
reason = "...")] form: add three separate expectations such as
#[expect(clippy::unwrap_used, reason = "test uses unwrap to intentionally assert
success of fallible operations")] #[expect(clippy::indexing_slicing, reason =
"tests use direct indexing for concise fixed-index assertions")] and
#[expect(clippy::useless_vec, reason = "tests construct Vecs for
clarity/explicit ownership in assertions")] applied to the same test module (the
attribute decorating the test module in src/file.rs) so each suppression is
converted to an explicit expectation with a short justification.
In `@src/fs/std_fs.rs`:
- Line 314: Replace the blanket #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with explicit #[expect(...,
reason = "...")] attributes: add three separate attributes
#[expect(clippy::unwrap_used, reason = "...")],
#[expect(clippy::indexing_slicing, reason = "...")], and
#[expect(clippy::useless_vec, reason = "...")] placed where the original
attribute was, supplying concise reasons describing why each suppression is
necessary (e.g., safe indexing with prior bounds check, deliberate unwrap
because error cannot occur, or intentional vec usage for API compatibility)
while keeping the same scope and behavior as the original attribute.
In `@src/memtable/arena.rs`:
- Around line 305-307: The tests module currently uses a single #[allow(...)]
attribute to silence multiple clippy lints; replace each allowance with an
#[expect(..., reason = "...")] attribute to match the project's lint suppression
strategy and provide reasons. Specifically, on the mod tests block that has
#[expect(clippy::expect_used, reason = "tests use expect for brevity")] and
#[allow(clippy::unwrap_used, clippy::indexing_slicing, clippy::useless_vec)],
convert those three allows into three separate #[expect] attributes (e.g.,
#[expect(clippy::unwrap_used, reason = "...")],
#[expect(clippy::indexing_slicing, reason = "...")],
#[expect(clippy::useless_vec, reason = "...")]) so clippy will report if any
suppression becomes unnecessary while keeping the existing
#[expect(clippy::expect_used, ...)] intact.
In `@src/mvcc_stream.rs`:
- Line 327: Replace the module-level #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with #[expect(...)] variants and
provide short reasons for each lint so the suppression will warn if it becomes
unnecessary; specifically, change the attribute to three expects referencing the
same clippy lints used (clippy::unwrap_used, clippy::indexing_slicing,
clippy::useless_vec) and add concise reason strings like "intentional unwrap on
well-validated input", "indexing is safe due to invariants", and "vec used for
FFI/heap-local optimization" (use the same attribute site as the existing
#[expect] on the previous line to keep consistency).
In `@src/prefix.rs`:
- Line 124: Replace the new clippy suppression attribute at the top of the file
(the attribute currently written as #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)]) with an equivalent
#[expect(..., reason = "...")] attribute: change each clippy lint listed to use
#[expect(clippy::unwrap_used, reason = "...")],
#[expect(clippy::indexing_slicing, reason = "...")], and
#[expect(clippy::useless_vec, reason = "...")] (or a single combined expect if
preferred), and provide a concise reason string explaining why the lint is
expected; MSRV 1.90 supports #[expect] so update the attribute on the same item
(the attribute currently suppressing clippy lints) accordingly.
In `@src/table/block_index/iter.rs`:
- Line 91: Replace the blanket test-module lint suppression attribute
`#[allow(clippy::unwrap_used, clippy::indexing_slicing, clippy::useless_vec)]`
with `#[expect(...)]` attributes and add concise justifications for each
expectation; specifically, change each lint to `#[expect(clippy::unwrap_used,
reason = "...")]`, `#[expect(clippy::indexing_slicing, reason = "...")]`, and
`#[expect(clippy::useless_vec, reason = "...")]` on the same test module (the
attribute you see at the top of the test module in iter.rs) and fill each reason
with a short explanation (e.g., tests intentionally use unwrap for brevity,
indexing is safe due to fixed test data, vecs improve readability) so the
suppressions are explicit and compliant with the MSRV Rust 1.90 policy.
In `@src/table/block/mod.rs`:
- Line 720: Replace the blanket clippy suppression attribute
#[allow(clippy::unwrap_used, clippy::indexing_slicing, clippy::useless_vec)]
with individual #[expect(..., reason = "...")] attributes for each lint (e.g.
#[expect(clippy::unwrap_used, reason = "explain why unwrap is safe here")],
#[expect(clippy::indexing_slicing, reason = "explain why indexing is safe
here")], #[expect(clippy::useless_vec, reason = "explain why vec here is
required")]) so each suppression documents the rationale and will be flagged
automatically when the lint is removed; apply these replaces to the same
attribute location (the current module-level attribute in mod.rs).
In `@src/table/meta.rs`:
- Line 281: Replace the module-level attribute #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] used on the test module in
src/table/meta.rs with an #[expect(..., reason = "...")] form; specifically
change the attribute on the test module to use #[expect] listing
clippy::unwrap_used, clippy::indexing_slicing, and clippy::useless_vec and add
reason = "test code" so unused suppressions will be flagged by the linter.
Ensure you update the attribute that decorates the test module (the one
currently using #[allow(...)] immediately above the mod test declaration).
In `@src/table/mod.rs`:
- Around line 21-26: Replace the blanket #[allow(...)] on the test module with
individual #[expect(..., reason = "...")] attributes for each lint so
suppression debt is explicit; specifically, for the module attribute that
currently lists clippy::unwrap_used, clippy::indexing_slicing,
clippy::useless_vec, and clippy::needless_borrows_for_generic_args, add one
#[expect(..., reason = "...")] per lint (e.g., reference the cfg(test) module in
src/table/mod.rs and replace the single #[allow(...)] with four separate
#[expect] attributes, each supplying a concise reason such as "convenience in
test code").
In `@src/table/multi_writer.rs`:
- Line 500: The attribute #[allow(clippy::unwrap_used, clippy::indexing_slicing,
clippy::useless_vec)] should be converted to the newer #[expect(..., reason =
"...")] form (or annotated with a brief reason comment if the crate MSRV <
1.79); update the attribute on the same item in multi_writer.rs to use
#[expect(clippy::unwrap_used, reason = "explain why unwrap is safe here"),
expect(clippy::indexing_slicing, reason = "explain why indexing is safe"),
expect(clippy::useless_vec, reason = "explain why Vec construction is
necessary")] when compiling on MSRV >= 1.79, otherwise retain #[allow(...)] but
add a short inline comment // reason: ... describing why each allow is required;
locate and change the attribute shown in the diff to include these reasons.
In `@src/version/run.rs`:
- Line 365: Replace the broad test-module attribute #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] with per-lint #[expect(...,
reason = "...")] attributes on the same test module in run.rs: remove the
#[allow(...)] and add one #[expect(clippy::unwrap_used, reason = "explain why
unwrap is acceptable in these tests")] plus one
#[expect(clippy::indexing_slicing, reason = "explain safe indexing in tests")]
and one #[expect(clippy::useless_vec, reason = "explain why vec! is intentional
in tests")] so each suppression is explicit and carries a reason; keep the
attribute(s) at the module-level where the original attribute was applied (the
test module surrounding the functions in run.rs).
In `@src/vlog/blob_file/reader.rs`:
- Line 246: Replace the #[allow(clippy::unwrap_used, clippy::indexing_slicing,
clippy::useless_vec)] attribute with #[expect(..., reason = "...")] variants so
each lint suppression becomes an expect with a short rationale; specifically
change the attribute on the blob reader block from allow to expect (e.g.,
#[expect(clippy::unwrap_used, reason = "explain why unwrap is safe here")],
#[expect(clippy::indexing_slicing, reason = "...")],
#[expect(clippy::useless_vec, reason = "...")]) so the code continues to
suppress these lints but will warn when the suppression is no longer needed.
In `@src/vlog/blob_file/scanner.rs`:
- Line 201: The attribute line currently uses #[allow(clippy::unwrap_used,
clippy::indexing_slicing, clippy::useless_vec)] which regresses from the
preferred #[expect] usage; replace this #[allow(...)] with
#[expect(clippy::unwrap_used, clippy::indexing_slicing, clippy::useless_vec,
reason = "brief justification describing why these lints are intentionally
suppressed here (e.g. safe unwrap after validation, performance-driven slicing,
vector needed for FFI)")] so each suppressed lint is tracked and will warn if
the suppression becomes unnecessary—keep the same lint names and provide a
concise reason string relevant to the scanner.rs code block where this attribute
sits.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e740ae2c-d242-45aa-bd42-50bbe8949804
📒 Files selected for processing (31)
benches/block.rsbenches/level_manifest.rsbenches/partition_point.rsbenches/tli.rsbenches/tree.rssrc/compaction/leveled/mod.rssrc/compaction/leveled/test.rssrc/compaction/stream.rssrc/compression.rssrc/error.rssrc/file.rssrc/fs/std_fs.rssrc/memtable/arena.rssrc/mvcc_stream.rssrc/prefix.rssrc/table/block/mod.rssrc/table/block_index/iter.rssrc/table/filter/blocked_bloom/builder.rssrc/table/filter/blocked_bloom/mod.rssrc/table/filter/mod.rssrc/table/meta.rssrc/table/mod.rssrc/table/multi_writer.rssrc/table/tests.rssrc/tree/mod.rssrc/version/run.rssrc/vlog/blob_file/reader.rssrc/vlog/blob_file/scanner.rstests/level_routing.rstests/prop_mvcc.rstests/prop_range_tombstone.rs
There was a problem hiding this comment.
Pull request overview
This PR improves reopen/recovery diagnostics for level-routing configurations, restores the blocked_bloom filter module, and updates tests/benches to align with the current public API and clippy expectations.
Changes:
- Add
Error::RouteMismatch { expected, found }and return it during recovery whenlevel_routesis configured and fewer tables are found than expected. - Re-enable
table::filter::blocked_bloomand adjust bloom/test/bench code to current APIs (iter guards, cache config, seqno params). - Fix clippy issues in integration tests (replace never-looping
forwith.next(), adoptis_none_or, and add lint suppressions in test modules).
Reviewed changes
Copilot reviewed 31 out of 31 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/prop_range_tombstone.rs | Fix oracle get() to use .range(...).next() (removes never-looping loop). |
| tests/prop_mvcc.rs | Fix oracle get() to use .range(...).next() (removes never-looping loop). |
| tests/level_routing.rs | Add regression test asserting RouteMismatch on reopen with missing route coverage. |
| src/vlog/blob_file/scanner.rs | Switch test-module lint suppression to #[allow(...)]. |
| src/vlog/blob_file/reader.rs | Switch test-module lint suppression to #[allow(...)]. |
| src/version/run.rs | Switch test-module lint suppression to #[allow(...)]. |
| src/tree/mod.rs | Emit RouteMismatch when recovery finds fewer tables with level_routes configured. |
| src/table/tests.rs | Adjust InternalValue::from_components call site key argument. |
| src/table/multi_writer.rs | Add test-module lint suppression via #[allow(...)]. |
| src/table/mod.rs | Add test-module lint suppression via #[allow(...)]. |
| src/table/meta.rs | Add test-module lint suppression via #[allow(...)]. |
| src/table/filter/mod.rs | Re-enable blocked_bloom module export. |
| src/table/filter/blocked_bloom/mod.rs | Add clippy #[expect] annotations; remove unused helpers. |
| src/table/filter/blocked_bloom/builder.rs | Add clippy #[expect] annotations around float/int casts for sizing logic. |
| src/table/block_index/iter.rs | Add test-module lint suppression via #[allow(...)]. |
| src/table/block/mod.rs | Add test-module lint suppression via #[allow(...)]. |
| src/prefix.rs | Add test-module lint suppression via #[allow(...)]. |
| src/mvcc_stream.rs | Add test-module lint suppression via #[allow(...)]. |
| src/memtable/arena.rs | Add test-module lint suppression via #[allow(...)]. |
| src/fs/std_fs.rs | Add test-module lint suppression via #[allow(...)]. |
| src/file.rs | Add test-module lint suppression via #[allow(...)]. |
| src/error.rs | Add Error::RouteMismatch { expected, found } variant and docs. |
| src/compression.rs | Add test-module lint suppression via #[allow(...)]. |
| src/compaction/stream.rs | Add test-module lint suppression via #[allow(...)]. |
| src/compaction/leveled/test.rs | Use is_none_or instead of map_or. |
| src/compaction/leveled/mod.rs | Add test-module lint suppression via #[allow(...)]. |
| benches/tree.rs | Update benchmarks to new public API (Config ctor, cache, seqno, guard iteration), and stub removed snapshot/blob benches. |
| benches/tli.rs | Stub benchmark that depended on removed internal index types. |
| benches/partition_point.rs | Stub benchmark that depended on removed binary_search public API. |
| benches/level_manifest.rs | Stub benchmarks that accessed internal tree.levels. |
| benches/block.rs | Stub benchmarks that depended on removed/restructured internal block types. |
Comments suppressed due to low confidence (1)
src/table/filter/blocked_bloom/mod.rs:48
- BlockedBloomFilterReader::new() parses on-disk bytes but still uses assert_eq! for the filter type and hash type (and relies on an expect() for slicing). If a table’s filter block is corrupted or from a different format, these asserts will panic and can crash the process; this should return a structured Error (e.g., InvalidHeader/InvalidTag/Unrecoverable) instead of panicking. Also consider validating the u64→usize conversions (use try_into / bounds checks) so a corrupted header can’t truncate/overflow and lead to incorrect indexing.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/tree/mod.rs`:
- Around line 1628-1659: The current logic in the branch that returns
crate::Error::RouteMismatch uses only config.level_routes to decide if missing
tables are due to a routing change, which misclassifies cases where the
route-to-path mapping changed; fix by persisting route provenance (either add
previous route→path mapping to the manifest or per-table metadata) and update
the check that computes all_missing_uncovered to compare missing tables'
persisted route/path identity against current config.level_routes (use
table_map's stored provenance if present) before returning Error::RouteMismatch;
if you cannot persist provenance in this change, revert to returning
Unrecoverable instead of RouteMismatch to avoid misclassification (update
references: config.level_routes, table_map, and crate::Error::RouteMismatch).
In `@tests/level_routing.rs`:
- Around line 714-720: The test removes one file from the directory but never
asserts that a deletion actually happened; modify the loop that iterates over
std::fs::read_dir(&tables_dir) so it tracks whether a file was removed (e.g., a
boolean `deleted` flag set to true inside the if branch where
std::fs::remove_file(entry.path()) is called) and after breaking from the loop
add an assertion (e.g., assert!(deleted, "expected to delete at least one table
file")) to fail the test if no file was removed; apply the same change to the
other identical block referenced around lines 756-762.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0b6428ba-1391-4ca0-bfff-9dbeedca4e32
📒 Files selected for processing (26)
Cargo.tomlbenches/block.rsbenches/level_manifest.rsbenches/partition_point.rsbenches/tli.rsbenches/tree.rssrc/compaction/leveled/mod.rssrc/compaction/stream.rssrc/compression.rssrc/config/mod.rssrc/error.rssrc/file.rssrc/fs/std_fs.rssrc/memtable/arena.rssrc/mvcc_stream.rssrc/prefix.rssrc/table/block/mod.rssrc/table/block_index/iter.rssrc/table/meta.rssrc/table/mod.rssrc/table/multi_writer.rssrc/tree/mod.rssrc/version/run.rssrc/vlog/blob_file/reader.rssrc/vlog/blob_file/scanner.rstests/level_routing.rs
💤 Files with no reviewable changes (5)
- benches/block.rs
- benches/tli.rs
- benches/partition_point.rs
- benches/level_manifest.rs
- Cargo.toml
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 33 out of 33 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/table/filter/blocked_bloom/mod.rs:48
BlockedBloomFilterReader::newparses potentially untrusted on-disk bytes but currently relies onassert_eq!/expect()and uncheckedas usizecasts. Corrupted or malicious filter blocks could trigger panics (e.g., invalid filter/hash type, out-of-bounds slice,num_blocks == 0leading to division-by-zero later) or truncation. Instead of suppressing clippy here, validate fields (num_blocks > 0,k > 0, bounds) and return a structuredErroron invalid input.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/table/filter/blocked_bloom/mod.rs`:
- Around line 237-245: Add a unit test that constructs a complete
BlockedBloomFilter header (including MAGIC_BYTES, filter type byte, hash_type,
num_blocks and k as little-endian u64s) but omits the data section so that
BlockedBloomFilterReader::new advances past reader.read_u8() and then hits the
slice.get(offset..) → ok_or(...) validation; call BlockedBloomFilterReader::new
with that byte vector and assert the expected error result (or assert is_err()
if you want to require failure) to ensure the truncated data path is exercised.
- Around line 77-84: The constructor builds a BitArrayReader from raw data but
never validates that the claimed num_blocks fits in the provided buffer,
allowing later contains_hash to compute block/bit offsets that panic inside
BitArrayReader::get; fix by checking after slicing that data.len() >=
(num_blocks as usize).saturating_mul(CACHE_LINE_BYTES) and return
Err(crate::Error::InvalidHeader("BloomFilter: truncated data")) if not, before
calling BitArrayReader::new (use the existing symbols k, num_blocks,
BitArrayReader::new, contains_hash, and CACHE_LINE_BYTES to locate the code).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: da5ae764-f7d4-4928-8588-7fca6f2e483f
📒 Files selected for processing (1)
src/table/filter/blocked_bloom/mod.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/level_routing.rs`:
- Around line 796-851: The test currently only creates hot-tier tables, so
removing the warm route doesn't produce a mixed covered+uncovered missing-table
scenario; update mixed_missing_covered_and_uncovered_returns_unrecoverable to
explicitly create at least one warm-level table before deleting a hot table:
after the initial three_tier_config open/flush steps (where tree.insert and
tree.flush_active_memtable are used for hot), open the same three_tier_config
and perform an insert + flush that targets the warm tier (use tree.insert(...)
then tree.flush_active_memtable(...) for the warm-level flush or otherwise use
the API the test uses to force a flush into the warm level) so there is an
actual warm table on disk, then proceed to delete a file from hot_tables_dir and
remove the warm LevelRoute when reopening so the test exercises the mixed
covered+uncovered case and expects lsm_tree::Error::Unrecoverable.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 31c15da3-e139-4b3e-83ce-38fb1870d908
📒 Files selected for processing (1)
tests/level_routing.rs
- Add Error::RouteMismatch variant returned when recovery finds fewer tables than expected AND level_routes is configured, distinguishing misconfiguration from actual data corruption (Unrecoverable) - Re-enable blocked_bloom filter module (was disabled via ._rs rename) with proper clippy expect annotations - Fix never-looping for loops in prop_mvcc and prop_range_tombstone oracles (replace with iterator .next()) - Update benchmarks for current API: Config::new(3 args), Cache (was BlockCache), use_cache (was block_cache), SeqNo params, IterGuardImpl iteration pattern - Stub out benchmarks for removed internal types (ValueBlock, TopLevelIndex, binary_search) - Add #[allow] for clippy lints in #[cfg(test)] modules - Fix map_or → is_none_or and &[key] → [key] in test code Closes #164
Adds missing_table_without_routes_returns_unrecoverable test to cover the Unrecoverable branch (no level_routes configured but table file deleted), bringing patch coverage to 100%.
…tion - RouteMismatch is now returned ONLY when all missing tables are on levels not covered by any current route (strongly suggests removed routes). If any missing table is on a level covered by a current route, return Unrecoverable (the SST was genuinely deleted/corrupted) - Soften RouteMismatch doc: "often indicates" instead of "this is" - Update level_routes reopen contract doc to mention RouteMismatch - Add regression test: deleted_sst_with_routes_configured_returns_unrecoverable - Convert #[allow] → #[expect] for clippy lint suppressions in 14 test modules - Remove 4 dead placeholder bench targets (block, level_manifest, partition_point, tli) and their Cargo.toml entries - Unify Cache wrapping to Arc::new() in bench/tree.rs
- Doc now accurately describes level-based detection logic - Convert remaining #[allow] → #[expect] in leveled/mod.rs and table/mod.rs
assert_eq! on filter_type and hash_type would panic on corrupted on-disk data instead of returning a structured error. Replace with conditional checks returning Error::InvalidHeader. Also replace expect() on slice bounds with ok_or(). Adds regression tests: corrupted_filter_type_returns_error, corrupted_hash_type_returns_error, truncated_filter_returns_error.
blocked_bloom was never integrated into the Segment loader pipeline (upstream fjall-rs#78 is still open). Remove the module source, re-disable the module declaration, and clean up bench references. FilterType::BlockedBloom enum variant is preserved for on-disk format compatibility (tag byte 1).
Add deleted flag + assertion to both missing_table_without_routes and deleted_sst_with_routes_configured tests so they fail explicitly if the test setup didn't actually remove a file.
Add code comment noting that level-based RouteMismatch detection is best-effort and cannot detect same-level path changes. Persisting route provenance per-table would enable exact detection but requires a manifest format change.
Exercises the path where some missing tables are on covered levels (corruption) and some on uncovered levels (route mismatch) — must return Unrecoverable since at least one SST was genuinely lost.
Compact one table to L3 (warm tier) and leave another at L0 (hot tier) so the test genuinely exercises mixed covered+uncovered missing tables when reopening with a partial route config.
7a40ad1 to
3791c93
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/config/mod.rs`:
- Around line 592-594: Update the doc comment in config::mod (the triple-slash
comment above the relevant item) to soften the guarantee: change the
unconditional phrasing that “omitting an old folder will fail with
RouteMismatch” to indicate it “may fail with RouteMismatch when all missing
tables are on levels not covered by config.level_routes,” and mention that
recover_levels in src/tree/mod.rs uses a best-effort level-coverage heuristic
(it does not track per-table route provenance) so mixed cases can still surface
as Unrecoverable; reference recover_levels and config.level_routes in the text
to make the behavior explicit.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 32ce5d55-24d8-484c-9f65-4a48a5703e08
📒 Files selected for processing (34)
Cargo.tomlbenches/block.rsbenches/bloom.rsbenches/level_manifest.rsbenches/partition_point.rsbenches/tli.rsbenches/tree.rssrc/compaction/leveled/mod.rssrc/compaction/leveled/test.rssrc/compaction/stream.rssrc/compression.rssrc/config/mod.rssrc/error.rssrc/file.rssrc/fs/std_fs.rssrc/memtable/arena.rssrc/mvcc_stream.rssrc/prefix.rssrc/table/block/mod.rssrc/table/block_index/iter.rssrc/table/filter/blocked_bloom/builder._rssrc/table/filter/blocked_bloom/mod._rssrc/table/filter/mod.rssrc/table/meta.rssrc/table/mod.rssrc/table/multi_writer.rssrc/table/tests.rssrc/tree/mod.rssrc/version/run.rssrc/vlog/blob_file/reader.rssrc/vlog/blob_file/scanner.rstests/level_routing.rstests/prop_mvcc.rstests/prop_range_tombstone.rs
💤 Files with no reviewable changes (9)
- benches/bloom.rs
- src/table/filter/mod.rs
- benches/tli.rs
- benches/level_manifest.rs
- benches/partition_point.rs
- Cargo.toml
- src/table/filter/blocked_bloom/mod._rs
- src/table/filter/blocked_bloom/builder._rs
- benches/block.rs
- Fix level_route → level_routes typo in RouteMismatch rustdoc - Soften level_routes reopen contract: "may fail" with RouteMismatch or Unrecoverable depending on which levels are affected - Add comment explaining Guard trait import for into_inner() - Hoist log::error! args into let bindings for coverage visibility
TempDir was moved into Config::new, causing immediate drop and directory deletion. Pass by reference so TempDir lives for the duration of the benchmark closure.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@benches/tree.rs`:
- Around line 100-108: The iterator currently swallows iteration errors by using
guard.into_inner().ok(), which masks read failures; change both scan pipelines
that call tree.iter(MAX, None) (and the other similar pipeline at the later
block) to unwrap the result from guard.into_inner() (e.g., replace .ok() with
.unwrap() or otherwise propagate/panic) so the benchmark fails fast on iterator
read errors; locate uses of guard.into_inner().ok() in the scan closures and
replace them with a panic-on-error approach (unwrap) to ensure broken entries
are not silently skipped.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: d3a0136b-a8d0-484e-9340-a2b46fac1b4f
📒 Files selected for processing (1)
benches/tree.rs
filter_map(ok()) swallowed scan errors, making benchmarks look correct even with broken entries. Use map(unwrap()) to fail fast.
## 🤖 New release * `coordinode-lsm-tree`: 4.1.0 -> 4.2.0 <details><summary><i><b>Changelog</b></i></summary><p> <blockquote> ## [4.2.0](v4.1.0...v4.2.0) - 2026-03-26 ### Added - comparator-aware range tombstones ([#180](#180)) - *(compression)* CompressionProvider trait + pure Rust zstd backend ([#176](#176)) - *(error)* RouteMismatch error, blocked_bloom cleanup, bench/clippy fixes ([#166](#166)) - *(config)* per-level Fs routing for tiered storage ([#163](#163)) ### Performance - *(bench)* consolidate benchmarks + nextest + flamegraph pipeline ([#175](#175)) ### Testing - *(table)* add zstd dict helper coverage ([#181](#181)) </blockquote> </p></details> --- This PR was generated with [release-plz](https://github.com/release-plz/release-plz/). Co-authored-by: sw-release-bot[bot] <255865126+sw-release-bot[bot]@users.noreply.github.com>
Summary
Error::RouteMismatch { expected, found }with level-based detection — only returned when ALL missing tables are on levels not covered by any current route (prevents masking genuine SST corruption)blocked_bloommodule entirely (upstream Implement Blocked Bloom Filters fjall-rs/lsm-tree#78 still open, never integrated into Segment loader); preserveFilterType::BlockedBloomenum variant for on-disk format compatibilityforloops inprop_mvccandprop_range_tombstoneoracleget()methodsConfig3-arg constructor,Cache,use_cache,SeqNoparams,IterGuardImpl/Guardpattern); remove 4 dead bench targets; fix TempDir lifetime#[allow]→#[expect]with reason strings in 14 test modulesmap_or→is_none_orand needless borrow warnings in test codelevel_routesreopen contract doc to mentionRouteMismatchTest plan
cargo test --test level_routing— 24 passed (4 new: route mismatch, unrecoverable without routes, unrecoverable with routes, mixed covered+uncovered)cargo test --test prop_mvcc— 1 passedcargo test --test prop_range_tombstone— 1 passedcargo clippy --all-targets --all-features— 0 errorsCloses #164