fix(test): use shared seqno counter in proptest oracle#97
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRefactors test internals: replaces manual iteration in Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
This PR moves prefix/key bloom filter checks for multi-table runs from eager upfront filtering in TreeIter::create_range to lazy, per-table evaluation inside RunReader, improving first-result latency for partial scans over large runs.
Changes:
- Add
BloomHintsand awith_bloom_hints()builder onRunReaderto pass prefix/key bloom hashes (and optional metrics) into the run iterator. - Implement lazy intermediate-table bloom skipping in
RunReader::next()/next_back()via a newbloom_skip()helper. - Update
TreeIter::create_rangeto stop pre-filtering multi-table runs and instead attach bloom hints to eachRunReader.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/run_reader.rs | Introduces BloomHints, attaches hints to RunReader, and performs lazy per-table bloom skipping when advancing across intermediate tables. |
| src/range.rs | Removes eager multi-table run bloom pre-filtering and passes bloom hints into RunReader instances. |
| src/metrics.rs | Updates documentation for prefix_bloom_skips to reflect the new counting location for multi-table runs. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
ba66c88 to
d07d367
Compare
|
@coderabbitai re-review |
|
🧠 Learnings used✅ Actions performedFull review triggered. |
d07d367 to
c8272e8
Compare
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'lsm-tree db_bench'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.15.
| Benchmark suite | Current: 42e6ebc | Previous: ff82717 | Ratio |
|---|---|---|---|
fillrandom |
1066814.4048151223 ops/sec |
1277896.8965820211 ops/sec |
1.20 |
This comment was automatically generated by workflow using github-action-benchmark.
CC: @polaz
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/run_reader.rs (2)
126-141: Consider pruning the culled boundary tables too.
bloom_skip()only runs in the intermediate-table loops. WhenRunReader::new()collapses a multi-table run down to one or two overlapping tables, neither boundary is checked, so narrow scans still miss the bloom-pruning fast path entirely. A small precheck when attaching hints, or before constructingRunReader, would recover that optimization forhi - lo <= 1.Also applies to: 158-173, 198-213
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/run_reader.rs` around lines 126 - 141, The bloom pruning misses single- or two-table runs because bloom_skip is only used for intermediate tables; update the path where hints are attached or where RunReader::new collapses runs (referencing with_bloom_hints, RunReader::new, bloom, bloom_skip, and BloomHints::should_skip) to perform a quick precheck of the boundary tables when hi - lo <= 1 and prune any boundary table that BloomHints::should_skip reports as absent; i.e., before finalizing the run collapse, call bloom.should_skip(&table) for the lo and hi boundary Table(s) and adjust/remove them so narrow scans benefit from the same bloom-pruning fast path.
451-477: Make the bloom-miss test input deterministic.These assertions depend on the current bloom configuration rejecting one hard-coded absent hash. Since the filter can legitimately false-positive, the test is coupled to today's hash/bits-per-key choices. Consider probing candidate hashes during setup until the intermediate tables return
Ok(false), then using that value for the skip assertions.Also applies to: 486-504
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/run_reader.rs` around lines 451 - 477, The test currently uses a single hard-coded hash from standard_bloom::Builder::get_hash(b"zzz_not_in_any_table") which can false-positive; instead probe candidate hashes during setup by generating hashes (e.g., with Builder::get_hash on different byte inputs) and checking against the intermediate tables' bloom filters until you find one that definitively returns Ok(false) for those tables, then use that hash in BloomHints passed to RunReader::with_bloom_hints so the intermediate tables are reliably bloom-skipped and the assertions remain deterministic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/run_reader.rs`:
- Around line 126-141: The bloom pruning misses single- or two-table runs
because bloom_skip is only used for intermediate tables; update the path where
hints are attached or where RunReader::new collapses runs (referencing
with_bloom_hints, RunReader::new, bloom, bloom_skip, and
BloomHints::should_skip) to perform a quick precheck of the boundary tables when
hi - lo <= 1 and prune any boundary table that BloomHints::should_skip reports
as absent; i.e., before finalizing the run collapse, call
bloom.should_skip(&table) for the lo and hi boundary Table(s) and adjust/remove
them so narrow scans benefit from the same bloom-pruning fast path.
- Around line 451-477: The test currently uses a single hard-coded hash from
standard_bloom::Builder::get_hash(b"zzz_not_in_any_table") which can
false-positive; instead probe candidate hashes during setup by generating hashes
(e.g., with Builder::get_hash on different byte inputs) and checking against the
intermediate tables' bloom filters until you find one that definitively returns
Ok(false) for those tables, then use that hash in BloomHints passed to
RunReader::with_bloom_hints so the intermediate tables are reliably
bloom-skipped and the assertions remain deterministic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1aa8bb34-e76b-44c3-9467-5cacfb17c5de
📒 Files selected for processing (3)
src/metrics.rssrc/range.rssrc/run_reader.rs
60ce143 to
fcfb7de
Compare
|
@coderabbitai re-review |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 50643
|
53616b0 to
0468c35
Compare
Pass prefix_hash and key_hash into RunReader via BloomHints so it can skip tables lazily during iteration instead of upfront pre-filtering. - Add BloomHints struct with should_skip() method shared by RunReader and the single-table path in range.rs - Boundary tables are bloom-checked during lazy init (ensure_lo/hi); rejected boundaries skip reader creation entirely - Intermediate tables are bloom-checked lazily in next()/next_back() when the iterator advances to them - Remove upfront multi-table filtering loop from range.rs create_range; bloom hints are passed to RunReader instead - 9 unit tests covering forward/reverse/partial/key_hash/3-table paths Closes #58
- Restructure next()/next_back() into 3-phase loops: drain current reader → advance through bloom-skipped intermediates → fall through to opposite boundary. Fixes skipped intermediates when a boundary table is bloom-rejected. - Advance lo/hi index in ensure_lo/hi_initialized when boundary is rejected to avoid redundant re-check and double metric counts - Remove dead hi == 0 guard (while lo < hi guarantees hi >= 1) - Add metrics test verifying prefix_bloom_skips counter
Pre-existing bug: proptest found a sequence of insert-flush-compact operations where point read returns an older value instead of the latest insert. Reproduces on main without bloom changes. Seed: cc 90710f9669603bd6cb741490355db41b936dd1088e87c45b621c6ec6eeef1ad2
Narrowed down: the bug requires an unflushed memtable entry present during major_compact. The exact_replay test reproduces the minimal proptest failing input verbatim.
Bug requires: initial no-op compact on empty tree + insert with unflushed memtable during triple compact + second compact cycle + post-compact flush creating L0 table + final memtable inserts.
The proptest and regression test used an independent seqno counter (`let mut seqno = 1`) that did not advance on flush/compact, violating the API contract requiring data seqnos from the shared SequenceNumberCounter passed to Config::new. With independent counters, internal SuperVersion seqnos advance faster than data seqnos, causing get_version_for_snapshot to return a stale SuperVersion whose memtable misses recent inserts. Root cause: get_version_for_snapshot(S) finds the latest SV with seqno < S. When the internal counter (advanced by flush/compact) outpaces user data seqnos, the returned SV references an old memtable that was rotated away, missing data inserted into the current memtable. Fix: use seqno_counter.next() from the shared counter for all data operations, keeping SV seqnos and data seqnos properly interleaved.
Clippy flagged the for-loop-that-always-breaks pattern as never_loop. Rewrite to take_while + next for the same "first matching element" semantics.
Use visible_seqno.get() instead of seqno_counter.get() for the read seqno. The visible_seqno tracks the visibility watermark and won't drift ahead of what the tree considers readable.
9bd04ba to
2beca07
Compare
Rename with_post_compact_flush → point_read_after_compact_flush_returns_latest_value to follow the <what>_<condition>_<expected> pattern.
No-op flush (empty memtable) and DoNothing compaction skip upgrade_version entirely, so the comment should say "may" not "also".
Summary
SequenceNumberCounterper API contract (was using independent counter)never_looplint in oracle'sget()methodTechnical Details
The proptest used an independent seqno counter (
let mut seqno = 1) that did not advance on flush/compact, violating the API contract requiring data seqnos from the sharedSequenceNumberCounterpassed toConfig::new. With independent counters, internal SuperVersion seqnos advance faster than data seqnos, causingget_version_for_snapshotto return a stale SuperVersion whose memtable misses recent inserts.Root cause:
get_version_for_snapshot(S)finds the latest SV withseqno < S. When the internal counter (advanced by flush/compact) outpaces user data seqnos, the returned SV references an old memtable that was rotated away.Fix: use
seqno_counter.next()from the shared counter for all data operations in the proptest, keeping SV seqnos and data seqnos properly interleaved.Note: The bloom skipping feature (src/ changes) was merged via PR #64. This PR now contains only test improvements.
Test Plan
point_read_after_compact_flush_returns_latest_valuepassesprop_btreemap_oracle_correctnesspasses (256 cases)cargo clippy --testscleanCloses #58