Skip to content

perf: seqno-aware seek in data block point reads#263

Closed
polaz wants to merge 47 commits into
fjall-rs:mainfrom
structured-world:feat/#237-data-block-seqno-aware-seek
Closed

perf: seqno-aware seek in data block point reads#263
polaz wants to merge 47 commits into
fjall-rs:mainfrom
structured-world:feat/#237-data-block-seqno-aware-seek

Conversation

@polaz
Copy link
Copy Markdown

@polaz polaz commented Mar 14, 2026

Summary

  • Exploit internal key ordering (user_key ASC, seqno DESC) to include seqno in the binary search predicate during point_read
  • Adds seek_to_key_seqno() to Iter which uses a composite predicate: head_key < needle || (head_key == needle && head_seqno >= target_seqno)
  • This skips entire restart intervals containing only versions newer than the target snapshot seqno
  • Reduces linear scan from O(versions) to O(restart_interval) for keys with many MVCC versions
  • Hash index direct-hit path unchanged (already optimal for single-interval keys)

How it works

Before: binary search finds the restart interval by user_key only, then linear scans past all newer versions one by one.

After: binary search considers both user_key and seqno, landing directly at the restart interval containing the target version. The remaining linear scan within that interval is bounded by restart_interval items (typically 16).

Test plan

  • New test data_block_point_read_seqno_aware_seek — single key with 5 versions, various target seqnos, restart_interval 1..4
  • New test data_block_point_read_seqno_aware_seek_mixed_keys — multiple keys with multiple versions
  • All 15 existing data_block_point_read tests pass
  • Full cargo test --all-features passes (232 tests + 23 doc-tests)

Closes #237

Summary by CodeRabbit

  • New Features

    • Seek and point-read operations are now sequence-number aware, returning the correct versioned record for a given snapshot.
  • Bug Fixes

    • Prevents returns of entries masked by newer versions, improving conflict resolution and lookup correctness.
  • Tests

    • Added unit tests validating seqno-aware seeking across restart intervals and mixed-key/version scenarios.

polaz added 4 commits March 14, 2026 13:53
Add contains_prefix() to AbstractTree trait that checks if any key
with the given prefix exists, stopping at the first match instead
of materializing a full iterator.

- Default implementation on AbstractTree uses prefix().next()
- BlobTree overrides to delegate to index tree, avoiding value log reads
- MVCC-correct: respects seqno visibility and tombstones

Closes fjall-rs#138
- Doc describes convenience and error propagation, not false optimization claim
- Test asserts "ab" matches "abc:*" keys at visible seqno
- Add BlobTree test covering delegated index-only path
Delegate to index tree avoids BlobGuard construction overhead,
not value-log reads (key() never resolves blob indirections).
Exploit internal key ordering (user_key ASC, seqno DESC) to include
seqno in the binary search predicate. This skips entire restart
intervals containing only versions newer than the target snapshot,
reducing O(versions) linear scan to O(restart_interval) for keys
with many MVCC versions.

Closes fjall-rs#237
Copilot AI review requested due to automatic review settings March 14, 2026 12:42
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds seqno-aware seeking to data-block iteration (new Iter::seek_to_key_seqno and seqno-aware variants of seek/seek_upper/seek_exclusive/seek_upper_exclusive), wires seqno-aware seeks into point-read paths, propagates SeqNo through higher-level iter helpers, and updates/extends tests to exercise seqno-aware behavior.

Changes

Cohort / File(s) Summary
Seqno-aware iteration core
src/table/data_block/iter.rs
Added pub fn seek_to_key_seqno(&mut self, needle: &[u8], seqno: SeqNo) -> bool; changed seek, seek_upper, seek_exclusive, seek_upper_exclusive to accept a SeqNo and perform seqno-aware binary + linear search; imported SeqNo.
Point-read integration & tests
src/table/data_block/mod.rs, src/table/data_block/iter_test.rs
Replaced plain binary-search fallback seeks with iter.seek_to_key_seqno(needle, seqno) in point-read paths; updated many tests to pass SeqNo::MAX and added two new seqno-aware unit tests (data_block_point_read_seqno_aware_seek, data_block_point_read_seqno_aware_seek_mixed_keys).
Iterator plumbing
src/table/iter.rs
Propagated SeqNo through OwnedDataBlockIter seek helper methods by changing signatures to accept seqno and forwarding it to underlying iter methods.

Sequence Diagram(s)

(Skipped — changes are localized to iterator logic and point-read wiring; no multi-component sequential flow diagram generated.)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: introducing seqno-aware seeking in data block point reads for performance optimization.
Linked Issues check ✅ Passed The PR fully addresses issue #237 by implementing seqno-aware binary search in data blocks that skips restart intervals with only newer versions, with comprehensive tests validating the optimization.
Out of Scope Changes check ✅ Passed All changes directly support seqno-aware seeking: new seek_to_key_seqno method, updated seek signatures to accept seqno, propagated seqno through iterator layers, and comprehensive tests validating the feature.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves DataBlock::point_read performance for MVCC-heavy keys by making the restart-interval binary search seqno-aware, leveraging internal key ordering (user_key ASC, seqno DESC) to skip restart intervals that can’t contain the snapshot-visible version.

Changes:

  • Add Iter::seek_to_key_seqno() to seek using a composite (user_key, seqno) predicate.
  • Update DataBlock::point_read to use the seqno-aware seek in the binary-search fallback paths.
  • Add new tests covering seqno-aware seeking across multiple restart_interval settings and mixed-key blocks.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/table/data_block/mod.rs Switch point_read fallback seek to seqno-aware seek; add new targeted tests.
src/table/data_block/iter.rs Introduce seek_to_key_seqno() to perform seqno-aware restart-interval seeking.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread src/table/data_block/mod.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves data-block point_read performance for MVCC-heavy keys by using internal key ordering (user_key ASC, seqno DESC) to make the restart-interval seek sequence-number aware, reducing the amount of post-seek linear scanning.

Changes:

  • Add Iter::seek_to_key_seqno() to perform a composite-predicate binary search over restart heads.
  • Update DataBlock::point_read() to use seqno-aware seeking in the binary-search fallback path.
  • Add new tests covering seqno-aware seeking behavior across varying restart_interval values and mixed-key blocks.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/table/data_block/mod.rs Switch point_read fallback to seqno-aware seek and add tests validating MVCC snapshot visibility behavior.
src/table/data_block/iter.rs Introduce seek_to_key_seqno() implementing the composite predicate over restart heads.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread src/table/data_block/iter.rs Outdated
Comment thread src/table/data_block/mod.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves DataBlock::point_read performance for MVCC-heavy keys by making the restart-interval binary search snapshot-seqno aware, reducing how many newer versions must be linearly scanned.

Changes:

  • Update DataBlock::point_read to use a seqno-aware seek predicate when falling back to binary search.
  • Add Iter::seek_to_key_seqno() to support composite (user_key, seqno) seeking over restart heads.
  • Add unit tests covering seqno-aware seeks across restart intervals and with mixed keys.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/table/data_block/mod.rs Switch point reads to seqno-aware seek and add new tests validating behavior across restart intervals.
src/table/data_block/iter.rs Introduce seek_to_key_seqno() that uses a composite predicate for restart-head binary search.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread src/table/data_block/iter.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves MVCC point-read performance in data blocks by making the restart-interval binary search sequence-number aware, leveraging internal key ordering (user_key ASC, seqno DESC) to skip restart intervals containing only versions newer than the target snapshot.

Changes:

  • Update DataBlock::point_read to use a seqno-aware seek predicate for binary-search fallback paths.
  • Add Iter::seek_to_key_seqno() to perform composite (key, seqno) restart-head searches.
  • Add unit tests covering seqno-aware seeking across varying restart intervals and mixed-key scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/table/data_block/mod.rs Switch point-read fallback to seqno-aware seek and add targeted tests for MVCC version chains.
src/table/data_block/iter.rs Introduce seek_to_key_seqno() using a composite predicate over restart-head key + seqno.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread src/table/data_block/mod.rs Outdated
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves DataBlock::point_read performance for MVCC-heavy keys by making the restart-interval binary search sequence-number aware, leveraging the internal key ordering (user_key ASC, seqno DESC) to skip restart intervals that only contain versions newer than the snapshot.

Changes:

  • Switch DataBlock::point_read fallback seek from key-only to seqno-aware restart search.
  • Add Iter::seek_to_key_seqno() implementing a composite predicate over (user_key, seqno).
  • Add targeted unit tests covering seqno-aware seeking across restart intervals and mixed-key layouts.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/table/data_block/mod.rs Updates point_read to use seqno-aware seek and adds new unit tests for MVCC visibility across restart intervals.
src/table/data_block/iter.rs Adds seek_to_key_seqno() to perform seqno-aware restart-interval selection during point reads.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@marvin-j97
Copy link
Copy Markdown
Contributor

This doesn't really fully solve #237 - the todos for iterators are still todos

fn seek_lower_inclusive(&mut self, needle: &[u8], _seqno: SeqNo) -> bool {
self.with_dependent_mut(|_, m| m.seek(needle /* TODO: , seqno */))
}
fn seek_upper_inclusive(&mut self, needle: &[u8], _seqno: SeqNo) -> bool {
self.with_dependent_mut(|_, m| m.seek_upper(needle /* TODO: , seqno */))
}
fn seek_lower_exclusive(&mut self, needle: &[u8], _seqno: SeqNo) -> bool {
self.with_dependent_mut(|_, m| m.seek_exclusive(needle /* TODO: , seqno */))
}
fn seek_upper_exclusive(&mut self, needle: &[u8], _seqno: SeqNo) -> bool {
self.with_dependent_mut(|_, m| m.seek_upper_exclusive(needle /* TODO: , seqno */))
}

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

- Forward seeks (seek, seek_exclusive) use seqno in restart-interval
  binary search predicate, matching index_block pattern
- Backward seeks (seek_upper, seek_upper_exclusive) accept seqno for
  API uniformity but cannot narrow the binary search — backward
  iteration visits lower indices only, so a tighter predicate would
  miss intervals containing the visible version
- Wire seqno through OwnedDataBlockIter wrappers, removing all TODOs
- Add tests for seqno-aware forward/backward seeks with mixed keys

Ref fjall-rs#237
@polaz
Copy link
Copy Markdown
Author

polaz commented Mar 15, 2026

Resolved — all four iterator seek methods now accept seqno:

  • Forward seeks (seek, seek_exclusive): seqno narrows the restart-interval binary search, matching the existing index_block::Iter pattern
  • Backward seeks (seek_upper, seek_upper_exclusive): accept seqno for API uniformity but cannot use it in the binary search — backward iteration visits lower indices only, so a tighter predicate would skip intervals containing the visible version

Created #268 to explore seqno-aware linear scan as a follow-up optimization for backward seeks.

polaz added 5 commits March 15, 2026 17:01
…tion

Tamper real_val_len in blob header (not covered by checksum) and
verify that Reader::get returns Error::Decompress when decompressed
byte count mismatches the declared length.
- Add early return for n==0 in estimated_filter_size to prevent
  division by zero panic in FalsePositiveRate path
- Replace truncating `as usize` cast with `usize::try_from()` for
  metadata section length (safe on 32-bit platforms)
compress_prepend_size adds a 4-byte size prefix that decompress_into
does not expect. Use compress() to match production decompression format.
@marvin-j97
Copy link
Copy Markdown
Contributor

I think the changes don't look half bad, but note there are a lot of other committed files and changes I can't accept.

polaz added 20 commits March 15, 2026 17:59
- dead_code: Indexed struct, Manifest::tree_type → #[expect] with reason
- unreachable_patterns: BlockType wildcard arms → #[expect] referencing issue #13
…est test

- Document flush vs ingestion seqno semantics in get_highest_seqno
- Add test covering regular inserts followed by bulk ingestion,
  verifying persisted seqno reflects global offset and data is visible
- blob/reader.rs: compute add_size in usize directly (BLOB_HEADER_LEN + key.len()),
  eliminating two #[expect(cast_possible_truncation)] annotations
- table/util.rs: remove _ => {} wildcard arms — BlockType match is already exhaustive
  with Filter/Index/Data|Meta covering all 4 variants
- blob reader: checked_add for on_disk_size + add_size (32-bit overflow)
- block/blob: add NOTE comments clarifying size cap is in separate branch
fix: resolve all clippy warnings for strict -D warnings CI
…docstring-and-test

docs(table): expand get_highest_seqno docstring and add mixed ingest test
- Simplify iterator chain to match/guard pattern for clarity
- Add empty-prefix edge case test on non-empty tree (MVCC + deletes)
Add doc-comments to `seek_upper` and `seek_upper_exclusive` explaining
that seqno is accepted for API uniformity with forward seek methods but
is intentionally unused because backward binary search cannot leverage
it without risking skipping the visible version.
…insprefix

feat: add optimized contains_prefix() method
@polaz
Copy link
Copy Markdown
Author

polaz commented Mar 16, 2026

Sorry about the mess here — the branch accidentally included unrelated changes from our fork's main (CI workflows, clippy fixes, contains_prefix, etc.).

I've opened a clean PR rebased directly on upstream main with only the seqno-aware seek changes (1 commit, 4 files): #270

Closing this one in favour of that.

@polaz polaz closed this Mar 16, 2026
@polaz polaz deleted the feat/#237-data-block-seqno-aware-seek branch March 16, 2026 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Data block: seqno aware seek

3 participants