Skip to content

[PERF] Filter blocks by key range in get_range#7120

Merged
Sicheng-Pan merged 2 commits into
mainfrom
05-22-_perf_filter_blocks_by_key_range_in_get_range
May 26, 2026
Merged

[PERF] Filter blocks by key range in get_range#7120
Sicheng-Pan merged 2 commits into
mainfrom
05-22-_perf_filter_blocks_by_key_range_in_get_range

Conversation

@Sicheng-Pan
Copy link
Copy Markdown
Contributor

@Sicheng-Pan Sicheng-Pan commented May 22, 2026

Summary

Pass the key range through to get_block_ids_range so that get_range / get_range_stream can eliminate blocks that can't contain any matching keys. Previously, block resolution only filtered by prefix — a narrow key range query like get_range(""..="", 42..=42) would load all blocks for prefix "" even when only 1 block contains key 42.

Approach

When a block's start and end delimiters share the same prefix, the block covers a single prefix and its key range is known ([start.key, end.key)). We check if this range overlaps the query key range using the same MAX(start) <= MIN(end) overlap pattern used for prefix filtering.
First and last blocks (with Start delimiter or no end delimiter) are not eliminated — their key range is unbounded on one side. This is at most 2 extra blocks loaded per query, which is acceptable.
Key range bounds are converted to KeyWrapper once outside the filter loop to avoid per-block allocations.

Changes

  • sparse_index.rs: Add key_range parameter to get_block_ids_range. After the existing prefix overlap check, add key range overlap check for single-prefix blocks. New test test_get_block_ids_range_with_key_filter.
  • blockfile.rs: Update 5 call sites — get_range and get_range_stream pass key_range through, others pass .. (unbounded, preserving existing behavior).

Copy link
Copy Markdown
Contributor Author

Sicheng-Pan commented May 22, 2026

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions
Copy link
Copy Markdown

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • If appropriate, are there adequate property based tests?
  • If appropriate, are there adequate unit tests?
  • Should any logging, debugging, tracing information be added or removed?
  • Are error messages user-friendly?
  • Have all documentation changes needed been made?
  • Have all non-obvious changes been commented?

System Compatibility

  • Are there any potential impacts on other parts of the system or backward compatibility?
  • Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • Is this code of a unexpectedly high quality (Readability, Modularity, Intuitiveness)

@Sicheng-Pan Sicheng-Pan force-pushed the 05-22-_perf_filter_blocks_by_key_range_in_get_range branch from ea1e979 to bcda98e Compare May 22, 2026 20:48
@Sicheng-Pan Sicheng-Pan marked this pull request as ready for review May 22, 2026 22:44
@Sicheng-Pan Sicheng-Pan merged commit 48234f6 into main May 26, 2026
62 checks passed
@Sicheng-Pan Sicheng-Pan deleted the 05-22-_perf_filter_blocks_by_key_range_in_get_range branch May 26, 2026 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants