feat: look up cache with range calculation by waynexia · Pull Request #8123 · GreptimeTeam/greptimedb

waynexia · 2026-05-16T03:18:34Z

Signed-off-by: Ruihang Xia waynestxia@gmail.comI hereby agree to the terms of the GreptimeDB CLA.

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

Use cached content with range calculation instead of precise key matching to improve cache efficiency. Test scenario:

SELECT pod, container, avg(greptime_value) AS avg_v
FROM container_cpu_usage_seconds_total
WHERE greptime_timestamp >= '2025-10-20 00:00:00'
  AND greptime_timestamp <  '2025-10-21 00:00:00'
  AND namespace = 'kube-system'
GROUP BY pod, container
ORDER BY avg_v DESC
LIMIT 20;

Changes:

Version	page hits	page misses	cache bytes added
before	0	+63	+132,136,272
after	+32	+63	+24,335,439

query gets 32 page-cache hits where old gets 0, meaning fragment reuse is happening inside the same SQL workload.
The cache footprint is much smaller: about 24MB added vs 132MB added for the same query.

PR Checklist

Please convert it to a draft if some of the following conditions are not met.

I have written the necessary rustdoc comments.
I have added the necessary unit tests and integration tests.
This PR requires documentation updates.
API changes are backward compatible.
Schema or data changes are backward compatible.

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 226a02e6de

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

gemini-code-assist

Code Review

This pull request replaces the previous page-level cache with a more granular byte-fragment cache (PageRangeCache) to support partial cache hits and overlapping range lookups in Parquet row groups. Key changes include the introduction of PageFragmentKey and PageRangeLookup, as well as logic in the SstParquetRangeFetcher to assemble requested ranges from a mix of cached fragments and newly fetched data. A critical issue was identified in the PageRangeCache eviction listener, where holding a write lock while calling remove_if on the index map could lead to deadlocks or race conditions.

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

waynexia · 2026-05-16T04:49:40Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8029580d64

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

waynexia · 2026-05-18T10:50:57Z

@codex review

chatgpt-codex-connector · 2026-05-18T10:57:42Z

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

evenyag · 2026-05-18T14:37:55Z

+    fn remove_stale_index_entries(&self, keys: Vec<PageFragmentKey>) {
+        for key in keys {
+            if self.cache.contains_key(&key) {
+                continue;
+            }
+
+            if self.remove_index_entry(key) && self.cache.contains_key(&key) {
+                self.insert_index_entry(key);
+            }
+        }
+    }


remove_stale_index_entries() is invoked from the eviction listener and calls self.cache.contains_key(&key). Is there a potential deadlock issue?

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 54c19174eb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-20T07:05:26Z

+        let mut index = self.index.write().unwrap();
+        if self.cache.contains_key(&key) {
+            return;


Avoid cache lookup while holding index write lock

remove_uncached_index_entry acquires self.index.write() and then calls self.cache.contains_key(&key), but the cache is configured with an eviction listener that calls remove_index_entry (which also takes self.index.write()). In Moka 0.12, maintenance/removal notifications are delivered during foreground cache operations, so a contains_key that drains pending notifications can re-enter the listener and block on the same lock, causing lookup threads to deadlock under eviction pressure.

Useful? React with 👍 / 👎.

evenyag · 2026-05-20T12:38:04Z

            self.cache.insert(key, bytes);
-            self.insert_index_entry(key);
+            let mut index = self.index.write().unwrap();
+            index
+                .entry(key.group_key())
+                .or_default()
+                .insert((key.start, key.end), key);


Should we insert the index before the cache? So we can ensure the key is in the index when the cache evicts items.

waynexia added 2 commits May 16, 2026 11:14

feat: loop up cache with range calculation

8eebf15

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

copy fragements

226a02e

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

waynexia requested review from evenyag and v0y4g3r as code owners May 16, 2026 03:18

github-actions Bot added size/M docs-not-required This change does not impact docs. labels May 16, 2026

chatgpt-codex-connector Bot reviewed May 16, 2026

View reviewed changes

Comment thread src/mito2/src/cache.rs Outdated

gemini-code-assist Bot reviewed May 16, 2026

View reviewed changes

Comment thread src/mito2/src/cache.rs Outdated

release dashmap reference

8029580

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

chatgpt-codex-connector Bot reviewed May 16, 2026

View reviewed changes

Comment thread src/mito2/src/cache.rs Outdated

waynexia changed the title ~~feat: loop up cache with range calculation~~ feat: look up cache with range calculation May 18, 2026

waynexia added 3 commits May 18, 2026 17:39

fix bugs

6f54242

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

remove dashmap

8accd23

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

remove keys

bebcfc6

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

evenyag reviewed May 18, 2026

View reviewed changes

clean up

54c1917

Signed-off-by: Ruihang Xia <waynestxia@gmail.com>

chatgpt-codex-connector Bot reviewed May 20, 2026

View reviewed changes

evenyag approved these changes May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: look up cache with range calculation#8123

feat: look up cache with range calculation#8123
waynexia wants to merge 7 commits into
mainfrom
range-cache

waynexia commented May 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

waynexia commented May 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

waynexia commented May 18, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 18, 2026

Uh oh!

evenyag May 18, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 20, 2026

Uh oh!

evenyag May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

waynexia commented May 16, 2026

Refer to a related PR or issue link (optional)

What's changed and what's your intention?

PR Checklist

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

waynexia commented May 16, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

waynexia commented May 18, 2026

Uh oh!

chatgpt-codex-connector Bot commented May 18, 2026

Uh oh!

evenyag May 18, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 20, 2026

Choose a reason for hiding this comment

Uh oh!

evenyag May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants