feat: add cross-node consistency check for pooled per-object blob info#3490
feat: add cross-node consistency check for pooled per-object blob info#3490halfprice wants to merge 1 commit into
Conversation
Mirror the certified per-object blob digest consistency check for the per-object pooled blob info table: - Implement `CertifiedBlobInfoApi` for `PerObjectPooledBlobInfo` so the `BlobInfoIter` "certified before epoch" filter works over the pooled table. A pooled blob entry is certified iff it is present (deletion removes the entry) and its certified epoch is at or before the epoch. - Add `PerObjectPooledBlobInfoIterator` plus the `certified_per_object_pooled_blob_info_iter_before_epoch` builders on `BlobInfoTable` and `Storage`. - Add three pooled consistency-check metrics. - Make `compose_blob_object_list_digest` generic over any ObjectID-keyed table, and add `compose_certified_pooled_object_blob_list_digest` with its own `storage_node_certified_pooled_blob_object_digest` fail point. Schedule it alongside the existing per-object check at epoch change. - Wire the new fail point into the simtest `BlobInfoConsistencyCheck` so every storage-pool simtest verifies pooled per-object digests match across all nodes.
|
@codex review |
There was a problem hiding this comment.
Pull request overview
This PR extends the storage-node epoch-change background consistency checks to also compute and compare a cross-node digest for the per-object pooled blob info table, mirroring the existing per-object blob info digest check. This enables simtests to detect pooled per-object table divergence across nodes.
Changes:
- Add
CertifiedBlobInfoApisupport forPerObjectPooledBlobInfoso it can be filtered/scanned consistently with other certified-blob iterators. - Add storage/table iterators and background digest computation + metrics for certified pooled per-object entries.
- Extend the simtest harness to collect and assert cross-node equality for the pooled per-object digest via a new failpoint.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| crates/walrus-simtest/src/test_utils.rs | Registers a new failpoint + digest map and adds a pooled per-object digest cross-node consistency assertion. |
| crates/walrus-service/src/node/storage/blob_info/per_object_pooled_blob_info.rs | Implements CertifiedBlobInfoApi for pooled per-object blob info entries. |
| crates/walrus-service/src/node/storage/blob_info.rs | Adds pooled per-object iterator types and a “certified before epoch” iterator for the pooled per-object table. |
| crates/walrus-service/src/node/storage.rs | Exposes a storage-level iterator for certified pooled per-object entries (before epoch). |
| crates/walrus-service/src/node/metrics.rs | Adds Prometheus metrics for pooled per-object digest, errors, and scanned count. |
| crates/walrus-service/src/node/consistency_check.rs | Computes and publishes pooled per-object digest during epoch-change background consistency checks (plus simtest failpoint hook). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ) | ||
| .set(blob_object_list_digest as i64); | ||
|
|
||
| // No-op out side of simtest. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0f37f32e02
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let blob_object_list_digest = match compose_blob_object_list_digest( | ||
| per_object_pooled_blob_info_iter, |
There was a problem hiding this comment.
Hash pooled row values in the new digest
This new pooled check uses compose_blob_object_list_digest, but that helper only feeds blob_info.0 (the ObjectID) into the hasher, so two nodes with the same pooled object IDs but different PerObjectPooledBlobInfo values still report identical digests. In scenarios where blob_id, storage_pool_id, or certified_epoch diverges, later GC/aggregate updates can behave differently while the simtest consistency check passes; include a stable serialization of the pooled value, or at least the material fields, in this digest.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
That's the pattern for other consistency checks as well and only the ObjectID set is relevant.
shuowang12
left a comment
There was a problem hiding this comment.
LGTM! Thanks for extending it to cover pooled blobs.
Summary
Extends the storage-node background consistency check to cover the per-object pooled blob info table, mirroring the existing certified per-object blob digest check. At each epoch change, every node computes an xxhash digest over its certified pooled per-object entries; the simtest harness then asserts all nodes agree, surfacing any cross-node divergence.
Test plan
cargo check -p walrus-service— clean.walrus-simtest— clean.MSIM_TEST_SEED=1 cargo simtest test_storage_pool— all 11 storage-pool simtests pass.Release notes
Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that
a user might notice and any actions they must take to implement updates. (Add release notes after the colon for each item)