refactor: remove build() and async shuffling calculation by guha-rahul · Pull Request #8688 · ChainSafe/lodestar

guha-rahul · 2025-12-10T22:50:13Z

Motivation

When lodestar-z happens, BeaconStateAllForks will be a blocker and The build() method in ShufflingCache depends on it.
Post-Fulu proposer lookahead is stored in BeaconState, requiring shufflings synchronously during epoch transitions—making the async build() pattern no longer viable.

Description

Remove build() method from IShufflingCache interface and ShufflingCache class
Add set() to IShufflingCache interface to add shufflings
Remove asyncShufflingCalculation

Closes #8653

AI Assistance Disclosure

External Contributors: I have read the contributor guidelines and disclosed my usage of AI below.
use claude to understand how ShufflingCache avoids recomputation

gemini-code-assist · 2025-12-10T22:50:34Z

Summary of Changes

Hello @guha-rahul, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the shuffling calculation mechanism by removing the asynchronous build() method and the asyncShufflingCalculation flag. This change is motivated by upcoming lodestar-z compatibility issues with BeaconStateAllForks and the need for synchronous shuffling calculations for proposer lookahead in the post-Fulu era. The IShufflingCache interface has been updated to reflect these changes, now providing a set() method for directly adding shufflings.

Highlights

Removal of Asynchronous Shuffling: The build() method and the asyncShufflingCalculation flag have been removed, shifting shuffling computations to a synchronous model to align with new requirements.
IShufflingCache Interface Update: The build() method in the IShufflingCache interface has been replaced with a set() method, allowing direct addition of EpochShuffling objects to the cache.
BeaconStateAllForks Compatibility: The changes address a future blocker related to BeaconStateAllForks and lodestar-z by removing the dependency of the build() method on it.
Post-Fulu Requirements: This refactor ensures that proposer lookahead, which requires synchronous shuffling post-Fulu, is correctly handled within the system.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively refactors the shuffling calculation by removing the asynchronous build() method and the asyncShufflingCalculation option. This change simplifies the codebase by enforcing synchronous shuffling calculation, which is a necessary step for upcoming features like post-Fulu proposer lookahead. The changes are consistent across all modified files, including interfaces, implementations, and tests. The IShufflingCache interface is updated to use a set() method, and the implementation in ShufflingCache correctly makes this method public. The epoch processing logic in EpochCache and EpochTransitionCache is also cleaned up by removing the now-obsolete async path. Overall, this is a solid refactoring. I have a couple of minor suggestions to improve code clarity and documentation.

gemini-code-assist · 2025-12-10T22:51:47Z

packages/state-transition/src/cache/epochTransitionCache.ts

@@ -380,7 +370,6 @@ export function beforeProcessEpoch(
  });

  // Trigger async build of shuffling for epoch after next (nextShuffling post epoch transition)


This comment is misleading as the asynchronous shuffling build logic has been removed in this pull request. It should be removed to avoid confusion for future developers.

gemini-code-assist · 2025-12-10T22:51:47Z

packages/state-transition/src/util/epochShuffling.ts

  /**
-   * Queue asynchronous build for an EpochShuffling
+   * Add an EpochShuffling to the ShufflingCache.
   */


The comment for the set method in the IShufflingCache interface could be more descriptive. The implementation in ShufflingCache also handles resolving promises if they exist. Reflecting this behavior in the interface documentation would improve clarity for developers using this interface.

Suggested change

/**

* Queue asynchronous build for an EpochShuffling

* Add an EpochShuffling to the ShufflingCache.

*/

/**

* Add an EpochShuffling to the ShufflingCache. If a promise for the shuffling is present it will

* resolve the promise with the built shuffling.

*/

wemeetagain

generally lgtm

twoeths

this PR misses the context of #6938:

state-transition should not have ShufflingCache dependency, we will not have it in lodestar-z also
beacon-node should populate the ShufflingCache instead, see https://github.com/ChainSafe/lodestar/pull/6938/changes#diff-c21a87c055e2b34c02de2fa28ecc52c53837159f567eef5fb22e80ed55827044L341 . That's also the direction for integrating lodestar-z state-transition, need to think how to pull shuffling there
a lot of metrics in ShuffllingCache introduced by #6938 becomes useless
if we regen state for attestation verification, we should also populate ShufflingCache https://github.com/ChainSafe/lodestar/pull/6938/changes#diff-29e6daed96db2b885b1c4f516eb55bf7e6916e298d50f78bd2f8c57811481d2fL902
add more comments on why we have to compute shuffling for epoch n + 2 postfulu (with proposer lookahead spec change)

in the end it'd be more about reverting #6938, and we should deploy on a feature group to make sure there is no regression
also would like @matthewkeil to have a close look to make sure we don't miss anything

matthewkeil · 2025-12-15T07:10:09Z

this PR misses the context of #6938:

state-transition should not have ShufflingCache dependency, we will not have it in lodestar-z also

beacon-node should populate the ShufflingCache instead, see https://github.com/ChainSafe/lodestar/pull/6938/changes#diff-c21a87c055e2b34c02de2fa28ecc52c53837159f567eef5fb22e80ed55827044L341 . That's also the direction for integrating lodestar-z state-transition, need to think how to pull shuffling there

a lot of metrics in ShuffllingCache introduced by feat: async shuffling refactor #6938 becomes useless

if we regen state for attestation verification, we should also populate ShufflingCache https://github.com/ChainSafe/lodestar/pull/6938/changes#diff-29e6daed96db2b885b1c4f516eb55bf7e6916e298d50f78bd2f8c57811481d2fL902

add more comments on why we have to compute shuffling for epoch n + 2 postfulu (with proposer lookahead spec change)

in the end it'd be more about reverting #6938, and we should deploy on a feature group to make sure there is no regression also would like @matthewkeil to have a close look to make sure we don't miss anything

@twoeths It looks like you covered changes that will get reverted from #6938. The gist Rahul will be that there should be no shuffling cache on the epochCtx anymore. I think if you remove that from the EpochCache and follow all the codepaths that break from the change you will find most of the other hanging references not mentioned above.

packages/beacon-node/src/chain/blocks/importBlock.ts

packages/beacon-node/src/chain/chain.ts

packages/beacon-node/src/chain/shufflingCache.ts

packages/state-transition/src/cache/epochCache.ts

matthewkeil · 2026-01-09T07:34:35Z

packages/beacon-node/src/chain/blocks/importBlock.ts

+  const parentEpoch = computeEpochAtSlot(parentBlockSlot);
+  if (parentEpoch < blockEpoch && postState.epochCtx.nextShuffling !== null) {
+    // current epoch and previous epoch are likely cached in previous states
+    this.shufflingCache.set(postState.epochCtx.nextShuffling, postState.epochCtx.nextDecisionRoot);
+    this.logger.verbose("Processed shuffling for next epoch", {parentEpoch, blockEpoch, slot: blockSlot});
+  }


Originally this happened right before the epoch work. I think it should be ok doing it early here but for consistency sake can we move this down to just before the line if (blockSlot % SLOTS_PER_EPOCH === 0) { in case there are error cases that will cause us to exit the import early. We only cache a limited number of shufflings and will be better to not cache ones from imports that fail.

agree, should stay where it was originally

matthewkeil · 2026-01-09T08:00:11Z

packages/state-transition/src/epoch/processProposerLookahead.ts

-    }) ??
-    // Only for testing. shufflingCache should always be available in prod
-    computeEpochShuffling(state, cache.nextShufflingActiveIndices, epoch);
+  const shuffling = computeEpochShuffling(state, cache.nextShufflingActiveIndices, epoch);


Do we cache this result when the state comes back out of the state transition in all cases? I think this will be the case but please check for places where processSlot or stateTransition is in the call stack and verify we are doing that.

matthewkeil

Perhaps its better to do all calculations via the cache and that way there is no accidental duplication. Seems like an easier pattern to always use the build method instead of having free floating calls to computeEpochShuffling. Will keep everything centralized and coordinated. What do you think @twoeths ??

packages/beacon-node/src/metrics/metrics/lodestar.ts

packages/beacon-node/src/chain/shufflingCache.ts

packages/state-transition/src/cache/epochCache.ts

twoeths · 2026-01-17T08:06:14Z

@guha-rahul this e2e test consistently failed

[E2E tests: packages/beacon-node/test/e2e/api/impl/lightclient/endpoint.test.ts#L97](https://github.com/ChainSafe/lodestar/pull/8688/files#annotation_44242307082)
AssertionError: expected 4 to be 5 // Object.is equality - Expected + Received - 5 + 4 ❯ packages/beacon-node/test/e2e/api/impl/lightclient/endpoint.test.ts:97:47

also lint CI is failing

packages/beacon-node/src/chain/chain.ts

twoeths · 2026-01-20T03:22:35Z

the branch was deployed for >9h

size, hit, miss rate are the same
set_rate_per_epoch is new in this PR, it ranges from ~7 to ~9 now which makes sense (it's very cheap check so it's not an issue setting the same shuffling there)
- when import block, parent epoch is less than block epoch we set 3 times
- when import_block, if slot is multiple of 32, we set 3 times. But there could be skip slots in this case
- when regen process_slot, we set 3 times
at start of each epoch, we query ShufflingCache for attester duties. It works as before

we also get shuffling from ShufflingCache to validate gossip attestations, it works as before
- decision root was queried from fork-choice, and we still can get shuffling which proves the decision root was computed correctly inside the cache

twoeths · 2026-01-20T03:42:14Z

packages/beacon-node/src/metrics/metrics/lodestar.ts

        buckets: [0.5, 1, 1.5, 2],
      }),
-      shufflingCalculationTime: register.histogram<{source: "build" | "getSync"}>({
+      shufflingCalculationTime: register.histogram({


I notice that we don't have this metric anymore so need to remove it

pre-fulu this metric was useful to know how much time we save

post-fulu we have to compute shuffling for epoch n+2 so we don't have that saving time. I suppose that takes up the processProposerLookahead time

…hul/lodestar into refactor_shiffling_cache

packages/beacon-node/src/metrics/metrics/lodestar.ts

pnpm-lock.yaml

This reverts commit f75cd15.

matthewkeil

LGTM!! 🚀

Metrics are all looking good

…hul/lodestar into refactor_shiffling_cache

codecov · 2026-01-22T21:48:56Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 52.06%. Comparing base (b79f41e) to head (9aad853).
⚠️ Report is 21 commits behind head on unstable.

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #8688      +/-   ##
============================================
+ Coverage     52.02%   52.06%   +0.04%     
============================================
  Files           848      848              
  Lines         64650    64480     -170     
  Branches       4767     4757      -10     
============================================
- Hits          33632    33572      -60     
+ Misses        30949    30839     -110     
  Partials         69       69

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

wemeetagain · 2026-02-11T15:19:41Z

🎉 This PR is included in v1.40.0 🎉

guha-rahul added 3 commits December 11, 2025 03:56

fix : remove build() method

fb26c17

fix : add shuffling to cache

4353bb0

fix : reverting async shuffling refactor

27b506f

guha-rahul requested a review from a team as a code owner December 10, 2025 22:50

gemini-code-assist bot reviewed Dec 10, 2025

View reviewed changes

chore: update comments

bbdf692

wemeetagain previously approved these changes Dec 11, 2025

View reviewed changes

twoeths requested changes Dec 12, 2025

View reviewed changes

twoeths mentioned this pull request Dec 25, 2025

[fusaka-holesky] Proposer lookahead processing times increasing epoch transition #8488

Closed

guha-rahul added 3 commits January 6, 2026 20:17

Remove ShufflingCache dependency from state-transition package

97a1372

proposer lookahead comments

ec2983f

remove metrics

03cb629

guha-rahul dismissed wemeetagain’s stale review via 03cb629 January 6, 2026 15:03

guha-rahul requested a review from twoeths January 10, 2026 11:30