fix: improve atomic ordering in ThreadPool and NAPI init by GrapeBaBa · Pull Request #310 · ChainSafe/lodestar-z

GrapeBaBa · 2026-04-10T09:47:03Z

Motivation

ThreadPool worker hot loops use .acquire ordering on err_flag.load() where .monotonic suffices — the flag is a pure early-exit signal with no data dependency on the setter's other writes. BLS verification is CPU-intensive, so this matters in the hot path.
NAPI register() uses a bare fetchAdd(.monotonic) to guard shared state initialization. If two Node.js Worker threads call register concurrently, Thread B can skip init and use partially-initialized state because .monotonic provides no ordering guarantees for the init stores. While unlikely in practice (Worker thread creation is main-thread-driven), the code is incorrect.

Description

src/bls/ThreadPool.zig (L154, L260):

err_flag.load(.acquire) → .monotonic in execVerifyMulti and execAggVerify worker loops
Aligns with the identical pattern in pubkey_cache.zig:50 which already correctly uses .monotonic
The final err_flag.load(.acquire) after dispatch() (L233, L332) is unchanged — though technically redundant (the work_done event provides happens-before), it serves as defense-in-depth

bindings/napi/root.zig:

Add init_mutex to serialize register() calls, ensuring Thread B blocks until Thread A's init completes
Add errdefer to roll back env_refcount on init failure, so the next caller retries initialization
fetchAdd(.monotonic) is correct inside the mutex (the mutex lock/unlock provides acquire/release barriers)
fetchSub(.acq_rel) in EnvCleanup.hook unchanged — it runs outside the mutex and needs acq_rel to ensure the last thread sees all writes before deinit

- ThreadPool: relax err_flag.load from .acquire to .monotonic in worker hot loops (L154, L260). The flag is a pure early-exit signal with no data dependency on the setter's other writes. Matches the pattern already used in pubkey_cache.zig. - NAPI root: guard shared state initialization with init_mutex to prevent a race where concurrent register calls (e.g. from Node.js Worker threads) could observe partially-initialized state. Add errdefer to roll back env_refcount on init failure so the next caller retries.

gemini-code-assist · 2026-04-10T09:47:16Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses potential concurrency issues and performance bottlenecks related to atomic operations. By refining memory ordering in hot loops and introducing proper synchronization for shared state initialization, the changes ensure both improved execution efficiency and robust thread safety across the codebase.

Highlights

Atomic Ordering Optimization: Updated ThreadPool worker loops to use .monotonic atomic ordering instead of .acquire for error flag checks, improving performance on ARM architectures.
NAPI Initialization Safety: Introduced a mutex in the NAPI registration process to prevent race conditions during shared state initialization, ensuring thread safety for concurrent Node.js Worker threads.
Error Handling: Added an errdefer block to the NAPI registration to properly roll back reference counts if initialization fails, allowing for successful retries.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

Copilot

Pull request overview

This PR refines atomic usage in two concurrency-sensitive areas: the BLS verification thread pool hot loops and the N-API module’s shared-state initialization, aiming to reduce unnecessary barriers in hot paths while preventing partially-initialized shared state under concurrent register() calls.

Changes:

Downgrade err_flag.load(.acquire) to .monotonic in BLS ThreadPool worker hot loops.
Add an init_mutex to serialize N-API register() initialization and add rollback on init failure.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
src/bls/ThreadPool.zig	Reduces ordering strength in worker-loop early-exit checks to avoid unnecessary acquire loads in hot paths.
bindings/napi/root.zig	Serializes shared state initialization across concurrent `register()` calls and adds partial rollback on init failure.

Comments suppressed due to low confidence (1)

bindings/napi/root.zig:55

If env.addEnvCleanupHook(...) fails, env_refcount has already been incremented and there will be no cleanup hook to decrement it later, leaking the refcount (and potentially preventing future init/deinit from behaving correctly). Add rollback logic for failures after the increment until the cleanup hook is successfully registered (and if rolling back to 0, ensure shared state is torn down consistently).


    try env.addEnvCleanupHook(EnvCleanup, &env_cleanup, EnvCleanup.hook);

    try pool.register(env, exports);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

gemini-code-assist

Code Review

This pull request introduces a mutex to synchronize shared state initialization in the N-API bindings, ensuring thread safety for concurrent registrations, and adds a rollback mechanism for the environment reference count upon initialization failure. It also optimizes the thread pool by relaxing memory ordering for error flag checks. Feedback recommends strengthening the atomic ordering of the reference count increment for better long-term robustness and suggests adding an assertion in the error recovery path to validate the reference count state, consistent with the repository's emphasis on safety and invariant checking.

…riant - Take init_mutex in EnvCleanup.hook to prevent init/deinit race when a Worker thread tears down while another is registering. - Assert env_refcount == 1 in errdefer to enforce the invariant that only our increment is present when init fails.

GrapeBaBa · 2026-04-10T10:45:28Z

@codex review

GrapeBaBa · 2026-04-10T10:45:40Z

gemini review

chatgpt-codex-connector · 2026-04-10T10:45:47Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Copilot

Pull request overview

This PR adjusts concurrency semantics in two areas: (1) lowers atomic load ordering in ThreadPool hot loops where the flag is used only as an early-exit signal, and (2) serializes N-API module shared-state initialization to prevent concurrent register() calls from observing partial initialization.

Changes:

Switch err_flag.load(.acquire) to .monotonic in ThreadPool worker loops to reduce hot-path overhead.
Add a global init_mutex to serialize N-API register() initialization and add errdefer to roll back env_refcount on init failure.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
`src/bls/ThreadPool.zig`	Relaxes atomic load ordering on an early-exit flag inside worker loops.
`bindings/napi/root.zig`	Adds a mutex to serialize shared state init/teardown across concurrent N-API environments and refcount rollback on init failure.

Comments suppressed due to low confidence (2)

bindings/napi/root.zig:59

register() increments env_refcount before calling env.addEnvCleanupHook(...). If addEnvCleanupHook fails, the refcount is never rolled back, leaving the shared state permanently referenced (and potentially never deinitialized) and future register() calls treating the module as already initialized. Consider adding an errdefer to decrement the refcount on failures until the cleanup hook is successfully registered (and/or factor the decrement+deinit path into a shared helper used by both the hook and error handling).

fn register(env: napi.Env, exports: napi.Value) !void {
    {
        init_mutex.lock();
        defer init_mutex.unlock();
        if (env_refcount.fetchAdd(1, .monotonic) == 0) {
            // First environment — initialize shared state.
            // On failure, roll back the refcount so the next caller retries.
            errdefer {
                const old = env_refcount.fetchSub(1, .monotonic);
                std.debug.assert(old == 1);
            }
            try pool.state.init();
            try pubkeys.state.init();
            config.state.init();
        }
    }

    try env.addEnvCleanupHook(EnvCleanup, &env_cleanup, EnvCleanup.hook);

bindings/napi/root.zig:35

The PR description says EnvCleanup.hook runs outside the new init_mutex, but the implementation now locks init_mutex in the cleanup hook as well. Please update the PR description (or adjust the code) so the documented concurrency behavior matches what was shipped.

const EnvCleanup = struct {
    fn hook(_: *EnvCleanup) void {
        init_mutex.lock();
        defer init_mutex.unlock();
        if (env_refcount.fetchSub(1, .acq_rel) == 1) {
            // Last environment — tear down shared state.
            config.state.deinit();
            pubkeys.state.deinit();
            pool.state.deinit();
            metrics.deinit();
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

GrapeBaBa · 2026-04-10T13:11:23Z

gemini review

GrapeBaBa requested a review from a team as a code owner April 10, 2026 09:47

Copilot AI review requested due to automatic review settings April 10, 2026 09:47

Copilot started reviewing on behalf of GrapeBaBa April 10, 2026 09:47 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

Comment thread bindings/napi/root.zig

gemini-code-assist Bot reviewed Apr 10, 2026

View reviewed changes

Comment thread bindings/napi/root.zig

Comment thread bindings/napi/root.zig Outdated

GrapeBaBa requested a review from Copilot April 10, 2026 10:45

Copilot started reviewing on behalf of GrapeBaBa April 10, 2026 10:45 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

matthewkeil mentioned this pull request Apr 15, 2026

Add grapebaba from Lodestar Team protocolguild/documentation#494

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improve atomic ordering in ThreadPool and NAPI init#310

fix: improve atomic ordering in ThreadPool and NAPI init#310
GrapeBaBa wants to merge 2 commits intomainfrom
worktree-fix+atomic-ordering

GrapeBaBa commented Apr 10, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

GrapeBaBa commented Apr 10, 2026

Uh oh!

GrapeBaBa commented Apr 10, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

GrapeBaBa commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

GrapeBaBa commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Description

Uh oh!

gemini-code-assist Bot commented Apr 10, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

GrapeBaBa commented Apr 10, 2026

Uh oh!

GrapeBaBa commented Apr 10, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

GrapeBaBa commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

GrapeBaBa commented Apr 10, 2026 •

edited

Loading