Skip to content

feat(fork_choice): add Prometheus metrics module#309

Draft
GrapeBaBa wants to merge 2 commits intomainfrom
gr/feature/forkchoice-metrics
Draft

feat(fork_choice): add Prometheus metrics module#309
GrapeBaBa wants to merge 2 commits intomainfrom
gr/feature/forkchoice-metrics

Conversation

@GrapeBaBa
Copy link
Copy Markdown
Contributor

@GrapeBaBa GrapeBaBa commented Apr 9, 2026

Motivation

Enable observability for the fork choice module by adding Prometheus metrics, aligned with the TypeScript Lodestar implementation for Grafana dashboard compatibility.

Description

  • Add src/fork_choice/metrics.zig with 21 Prometheus metrics matching packages/fork-choice/src/metrics.ts on the TS unstable branch
  • Wire up the metrics dependency in build.zig and zbuild.zon
  • Export the metrics submodule from src/fork_choice/root.zig
  • Instrument computeDeltas in fork_choice.zig with timing and counter observations
  • Follow the same module pattern as src/state_transition/metrics.zig

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the monitoring capabilities of the fork choice mechanism by integrating a comprehensive Prometheus metrics module. It also introduces the foundational changes required for the upcoming GLOAS fork, which includes updates to core consensus types and state transition logic to support ePBS. Furthermore, new benchmarks have been added to rigorously test the performance of key fork choice operations, ensuring the system remains performant and observable.

Highlights

  • Prometheus Metrics for Fork Choice: A new module, src/fork_choice/metrics.zig, was added to expose 21 Prometheus metrics for monitoring the fork choice mechanism. These metrics are aligned with the Lodestar TypeScript implementation and cover aspects like head finding, reorg tracking, state gauges, and compute deltas.
  • GLOAS Fork Integration: The GLOAS fork has been introduced across the codebase, including updates to ForkSeq enum, network configurations, and consensus types. This prepares the system for ePBS (enshrined Proposer-Builder Separation) changes, modifying BeaconBlockBody and BeaconState structures accordingly.
  • New Fork Choice Benchmarks: Three new benchmarks (compute_deltas.zig, on_attestation.zig, update_head.zig) have been added under bench/fork_choice to measure the performance of critical fork choice operations, ensuring efficiency and stability.
  • State Transition Logic Updates: Conditional logic in state_transition modules (process_block.zig, process_operations.zig, slash_validator.zig, execution.zig) was updated to correctly handle the GLOAS fork's specific behaviors, particularly regarding execution payload processing and merge transition status.
  • Refactored Fork Choice Internals: New modules src/fork_choice/compute_deltas.zig, src/fork_choice/store.zig, and src/fork_choice/vote_tracker.zig were added to manage per-node weight deltas, checkpoint storage, and validator votes more efficiently and robustly.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Ignored Files
  • Ignored by pattern: .github/workflows/** (1)
    • .github/workflows/CI.yml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements the Gloas fork (EIP-7732: ePBS) and introduces a new fork-choice module featuring SoA-based vote tracking and optimized weight-propagation logic. The implementation includes new consensus types, updated state transition rules, and comprehensive benchmarks. Review feedback primarily focuses on adherence to the project's strict style guide, noting violations regarding function length limits, the use of architecture-specific types, and the requirement for exhaustive assertions. Additionally, improvements were suggested to optimize memory allocations in performance-critical paths and to correct a logic error in a struct assertion that could lead to unexpected crashes.

Comment on lines +44 to +122
pub fn computeDeltas(
allocator: Allocator,
deltas_cache: *DeltasCache,
num_proto_nodes: u32,
vote_current_indices: []VoteIndex,
vote_next_indices: []const VoteIndex,
old_balances: []const u16,
new_balances: []const u16,
equivocating_indices: *const EquivocatingIndices,
) !ComputeDeltasResult {
assert(vote_current_indices.len == vote_next_indices.len);
assert(num_proto_nodes < NULL_VOTE_INDEX);

// deltas.length = numProtoNodes; deltas.fill(0)
try deltas_cache.resize(allocator, @intCast(num_proto_nodes));
const deltas = deltas_cache.items;
@memset(deltas, 0);

const num_validators = vote_next_indices.len;

// Sort equivocating indices for pointer advancement in the loop.
const sorted_eq = try sortEquivocatingKeys(allocator, equivocating_indices);
defer allocator.free(sorted_eq);

var result: ComputeDeltasResult = .{ .deltas = deltas, .equivocating_validators = @intCast(sorted_eq.len) };
// Pre-fetch the first equivocating validator index for pointer advancement comparison.
// Use maxInt as sentinel when empty so the equivocating check is always false.
var equivocating_validator_index: ValidatorIndex = if (sorted_eq.len > 0) sorted_eq[0] else std.math.maxInt(ValidatorIndex);
var equivocating_index: usize = 0;

for (0..num_validators) |v_index| {
const current_index = vote_current_indices[v_index];
const next_index = vote_next_indices[v_index];

// Validator has never voted and has no pending vote.
if (current_index == NULL_VOTE_INDEX) {
if (next_index == NULL_VOTE_INDEX) {
result.old_inactive_validators += 1;
continue;
}
}

const bal = resolveBalances(v_index, old_balances, new_balances);

// Check if this validator is equivocating (sorted pointer advancement).
if (@as(ValidatorIndex, @intCast(v_index)) == equivocating_validator_index) {
// Remove weight from current vote. Only process once: after zeroing
// current_index, subsequent calls skip this validator.
subtractOldBalance(deltas, current_index, bal.old);
vote_current_indices[v_index] = NULL_VOTE_INDEX;
equivocating_index += 1;
// Advance to next equivocating validator, or set sentinel when exhausted.
equivocating_validator_index = if (equivocating_index < sorted_eq.len)
sorted_eq[equivocating_index]
else
std.math.maxInt(ValidatorIndex);
continue;
}

if (bal.old == 0) {
if (bal.new == 0) {
result.new_inactive_validators += 1;
continue;
}
}

// Vote or balance changed: apply delta.
if (current_index != next_index or bal.old != bal.new) {
subtractOldBalance(deltas, current_index, bal.old);
addNewBalance(deltas, next_index, bal.new);
vote_current_indices[v_index] = next_index;
result.new_vote_validators += 1;
} else {
result.unchanged_vote_validators += 1;
}
}

return result;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The computeDeltas function body is 78 lines long, which exceeds the hard limit of 70 lines per function specified in the repository style guide.

References
  1. We enforce a hard limit of 70 lines per function body to reduce the probability of poorly structured code. (link)

const num_validators = vote_next_indices.len;

// Sort equivocating indices for pointer advancement in the loop.
const sorted_eq = try sortEquivocatingKeys(allocator, equivocating_indices);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Allocating and sorting the equivocating validator indices on every call to computeDeltas (a hot path in fork choice) is inefficient. According to the style guide, memory should be statically allocated at startup where possible, and mechanical sympathy should be considered to avoid latency spikes. Consider maintaining a sorted list of equivocating indices in the store or caching the sorted result.

References
  1. Most memory should be statically allocated at startup, where possible and optimal. We don't allow potential memcpy latency spikes to slip through. (link)

var equivocating_validator_index: ValidatorIndex = if (sorted_eq.len > 0) sorted_eq[0] else std.math.maxInt(ValidatorIndex);
var equivocating_index: usize = 0;

for (0..num_validators) |v_index| {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The loop index v_index uses the architecture-specific usize type. The style guide requires using explicitly-sized types like u32 for everything to ensure safety and predictability.

References
  1. Use explicitly-sized types like u32 for everything, avoid architecture-specific usize. (link)

Comment on lines +160 to +166
fn sortEquivocatingKeys(allocator: Allocator, indices: *const EquivocatingIndices) ![]const ValidatorIndex {
const keys = indices.keys();
const buf = try allocator.alloc(ValidatorIndex, keys.len);
@memcpy(buf, keys);
std.mem.sortUnstable(ValidatorIndex, buf, {}, std.sort.asc(ValidatorIndex));
return buf;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The sortEquivocatingKeys function lacks assertions for its arguments. The style guide requires asserting all function arguments to increase the probability that a program is correct.

References
  1. Assert all function arguments and return values, pre/postconditions and invariants. A function must not operate blindly on data it has not checked. (link)

/// Copies equivocating keys into a heap buffer and sorts ascending for pointer advancement.
fn sortEquivocatingKeys(allocator: Allocator, indices: *const EquivocatingIndices) ![]const ValidatorIndex {
const keys = indices.keys();
const buf = try allocator.alloc(ValidatorIndex, keys.len);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function performs an allocation even when keys.len is zero. To adhere to the 'mechanical sympathy' and efficiency goals of the style guide, consider returning an empty slice immediately if there are no keys to sort.

    if (keys.len == 0) return &.{};
    const buf = try allocator.alloc(ValidatorIndex, keys.len);
References
  1. Think about performance from the outset. Mechanical sympathy is required to work with the grain of the system. (link)

Comment on lines +56 to +176
pub fn init(allocator: Allocator, comptime opts: m.RegistryOpts) !void {
var find_head = try Metrics.FindHead.init(
allocator,
"beacon_fork_choice_find_head_seconds",
.{ .help = "Time to find head in seconds" },
opts,
);
errdefer find_head.deinit();

var errors = try Metrics.ErrorsGauge.init(
allocator,
"beacon_fork_choice_errors_total",
.{ .help = "Count of fork choice errors" },
opts,
);
errdefer errors.deinit();

var not_reorged_reason = try Metrics.NotReorgedReasonCounter.init(
allocator,
"beacon_fork_choice_not_reorged_reason_total",
.{ .help = "Count of not reorged reasons" },
opts,
);
errdefer not_reorged_reason.deinit();

fork_choice_metrics = .{
.find_head = find_head,
.requests = Metrics.CountGauge.init(
"beacon_fork_choice_requests_total",
.{ .help = "Count of fork choice head-finding attempts" },
opts,
),
.errors = errors,
.changed_head = Metrics.CountGauge.init(
"beacon_fork_choice_changed_head_total",
.{ .help = "Count of head changes" },
opts,
),
.reorg = Metrics.CountGauge.init(
"beacon_fork_choice_reorg_total",
.{ .help = "Count of chain reorgs" },
opts,
),
.reorg_distance = Metrics.ReorgDistance.init(
"beacon_fork_choice_reorg_distance",
.{ .help = "Histogram of reorg distances" },
opts,
),
.votes = Metrics.CountGauge.init(
"beacon_fork_choice_votes_count",
.{ .help = "Current count of votes in fork choice" },
opts,
),
.queued_attestations = Metrics.CountGauge.init(
"beacon_fork_choice_queued_attestations_count",
.{ .help = "Current count of queued attestations per slot" },
opts,
),
.validated_attestation_datas = Metrics.CountGauge.init(
"beacon_fork_choice_validated_attestation_datas_count",
.{ .help = "Current count of validated attestation data" },
opts,
),
.balances_length = Metrics.CountGauge.init(
"beacon_fork_choice_balances_length",
.{ .help = "Current balances array length" },
opts,
),
.nodes = Metrics.CountGauge.init(
"beacon_fork_choice_nodes_count",
.{ .help = "Current number of nodes in fork choice" },
opts,
),
.indices = Metrics.CountGauge.init(
"beacon_fork_choice_indices_count",
.{ .help = "Current number of indices in fork choice" },
opts,
),
.not_reorged_reason = not_reorged_reason,
.compute_deltas_duration = Metrics.ComputeDeltasDuration.init(
"beacon_fork_choice_compute_deltas_seconds",
.{ .help = "Time to compute deltas in seconds" },
opts,
),
.compute_deltas_deltas_count = Metrics.CountGauge.init(
"beacon_fork_choice_compute_deltas_deltas_count",
.{ .help = "Number of deltas computed" },
opts,
),
.compute_deltas_zero_deltas_count = Metrics.CountGauge.init(
"beacon_fork_choice_compute_deltas_zero_deltas_count",
.{ .help = "Number of zero deltas" },
opts,
),
.compute_deltas_equivocating_validators = Metrics.CountGauge.init(
"beacon_fork_choice_compute_deltas_equivocating_validators_count",
.{ .help = "Number of equivocating validators" },
opts,
),
.compute_deltas_old_inactive_validators = Metrics.CountGauge.init(
"beacon_fork_choice_compute_deltas_old_inactive_validators_count",
.{ .help = "Number of old inactive validators" },
opts,
),
.compute_deltas_new_inactive_validators = Metrics.CountGauge.init(
"beacon_fork_choice_compute_deltas_new_inactive_validators_count",
.{ .help = "Number of new inactive validators" },
opts,
),
.compute_deltas_unchanged_vote_validators = Metrics.CountGauge.init(
"beacon_fork_choice_compute_deltas_unchanged_vote_validators_count",
.{ .help = "Number of unchanged vote validators" },
opts,
),
.compute_deltas_new_vote_validators = Metrics.CountGauge.init(
"beacon_fork_choice_compute_deltas_new_vote_validators_count",
.{ .help = "Number of new vote validators" },
opts,
),
};
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The init function body is approximately 120 lines long, which significantly exceeds the hard limit of 70 lines per function specified in the repository style guide.

References
  1. We enforce a hard limit of 70 lines per function body to reduce the probability of poorly structured code. (link)

next_indices: []u32,
next_slots: []Slot,
} {
assert(self.multi_list.len > 0 or self.multi_list.capacity == 0);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This assertion will fail if the list is empty (len == 0) but has been pre-allocated with capacity (capacity > 0). This is a valid state for an initialized Votes struct. The assertion should be relaxed or removed if empty slices are acceptable.

        assert(self.multi_list.len >= 0);

@GrapeBaBa GrapeBaBa changed the base branch from main to gr/feature/forkchoice-z April 9, 2026 15:02
@GrapeBaBa
Copy link
Copy Markdown
Contributor Author

@codex review

@GrapeBaBa
Copy link
Copy Markdown
Contributor Author

gemini review

@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

GrapeBaBa added 2 commits May 10, 2026 20:21
Add fork choice metrics aligned with Lodestar TypeScript implementation
(packages/fork-choice/src/metrics.ts). Uses the same beacon_ prefix and
metric names to ensure Grafana dashboard compatibility.

Metrics include find_head histogram, reorg tracking, compute_deltas
breakdown, and not_reorged_reason counters.
Wire up the metrics dependency (build.zig, zbuild.zon), export the
metrics submodule from root.zig, and instrument computeDeltas in
fork_choice.zig with timing and counter observations.
@GrapeBaBa GrapeBaBa force-pushed the gr/feature/forkchoice-metrics branch from 8dfd345 to 0de85f9 Compare May 10, 2026 12:33
@GrapeBaBa GrapeBaBa changed the base branch from gr/feature/forkchoice-z to main May 10, 2026 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant