feat(fork_choice): add Prometheus metrics module#309
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the monitoring capabilities of the fork choice mechanism by integrating a comprehensive Prometheus metrics module. It also introduces the foundational changes required for the upcoming Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request implements the Gloas fork (EIP-7732: ePBS) and introduces a new fork-choice module featuring SoA-based vote tracking and optimized weight-propagation logic. The implementation includes new consensus types, updated state transition rules, and comprehensive benchmarks. Review feedback primarily focuses on adherence to the project's strict style guide, noting violations regarding function length limits, the use of architecture-specific types, and the requirement for exhaustive assertions. Additionally, improvements were suggested to optimize memory allocations in performance-critical paths and to correct a logic error in a struct assertion that could lead to unexpected crashes.
| pub fn computeDeltas( | ||
| allocator: Allocator, | ||
| deltas_cache: *DeltasCache, | ||
| num_proto_nodes: u32, | ||
| vote_current_indices: []VoteIndex, | ||
| vote_next_indices: []const VoteIndex, | ||
| old_balances: []const u16, | ||
| new_balances: []const u16, | ||
| equivocating_indices: *const EquivocatingIndices, | ||
| ) !ComputeDeltasResult { | ||
| assert(vote_current_indices.len == vote_next_indices.len); | ||
| assert(num_proto_nodes < NULL_VOTE_INDEX); | ||
|
|
||
| // deltas.length = numProtoNodes; deltas.fill(0) | ||
| try deltas_cache.resize(allocator, @intCast(num_proto_nodes)); | ||
| const deltas = deltas_cache.items; | ||
| @memset(deltas, 0); | ||
|
|
||
| const num_validators = vote_next_indices.len; | ||
|
|
||
| // Sort equivocating indices for pointer advancement in the loop. | ||
| const sorted_eq = try sortEquivocatingKeys(allocator, equivocating_indices); | ||
| defer allocator.free(sorted_eq); | ||
|
|
||
| var result: ComputeDeltasResult = .{ .deltas = deltas, .equivocating_validators = @intCast(sorted_eq.len) }; | ||
| // Pre-fetch the first equivocating validator index for pointer advancement comparison. | ||
| // Use maxInt as sentinel when empty so the equivocating check is always false. | ||
| var equivocating_validator_index: ValidatorIndex = if (sorted_eq.len > 0) sorted_eq[0] else std.math.maxInt(ValidatorIndex); | ||
| var equivocating_index: usize = 0; | ||
|
|
||
| for (0..num_validators) |v_index| { | ||
| const current_index = vote_current_indices[v_index]; | ||
| const next_index = vote_next_indices[v_index]; | ||
|
|
||
| // Validator has never voted and has no pending vote. | ||
| if (current_index == NULL_VOTE_INDEX) { | ||
| if (next_index == NULL_VOTE_INDEX) { | ||
| result.old_inactive_validators += 1; | ||
| continue; | ||
| } | ||
| } | ||
|
|
||
| const bal = resolveBalances(v_index, old_balances, new_balances); | ||
|
|
||
| // Check if this validator is equivocating (sorted pointer advancement). | ||
| if (@as(ValidatorIndex, @intCast(v_index)) == equivocating_validator_index) { | ||
| // Remove weight from current vote. Only process once: after zeroing | ||
| // current_index, subsequent calls skip this validator. | ||
| subtractOldBalance(deltas, current_index, bal.old); | ||
| vote_current_indices[v_index] = NULL_VOTE_INDEX; | ||
| equivocating_index += 1; | ||
| // Advance to next equivocating validator, or set sentinel when exhausted. | ||
| equivocating_validator_index = if (equivocating_index < sorted_eq.len) | ||
| sorted_eq[equivocating_index] | ||
| else | ||
| std.math.maxInt(ValidatorIndex); | ||
| continue; | ||
| } | ||
|
|
||
| if (bal.old == 0) { | ||
| if (bal.new == 0) { | ||
| result.new_inactive_validators += 1; | ||
| continue; | ||
| } | ||
| } | ||
|
|
||
| // Vote or balance changed: apply delta. | ||
| if (current_index != next_index or bal.old != bal.new) { | ||
| subtractOldBalance(deltas, current_index, bal.old); | ||
| addNewBalance(deltas, next_index, bal.new); | ||
| vote_current_indices[v_index] = next_index; | ||
| result.new_vote_validators += 1; | ||
| } else { | ||
| result.unchanged_vote_validators += 1; | ||
| } | ||
| } | ||
|
|
||
| return result; | ||
| } |
There was a problem hiding this comment.
The computeDeltas function body is 78 lines long, which exceeds the hard limit of 70 lines per function specified in the repository style guide.
References
- We enforce a hard limit of 70 lines per function body to reduce the probability of poorly structured code. (link)
| const num_validators = vote_next_indices.len; | ||
|
|
||
| // Sort equivocating indices for pointer advancement in the loop. | ||
| const sorted_eq = try sortEquivocatingKeys(allocator, equivocating_indices); |
There was a problem hiding this comment.
Allocating and sorting the equivocating validator indices on every call to computeDeltas (a hot path in fork choice) is inefficient. According to the style guide, memory should be statically allocated at startup where possible, and mechanical sympathy should be considered to avoid latency spikes. Consider maintaining a sorted list of equivocating indices in the store or caching the sorted result.
References
- Most memory should be statically allocated at startup, where possible and optimal. We don't allow potential memcpy latency spikes to slip through. (link)
| var equivocating_validator_index: ValidatorIndex = if (sorted_eq.len > 0) sorted_eq[0] else std.math.maxInt(ValidatorIndex); | ||
| var equivocating_index: usize = 0; | ||
|
|
||
| for (0..num_validators) |v_index| { |
There was a problem hiding this comment.
The loop index v_index uses the architecture-specific usize type. The style guide requires using explicitly-sized types like u32 for everything to ensure safety and predictability.
References
- Use explicitly-sized types like u32 for everything, avoid architecture-specific usize. (link)
| fn sortEquivocatingKeys(allocator: Allocator, indices: *const EquivocatingIndices) ![]const ValidatorIndex { | ||
| const keys = indices.keys(); | ||
| const buf = try allocator.alloc(ValidatorIndex, keys.len); | ||
| @memcpy(buf, keys); | ||
| std.mem.sortUnstable(ValidatorIndex, buf, {}, std.sort.asc(ValidatorIndex)); | ||
| return buf; | ||
| } |
There was a problem hiding this comment.
The sortEquivocatingKeys function lacks assertions for its arguments. The style guide requires asserting all function arguments to increase the probability that a program is correct.
References
- Assert all function arguments and return values, pre/postconditions and invariants. A function must not operate blindly on data it has not checked. (link)
| /// Copies equivocating keys into a heap buffer and sorts ascending for pointer advancement. | ||
| fn sortEquivocatingKeys(allocator: Allocator, indices: *const EquivocatingIndices) ![]const ValidatorIndex { | ||
| const keys = indices.keys(); | ||
| const buf = try allocator.alloc(ValidatorIndex, keys.len); |
There was a problem hiding this comment.
The function performs an allocation even when keys.len is zero. To adhere to the 'mechanical sympathy' and efficiency goals of the style guide, consider returning an empty slice immediately if there are no keys to sort.
if (keys.len == 0) return &.{};
const buf = try allocator.alloc(ValidatorIndex, keys.len);
References
- Think about performance from the outset. Mechanical sympathy is required to work with the grain of the system. (link)
| pub fn init(allocator: Allocator, comptime opts: m.RegistryOpts) !void { | ||
| var find_head = try Metrics.FindHead.init( | ||
| allocator, | ||
| "beacon_fork_choice_find_head_seconds", | ||
| .{ .help = "Time to find head in seconds" }, | ||
| opts, | ||
| ); | ||
| errdefer find_head.deinit(); | ||
|
|
||
| var errors = try Metrics.ErrorsGauge.init( | ||
| allocator, | ||
| "beacon_fork_choice_errors_total", | ||
| .{ .help = "Count of fork choice errors" }, | ||
| opts, | ||
| ); | ||
| errdefer errors.deinit(); | ||
|
|
||
| var not_reorged_reason = try Metrics.NotReorgedReasonCounter.init( | ||
| allocator, | ||
| "beacon_fork_choice_not_reorged_reason_total", | ||
| .{ .help = "Count of not reorged reasons" }, | ||
| opts, | ||
| ); | ||
| errdefer not_reorged_reason.deinit(); | ||
|
|
||
| fork_choice_metrics = .{ | ||
| .find_head = find_head, | ||
| .requests = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_requests_total", | ||
| .{ .help = "Count of fork choice head-finding attempts" }, | ||
| opts, | ||
| ), | ||
| .errors = errors, | ||
| .changed_head = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_changed_head_total", | ||
| .{ .help = "Count of head changes" }, | ||
| opts, | ||
| ), | ||
| .reorg = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_reorg_total", | ||
| .{ .help = "Count of chain reorgs" }, | ||
| opts, | ||
| ), | ||
| .reorg_distance = Metrics.ReorgDistance.init( | ||
| "beacon_fork_choice_reorg_distance", | ||
| .{ .help = "Histogram of reorg distances" }, | ||
| opts, | ||
| ), | ||
| .votes = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_votes_count", | ||
| .{ .help = "Current count of votes in fork choice" }, | ||
| opts, | ||
| ), | ||
| .queued_attestations = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_queued_attestations_count", | ||
| .{ .help = "Current count of queued attestations per slot" }, | ||
| opts, | ||
| ), | ||
| .validated_attestation_datas = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_validated_attestation_datas_count", | ||
| .{ .help = "Current count of validated attestation data" }, | ||
| opts, | ||
| ), | ||
| .balances_length = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_balances_length", | ||
| .{ .help = "Current balances array length" }, | ||
| opts, | ||
| ), | ||
| .nodes = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_nodes_count", | ||
| .{ .help = "Current number of nodes in fork choice" }, | ||
| opts, | ||
| ), | ||
| .indices = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_indices_count", | ||
| .{ .help = "Current number of indices in fork choice" }, | ||
| opts, | ||
| ), | ||
| .not_reorged_reason = not_reorged_reason, | ||
| .compute_deltas_duration = Metrics.ComputeDeltasDuration.init( | ||
| "beacon_fork_choice_compute_deltas_seconds", | ||
| .{ .help = "Time to compute deltas in seconds" }, | ||
| opts, | ||
| ), | ||
| .compute_deltas_deltas_count = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_compute_deltas_deltas_count", | ||
| .{ .help = "Number of deltas computed" }, | ||
| opts, | ||
| ), | ||
| .compute_deltas_zero_deltas_count = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_compute_deltas_zero_deltas_count", | ||
| .{ .help = "Number of zero deltas" }, | ||
| opts, | ||
| ), | ||
| .compute_deltas_equivocating_validators = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_compute_deltas_equivocating_validators_count", | ||
| .{ .help = "Number of equivocating validators" }, | ||
| opts, | ||
| ), | ||
| .compute_deltas_old_inactive_validators = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_compute_deltas_old_inactive_validators_count", | ||
| .{ .help = "Number of old inactive validators" }, | ||
| opts, | ||
| ), | ||
| .compute_deltas_new_inactive_validators = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_compute_deltas_new_inactive_validators_count", | ||
| .{ .help = "Number of new inactive validators" }, | ||
| opts, | ||
| ), | ||
| .compute_deltas_unchanged_vote_validators = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_compute_deltas_unchanged_vote_validators_count", | ||
| .{ .help = "Number of unchanged vote validators" }, | ||
| opts, | ||
| ), | ||
| .compute_deltas_new_vote_validators = Metrics.CountGauge.init( | ||
| "beacon_fork_choice_compute_deltas_new_vote_validators_count", | ||
| .{ .help = "Number of new vote validators" }, | ||
| opts, | ||
| ), | ||
| }; | ||
| } |
There was a problem hiding this comment.
The init function body is approximately 120 lines long, which significantly exceeds the hard limit of 70 lines per function specified in the repository style guide.
References
- We enforce a hard limit of 70 lines per function body to reduce the probability of poorly structured code. (link)
| next_indices: []u32, | ||
| next_slots: []Slot, | ||
| } { | ||
| assert(self.multi_list.len > 0 or self.multi_list.capacity == 0); |
There was a problem hiding this comment.
|
@codex review |
|
|
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
Add fork choice metrics aligned with Lodestar TypeScript implementation (packages/fork-choice/src/metrics.ts). Uses the same beacon_ prefix and metric names to ensure Grafana dashboard compatibility. Metrics include find_head histogram, reorg tracking, compute_deltas breakdown, and not_reorged_reason counters.
Wire up the metrics dependency (build.zig, zbuild.zon), export the metrics submodule from root.zig, and instrument computeDeltas in fork_choice.zig with timing and counter observations.
8dfd345 to
0de85f9
Compare
Motivation
Enable observability for the fork choice module by adding Prometheus metrics, aligned with the TypeScript Lodestar implementation for Grafana dashboard compatibility.
Description
src/fork_choice/metrics.zigwith 21 Prometheus metrics matchingpackages/fork-choice/src/metrics.tson the TSunstablebranchmetricsdependency inbuild.zigandzbuild.zonsrc/fork_choice/root.zigcomputeDeltasinfork_choice.zigwith timing and counter observationssrc/state_transition/metrics.zig