Skip to content

[wip]feat(fork-choice): implement Fast Confirmation Rule (FCR)#336

Draft
GrapeBaBa wants to merge 12 commits into
gr/feature/forkchoice-zfrom
gr/feat-fcr-clean
Draft

[wip]feat(fork-choice): implement Fast Confirmation Rule (FCR)#336
GrapeBaBa wants to merge 12 commits into
gr/feature/forkchoice-zfrom
gr/feat-fcr-clean

Conversation

@GrapeBaBa
Copy link
Copy Markdown
Contributor

@GrapeBaBa GrapeBaBa commented May 1, 2026

Summary

Implements the Fast Confirmation Rule (FCR) algorithm + module as defined in consensus-specs PR 4747.

This is a draft PR targeting gr/feature/forkchoice-z. No spec test code in this PR — the FCR EF spec test runner lives separately on gr/feat-fcr-on-spec-tests.

References:

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces the Fast Confirmation Rule (FCR) to the fork choice mechanism, enabling faster finality assessments as specified in the Ethereum consensus specs. The implementation is currently in a draft state, focusing on the core algorithm and the necessary infrastructure to validate it against official spec tests. Changes include updates to the fork choice logic to support specialized test modes and the addition of comprehensive test runners to ensure parity with the consensus specifications.

Highlights

  • Fast Confirmation Rule (FCR) Implementation: Implemented the Fast Confirmation Rule (FCR) algorithm as defined in consensus-specs PR 4747, including core logic, helpers, and inline unit tests.
  • Fork Choice Enhancements: Added a new 'compute_unrealized_without_pull_up' flag to ForkChoiceOpts to support FCR spec test requirements without altering production head selection logic.
  • Spec Test Infrastructure: Added a dedicated spec test runner for FCR and integrated it into the existing test generation and execution pipeline.
  • Configuration Updates: Introduced 'CONFIRMATION_BYZANTINE_THRESHOLD' to ChainConfig and updated network configurations (mainnet, minimal, gnosis) to include this parameter.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the Fast Confirmation (FCR) logic and its associated spec test runner, updating the consensus spec version to v1.7.0-alpha.5. It adds the CONFIRMATION_BYZANTINE_THRESHOLD configuration and extends ForkChoice with a compute_unrealized_without_pull_up mode to support FCR testing. Feedback identifies several critical issues: the use of test-only dependencies (std.testing.io) in production code, incorrect usage of std.ArrayList instead of std.ArrayListUnmanaged in the test runner, and violations of the "limit everything" safety rule by using .unlimited for file allocations. Additionally, it is recommended to implement a proper balances getter in the test runner to avoid workarounds in the production ForkChoice implementation.

Comment thread src/fork_choice/fork_choice.zig Outdated
const unrealized = try state_transition.computeUnrealizedCheckpoints(state, allocator);
// Note: io is sourced from std.testing in this code path; non-test callers
// should propagate io through onBlock once Phase F+ wires production callers.
const unrealized = try state_transition.computeUnrealizedCheckpoints(allocator, std.testing.io, state);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The use of std.testing.io in a production source file violates the "Safety" and "Technical Debt" goals of the style guide (Repository Style Guide lines 19 and 32). Production code should not depend on std.testing for core logic. The onBlock function should be updated to accept an io parameter (e.g., std.Io) to be propagated to computeUnrealizedCheckpoints, rather than hardcoding a test-only dependency.

References
  1. Use only very simple, explicit control flow for clarity. Do not use recursion to ensure that all executions that should be bounded are bounded. (link)
  2. TigerBeetle has a “zero technical debt” policy. We do it right the first time.

Comment thread test/spec/runner/fast_confirmation.zig Outdated
/// Parse the steps.yaml content into an owned slice of Step values.
/// All name slices are owned by `allocator` and must be freed via `freeSteps`.
fn parseSteps(allocator: Allocator, content: []const u8) !std.ArrayList(Step) {
var steps: std.ArrayList(Step) = .empty;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The variable steps is declared as std.ArrayList(Step), but it is initialized with .empty and its methods are called with an explicit allocator (e.g., at line 191). This is the API for std.ArrayListUnmanaged. std.ArrayList is a managed type that requires init(allocator) and does not take an allocator in its append methods. This will cause a compilation error. The same issue applies to current at line 150 and the function signature at line 175.

    var steps: std.ArrayListUnmanaged(Step) = .empty;

Comment thread test/spec/runner/fast_confirmation.zig Outdated
// the current logical step buffer and flush whenever we see a `- ` at
// column 0 (indicating the next step) or end-of-file.
var lines = std.mem.splitScalar(u8, content, '\n');
var current: std.ArrayList(u8) = .empty;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The variable current is declared as std.ArrayList(u8), but it is used as an ArrayListUnmanaged (initialized with .empty and passing allocator to appendSlice at line 162). This will cause a compilation error.

    var current: std.ArrayListUnmanaged(u8) = .empty;

Comment thread test/spec/runner/fast_confirmation.zig Outdated
fcr.setSpecTestMode(true);

// ---------- Read steps.yaml ----------
const steps_content = try dir.readFileAlloc(std.testing.io, "steps.yaml", allocator, .unlimited);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using .unlimited in readFileAlloc violates the "Put a limit on everything" rule (Repository Style Guide line 38). Spec test files can be large, and a fixed upper bound should be provided to prevent excessive memory allocation in case of malformed or unexpectedly large test files.

References
  1. Put a limit on everything because, in reality, this is what we expect—everything has a limit. (link)

Comment thread test/spec/runner/fast_confirmation.zig Outdated
Comment on lines +72 to +74
fn nullBalancesGetter(_: ?*anyopaque, _: Checkpoint, _: *CachedBeaconState) JustifiedBalances {
return .empty;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

nullBalancesGetter returning .empty is a significant limitation that has already leaked into the production ForkChoice implementation (see the workaround in src/fork_choice/fork_choice.zig line 1466). To maintain a clean separation between test logic and production code, the runner should provide a proper getter that extracts effective balances from the CachedBeaconState using state.epoch_cache.getEffectiveBalanceIncrements().

fn balancesGetter(_: ?*anyopaque, _: Checkpoint, state: *CachedBeaconState) JustifiedBalances {
    return state.epoch_cache.getEffectiveBalanceIncrements();
}
References
  1. TigerBeetle has a “zero technical debt” policy. We do it right the first time.

GrapeBaBa added 11 commits May 1, 2026 14:57
Add empty FastConfirmation module skeleton with struct fields, init/deinit,
and CONFIRMATION_BYZANTINE_THRESHOLD config constant. Spec helpers and
algorithm follow in subsequent commits.

Spec: ethereum/consensus-specs#4747
Reference: Lighthouse PR sigp/lighthouse#8951
- Drop unused Error.InvalidByzantineThreshold (YAGNI; init silently clamps
  per Lighthouse precedent, no caller fails on out-of-range yet).
- Widen byzantine_threshold and proposer_score_boost from u8 to u64 to match
  ChainConfig field types and Lighthouse reference, eliminating @intcast at
  every future call site.
- Drop unused Epoch and ValidatorIndex imports (re-introduce in phases that
  use them).
Implement spec misc helpers (isStartSlotAtEpoch, getBlockSlot, getBlockEpoch,
getCheckpointForBlock, isAncestor, getAncestorRoots, getCurrentTarget) and
state helpers (BalanceSourceData.rebuild, SlotAssignments.rebuild,
getSlotCommittee).

Spec: specs/phase0/fast-confirmation.md (Misc + State helpers sections)
- Widen FCR Error to include ForkChoiceError variants. getCheckpointForBlock
  and isAncestor now match only "block not found" variants
  (MissingProtoArrayBlock, UnknownAncestor), propagating corruption signals
  (BeaconStateErr, InvalidParentIndex, etc.) instead of collapsing them.
- Change isAncestor signature to Error!bool so non-not-found errors surface.
- Tighten BalanceSourceData.rebuild and SlotAssignments.rebuild to explicit
  Error!void return type. State subsystem errors are mapped to StateMissing;
  OutOfMemory propagates separately.
- Fix misleading TigerStyle comment on BalanceSourceData.rebuild's
  validators.len > 0 assert (it asserts, doesn't silently return).
- Add upper-bound assert on SlotAssignments.rebuild slot range.
- Drop stale "(rebuild added in Phase B)" parentheticals from section banners.
Implement 12 LMD-GHOST safety threshold helpers per spec:
isFullValidatorSetCovered, adjustCommitteeWeightEstimateToEnsureSafety,
estimateCommitteeWeightBetweenSlots, getEquivocationScore,
computeAdversarialWeight, getAdversarialWeight, getBlockSupportBetweenSlots,
computeEmptySlotSupportDiscount, getSupportDiscount, computeSafetyThreshold,
isOneConfirmed, isConfirmedChainSafe.

Spec: specs/phase0/fast-confirmation.md (LMD-GHOST helpers section)
- Document BalanceSourceData.rebuild's caller invariant (state's epoch must
  match cp.epoch) since active/slashed predicates evaluate at cp.epoch.
- Rename two tests that didn't actually exercise the branch their names
  promised:
  - "computeAdversarialWeight basic saturation behavior" ->
    "empty equivocating set returns positive max" (saturation-to-zero
    requires committee-aware setup, out of scope for unit test).
  - "computeSafetyThreshold underflow guard saturates to 0" ->
    "degenerate zero inputs return zero breakdown" (underflow branch needs
    empty-slot parent-support setup, out of scope for unit test).
Implement 4 FFG helpers per spec:
getCurrentTargetScore, computeHonestFfgSupportForCurrentTarget,
willNoConflictingCheckpointBeJustified, willCurrentTargetBeJustified.

Spec: specs/phase0/fast-confirmation.md (FFG helpers section)
- Rename two tests whose names overstated coverage:
  - "computeHonestFfgSupportForCurrentTarget bounded by total active
    balance" -> "zero-vote upper bound" (only the trivial branch is
    exercised; positive-vote case requires more involved fixture).
  - "willNoConflictingCheckpointBeJustified insufficient honest yields
    false" -> "zero-balance degenerate case yields false" (named branch
    requires committee + ffg_weight setup out of scope for unit test).
- Add Phase E TODO note on the redundant getTotalActiveBalance scan in
  willNoConflictingCheckpointBeJustified (called once at top, again
  transitively via computeHonestFfgSupportForCurrentTarget).
Implement updateFastConfirmationVariables, findLatestConfirmedDescendant,
getLatestConfirmed (private algorithm methods) plus onFastConfirmation and
runConfirmation public entry points. Algorithm is a direct port of spec
pseudocode; spec test mode lets the runner trigger confirmation explicitly.

Spec: specs/phase0/fast-confirmation.md (algorithm + handlers sections)
Phase E confused two different unrealized-justified concepts:
- store.unrealized_justified_checkpoint (global, used in
  update_fast_confirmation_variables per spec line 814-816)
- store.unrealized_justifications[head] (per-head, used by FFG helpers
  per spec line 770, 1003)

The runner had been passing the head's per-block unrealized_justified
to update_fast_confirmation_variables, which at anchor blocks carries
ZERO_HASH. This stomped fcr.previous_epoch_greatest_unrealized_checkpoint
to zero at every epoch boundary, cascading into all FCR field mismatches.

Fix: updateFastConfirmationVariables takes *const ForkChoice and reads
the global value directly from fc.fc_store.unrealized_justified.checkpoint.
The per-head head_unrealized_justified parameter on entry points
(onFastConfirmation/runConfirmation) keeps its meaning and continues
feeding the FFG helpers downstream — those usages were correct.

Spec test pass rate jumps from 0/1014 to 4006/4798 (~83.5%) on minimal
preset; remaining 792 FcrConfirmedRootMismatch failures will be diagnosed
in follow-up commits.

Surfaced by Phase F EF spec test runner.
Spec line 999 (`get_latest_confirmed`) gates the "restart from observed
justified" branch on the BLOCK's slot's epoch:

    is_observed_justified_block_epoch_ok = (
        compute_epoch_at_slot(observed_justified_block_slot) + 1 == current_epoch
    )

Phase E had an outer guard `observed.epoch + 1 == current_epoch` using the
checkpoint's `epoch` field — these can differ when the checkpoint's root
is an empty-slot block from an earlier epoch (e.g., checkpoint at epoch 2
but its root block is at slot 15, epoch 1). The inner correct check
(observed_block_epoch + 1 == current_epoch) is preserved.

Also add diagnostic logging to test/spec/runner/fast_confirmation.zig so
future failure-debugging can trace expected/actual confirmed_root and
the FCR variables alongside.

Note: this fix is spec-aligned but does not change the spec test pass
count (4006/4798 unchanged). The cases that fail don't hit this gate.
The remaining 792 failures need per-case algorithmic trace, suspected
in `findLatestConfirmedDescendant`'s isOneConfirmed branch (balance
source semantics or safety-threshold computation).
@GrapeBaBa GrapeBaBa force-pushed the gr/feat-fcr-clean branch from eea5c7d to b085072 Compare May 1, 2026 07:09
@GrapeBaBa GrapeBaBa force-pushed the gr/feat-fcr-clean branch 2 times, most recently from 236b604 to 59ea6f5 Compare May 1, 2026 08:14
Remove "Phase A/B/C/D/E/F/H" labels from section headers, doc comments, and
TODO notes. These referred to the original implementation roadmap, not to
anything in the code itself. Replace with code-relevant phrasing where the
original intent was a concrete future improvement, drop where it was just
historical context.

No functional change; 241/241 fork_choice tests still pass.
@GrapeBaBa GrapeBaBa force-pushed the gr/feat-fcr-clean branch from 59ea6f5 to e80d456 Compare May 1, 2026 12:28
@GrapeBaBa GrapeBaBa changed the title feat(fork-choice): implement Fast Confirmation Rule (FCR) [wip]feat(fork-choice): implement Fast Confirmation Rule (FCR) May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant