Skip to content

refactor(vara.eth/malachite): move EB-prerequisite gating out of Malachite#5602

Draft
grishasobol wants to merge 1 commit into
gsobol/ethexe/al-bitswapfrom
gsobol/ethexe/bitswap-malachite
Draft

refactor(vara.eth/malachite): move EB-prerequisite gating out of Malachite#5602
grishasobol wants to merge 1 commit into
gsobol/ethexe/al-bitswapfrom
gsobol/ethexe/bitswap-malachite

Conversation

@grishasobol

Copy link
Copy Markdown
Member

What & why

Redesign-only diff on top of gsobol/ethexe/al-bitswap (which is al/ethexe/bitswap with tune-malachite-errors merged). Base chosen so this PR shows only the redesign, not the large tune merge.

Models the eventual flow: tune → master, master → al/ethexe/bitswap (== the base branch here), then this PR → al/ethexe/bitswap.

Problem

After tune lands in bitswap, bitswap's in-Malachite fast-sync replay filter is gone (dropped with bitswap's Malachite changes). A pure app-level "wait for BlockFinalized(target)" gate is impossible against tune's externalities: try_emit_or_queue head-of-line-blocks the event queue on an unprepared prerequisite EB, so the target event would never reach the app (fast sync deadlocks). On the base branch the fast_sync test stalls for exactly this reason.

Solution — move the EB-prerequisite gate out of Malachite

  • Malachite externalities no longer queue events behind a prepared-EB prerequisite (try_emit_or_queue / pending_events / drain_pending_events / prerequisite_satisfied removed; receive_eb_prepared gone). process_mb_proposal / process_mb_finalized emit immediately and strictly in order.
  • ethexe-compute now owns the "defer an MB until the EB it advances to is prepared" gate (ComputeSubService::deferred + receive_prepared_block). It scans every deferred request on each BlockPrepared (the prepare sub-service only reports the chain head of a prepared ancestor run); defer-at-launch avoids head-of-line blocking ready requests.
  • The service applies an app-level ReplayGate after fast sync: suppresses replayed BlockProposals until the committed target MB is reached — by hash for a fresh DB, or via a one-shot ancestry walk when the target was already finalized locally before start — then resumes live consensus. A missing CompactMb mid-walk fails loudly instead of hanging.

Testing

  • cargo fmt --all --check
  • cargo clippy ✓ (clean)
  • cargo nextest run -p "ethexe-*"543 passed, 1 skipped, incl. the fast_sync integration test (~77s, well under the 120s cap — default value-sync timing suffices).

Notes

  • The transient rewind of globals.latest_finalized_mb_hash during genesis replay is low-severity (only the consensus batch manager reads it at runtime, and it isn't producing batches during catch-up) and self-heals as replay climbs back to the tip.
  • Draft for review of the layering choice (compute-owned prerequisite gate + app-level replay gate).

🤖 Generated with Claude Code

https://claude.ai/code/session_01EgjaE68s3nARWKBCa4Jid1

…chite

Replaces the in-Malachite fast-sync replay filter with a layered design, on
top of `gsobol/ethexe/al-bitswap` (al/ethexe/bitswap + tune-malachite-errors).

A pure app-level "wait for BlockFinalized(target)" gate is impossible against
tune's externalities: `try_emit_or_queue` head-of-line-blocks the event queue
on an unprepared prerequisite EB, so the target event would never reach the
app. So the prerequisite gate is moved out of Malachite:

- Malachite externalities no longer queue events behind a prepared-EB
  prerequisite (try_emit_or_queue / pending_events / drain_pending_events /
  prerequisite_satisfied removed; receive_eb_prepared gone). process_mb_*
  emit immediately and strictly in order.
- ethexe-compute now owns the "defer an MB until the EB it advances to is
  prepared" gate (ComputeSubService::deferred + receive_prepared_block),
  scanning all deferred requests on each BlockPrepared since the prepare
  sub-service only reports the chain head; defer-at-launch avoids head-of-line
  blocking ready requests.
- The service applies an app-level ReplayGate after fast sync: it suppresses
  replayed BlockProposals until the committed target MB is reached (by hash for
  a fresh DB, or via a one-shot ancestry walk when the target was already
  finalized locally before start), then resumes live consensus. A missing
  CompactMb mid-walk fails loudly instead of hanging.

cargo fmt, cargo clippy and cargo nextest run -p "ethexe-*" all pass
(543 tests, incl. the fast_sync integration test).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01EgjaE68s3nARWKBCa4Jid1
@grishasobol grishasobol added the ai-generated Created entirely by an AI agent without direct human authorship label Jun 23, 2026
@grishasobol

Copy link
Copy Markdown
Member Author

CI dispatched manually on the branch (PR.yml only auto-runs for master-targeted PRs): https://github.com/gear-tech/gear/actions/runs/28025097528

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-generated Created entirely by an AI agent without direct human authorship

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant