Skip to content

refactor: cut TS world state and NAPI AVM over to WSDB IPC; delete NAPI WSDB#23036

Open
charlielye wants to merge 9 commits into
cl/ipc-wsdb-concurrent-serverfrom
cl/ipc-3-avm-wsdb-cutover
Open

refactor: cut TS world state and NAPI AVM over to WSDB IPC; delete NAPI WSDB#23036
charlielye wants to merge 9 commits into
cl/ipc-wsdb-concurrent-serverfrom
cl/ipc-3-avm-wsdb-cutover

Conversation

@charlielye

@charlielye charlielye commented May 7, 2026

Copy link
Copy Markdown
Contributor

Summary

Cuts TypeScript world-state usage and the in-process C++ AVM NAPI path over to the standalone generated @aztec/wsdb package and aztec-wsdb service from the lower stack.

This PR sits directly on top of #24198 (the concurrent wsdb server), so the IPC world-state path it introduces serves parallel reads concurrently rather than serializing them through a single server loop.

Stack

next already contains the merged foundation (#23610) and wsdb migration (#23611).

  1. feat(ipc): concurrent service machinery — async reactor + codegen #24198 cl/ipc-wsdb-concurrent-server — concurrent wsdb server (reactor + async handlers + per-fork ordering); base next
  2. refactor: cut TS world state and NAPI AVM over to WSDB IPC; delete NAPI WSDB #23036 cl/ipc-3-avm-wsdb-cutoverthis PR
  3. feat: add generated aztec-vm-sim package setup #23084 cl/ipc-4-avm-binary
  4. feat: cut simulator over to generated aztec-vm-sim IPC service #23697 cl/ipc-5-avm-cutover

What changes

C++ / NAPI AVM path

  • AvmSimAPI::simulate takes a LowLevelMerkleDBInterface& instead of an in-process WorldState&.
  • NAPI AVM accepts a WSDB socket path and constructs a generated wsdb IPC client plus VM2 wsdb adapter per simulation.
  • NAPI world-state module is deleted; the NAPI module no longer exposes in-process WorldState.
  • NAPI module links the current ipc_runtime path from the lower stack.

TypeScript world-state path

  • World-state startup is encapsulated behind the world-state service/facade instead of leaking binary discovery into callers.
  • NativeWorldStateService spawns aztec-wsdb through the generated @aztec/wsdb package.
  • The generated wsdb wrapper resolves its service binary from the installed/local arch package optional dependency.
  • The old WorldStateRevisionWithHandle shape is removed; callers use ordinary revisions and, where needed, the wsdb socket path.
  • Cleanup paths close spawned wsdb processes explicitly in tests and service lifecycles.
  • The client-side WorldStateOpsQueue is removed: ordering is now enforced server-side per fork (in feat(ipc): concurrent service machinery — async reactor + codegen #24198), so the client just sends requests directly.

End state after this PR

The TS world-state path and the C++ AVM path both talk to world state over IPC. The in-process NAPI WSDB module is gone.

Performance

Parallel reads, yarn-project/world-state/src/native/parallel_read.bench.test.ts (single connection, pipelined reads) and multi_connection_read.bench.test.ts (N separate connections, the real AVM-pool topology). Same machine, medians; absolute values carry run-to-run variance, so read the scaling shape rather than the exact figures.

Before — in-process world-state (next)

Reads/s by concurrency, single connection:

concurrency reads/s
1 ~8,700
4 ~29,600
8 ~38,300
16 ~36,900

The in-process world state scales roughly 4× with concurrency.

The regression this stack fixes

A naive IPC cutover with a synchronous server loop (run() blocking on each handler) serves parallel reads flat at ~13k reads/s regardless of concurrency — every request in the system is serialized through one loop thread. At the AVM's 4-way concurrency that is ~2.2× worse than in-process, and the gap widens with more cores.

After — IPC wsdb with the concurrent reactor (this stack)

Single connection, pipelined, c16:

transport reads/s
reactor over UDS ~28,700
reactor over SHM ~30,400

SHM scales ~3.7× (8.1k → 30.4k), restoring concurrency to ~80–86% of the in-process baseline. The residual gap is per-op event-loop overhead in the TS read facade, which only becomes visible over fast IPC — the in-process path pays it too, but it is dwarfed by in-process read latency.

Multi-connection (AVM topology — N separate wsdb connections, which bypass the TS facade entirely, as real C++ AVM sims do): SHM aggregate reaches ~40–47k reads/s at 7 connections, exceeding the single-connection facade ceiling.

Validation

  • wsdb/bootstrap.sh
  • yarn-project/yarn install --immutable on this branch after generated package lock updates.
  • Root ./bootstrap.sh passed on this branch during stack validation.
  • native_world_state.test.ts (50 tests incl. concurrent mutating/non-mutating ordering) green.

@charlielye charlielye force-pushed the cl/ipc-3-avm-wsdb-cutover branch from 144f6c2 to d2d4ae7 Compare May 7, 2026 13:51
@charlielye charlielye marked this pull request as ready for review May 7, 2026 16:48
@charlielye charlielye force-pushed the cl/ipc-3-avm-wsdb-cutover branch 2 times, most recently from b7a6517 to 4941f2b Compare May 7, 2026 18:02
@charlielye charlielye force-pushed the cl/ipc-3-avm-wsdb-cutover branch 3 times, most recently from 67d3c6c to d904a74 Compare May 11, 2026 13:59
charlielye added a commit that referenced this pull request May 11, 2026
…C server

Replaces the in-process NAPI AVM with the standalone aztec-avm binary
(spawned via AvmBackend from PR 3a) and a TS-hosted CdbIpcServer for the
contract data callbacks.

Stacked on top of PR 2b (cl/ipc-3-avm-wsdb-cutover, #23036) and PR 3a
(cl/ipc-4-avm-binary, #23084).
Base automatically changed from cl/ipc-2-wsdb to cl/ipc-1-mpsc-shm May 12, 2026 08:52
@AztecBot AztecBot force-pushed the cl/ipc-1-mpsc-shm branch 2 times, most recently from 17b93f4 to e668757 Compare May 12, 2026 09:02
Base automatically changed from cl/ipc-1-mpsc-shm to next May 12, 2026 10:55
@charlielye charlielye added the ci-full Run all master checks. label May 12, 2026
charlielye added a commit that referenced this pull request May 12, 2026
…C server

Replaces the in-process NAPI AVM with the standalone aztec-avm binary
(spawned via AvmBackend from PR 3a) and a TS-hosted CdbIpcServer for the
contract data callbacks.

Stacked on top of PR 2b (cl/ipc-3-avm-wsdb-cutover, #23036) and PR 3a
(cl/ipc-4-avm-binary, #23084).
@charlielye charlielye changed the base branch from next to cl/ipc-bb-js-migrate May 29, 2026 13:54
@charlielye charlielye force-pushed the cl/ipc-3-avm-wsdb-cutover branch from 0d58c3e to 679c622 Compare May 29, 2026 13:54
@charlielye charlielye force-pushed the cl/ipc-3-avm-wsdb-cutover branch from 3a9289a to 6042327 Compare June 9, 2026 16:15
@charlielye charlielye force-pushed the cl/ipc-bb-js-migrate branch from ad67d9c to 1e2528e Compare June 9, 2026 17:00
@charlielye charlielye force-pushed the cl/ipc-3-avm-wsdb-cutover branch from 6042327 to 14392b8 Compare June 9, 2026 17:00
charlielye added a commit that referenced this pull request Jun 9, 2026
…C server

Replaces the in-process NAPI AVM with the standalone aztec-avm binary
(spawned via AvmBackend from PR 3a) and a TS-hosted CdbIpcServer for the
contract data callbacks.

Stacked on top of PR 2b (cl/ipc-3-avm-wsdb-cutover, #23036) and PR 3a
(cl/ipc-4-avm-binary, #23084).
charlielye added a commit that referenced this pull request Jun 10, 2026
…C server

Replaces the in-process NAPI AVM with the standalone aztec-avm binary
(spawned via AvmBackend from PR 3a) and a TS-hosted CdbIpcServer for the
contract data callbacks.

Stacked on top of PR 2b (cl/ipc-3-avm-wsdb-cutover, #23036) and PR 3a
(cl/ipc-4-avm-binary, #23084).
@charlielye charlielye force-pushed the cl/ipc-3-avm-wsdb-cutover branch from 14392b8 to 011106a Compare June 10, 2026 13:37
@charlielye charlielye force-pushed the cl/ipc-bb-js-migrate branch from 1e2528e to 3cddf51 Compare June 10, 2026 13:37
@charlielye charlielye force-pushed the cl/ipc-3-avm-wsdb-cutover branch from 011106a to f46984b Compare June 10, 2026 13:44
charlielye added a commit that referenced this pull request Jun 10, 2026
…C server

Replaces the in-process NAPI AVM with the standalone aztec-avm binary
(spawned via AvmBackend from PR 3a) and a TS-hosted CdbIpcServer for the
contract data callbacks.

Stacked on top of PR 2b (cl/ipc-3-avm-wsdb-cutover, #23036) and PR 3a
(cl/ipc-4-avm-binary, #23084).
@charlielye charlielye force-pushed the cl/ipc-bb-js-migrate branch from 3cddf51 to c8b3cfa Compare June 10, 2026 14:11
charlielye added a commit that referenced this pull request Jun 10, 2026
…C server

Replaces the in-process NAPI AVM with the standalone aztec-avm binary
(spawned via AvmBackend from PR 3a) and a TS-hosted CdbIpcServer for the
contract data callbacks.

Stacked on top of PR 2b (cl/ipc-3-avm-wsdb-cutover, #23036) and PR 3a
(cl/ipc-4-avm-binary, #23084).
@charlielye charlielye force-pushed the cl/ipc-3-avm-wsdb-cutover branch from f46984b to 0d5cc4a Compare June 10, 2026 14:11
@charlielye charlielye force-pushed the cl/ipc-bb-js-migrate branch from c8b3cfa to fbe1523 Compare June 11, 2026 17:50
A sequential-insert response can run to ~1.5 MB, but the default SHM rings
were 1 MB, so the response could never be sent and the client hung forever
(no error surfaced: the server crashed to a discarded stderr and the SHM
client has no peer-disconnect signal). UDS streamed it fine, which is why this
only bit the SHM path, and bb.js never hit it (tiny responses, SPSC only).

- Size the spawned aztec-wsdb SHM rings to 32 MiB request+response (an SHM
  frame is capped at half the ring, so this gives ~16 MiB headroom). Keep the
  WSDB_TRANSPORT env gate (default uds) for opting into SHM.
- SpscShm::create: gate the pre-fault memset to small rings so large rings
  stay demand-paged instead of forcing the whole mapping resident.
- IpcServer::run(): catch handler/send failures and shut down cleanly with a
  logged reason instead of letting an uncaught exception reach std::terminate.
- Generated spawned-service backend: capture the child's stdout/stderr to a
  temp logfile (an fd, not a pipe, to preserve clean process exit) and, on
  unexpected child exit, reject in-flight calls surfacing the log path — so a
  server crash is a clear error rather than a silent hang over SHM.
- Add single-client MPSC pipelined-flood and burst tests; the SPSC grind never
  exercised the MPSC path.

Verified: world-state native suite 50/50 over both SHM and UDS; avm_bulk
(TS world state client 0 + C++ AVM client 1 on one aztec-wsdb) passes over
SHM; no perf regression vs UDS.
parallel_read.bench pipelines N reads down one connection (the TS
world-state client's pattern). The AVM proving path is different: one
aztec-wsdb with N independent sim connections, each issuing sequential
reads. Add a bench that models that — N AsyncApi clients, one in-flight
read each — to measure what the simulator pool actually sees.

The N-connection topology scales better than one pipelined connection,
since load spreads across N request/response rings instead of contending
on one: SHM reaches ~40-47k reads/s at 7 connections (at/above the
in-process baseline), UDS ~30k. IPC-only; skips on the in-process build.
…-process comparison

The raw AsyncApi path skips the WorldStateOpsQueue + facade (~40% overhead)
that the in-process numbers go through, so these figures measure server
capability, not an in-process win. Document the 7-connection cap as the real
SHM ceiling (8 slots minus the world-state client's).
multi_connection_read.bench.test.ts imports @aztec/ipc-runtime to open extra
connections; declare it so eslint import-x/no-extraneous-dependencies passes.
Read/write ordering is now enforced server-side by the wsdb per-fork
scheduler, so the IPC client no longer needs its own per-fork ordering
queue: every op is sent immediately and the server serializes writes per
fork while running reads concurrently. IpcWorldState.execute() now sends
directly; the per-fork queue map, its lifecycle (stop on close/deleteFork),
and the WorldStateOpsQueue class are deleted. The WorldStateOperationName
label type moves to world_state_operation.ts (still used for
instrumentation). The A-1055 delayed-close-fork regression test drops its
now-obsolete per-fork-queue-cleanup assertion (the silent-dispose check
remains).

native_world_state.test.ts (50 tests) passes against the async server,
including "Concurrent requests" — which, with the client queue gone, now
genuinely exercises server-side per-fork read/write ordering.
…uild

The webapp-tutorial validation links the local @Aztec dependency closure and
resolves the rest from npm. World-state now depends on the generated, unpublished
@aztec/wsdb (and the simulator on @aztec/aztec-vm-sim), which the monorepo
declares as local portals in yarn-project's root resolutions but which the walker
previously treated as npm packages — failing with a 404. Resolve any @Aztec
package the root resolutions map to a local portal/file path locally instead.
@AztecBot

Copy link
Copy Markdown
Collaborator

Flakey Tests

🤖 says: This CI run detected 2 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/7d01eb8595f1768e�7d01eb8595f1768e8;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_epochs/epochs_invalidate_block.parallel.test.ts "proposer invalidates multiple checkpoints" (433s) (code: 0) group:e2e-p2p-epoch-flakes
\033FLAKED\033 (8;;http://ci.aztec-labs.com/b7ced716b479f512�b7ced716b479f5128;;�): yarn-project/end-to-end/scripts/run_test.sh ha src/composed/ha/e2e_ha_full.test.ts (283s) (code: 0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants