refactor: cut TS world state and NAPI AVM over to WSDB IPC; delete NAPI WSDB#23036
Open
charlielye wants to merge 9 commits into
Open
refactor: cut TS world state and NAPI AVM over to WSDB IPC; delete NAPI WSDB#23036charlielye wants to merge 9 commits into
charlielye wants to merge 9 commits into
Conversation
144f6c2 to
d2d4ae7
Compare
b7a6517 to
4941f2b
Compare
67d3c6c to
d904a74
Compare
5 tasks
17b93f4 to
e668757
Compare
This was referenced May 12, 2026
0d58c3e to
679c622
Compare
3a9289a to
6042327
Compare
ad67d9c to
1e2528e
Compare
6042327 to
14392b8
Compare
14392b8 to
011106a
Compare
1e2528e to
3cddf51
Compare
011106a to
f46984b
Compare
This was referenced Jun 10, 2026
3cddf51 to
c8b3cfa
Compare
f46984b to
0d5cc4a
Compare
c8b3cfa to
fbe1523
Compare
This was referenced Jun 20, 2026
A sequential-insert response can run to ~1.5 MB, but the default SHM rings were 1 MB, so the response could never be sent and the client hung forever (no error surfaced: the server crashed to a discarded stderr and the SHM client has no peer-disconnect signal). UDS streamed it fine, which is why this only bit the SHM path, and bb.js never hit it (tiny responses, SPSC only). - Size the spawned aztec-wsdb SHM rings to 32 MiB request+response (an SHM frame is capped at half the ring, so this gives ~16 MiB headroom). Keep the WSDB_TRANSPORT env gate (default uds) for opting into SHM. - SpscShm::create: gate the pre-fault memset to small rings so large rings stay demand-paged instead of forcing the whole mapping resident. - IpcServer::run(): catch handler/send failures and shut down cleanly with a logged reason instead of letting an uncaught exception reach std::terminate. - Generated spawned-service backend: capture the child's stdout/stderr to a temp logfile (an fd, not a pipe, to preserve clean process exit) and, on unexpected child exit, reject in-flight calls surfacing the log path — so a server crash is a clear error rather than a silent hang over SHM. - Add single-client MPSC pipelined-flood and burst tests; the SPSC grind never exercised the MPSC path. Verified: world-state native suite 50/50 over both SHM and UDS; avm_bulk (TS world state client 0 + C++ AVM client 1 on one aztec-wsdb) passes over SHM; no perf regression vs UDS.
parallel_read.bench pipelines N reads down one connection (the TS world-state client's pattern). The AVM proving path is different: one aztec-wsdb with N independent sim connections, each issuing sequential reads. Add a bench that models that — N AsyncApi clients, one in-flight read each — to measure what the simulator pool actually sees. The N-connection topology scales better than one pipelined connection, since load spreads across N request/response rings instead of contending on one: SHM reaches ~40-47k reads/s at 7 connections (at/above the in-process baseline), UDS ~30k. IPC-only; skips on the in-process build.
…-process comparison The raw AsyncApi path skips the WorldStateOpsQueue + facade (~40% overhead) that the in-process numbers go through, so these figures measure server capability, not an in-process win. Document the 7-connection cap as the real SHM ceiling (8 slots minus the world-state client's).
multi_connection_read.bench.test.ts imports @aztec/ipc-runtime to open extra connections; declare it so eslint import-x/no-extraneous-dependencies passes.
Read/write ordering is now enforced server-side by the wsdb per-fork scheduler, so the IPC client no longer needs its own per-fork ordering queue: every op is sent immediately and the server serializes writes per fork while running reads concurrently. IpcWorldState.execute() now sends directly; the per-fork queue map, its lifecycle (stop on close/deleteFork), and the WorldStateOpsQueue class are deleted. The WorldStateOperationName label type moves to world_state_operation.ts (still used for instrumentation). The A-1055 delayed-close-fork regression test drops its now-obsolete per-fork-queue-cleanup assertion (the silent-dispose check remains). native_world_state.test.ts (50 tests) passes against the async server, including "Concurrent requests" — which, with the client queue gone, now genuinely exercises server-side per-fork read/write ordering.
…uild The webapp-tutorial validation links the local @Aztec dependency closure and resolves the rest from npm. World-state now depends on the generated, unpublished @aztec/wsdb (and the simulator on @aztec/aztec-vm-sim), which the monorepo declares as local portals in yarn-project's root resolutions but which the walker previously treated as npm packages — failing with a 404. Resolve any @Aztec package the root resolutions map to a local portal/file path locally instead.
Collaborator
Flakey Tests🤖 says: This CI run detected 2 tests that failed, but were tolerated due to a .test_patterns.yml entry. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Cuts TypeScript world-state usage and the in-process C++ AVM NAPI path over to the standalone generated
@aztec/wsdbpackage andaztec-wsdbservice from the lower stack.This PR sits directly on top of #24198 (the concurrent wsdb server), so the IPC world-state path it introduces serves parallel reads concurrently rather than serializing them through a single server loop.
Stack
nextalready contains the merged foundation (#23610) and wsdb migration (#23611).cl/ipc-wsdb-concurrent-server— concurrent wsdb server (reactor + async handlers + per-fork ordering); basenextcl/ipc-3-avm-wsdb-cutover— this PRcl/ipc-4-avm-binarycl/ipc-5-avm-cutoverWhat changes
C++ / NAPI AVM path
AvmSimAPI::simulatetakes aLowLevelMerkleDBInterface&instead of an in-processWorldState&.WorldState.ipc_runtimepath from the lower stack.TypeScript world-state path
NativeWorldStateServicespawnsaztec-wsdbthrough the generated@aztec/wsdbpackage.WorldStateRevisionWithHandleshape is removed; callers use ordinary revisions and, where needed, the wsdb socket path.WorldStateOpsQueueis removed: ordering is now enforced server-side per fork (in feat(ipc): concurrent service machinery — async reactor + codegen #24198), so the client just sends requests directly.End state after this PR
The TS world-state path and the C++ AVM path both talk to world state over IPC. The in-process NAPI WSDB module is gone.
Performance
Parallel reads,
yarn-project/world-state/src/native/parallel_read.bench.test.ts(single connection, pipelined reads) andmulti_connection_read.bench.test.ts(N separate connections, the real AVM-pool topology). Same machine, medians; absolute values carry run-to-run variance, so read the scaling shape rather than the exact figures.Before — in-process world-state (
next)Reads/s by concurrency, single connection:
The in-process world state scales roughly 4× with concurrency.
The regression this stack fixes
A naive IPC cutover with a synchronous server loop (
run()blocking on each handler) serves parallel reads flat at ~13k reads/s regardless of concurrency — every request in the system is serialized through one loop thread. At the AVM's 4-way concurrency that is ~2.2× worse than in-process, and the gap widens with more cores.After — IPC wsdb with the concurrent reactor (this stack)
Single connection, pipelined, c16:
SHM scales ~3.7× (8.1k → 30.4k), restoring concurrency to ~80–86% of the in-process baseline. The residual gap is per-op event-loop overhead in the TS read facade, which only becomes visible over fast IPC — the in-process path pays it too, but it is dwarfed by in-process read latency.
Multi-connection (AVM topology — N separate wsdb connections, which bypass the TS facade entirely, as real C++ AVM sims do): SHM aggregate reaches ~40–47k reads/s at 7 connections, exceeding the single-connection facade ceiling.
Validation
wsdb/bootstrap.shyarn-project/yarn install --immutableon this branch after generated package lock updates../bootstrap.shpassed on this branch during stack validation.native_world_state.test.ts(50 tests incl. concurrent mutating/non-mutating ordering) green.