feat(ipc): concurrent service machinery — async reactor + codegen by charlielye · Pull Request #24198 · AztecProtocol/aztec-packages

charlielye · 2026-06-19T15:08:06Z

What

General-purpose machinery in ipc-runtime and ipc-codegen for building IPC services that handle requests concurrently, plus its first consumer (the wsdb world-state service).

Until now an ipc-runtime server processed requests through a single-threaded run() loop (accept → wait_for_data → receive → handler → send → release) — one in-flight request at a time. A service backed by this could only dispatch as fast as one thread, regardless of how concurrent its clients were. This adds an asynchronous, non-blocking reactor that any generated service can opt into, so a service can service its connections concurrently while ipc-runtime itself stays thread-pool-agnostic.

Nothing here is wsdb-specific except the per-fork scheduler called out under Consumers. The reactor, the async codegen, the wake mechanism, and the response reordering are service-agnostic: aztec-wsdb, aztec-vm-sim, and cdb all generate against the same async server interface. wsdb is simply the first service to exercise it under parallel load, so the benchmarks below are wsdb's.

The machinery (ipc-runtime + ipc-codegen)

run_reactor(AsyncHandler): the reactor owns all ring/socket I/O and never blocks on a handler. It copies each request out, release()s the slot immediately, assigns a per-connection sequence, and invokes an AsyncHandler — void(int client_id, std::span<const uint8_t> request, Respond respond), where Respond = std::function<void(std::vector<uint8_t>)>. The handler may respond inline (on the reactor thread) or hand the work to a thread pool and respond later from a worker. ipc-runtime owns no pool and spawns no thread — concurrency is purely the handler's choice. Serial callers (bb) keep the untouched run().
Async server codegen, all four languages: the generated server dispatch hands each handler a responder rather than expecting a return value, so a handler can defer completion. C++ Responder<Resp> (ok()/error()); Rust Responder<R> over a Box<dyn FnOnce(Vec<u8>) + Send>; Zig Responder(RespType) (ok/err); TS handlers are plain async functions returning a Promise. The wire format is unchanged — the cross-language echo matrix and frozen golden fixtures still pass — so this is purely a server-handler shape change for any service that regenerates.
Sole-sender + per-connection reorder: only the reactor calls send(), so each response ring stays single-producer/lock-free. A small per-connection reorder stash (keyed by sequence) preserves FIFO on the wire even when handlers complete out of order — no request-id envelope, no wire/client change. The only lock is over the in-process stash, never a ring.
notify() wake (no timeout poll): lets a worker that finished off the reactor thread wake the reactor immediately. Sockets use a self-pipe registered in the epoll/kqueue set; MPSC-SHM bumps the doorbell seq then futex_wakes (mirroring publish, so the wake isn't lost against the consumer's armed futex), and the consumer evaluates the completion predicate inside the same seq-latched window. The SHM spin also breaks on a doorbell-seq change so low-concurrency completions aren't spun through.
Inline fast path: when nothing is in flight and no further request is already pending, the handler runs on the reactor thread instead of dispatching — the thread handoff is pure latency when there's no concurrency to exploit. inflight == 0 short-circuits the pending check, so a burst never pays for it and stays on the dispatch path. The check is has_pending_request(): sockets poll via wait_for_data(0); MPSC-SHM uses a side-effect-free MpscConsumer::has_data() so the peek doesn't disturb its round-robin / spin state.

Consumers

wsdb (aztec-wsdb) — the first consumer, and the only part of this PR that is service-specific. A per-fork WsdbScheduler sits on top of the reactor: each handler routes its work through schedule_read / schedule_write, and per fork, committed reads run concurrently (they read the committed snapshot, independent of in-flight writes), uncommitted reads wait behind an in-flight write on that fork, and writes are exclusive per fork. This moves read/write ordering out of the client (the old WorldStateOpsQueue, removed in refactor: cut TS world state and NAPI AVM over to WSDB IPC; delete NAPI WSDB #23036) and into the database, so multiple clients stay consistent. Wiring: a dispatch bb::ThreadPool distinct from WorldState's intra-op pool (mutating handlers wait() on the latter; sharing could deadlock), sized from the caller's threads budget (not hardware_concurrency(), which ignores cgroup limits and would exhaust the per-UID thread limit in containers), and max_shm_clients raised to 8 to cover the AVM pool.
avm / cdb — aztec-vm-sim and cdb generate against the same async server interface. The avm server runs its handler inline on the reactor thread (one in-flight simulation per connection, so no dispatch pool), which is exactly the inline fast path above.

Results

These are the wsdb service under parallel reads — the consumer that drives this work. reads/s, same machine, single-worker jest (meaningful run-to-run variance on a shared host; medians shown).

Single connection, pipelined (the TS world-state client pattern)

parallel_read.bench.test.ts — N in-flight reads down one connection, vs the in-process next baseline:

concurrency	in-process (`next`)	reactor — UDS	reactor — SHM
1	8719	7277	7395
2	17011	13567	15687
4	29567	18944	23203
8	38309	25106	29234
16	36879	28676	30390

With the old serial run() the IPC path was flat at ~15k reads/s regardless of concurrency. The reactor now scales; the inline fast path holds c=1–2 at ~in-process parity (no low-concurrency regression). One pipelined connection plateaus around ~30k because all N in-flight reads contend on a single request/response ring and the single reactor funnel.

N independent connections, sequential (the AVM simulator-pool topology)

multi_connection_read.bench.test.ts — one wsdb, N client connections, each a synchronous read loop (one in-flight). This is what the aztec-vm-sim pool actually does. Caveat: not a like-for-like in-process comparison — this bench calls AsyncApi directly, skipping the facade that the in-process numbers (and parallel_read.bench) go through. That facade costs ~40% (the 1-connection number here, ~10.6k SHM, vs parallel_read SHM c=1, ~7.4k), so these figures measure server capability with a lean client, not an in-process win:

connections	reactor — UDS	reactor — SHM
1	~8.1k	~10.6k
2	~13.8k	~18.3k
4	~20.9k	~24.5k
7	~30.4k	~40-47k

Two takeaways: (1) spreading load across N rings scales better than pipelining one connection (N=7 SHM raw ~46k vs one pipelined connection ~30k), so the server itself is not the bottleneck; (2) deflated by the ~1.4x facade factor, SHM lands at ~32k facade-equivalent — ~80-86% of in-process, never above — consistent with the like-for-like single-connection table. The IPC path does not beat in-process; the residual gap is the facade + cross-process transport. (7 connections is the production SHM ceiling: max_shm_clients=8 minus the world-state client's slot. UDS is uncapped.)

Testing

New gtests exercise the reactor over both transports: SocketTest.ReactorPipelinedConcurrencyAndOrder and ShmTest.MpscReactorPipelinedConcurrencyAndOrder — pipelined requests with reversed completion latencies must come back FIFO and complete concurrently (a lost wake shows as a stall). Full ipc-runtime suite green.
ipc-codegen cross-language echo matrix (UDS + SHM) and frozen golden fixtures green — the async handler change is wire-compatible.
native_world_state.test.ts (50 tests) green over the reactor, including Concurrent requests — mutating and non-mutating requests are correctly queued (validates wsdb's per-fork read/write ordering and that mutating handlers run on the dispatch pool without deadlocking on WorldState's intra-op pool).
Benches: parallel_read.bench.test.ts runs on both next (in-process) and this branch (identical file); multi_connection_read.bench.test.ts is IPC-only (skips in-process).

Asynchronous server-handler codegen across C++/Rust/Zig (TS already async): the generated dispatch hands each handler a respond callback, so a handler may run inline or defer to a thread pool and respond when ready. The wire protocol is unchanged. ipc-runtime gains run_reactor (a non-blocking reactor that owns all ring I/O, is the sole sender, and reorders responses per connection) plus notify()/wait_for_data_or_ready for completion wakeups, and the run() serial loop is unchanged. The wsdb C++ server adopts this: each of the 40 handlers declares its own ordering via schedule_read / schedule_write (reads concurrent, committed reads unordered, writes exclusive per fork), and WsdbScheduler implements the read-batch / write-barrier model with an inline fast path when idle and a dispatch pool distinct from WorldState's intra-op pool. This is the IPC-machinery layer that the wsdb cutover builds on; it is independent of the world-state IPC consumer. Includes the single-connection parallel-read benchmark (transport-agnostic).

charlielye mentioned this pull request Jun 20, 2026

feat(ipc): async server handlers + server-side per-fork ordering #24210

Closed

charlielye changed the base branch from cl/ipc-3-avm-wsdb-cutover to next June 20, 2026 21:01

charlielye requested review from IlyasRidhuan, MirandaWood and jeanmon as code owners June 20, 2026 21:01

charlielye force-pushed the cl/ipc-wsdb-concurrent-server branch from f407002 to 317a72f Compare June 20, 2026 21:02

charlielye mentioned this pull request Jun 20, 2026

refactor: cut TS world state and NAPI AVM over to WSDB IPC; delete NAPI WSDB #23036

Open

charlielye changed the title ~~feat(ipc): concurrent wsdb server via reactor + executor~~ feat(ipc): concurrent wsdb server — async handlers + per-fork ordering Jun 20, 2026

charlielye force-pushed the cl/ipc-wsdb-concurrent-server branch from 317a72f to 317eaf3 Compare June 21, 2026 09:20

charlielye changed the title ~~feat(ipc): concurrent wsdb server — async handlers + per-fork ordering~~ feat(ipc): concurrent service machinery — async reactor + codegen Jun 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ipc): concurrent service machinery — async reactor + codegen#24198

feat(ipc): concurrent service machinery — async reactor + codegen#24198
charlielye wants to merge 1 commit into
nextfrom
cl/ipc-wsdb-concurrent-server

charlielye commented Jun 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

charlielye commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

The machinery (ipc-runtime + ipc-codegen)

Consumers

Results

Single connection, pipelined (the TS world-state client pattern)

N independent connections, sequential (the AVM simulator-pool topology)

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

charlielye commented Jun 19, 2026 •

edited

Loading