Skip to content

Add chunked offline render to bound WASM memory on long outputs#1

Open
olilarkin wants to merge 2 commits into
mainfrom
chunked-offline-render
Open

Add chunked offline render to bound WASM memory on long outputs#1
olilarkin wants to merge 2 commits into
mainfrom
chunked-offline-render

Conversation

@olilarkin

Copy link
Copy Markdown
Owner

Problem

Stretching a short sound by a large factor (e.g. ~7 s × 482× → an hour-plus of audio) aborts the WebAssembly module with the generic "build with -sASSERTIONS for more info" message. This is an out-of-memory abort(), not a logic bug — reported downstream in paulstretch-for-live#2. It happens on the OfflineRenderer path (clicking Apply).

Root cause. renderMono/renderStereo build the entire output as one contiguous std::vector<float> in WASM linear memory, then to_js_float32_array allocates a second full-size copy — so peak memory is ≈ 2× the output, doubled again for stereo. The module links with -sALLOW_MEMORY_GROWTH=1 but no -sMAXIMUM_MEMORY, so the cap is Emscripten's 2 GiB default. A 7 s × 482× stereo render peaks well past that.

Fix

  • Chunked offline render (the real fix). New OfflineRenderer::render_mono_chunked / render_stereo_chunked (C++) and renderMonoChunked / renderStereoChunked (WASM). Same DSP — the existing render loops are refactored into shared stream_channel / stream_stereo cores, so render_mono/render_stereo output is unchanged — but the result is delivered one bufsize() chunk at a time. Each chunk is copied out to a fresh JS-heap Float32Array, so peak WASM memory stays ≈ input + one chunk regardless of output length. The consumer accumulates chunks on the JS heap (or streams them to disk / an encoder).
  • Memory cap bump. -sMAXIMUM_MEMORY=4294967296 (4 GiB wasm32 max) for immediate headroom. The chunked API is what removes the dependency on output length.
  • TypeScript declarations + npm README section.

Tests

  • tests/chunked_test.cpp (new):
    • Parity — chunked vs whole-buffer render match structurally (length, finiteness, peak, RMS; not bit-exact because the algorithm's per-instance PRNG seed differs per render, matching the streaming_test convention), every chunk is exactly bufsize, empty input is a no-op.
    • Memory limits — allocation-free check (via estimate_output_frames against named wasm32 caps) that a representative extreme render's buffered footprint overflows the 2 GiB default cap, and a bigger one overflows even the 4 GiB max, while the chunked working set stays ~16 KB.
  • Extended the headless-Chrome browser smoke test (tests/browser/index.html) to exercise renderMonoChunked.

Verified locally

  • ctest passes on PFFFT and ACCELERATE backends.
  • WASM target builds; both new functions exported; built .wasm reports max 65536 pages (4 GiB).
  • Browser smoke test passes under headless Chrome (data-status="ok").

Opening for CI before merge.

olilarkin added 2 commits May 21, 2026 23:45
The npm/README.md already covers the full JS/WASM API. Replace the
parallel section in the root README with a pointer to it.
renderMono/renderStereo materialise the entire output as one contiguous
vector in WASM linear memory and then copy it again into the returned
Float32Array, so peak memory is ~2x the output (and both channels at once
for stereo). A large stretch (e.g. a few seconds stretched several hundred
times into an hour-plus of audio) exceeds the wasm32 heap and aborts. The
module was also linked with ALLOW_MEMORY_GROWTH but no MAXIMUM_MEMORY, so
the cap was Emscripten's 2 GiB default.

- Add OfflineRenderer::render_mono_chunked / render_stereo_chunked, which
  run the same DSP loop (refactored into shared stream_channel /
  stream_stereo cores) but deliver the result one bufsize() chunk at a
  time. render_mono/render_stereo output is unchanged.
- Expose renderMonoChunked / renderStereoChunked in the WASM bindings; each
  chunk is copied out to a fresh JS-heap Float32Array, so peak WASM memory
  stays ~ input + one chunk regardless of output length.
- Raise MAXIMUM_MEMORY to the 4 GiB wasm32 maximum for immediate headroom.
- Add TypeScript declarations and an npm README section.
- Add tests/chunked_test.cpp: structural parity between the chunked and
  whole-buffer paths, plus an allocation-free check that a representative
  extreme render overflows the known wasm32 caps while the chunked working
  set stays tiny. Extend the browser smoke test to exercise renderMonoChunked.
@olilarkin olilarkin force-pushed the chunked-offline-render branch from e0c520d to 06f61c2 Compare June 7, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant