Add chunked offline render to bound WASM memory on long outputs#1
Open
olilarkin wants to merge 2 commits into
Open
Add chunked offline render to bound WASM memory on long outputs#1olilarkin wants to merge 2 commits into
olilarkin wants to merge 2 commits into
Conversation
The npm/README.md already covers the full JS/WASM API. Replace the parallel section in the root README with a pointer to it.
renderMono/renderStereo materialise the entire output as one contiguous vector in WASM linear memory and then copy it again into the returned Float32Array, so peak memory is ~2x the output (and both channels at once for stereo). A large stretch (e.g. a few seconds stretched several hundred times into an hour-plus of audio) exceeds the wasm32 heap and aborts. The module was also linked with ALLOW_MEMORY_GROWTH but no MAXIMUM_MEMORY, so the cap was Emscripten's 2 GiB default. - Add OfflineRenderer::render_mono_chunked / render_stereo_chunked, which run the same DSP loop (refactored into shared stream_channel / stream_stereo cores) but deliver the result one bufsize() chunk at a time. render_mono/render_stereo output is unchanged. - Expose renderMonoChunked / renderStereoChunked in the WASM bindings; each chunk is copied out to a fresh JS-heap Float32Array, so peak WASM memory stays ~ input + one chunk regardless of output length. - Raise MAXIMUM_MEMORY to the 4 GiB wasm32 maximum for immediate headroom. - Add TypeScript declarations and an npm README section. - Add tests/chunked_test.cpp: structural parity between the chunked and whole-buffer paths, plus an allocation-free check that a representative extreme render overflows the known wasm32 caps while the chunked working set stays tiny. Extend the browser smoke test to exercise renderMonoChunked.
e0c520d to
06f61c2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Stretching a short sound by a large factor (e.g. ~7 s × 482× → an hour-plus of audio) aborts the WebAssembly module with the generic "build with -sASSERTIONS for more info" message. This is an out-of-memory
abort(), not a logic bug — reported downstream in paulstretch-for-live#2. It happens on the OfflineRenderer path (clicking Apply).Root cause.
renderMono/renderStereobuild the entire output as one contiguousstd::vector<float>in WASM linear memory, thento_js_float32_arrayallocates a second full-size copy — so peak memory is ≈ 2× the output, doubled again for stereo. The module links with-sALLOW_MEMORY_GROWTH=1but no-sMAXIMUM_MEMORY, so the cap is Emscripten's 2 GiB default. A 7 s × 482× stereo render peaks well past that.Fix
OfflineRenderer::render_mono_chunked/render_stereo_chunked(C++) andrenderMonoChunked/renderStereoChunked(WASM). Same DSP — the existing render loops are refactored into sharedstream_channel/stream_stereocores, sorender_mono/render_stereooutput is unchanged — but the result is delivered onebufsize()chunk at a time. Each chunk is copied out to a fresh JS-heapFloat32Array, so peak WASM memory stays ≈ input + one chunk regardless of output length. The consumer accumulates chunks on the JS heap (or streams them to disk / an encoder).-sMAXIMUM_MEMORY=4294967296(4 GiB wasm32 max) for immediate headroom. The chunked API is what removes the dependency on output length.Tests
tests/chunked_test.cpp(new):streaming_testconvention), every chunk is exactlybufsize, empty input is a no-op.estimate_output_framesagainst named wasm32 caps) that a representative extreme render's buffered footprint overflows the 2 GiB default cap, and a bigger one overflows even the 4 GiB max, while the chunked working set stays ~16 KB.tests/browser/index.html) to exerciserenderMonoChunked.Verified locally
ctestpasses on PFFFT and ACCELERATE backends..wasmreports max 65536 pages (4 GiB).data-status="ok").Opening for CI before merge.