Skip to content

2026 05 27 update#6785

Merged
danlapid merged 93 commits into
mainfrom
maizatskyi/2026-05-27-upstream
Jun 1, 2026
Merged

2026 05 27 update#6785
danlapid merged 93 commits into
mainfrom
maizatskyi/2026-05-27-upstream

Conversation

@mikea
Copy link
Copy Markdown
Contributor

@mikea mikea commented May 27, 2026

git merge 2809a74

mikea and others added 30 commits May 8, 2026 15:43
NOUPSTREAM gitlab CI

See merge request cloudflare/ew/workerd!1
We have a Use-after-free bug regarding SqliteDatabase::Regulator
lifetimes. Specifically, SqlStorage inherits from
SqliteDatabase::Regulator, and then passes references to itself into
SqliteDatabase calls that construct things, like Statements and
Queries.

Because SqliteDatabase::Regulator is basically a small logic options
class, it might make sense that downstream things only hold a
reference to it. Indeed, many uses of SqliteDatabase::Regulator are
constexpr.

However, in the case of SqlStorage, SqliteDatabase::Regulator is
dynamic (SqlStorage).

Because the .storage field in JS land is a LAZY_INSTANCE_PROPERTY
field, it can be overwritten, and GC can be triggered such that
SqlStorage is garbage collected and released, even if there are live
SqliteStatement types still using SqlStorage as a Regulator.

So, that's a Use-after-Free mistake, and the ASan report agrees with
that assessment.

So how do we fix it?

This approach is to recognize that the Regulator is a bundle of
completely static things, and we never have a case where a Regulator
has some dynamic policy that can't last the lifetime of the process.

So, this change simply requires that all Regulators used by
SqliteDatabase are statically allocated, thus eliminating this class
of use-after-free. As a consequence, SqlStorage is no longer a
Regulator.

Also a use-after-free test is added.
https://jira.cfdata.org/browse/VULN-127735

When reader.read() triggers the pull() callback (through ConsumerImpl::read()
-> handleRead -> onConsumerWantsData -> pull), and the pull() callback
synchronously calls reader.cancel(), the consumer is destroyed mid-read:

- ByteReadable::cancel() at standard.c++:2163 sets state = kj::none,
  immediately freeing the ConsumerImpl<ByteQueue>
- Control returns to ConsumerImpl::read() at queue.h:471 which calls
  maybeDrainAndSetState() on the freed this

ValueReadable already had a reading flag to prevent this
(standard.c++:1849-1858, 1905), but ByteReadable was missing the equivalent
guard.

Fix (two layers of defense)

1. queue.h - ConsumerImpl::read(): Use the existing selfRef weak ref to guard
   maybeDrainAndSetState(). After handleRead() returns, runIfAlive() checks
   whether the consumer was destroyed before accessing it. This is
   defense-in-depth that protects against any path that could destroy the
   consumer during handleRead.

2. standard.c++ - ByteReadable: Add a reading flag (matching ValueReadable's
   existing pattern) that prevents cancel() from immediately setting state =
   kj::none. Instead, cancel() sets pendingCancel = true, and the destruction
   is deferred until after read() completes. This is the same pattern
   ValueReadable already uses.
Take a strong reference to prevent GC from freeing the target port during
serialization. Serialization can run arbitrary user code via custom getters.
Apply edgeworker patches

See merge request cloudflare/ew/workerd!68
capnproto/capnproto#2501 introduced a source-breaking change:
schema::Value::Reader::getStruct() now returns capnp::AnyStruct::Reader
(with as<T>()) instead of capnp::AnyPointer::Reader (with getAs<T>()).

Bump capnp-cpp past it and update the two getStruct().getAs<T>() callers
in compatibility-date.{c++,-test.c++} to use as<T>().

Assisted-by: OpenCode:claude-opus-4.7
Bump capnp-cpp past AnyStruct schema change and fix compatibility-date

See merge request cloudflare/ew/workerd!70
Additionally, make SequentialSpanSubmitter use entropy-based span IDs
outside predictable mode.
This is especially important for correct trace hierarchy in local dev
now that USER_SPAN_CONTEXT_PROPAGATION makes multiple workers emit a
combined trace.
Make wd_tests run in predictable mode by default

See merge request cloudflare/ew/workerd!74
Use Vector::add() in X509Certificate::getKeyUsage() to avoid use of uninitialized memory.

See merge request cloudflare/ew/workerd!72
NOUPSTREAM asan build

See merge request cloudflare/ew/workerd!67
…t() inner .then() continuation

The inner jsg::Promise::then() continuation in WorkerLoader::get() at
worker-loader.c++:71 captured IoContext by raw C++ reference (&ioctx) into a
V8-heap-rooted promise reaction whose lifetime is decoupled from the IoContext.
When the originating IoContext was destroyed before the user's getCode() promise
resolved, and the promise was later resolved from a different IoContext on the
same isolate (possible when handle_cross_request_promise_resolution is disabled),
the lambda would dereference freed memory through toDynamicWorkerSource() →
getIoChannelFactory() → getCurrentIncomingRequest(), leading to a heap
use-after-free with a virtual call through pointers derived from the freed
712-byte IoContext allocation.

The fix replaces the raw [&ioctx] capture with a kj::Own<IoContext::WeakRef>
obtained via ioctx.getWeakRef(). The inner lambda now calls
weakIoctx->tryGet() and throws a clean JS error ("The request which initiated
this dynamic worker load has already completed.") if the IoContext has been
destroyed, converting the UAF into a safe, catchable exception regardless of
the handle_cross_request_promise_resolution setting. The outer
makeReentryCallback wrapper already uses getWeakRef() for its own guard, but
the inner .then() lambda bypassed that safety by capturing &ioctx directly.

The regression test (regressionDeadIoContextGetCode) exercises the patched code
path by making a sub-request that calls env.loader.get() with a pending getCode
promise, returning to drain the sub-request's IoContext, then resolving the
promise from the test's IoContext. Post-patch, the WeakRef check fires and the
clean error message is logged; pre-patch, the UAF would silently dereference
freed memory (observable as a crash under ASAN).

Test validation: VALIDATED LOCALLY
Pre-patch run: PASS (bazel test //src/workerd/api/tests:worker-loader-test@)
Post-patch run: PASS (bazel test //src/workerd/api/tests:worker-loader-test@)

Refs: AUTOVULN-CLOUDFLARE-WORKERD-256
VULN-136585: fix(worker-loader): replace raw IoContext& capture with WeakRef in get() inner .then() continuation

See merge request cloudflare/ew/workerd!23
Adds four new fields to type the RFC 9440 mTLS certificate properties
now exposed on `request.cf.tlsClientAuth`: `certRFC9440`,
`certRFC9440TooLarge`, `certChainRFC9440`, and
`certChainRFC9440TooLarge`. Matching placeholder values are added to
`IncomingRequestCfPropertiesTLSClientAuthPlaceholder`.

See the [RFC 9440 mTLS fields changelog post][changelog].

[changelog]: https://developers.cloudflare.com/changelog/post/2026-03-27-rfc9440-mtls-fields/
Add RFC 9440 mTLS fields to `IncomingRequestCfPropertiesTLSClientAuth`

See merge request cloudflare/ew/workerd!76
Trigger internal CI on workerd MRs

See merge request cloudflare/ew/workerd!71
The slow path of the sync zlib convenience methods (`{ info: true }`)
constructs a JSG-bound CompressionStream wrapper per call. The wrapper
holds a jsg::Function writeCallback that captures the JS handle (see
internal_zlib_base.ts), forming a JS<->C++ reference cycle. Without a
visitForGc, V8 cannot trace through the C++->JS edge, so the cycle is
uncollectable and every CompressionStream becomes immortal.

Reproducer: 20k iterations of inflateSync(input, { info: true }) leaks
~128 MB.

Adds visitForGc() to CompressionStream covering writeCallback,
writeResult, and errorHandler. Also clears these refs eagerly in
close() so callers that explicitly destroy don't have to wait on the
cycle collector.

The fast path (zlibUtil.zlibSync) is unaffected: it does the whole
compression in C++ without exposing a CompressionStream wrapper to JS.

Adds zlib-leak-nodejs-test asserting that engines returned via
{ info: true } are reclaimed after GC, using WeakRef and --expose-gc.
Add visitForGc to CompressionStream to fix zlib slow-path leak

See merge request cloudflare/ew/workerd!69
…ent stack overflow

JsObject::getPrototype() in src/workerd/jsg/jsvalue.c++ recursed directly
into the Proxy target when no getPrototypeOf trap was present (line 154).
An attacker-supplied chain of ~1M nested `new Proxy(prev, {})` wrappers
drove unbounded native C++ recursion, overrunning the stack guard page and
crashing the workerd process with SIGSEGV. This affected all callers:
processEntrypointClass, collectMethodsFromPrototypeChain, and RPC paths.
The fix replaces the self-recursion with an iterative loop and a hard depth
limit of 100,000 (matching V8's internal JSProxy::kMaxIterationLimit),
throwing a RangeError when exceeded.

The regression test in jsvalue-test.c++ creates a 200,000-deep Proxy chain
and calls checkProxyPrototype(), asserting that a RangeError is thrown
instead of crashing. AUTOVULN-CLOUDFLARE-WORKERD-143.

Test validation: VALIDATED LOCALLY
Pre-patch run: FAIL (bazel test //src/workerd/jsg:jsvalue-test@)
Post-patch run: PASS (bazel test //src/workerd/jsg:jsvalue-test@)

Refs: AUTOVULN-CLOUDFLARE-WORKERD-143
Use Gitlab job ID as run_id for workerd-robot

See merge request cloudflare/ew/workerd!79
With the goal of preventing tens of thousands of these from being
accumulated by individual isolates without GC kicking in, holding open
outbound network connections unnecessarily.
ketanhwr and others added 18 commits May 14, 2026 14:04
VULN-136618: fix(worker-loader): copy data/wasm module bytes before async compilation

See merge request cloudflare/ew/workerd!55
…in handlePush

ByteQueue::handlePush() in queue.c++ called bufferData(0) when a partially
consumed entry could not satisfy the next pending BYOB readAtLeast() request.
This re-buffered the entire entry from offset 0 instead of from the current
entryOffset, duplicating already-consumed bytes and inflating queueTotalSize.
On the next enqueue, the KJ_REQUIRE at line 1110 (state.queueTotalSize <
pending.pullInto.atLeast) would fail because the duplicated bytes made
queueTotalSize exceed atLeast. The fix changes bufferData(0) to
bufferData(entryOffset) so only the unconsumed tail is buffered.

The regression test creates two concurrent readAtLeast(5) BYOB reads with
5-byte views, enqueues 7 bytes (partially consumed by read #1, leaving 2
bytes for read #2's buffer), then enqueues 4 more bytes. Pre-patch this
triggers the assertion failure; post-patch both reads complete correctly.

Test validation: VALIDATED LOCALLY
Pre-patch run: FAIL (bazel test //src/workerd/api/tests:streams-byob-concurrent-readatleast-test@)
Post-patch run: PASS (bazel test //src/workerd/api/tests:streams-byob-concurrent-readatleast-test@)

Refs: AUTOVULN-CLOUDFLARE-WORKERD-18
…ayPtr<T>()

The const overload of BackingStore::asArrayPtr<T>() in buffersource.h computed
the returned pointer as static_cast<T*>(backingStore->Data()) + byteOffset,
which treats byteOffset (a byte count) as an element count. For multi-byte T
(e.g. uint32_t), this advances by byteOffset * sizeof(T) bytes instead of
byteOffset bytes, producing an out-of-bounds pointer. The non-const overload
was already correct: it casts to kj::byte* first, adds byteOffset, then
reinterprets to T*. The fix makes the const overload mirror the non-const
overload and adds a byteOffset alignment assertion.

The regression test creates a Uint8Array view at byteOffset=4 over a 12-byte
ArrayBuffer, writes known byte patterns, then calls the const overload of
asArrayPtr<uint32_t>() and asserts the returned pointer reads the correct
uint32_t values. Before the fix, the const overload advanced by 16 bytes
(4 * sizeof(uint32_t)) instead of 4 bytes, reading zeroed memory.

Test validation: VALIDATED LOCALLY
Pre-patch run: FAIL (bazel test //src/workerd/jsg:buffersource-test@)
Post-patch run: PASS (bazel test //src/workerd/jsg:buffersource-test@)

Refs: AUTOVULN-CLOUDFLARE-WORKERD-17
Guard IoContext::current() in memory-cache eviction path.

See merge request cloudflare/ew/workerd!109
fix(jsg): correct pointer arithmetic in const BackingStore::asArrayPtr<T>()

See merge request cloudflare/ew/workerd!93
[ci] Use 16 CPU runner

See merge request cloudflare/ew/workerd!113
several fields were missing

Refs: AUTOVULN-CLOUDFLARE-WORKERD-44
VULN-136583: fix(streams): preserve entry offset when buffering partial BYOB data in handlePush

See merge request cloudflare/ew/workerd!20
fix EventSource memory tracking

See merge request cloudflare/ew/workerd!86
[build] silence protobuf warning

See merge request cloudflare/ew/workerd!114
This mostly reverts commit 0d86b66.

This removes the new `debugContext` string that was being passed around to distinguish params from results. Now that we've debugged the issue, this is more noise than it is worth.

We do keep the `cap.debugInfo()` debug log on failures, since that's not so invasive and is more useful anyway.
DO NOT MERGE until the autogate has been rolled to all of production!
This allows ExternalPusher methods to continue to be invoked after the top-level RPC call().

(DO NOT MERGE until jsrpc-session-handle autogate is rolled out in prod.)
There are cases where it is difficult to acquire the channel token for a SubrequestChannel or ActorClassChannel synchronously, but until now we have needed to do so in order to serialize `Fetcher`s and `DurableObjectClass`es.

We can't make serialization itself be async, because this would mess up e-order: A call that needs to wait for something while serializing params might end up being delayed until after some subsequent call which didn't wait, and so would be delivered out-of-order.

To avoid this, we make it possible for a call to be sent with an IOU for the channel tokens. This uses `ExternalPusher`. The call embeds an external which is a promise capability. Later, the caller invokes the callee's `ExternalPusher` to push the channel token to it, and resolves the IOU promise to the resulting object. The callee can then unwrap the promise to get their token.

(Opus 4.7 wrote the new test cases in channel-token-test but the rest of the code was by hand.)
This makes it so `getSubrequestChannel()` and similar methods of `IoChannelFactory` make sure that the contents of a `props` cap table are fully resolved before forwarding on to the `IoChannelFactory` implementation.

This means that the underlying implementation of `getSubrequestChannelResolved()` et al doens't need to change to start calling `getResolved()` before trying to downcast channel objects to implementation-specific subclasses. This otherwise would have been really annoying to do in the internal codebase.

Relatedly, this adds an `ensureAllResolved()` method to `DynamicWorkerSource`, for resolving channels there.
@ask-bonk
Copy link
Copy Markdown
Contributor

ask-bonk Bot commented May 27, 2026

This is a large security-focused upstream merge. After thorough review, the changes are predominantly fixes for multiple CVEs and the fixes themselves are well-implemented. I found no high-severity issues in the new code. The few observations I have are all low severity, and I don't believe any of them warrant blocking the PR.

Let me verify there are no actionable issues worth posting as GitHub suggestions:

  1. StaticRegulator consteval pattern - Safe and correct.
  2. NeuterableIoStream kj::Own -> kj::Rc - Correct, needed for the connect handler UAF fix.
  3. Stream queue weak-ref guards - Thorough and correct use of weak refs to detect re-entrant JS destroying consumers.
  4. Memory cache FallbackDoneCallback - Correctly captures kj::atomicAddRef of cache instead of this (Use).
  5. Headers::setCommon validation - Security fix, getCommonHeaderName has its own bounds assertion.
  6. visitForGc additions - Comprehensive across ~30+ types, with a new clang-tidy lint for enforcement.
  7. Worker-loader IoContext capture - Fixed from raw reference to weak ref.
  8. HTTP path SSRF protection - Correct security hardening, intentional deviation from Node.js behavior.
  9. StreamSink/ExternalPusher removal - Old autogated code paths removed, obsolete capnp fields handled gracefully.

All findings are either correct fixes, or very low-severity observations that don't warrant review comments.

LGTM

github run

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 51.32827% with 513 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.46%. Comparing base (e6353cd) to head (331bf27).

Files with missing lines Patch % Lines
src/workerd/io/io-channels.c++ 20.57% 173 Missing and 20 partials ⚠️
src/workerd/server/server.c++ 32.14% 38 Missing ⚠️
src/workerd/api/actor.c++ 38.46% 24 Missing and 8 partials ⚠️
src/workerd/api/trace.h 22.50% 19 Missing and 12 partials ⚠️
src/workerd/api/container.c++ 0.00% 30 Missing ⚠️
src/workerd/api/http.c++ 36.17% 24 Missing and 6 partials ⚠️
src/workerd/server/channel-token-test.c++ 79.71% 7 Missing and 21 partials ⚠️
src/workerd/util/sqlite.c++ 33.33% 9 Missing and 9 partials ⚠️
src/workerd/io/external-pusher.c++ 25.00% 12 Missing ⚠️
src/workerd/server/channel-token.c++ 77.77% 4 Missing and 4 partials ⚠️
... and 30 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6785      +/-   ##
==========================================
- Coverage   66.61%   66.46%   -0.15%     
==========================================
  Files         402      404       +2     
  Lines      115914   116468     +554     
  Branches    19425    19512      +87     
==========================================
+ Hits        77212    77414     +202     
- Misses      27112    27416     +304     
- Partials    11590    11638      +48     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mikea mikea marked this pull request as ready for review June 1, 2026 15:30
@mikea mikea requested review from a team as code owners June 1, 2026 15:30
@mikea mikea requested a review from petebacondarwin June 1, 2026 15:30
@petebacondarwin
Copy link
Copy Markdown
Contributor

Hey @mikea - I have been requested to review this PR but I have no context of what this is doing and the motivation. Can you help me out here?

@danlapid danlapid merged commit 0aebafa into main Jun 1, 2026
21 of 22 checks passed
@danlapid danlapid deleted the maizatskyi/2026-05-27-upstream branch June 1, 2026 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.