Skip to content

Auth SW: ask page for token on miss before falling through#4828

Draft
richardhjtan wants to merge 2 commits into
mainfrom
CS-11144-auth-sw-token-race
Draft

Auth SW: ask page for token on miss before falling through#4828
richardhjtan wants to merge 2 commits into
mainfrom
CS-11144-auth-sw-token-race

Conversation

@richardhjtan
Copy link
Copy Markdown
Contributor

Summary

When the auth service worker intercepts an image / asset GET to a known realm host but has no token in its in-memory map, ask the controlling page for one via MessageChannel and retry with Authorization if a token comes back. Falls through to the existing unauthed fetch when no token is available, so current behavior is preserved for non-realm hosts and for users who genuinely have no session for the realm.

Part of a broader investigation into broken-image symptoms on staging and production where users see broken <img> icons and a manual refresh fixes them. This PR addresses one of the failure modes — image requests that 401 because the SW's token map is stale relative to localStorage. Full investigation report shared separately.

Linear: CS-11144

What this fixes

The SW caches realm tokens in memory and is updated by the host page via postMessage. If the cache is missing a token at the moment an <img> GET is intercepted, the SW used to forward the request unauthenticated and the realm-server returned 401. After this PR, the SW asks the page for the token, reads from the authoritative SessionLocalStorageKey in localStorage, and retries with auth.

This is a freshness / resilience fix on the existing auth path — it does not change which realms a user has access to.

Implementation notes

  • Origin gating — fallback only fires when the request origin is one we've ever held a realm token for, so unrelated cross-origin asset requests don't pay any round-trip cost.
  • Single-flight — concurrent misses for the same URL share one MessageChannel round-trip.
  • Short timeout (200 ms) — if the page doesn't reply, fall through. Worst-case impact on page load is bounded.
  • No new auth surface — only reads tokens the page already has. No new IAM, no new endpoints.

Files

  • packages/host/public/auth-service-worker.js — fetch handler now does on-miss MessageChannel lookup.
  • packages/host/app/utils/auth-service-worker-registration.ts — listens for the SW's token request and replies from localStorage.
  • packages/host/tests/unit/auth-service-worker-test.ts — extended to cover the new path: positive case, single-flight, fall-through.

Test plan

  • pnpm lint / pnpm lint:types green.
  • Manual repro on staging: open a card that embeds an image from a realm where the token is in localStorage but the SW's map hasn't received it yet. Before → broken-image icon. After → image loads on first paint.
  • Re-run the staging log scan ~7 days after deploy; image-401 count from this failure mode should drop to ~0.

Out of scope

Other failure modes from the same investigation (URL-construction bugs on the indexer side, prerender base-URL handling, content-negotiation edge cases) are tracked separately and will land on their own branches.

🤖 Generated with Claude Code

When the auth service worker intercepts a GET to a known realm host but
has no token in its in-memory map, send a MessageChannel request to the
controlling page asking for one and retry with auth if a token comes
back. Falls through to the existing unauthed-fetch behavior when no
token is available.

Fixes the broken-image-icon symptom on first paint for: SW activation
races (per-realm sync was dropped because navigator.serviceWorker.controller
was null), and any other window where the SW's token map is stale
relative to localStorage. The page reads from localStorage via the
existing SessionLocalStorageKey, so this is purely a freshness fix —
it does not change which realms a user has access to.

The host-side listener replies via the MessagePort; single-flight per
request URL keeps a burst of <img> tags from triggering a burst of
postMessages. Origin-gated to known realm hosts so unrelated
cross-origin asset requests don't pay the round-trip cost.

CS-11144

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 14, 2026

Preview deployments

Host Test Results

    1 files  ±0      1 suites  ±0   1h 39m 49s ⏱️ - 1m 10s
2 667 tests +2  2 652 ✅ +2  15 💤 ±0  0 ❌ ±0 
2 686 runs  +2  2 671 ✅ +2  15 💤 ±0  0 ❌ ±0 

Results for commit 2b9f986. ± Comparison against earlier commit df97848.

Realm Server Test Results

    1 files  ±0      1 suites  ±0   11m 18s ⏱️ +22s
1 365 tests ±0  1 365 ✅ ±0  0 💤 ±0  0 ❌ ±0 
1 444 runs  ±0  1 444 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit 2b9f986. ± Comparison against earlier commit df97848.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a service-worker fallback path so intercepted realm asset requests can recover when the SW token cache misses by asking the controlled page for a token from localStorage before falling back to an unauthenticated fetch.

Changes:

  • Tracks known realm origins and in-flight token requests in the auth service worker.
  • Adds a page-side request-realm-token message handler that resolves the best matching token from localStorage.
  • Extends unit coverage for on-miss token lookup, caching, single-flight behavior, and fall-through.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
packages/host/public/auth-service-worker.js Adds known-host tracking, token lookup via MessageChannel, and retry/fall-through logic.
packages/host/app/utils/auth-service-worker-registration.ts Adds the page-side service worker message handler and localStorage token resolution.
packages/host/tests/unit/auth-service-worker-test.ts Updates the simulated SW test harness and adds coverage for the fallback path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +185 to +194
// No token in the map. Only attempt the on-miss client fallback when the
// request is to a host we've ever held a realm token for — that keeps the
// round-trip cost off every unrelated cross-origin asset request.
let requestOrigin;
try {
requestOrigin = new URL(url).origin;
} catch {
return;
}
if (!realmHosts.has(requestOrigin)) {
Comment on lines +128 to +131
// Ask the first window client. If multiple are open, any one of them
// can answer from the shared localStorage / session state.
clientList[0].postMessage({ type: 'request-realm-token', requestURL }, [
channel.port2,
Comment on lines +110 to +132
let timer = setTimeout(() => {
if (settled) return;
settled = true;
resolve(undefined);
}, TOKEN_REQUEST_TIMEOUT_MS);
channel.port1.onmessage = (event) => {
if (settled) return;
settled = true;
clearTimeout(timer);
let reply = event.data;
if (reply && reply.realmURL && reply.token) {
realmTokens.set(reply.realmURL, reply.token);
recordRealmHost(reply.realmURL);
resolve(reply.token);
} else {
resolve(undefined);
}
};
// Ask the first window client. If multiple are open, any one of them
// can answer from the shared localStorage / session state.
clientList[0].postMessage({ type: 'request-realm-token', requestURL }, [
channel.port2,
]);
…lot review)

Three follow-ups from the PR review:

1. Cold start. The on-miss client fallback used to require realmHosts
   to already contain the request's origin. At cold start (SW just
   activated, page hasn't synced yet) realmHosts is empty even though
   localStorage has tokens — which is the exact stale-cache case the
   fallback is meant to recover. Now: when realmHosts is empty, allow
   the fallback through; once populated, gate as before to keep the
   round-trip off unrelated cross-origin asset requests.

2. Ask the initiating client first. With skipWaiting() + clients.claim()
   multiple tabs can be controlled by this SW where some still run an
   older bundle without the request-realm-token listener. Always asking
   "first window" could hang on such a tab. Prefer event.clientId, fall
   back to first window only if the initiating client isn't a window
   (or no clientId was provided).

3. Timeout-path test. The test harness now mirrors the SW's race-
   against-timer behavior so a stuck clientTokenLookup results in
   fallthrough-fetch rather than hanging the suite.

CS-11144

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

packages/host/public/auth-service-worker.js:143

  • A delayed MessageChannel reply is accepted and cached unconditionally. If a clear-tokens, remove-realm-token, or newer sync-tokens message arrives while this lookup is in flight, this handler can re-add a token after logout or overwrite a freshly synced token with a stale one; the fallback should be invalidated or version-checked against later token mutations before caching/using the reply.
        if (reply && reply.realmURL && reply.token) {
          realmTokens.set(reply.realmURL, reply.token);
          recordRealmHost(reply.realmURL);
          resolve(reply.token);

Comment on lines +203 to +216
// No token in the map. Attempt the on-miss client fallback when either
// (a) the SW has not yet learned any realm hosts (cold-start: SW just
// activated and the page hasn't synced yet — exactly when we want the
// fallback to recover from a stale empty cache), or (b) the request
// origin matches a host we have ever held a token for. Skip the
// fallback for clearly-unrelated cross-origin assets once realmHosts
// is populated.
let requestOrigin;
try {
requestOrigin = new URL(url).origin;
} catch {
return;
}
if (realmHosts.size > 0 && !realmHosts.has(requestOrigin)) {
Comment on lines +116 to +120
async function requestTokenFromClient(requestURL, initiatingClientId) {
// Single-flight per request URL
let existing = inflightTokenRequests.get(requestURL);
if (existing) {
return existing;
Comment on lines +40 to +49
navigator.serviceWorker.addEventListener('message', (event) => {
if (!event.data || event.data.type !== 'request-realm-token') {
return;
}
let port = event.ports?.[0];
if (!port) {
return;
}
let { realmURL, token } = resolveTokenForRequestURL(event.data.requestURL);
port.postMessage({ realmURL, token });
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants