Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/fix-expired-shapes-self-healing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@electric-sql/client': patch
---

Fix permanently stuck expired shape handles in localStorage by adding self-healing retry. When stale cache retries are exhausted (3 attempts with cache busters), the client now clears the expired entry from localStorage and retries once without the `expired_handle` parameter. Since the server never reuses handles (documented as SPEC.md S0), the fresh response will have a new handle and bypass stale detection. This prevents shapes from being permanently unloadable when a proxy strips cache-buster query parameters.
2 changes: 1 addition & 1 deletion examples/burn/assets/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@
"eslint-plugin-prettier": "^5.4.0",
"eslint-plugin-react-hooks": "^4.6.0",
"eslint-plugin-react-refresh": "^0.4.6",
"prettier": "^3.2.4",
"prettier": "^3.6.2",
"typescript": "^5.2.2",
"vite": "^6.2.3"
}
Expand Down
2 changes: 1 addition & 1 deletion examples/redis/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@
"eslint-config-prettier": "^9.1.0",
"eslint-plugin-prettier": "^5.1.3",
"glob": "^10.3.10",
"prettier": "^3.3.2",
"prettier": "^3.6.2",
"shx": "^0.3.4",
"tsup": "^8.0.1",
"tsx": "^4.19.1",
Expand Down
4 changes: 2 additions & 2 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,9 @@
"lint-staged": {
"*.{js,jsx,ts,tsx}": [
"eslint --fix",
"prettier --write"
"node_modules/.bin/prettier --write"
],
"*.{json,css,md,yml,yaml}": "prettier --write"
"*.{json,css,md,yml,yaml}": "node_modules/.bin/prettier --write"
},
"pnpm": {
"patchedDependencies": {
Expand Down
2 changes: 1 addition & 1 deletion packages/experimental/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
"eslint-plugin-prettier": "^5.1.3",
"glob": "^10.3.10",
"pg": "^8.12.0",
"prettier": "^3.3.2",
"prettier": "^3.6.2",
"shx": "^0.3.4",
"tsup": "^8.0.1",
"typescript": "^5.5.2",
Expand Down
2 changes: 1 addition & 1 deletion packages/react-hooks/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
"glob": "^10.3.10",
"jsdom": "^25.0.0",
"pg": "^8.12.0",
"prettier": "^3.3.2",
"prettier": "^3.6.2",
"react": "^18.3.1",
"react-dom": "^18.3.1",
"shx": "^0.3.4",
Expand Down
2 changes: 1 addition & 1 deletion packages/start/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
"eslint": "^8.57.0",
"eslint-config-prettier": "^9.1.0",
"eslint-plugin-prettier": "^5.1.3",
"prettier": "^3.3.2",
"prettier": "^3.6.2",
"shx": "^0.3.4",
"tsup": "^8.0.1",
"typescript": "^5.5.2",
Expand Down
53 changes: 37 additions & 16 deletions packages/typescript-client/SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,26 @@ Any ──markMustRefetch─► Initial (offset = -1)
- `response` on Paused delegates to `previousState`, preserving the Paused wrapper for `accepted` and `stale-retry` transitions; `ignored` returns `this`
- `response`/`messages`/`sseClose` on Error return `this` (ignored)

## Server Assumptions

Properties of the sync service that the client state machine depends on.

### S0: Shape handles are unique and never reused

The server generates handles as `{phash2_hash}-{microsecond_timestamp}`. Uniqueness
is enforced by monotonic timestamps, a SQLite `UNIQUE INDEX` on the handle column,
and ETS `insert_new` checks. Even after server restarts, old handles persist in
SQLite and new ones receive fresh timestamps, so collisions cannot occur.

**Implication for expired shapes cache**: Once a handle is marked expired (after a
409 response), the server will never issue that handle again. If a response contains
an expired handle, it must be coming from a caching layer (browser HTTP cache,
CDN, or proxy) — not from the server itself.

**Source**: `packages/sync-service/lib/electric/shapes/shape.ex` (`generate_id/1`),
`packages/sync-service/lib/electric/shape_cache/shape_status/shape_db/connection.ex`
(`shapes_handle_idx`).

## Invariants

Properties that must hold after every state transition. Checked automatically by
Expand Down Expand Up @@ -346,25 +366,26 @@ This is enforced by the path-specific guards listed below. Live requests

Six sites in `client.ts` recurse or loop to issue a new fetch:

| # | Site | Line | Trigger | URL changes because | Guard |
| --- | --------------------------------------- | ---- | ---------------------------------------------------------- | ----------------------------------------------------------------------------------- | ------------------------------------------------------- |
| L1 | `#requestShape` → `#requestShape` | 940 | Normal completion after `#fetchShape()` | Offset advances from response headers | `#checkFastLoop` (non-live) |
| L2 | `#requestShape` catch → `#requestShape` | 874 | Abort with `FORCE_DISCONNECT_AND_REFRESH` or `SYSTEM_WAKE` | `isRefreshing` flag changes `canLongPoll`, affecting `live` param | Abort signals are discrete events |
| L3 | `#requestShape` catch → `#requestShape` | 886 | `StaleCacheError` thrown by `#onInitialResponse` | `StaleRetryState` adds `cache_buster` param | `maxStaleCacheRetries` counter in state machine |
| L4 | `#requestShape` catch → `#requestShape` | 924 | HTTP 409 (shape rotation) | `#reset()` sets offset=-1 + new handle; or request-scoped cache buster if no handle | New handle from 409 response or unique retry URL |
| L5 | `#start` catch → `#start` | 782 | Exception + `onError` returns retry opts | Params/headers merged from `retryOpts` | `#maxConsecutiveErrorRetries` (50) |
| L6 | `fetchSnapshot` catch → `fetchSnapshot` | 1975 | HTTP 409 on snapshot fetch | New handle via `withHandle()`; or local retry cache buster if same/no handle | `#maxSnapshotRetries` (5) + cache buster on same handle |
| # | Site | Line | Trigger | URL changes because | Guard |
| --- | --------------------------------------- | ---- | ---------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------- |
| L1 | `#requestShape` → `#requestShape` | 940 | Normal completion after `#fetchShape()` | Offset advances from response headers | `#checkFastLoop` (non-live) |
| L2 | `#requestShape` catch → `#requestShape` | 874 | Abort with `FORCE_DISCONNECT_AND_REFRESH` or `SYSTEM_WAKE` | `isRefreshing` flag changes `canLongPoll`, affecting `live` param | Abort signals are discrete events |
| L3 | `#requestShape` catch → `#requestShape` | 886 | `StaleCacheError` thrown by `#onInitialResponse` | `StaleRetryState` adds `cache_buster` param; after max retries, self-healing clears expired entry + resets stream | `maxStaleCacheRetries` counter + `#expiredShapeRecoveryKey` (once per shape) |
| L4 | `#requestShape` catch → `#requestShape` | 924 | HTTP 409 (shape rotation) | `#reset()` sets offset=-1 + new handle; or request-scoped cache buster if no handle | New handle from 409 response or unique retry URL |
| L5 | `#start` catch → `#start` | 782 | Exception + `onError` returns retry opts | Params/headers merged from `retryOpts` | `#maxConsecutiveErrorRetries` (50) |
| L6 | `fetchSnapshot` catch → `fetchSnapshot` | 1975 | HTTP 409 on snapshot fetch | New handle via `withHandle()`; or local retry cache buster if same/no handle | `#maxSnapshotRetries` (5) + cache buster on same handle |

### Guard mechanisms

| Guard | Scope | How it works |
| ----------------------------- | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| `#checkFastLoop` | Non-live `#requestShape` only | Detects N requests at same offset within a time window. First: clears caches + resets. Persistent: exponential backoff → throws FetchError(502). |
| `maxStaleCacheRetries` | Stale response path (L3) | State machine counts stale retries. Throws FetchError(502) after 3 consecutive stale responses. |
| `#maxSnapshotRetries` | Snapshot 409 path (L6) | Counts consecutive snapshot 409s. Adds cache buster when handle unchanged. Throws FetchError(502) after 5. |
| `#maxConsecutiveErrorRetries` | `#start` onError retry (L5) | Counts consecutive error retries. Sends error to subscribers and tears down after 50. Reset on successful message batch. |
| Pause lock | `#requestShape` entry | Returns immediately if paused. Prevents fetches during snapshots. |
| Up-to-date exit | `#requestShape` entry | Returns if `!subscribe` and `isUpToDate`. Breaks loop for one-shot syncs. |
| Guard | Scope | How it works |
| ----------------------------- | ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `#checkFastLoop` | Non-live `#requestShape` only | Detects N requests at same offset within a time window. First: clears caches + resets. Persistent: exponential backoff → throws FetchError(502). |
| `maxStaleCacheRetries` | Stale response path (L3) | State machine counts stale retries. After 3 consecutive stale responses, clears expired entry and attempts one self-healing retry. Throws FetchError(502) if self-healing also fails. |
| `#expiredShapeRecoveryKey` | Self-healing (L3 extension) | Records shape key after first self-healing attempt. Second exhaustion on same key skips self-healing → FetchError(502). Cleared on up-to-date. |
| `#maxSnapshotRetries` | Snapshot 409 path (L6) | Counts consecutive snapshot 409s. Adds cache buster when handle unchanged. Throws FetchError(502) after 5. |
| `#maxConsecutiveErrorRetries` | `#start` onError retry (L5) | Counts consecutive error retries. Sends error to subscribers and tears down after 50. Reset on successful message batch. |
| Pause lock | `#requestShape` entry | Returns immediately if paused. Prevents fetches during snapshots. |
| Up-to-date exit | `#requestShape` entry | Returns if `!subscribe` and `isUpToDate`. Breaks loop for one-shot syncs. |

### Coverage gaps

Expand Down
2 changes: 1 addition & 1 deletion packages/typescript-client/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
"glob": "^10.3.10",
"jsdom": "^26.1.0",
"pg": "^8.12.0",
"prettier": "^3.3.2",
"prettier": "^3.6.2",
"shx": "^0.3.4",
"tsup": "^8.0.1",
"typescript": "^5.5.2",
Expand Down
76 changes: 69 additions & 7 deletions packages/typescript-client/src/client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -623,6 +623,8 @@ export class ShapeStream<T extends Row<unknown> = Row>
#fastLoopMaxCount = 5
#pendingRequestShapeCacheBuster?: string
#maxSnapshotRetries = 5
#expiredShapeRecoveryKey: string | null = null
#pendingSelfHealCheck: { shapeKey: string; staleHandle: string } | null = null
#consecutiveErrorRetries = 0
#maxConsecutiveErrorRetries = 50

Expand Down Expand Up @@ -914,10 +916,11 @@ export class ShapeStream<T extends Row<unknown> = Row>
}

if (e instanceof StaleCacheError) {
// Received a stale cached response from CDN with an expired handle.
// The #staleCacheBuster has been set in #onInitialResponse, so retry
// the request which will include a random cache buster to bypass the
// misconfigured CDN cache.
// Two paths throw StaleCacheError:
// 1. Normal stale-retry: response handle matched expired handle,
// #staleCacheBuster set to bypass CDN cache on next request.
// 2. Self-healing: stale retries exhausted, expired entry cleared,
// stream reset — retry without expired_handle param.
return this.#requestShape()
}

Expand Down Expand Up @@ -1248,6 +1251,25 @@ export class ShapeStream<T extends Row<unknown> = Row>
? expiredShapesCache.getExpiredHandle(shapeKey)
: null

// If this response is the first one after a self-healing retry, check
// whether the proxy/CDN returned the exact handle we just marked expired.
// If so, the client is about to accept stale data silently — loudly warn
// so operators can detect and fix the proxy misconfiguration.
if (this.#pendingSelfHealCheck) {
const { shapeKey: healedKey, staleHandle } = this.#pendingSelfHealCheck
this.#pendingSelfHealCheck = null
if (shapeKey === healedKey && shapeHandle === staleHandle) {
console.warn(
`[Electric] Self-healing retry received the same handle "${staleHandle}" that was just marked expired. ` +
`This means your proxy/CDN is serving a stale cached response and ignoring cache-buster query params. ` +
`The client will proceed with this stale data to avoid a permanent failure, but it may be out of date until the cache refreshes. ` +
`Fix: configure your proxy/CDN to include all query parameters (especially 'handle' and 'offset') in its cache key. ` +
`For more information visit the troubleshooting guide: ${TROUBLESHOOTING_URL}`,
new Error(`stack trace`)
)
}
}

const transition = this.#syncState.handleResponseMetadata({
status,
responseHandle: shapeHandle,
Expand All @@ -1262,6 +1284,12 @@ export class ShapeStream<T extends Row<unknown> = Row>

this.#syncState = transition.state

// Clear recovery guard on 204 (no-content), since the empty body means
// #onMessages won't run to clear it via the up-to-date path.
if (status === 204) {
this.#expiredShapeRecoveryKey = null
}

if (transition.action === `accepted` && status === 204) {
this.#consecutiveErrorRetries = 0
}
Expand All @@ -1270,6 +1298,38 @@ export class ShapeStream<T extends Row<unknown> = Row>
// Cancel the response body to release the connection before retrying.
await response.body?.cancel()
if (transition.exceededMaxRetries) {
if (shapeKey) {
// Clear the expired entry — keeping it only poisons future sessions.
expiredShapesCache.delete(shapeKey)

// Try one self-healing retry per shape: reset the stream and
// retry without the expired_handle param. Since handles are never
// reused (see SPEC.md S0), the fresh response will have a new
// handle and won't trigger stale detection.
if (this.#expiredShapeRecoveryKey !== shapeKey) {
console.warn(
`[Electric] Stale cache retries exhausted (${this.#maxStaleCacheRetries} attempts). ` +
`Clearing expired handle entry and attempting self-healing retry without the expired_handle parameter. ` +
`For more information visit the troubleshooting guide: ${TROUBLESHOOTING_URL}`,
new Error(`stack trace`)
)
this.#expiredShapeRecoveryKey = shapeKey
// Arm a post-self-heal check: if the next response comes back
// with the same handle we just marked expired, the proxy/CDN is
// still serving stale data and we'll warn loudly instead of
// accepting it silently.
if (shapeHandle) {
this.#pendingSelfHealCheck = {
shapeKey,
staleHandle: shapeHandle,
}
}
this.#reset()
throw new StaleCacheError(
`Expired handle entry evicted for self-healing retry`
)
}
}
throw new FetchError(
502,
undefined,
Expand Down Expand Up @@ -1351,6 +1411,7 @@ export class ShapeStream<T extends Row<unknown> = Row>
shapeKey,
this.#syncState.liveCacheBuster
)
this.#expiredShapeRecoveryKey = null
}
}

Expand Down Expand Up @@ -1770,9 +1831,10 @@ export class ShapeStream<T extends Row<unknown> = Row>
#reset(handle?: string) {
this.#syncState = this.#syncState.markMustRefetch(handle)
this.#connected = false
// releaseAllMatching intentionally doesn't fire onReleased — it's called
// from within the running stream loop (#requestShape's 409 handler), so
// the stream is already active and doesn't need a resume signal.
// releaseAllMatching intentionally doesn't fire onReleased — every caller
// (#requestShape's 409 handler, #checkFastLoop, and stale-retry
// self-healing in #onInitialResponse) runs inside the active stream loop,
// so the stream is already active and doesn't need a resume signal.
this.#pauseLock.releaseAllMatching(`snapshot`)
}

Expand Down
Loading
Loading