Skip to content

test(client): add fast-check model-based property tests and retry bound analysis#4089

Open
KyleAMathews wants to merge 13 commits intomainfrom
fast-check-model-tests
Open

test(client): add fast-check model-based property tests and retry bound analysis#4089
KyleAMathews wants to merge 13 commits intomainfrom
fast-check-model-tests

Conversation

@KyleAMathews
Copy link
Copy Markdown
Contributor

@KyleAMathews KyleAMathews commented Apr 4, 2026

Summary

Fixes four bugs in ShapeStream's retry and error-handling paths, found via model-based property testing and static analysis. The bugs could cause infinite retry loops, stale data delivery to subscribers, stack frame leaks, and cache key divergence.

Root Cause

All four bugs share a pattern: they're edge cases in error-handling paths that only manifest under specific sequences of server responses. Hand-written tests don't naturally explore adversarial sequences like "409 with same handle, then 409 with no handle, then 200 with empty body." These bugs survived because they required rare combinations — a 409 where the proxy strips the handle header, or a deprecated query param that nobody adds to the protocol list.

Approach

Two complementary verification strategies that catch bugs mechanically:

Model-based property testing (fast-check)

fc.asyncModelRun with 8 command types generates adversarial server response sequences. A simple model tracks { consecutiveErrors, terminated }, and each command predicts the model's state change before asserting the real ShapeStream matches.

FetchGate pattern: A controllable mock fetch that blocks each request until the test provides a response — turn-based coordination where the test controls what the server returns while ShapeStream drives when it fetches.

Global invariants checked after every command:

  • URL length stays bounded (catches suffix growth)
  • isUpToDate / lastSyncedAt consistency
  • After 409, the retry URL differs from the pre-409 URL (catches identity loops)

Static analysis via AST walking

Seven rule types that mechanically detect structural bug patterns:

Rule What it catches
unbounded-retry-loop Recursive calls in catch blocks without detectable bounds
conditional-409-cache-buster 409 handlers where createCacheBuster() is conditional or missing
parked-tail-await await this.#method(); return patterns that park stack frames
error-path-publish #publish/#onMessages calls inside catch blocks or error status handlers
shared-instance-field Mutable fields written before async boundaries and read by other methods
ignored-response-transition Non-delegate states returning { action: 'ignored' }
protocol-literal-drift Near-miss string literals for Electric protocol params

Each rule was RED/GREEN verified: temporarily reintroduce the bug → rule fires; revert → rule passes clean.

Bugs Fixed

Bug 1: Conditional cache buster on 409

When the server returned a 409 with the same handle (or no handle), createCacheBuster() was only called in the no-handle branch. Same-handle 409s produced identical retry URLs, causing infinite CDN-cached retry loops.

Fix: createCacheBuster() is now unconditional on every 409 — both in #requestShape and #fetchSnapshotWithRetry.

Bug 2: Parked stack frame in #start retry

await this.#start(); return kept the caller's frame alive for the entire recursive chain. Under repeated error-recovery cycles, this accumulated O(n) suspended frames.

Fix: return this.#start() releases the frame via promise chaining.

Bug 3: Missing EXPERIMENTAL_LIVE_SSE_QUERY_PARAM in protocol params

The deprecated experimental_live_sse param wasn't in ELECTRIC_PROTOCOL_QUERY_PARAMS, so canonicalShapeKey wouldn't strip it. Clients using the deprecated param would get different cache keys from clients using the current live_sse param.

Fix: Added to the array. Static analysis test now verifies all internal *_QUERY_PARAM exports are in the list.

Bug 4: Publishing 409 body to subscribers

The 409 handler parsed e.json and called await this.#publish(messages409) before retrying. This delivered stale/partial rotation messages to subscribers, which could corrupt client-side state.

Fix: Removed the publish call entirely. 409 is a control-plane signal — the retry will deliver fresh data from offset -1.

Key Invariants

  1. Every 409 handler unconditionally calls createCacheBuster() before retrying
  2. Every recursive call in a catch block has a detectable retry bound (counter, type-guard, abort-signal, or callback-gate)
  3. No await this.#method(); return in recursive methods (use return this.#method() instead)
  4. No #publish or #onMessages calls in error handling paths
  5. All internal *_QUERY_PARAM constants appear in ELECTRIC_PROTOCOL_QUERY_PARAMS
  6. Consecutive error counter resets only on proven success (200 with data, 200 up-to-date, 204)

Non-goals

  • Refactoring #start/#requestShape into iterative loops (structural change, separate PR)
  • Fixing the pgArrayParser double-push bug (pre-existing, unreachable for valid PostgreSQL arrays)
  • Testing network-level failures or backoff timing (model tests focus on response sequences)
  • The wake detection queueMicrotask timing issue (pre-existing, requires deeper investigation)

Verification

cd packages/typescript-client
pnpm vitest run --config vitest.unit.config.ts

Files changed

File Change
src/client.ts Unconditional cache buster on 409 (both handlers), return this.#start() instead of await, removed #publish from 409 handler, updated 409 comment
src/constants.ts Added EXPERIMENTAL_LIVE_SSE_QUERY_PARAM to ELECTRIC_PROTOCOL_QUERY_PARAMS
test/model-based.test.ts New: 8 response factories, FetchGate class, 8 command classes, model definition, global invariant assertions
test/static-analysis.test.ts Tests for all 7 static analysis rules including protocol param completeness
bin/lib/shape-stream-static-analysis.mjs AST-based analysis: unbounded retry, 409 cache buster, tail-position await, error-path publish, protocol literal drift
SPEC.md Updated loop-back site line numbers (L1-L6) after code changes
vitest.unit.config.ts Added model-based test to include list
package.json Added fast-check devDependency

🤖 Generated with Claude Code

KyleAMathews and others added 4 commits April 4, 2026 09:37
Adds model-based testing using fast-check to generate adversarial server
response sequences and verify the retry counter behaves correctly under
all orderings. The model tracks expected consecutive error count and
termination state; fast-check generates 100 random 80-command sequences
mixing 200s, 204s, 400s, and malformed 200s, verifying the model matches
the real ShapeStream at every step.

This catches the class of bugs where counter resets interact with error
sequences in unexpected ways — the kind of ordering-dependent issues that
hand-written tests miss because humans only think to test particular
sequences.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eaders

Adds 3 new command types to the fast-check model-based property test:
- Respond409: verifies 409 (shape rotation) doesn't affect retry counter
- Respond200Empty: verifies empty message batch doesn't reset counter
- RespondMissingHeaders: verifies non-retryable errors terminate immediately

Also fixes cross-iteration cache pollution (expiredShapesCache, localStorage)
and tracks handle rotation so 409 → subsequent response sequences are
protocol-correct. Bumped to 200 runs with 8 command types.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tests

Adds global invariant assertions checked after every command:
- URL length bounded at 2000 chars (catches -next suffix accumulation)
- isUpToDate + lastSyncedAt consistency (catches silent stuck states)
- Post-409 URL differs from pre-409 URL (catches identity loops)

Also tracks request URLs in FetchGate and exposes stream instance for
observable state assertions. These invariants are drawn from 25
historical bugs identified in the client's git history.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new bot commented Apr 4, 2026

Open in StackBlitz

npm i https://pkg.pr.new/@electric-sql/react@4089
npm i https://pkg.pr.new/@electric-sql/client@4089
npm i https://pkg.pr.new/@electric-sql/y-electric@4089

commit: fafcd59

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 4, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.84%. Comparing base (c740930) to head (fafcd59).
⚠️ Report is 2 commits behind head on main.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4089      +/-   ##
==========================================
+ Coverage   88.67%   88.84%   +0.16%     
==========================================
  Files          25       25              
  Lines        2438     2430       -8     
  Branches      615      604      -11     
==========================================
- Hits         2162     2159       -3     
+ Misses        274      269       -5     
  Partials        2        2              
Flag Coverage Δ
packages/experimental 87.73% <ø> (ø)
packages/react-hooks 86.48% <ø> (ø)
packages/start 82.83% <ø> (ø)
packages/typescript-client 94.07% <100.00%> (+0.25%) ⬆️
packages/y-electric 56.05% <ø> (ø)
typescript 88.84% <100.00%> (+0.16%) ⬆️
unit-tests 88.84% <100.00%> (+0.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Enhances the static analysis script to exhaustively find recursive calls
in catch blocks and classify their retry bounds. For each recursive call,
walks the AST to detect counter guards, type guards (instanceof/status
checks), abort signal checks, and callback gates. Generates findings for
any completely unbounded recursive retries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@KyleAMathews KyleAMathews changed the title test(client): add fast-check model-based property tests for ShapeStream test(client): add fast-check model-based property tests and retry bound analysis Apr 4, 2026
KyleAMathews and others added 8 commits April 4, 2026 09:55
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eak in #start

Two fixes:

1. Both 409 handlers (#requestShape and #fetchSnapshotWithRetry) now
   unconditionally create a cache buster instead of only doing so
   conditionally when the handle is missing or recycled. This eliminates
   the same-handle 409 infinite loop (where identical retry URLs would
   hit CDN cache forever) and removes two conditional branches, making
   the behavior safer and easier to verify exhaustively.

2. Changed `await this.#start(); return` to `return this.#start()` in
   the onError retry path. The old pattern parked the outer #start frame
   on the call stack for the entire lifetime of the replacement stream,
   accumulating one frame per error recovery. The new pattern resolves
   the outer frame immediately.

Also adds model-based test commands for 409-no-handle and 409-same-handle
scenarios, plus a targeted regression test verifying consecutive same-handle
409s produce unique retry URLs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…sters

Adds a new analysis rule that finds all 409 status handlers in the
ShapeStream class and verifies each unconditionally calls
createCacheBuster(). This would have caught the same-handle 409 bug
where a conditional cache buster allowed identical retry URLs.

The rule:
- Finds if-statements checking .status == 409 or .status === 409
- Handles compound conditions (e.g. `e instanceof FetchError && e.status === 409`)
- Verifies createCacheBuster() is called outside any nested if-block
- Reports with the exact method, line numbers, and retry callee

RED/GREEN verified: temporarily reverting to conditional cache buster
correctly triggers the finding.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… list

The deprecated `experimental_live_sse` query param was missing from
ELECTRIC_PROTOCOL_QUERY_PARAMS, causing canonicalShapeKey to produce
different keys for the same shape depending on whether the SSE code
path added the param to the URL. This caused:

- expiredShapesCache entries written during SSE to be invisible when
  the stream fell back to long polling
- upToDateTracker entries from SSE sessions to be lost on page refresh
- fast-loop cache clearing to target the wrong key during SSE

Also adds a static analysis test that verifies all internal protocol
query param constants are included in the protocol params list, and
updates SPEC.md with the unconditional 409 cache buster invariant.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
409 responses carry only a must-refetch control message — no user data.
The #reset call already handles the state transition structurally. The
#publish call was delivering empty (or near-empty) batches to subscribers
because #publish lacks the empty-batch guard that #onMessages has.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Detects `await this.#method(); return` patterns in recursive methods
where `return this.#method()` would avoid parking the caller's stack
frame. RED/GREEN verified: reintroducing the old `await this.#start()`
pattern triggers the finding; the current `return this.#start()` is
clean.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ky SSE test

Add new static analysis rule detecting #publish/#onMessages calls inside
catch blocks or HTTP error status handlers — catches the Bug #4 pattern
(publishing stale 409 data to subscribers). RED/GREEN verified.

Also: update SPEC.md loop-back site line numbers, fix 409 handler comment,
DRY improvements to model-based tests, fix flaky SSE fallback test
(guard controller.close() against already-closed stream, widen SSE
request count tolerance).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous commit removed the raw 409 body publish entirely, but
Shape relies on the must-refetch control message to clear its
accumulated data and trigger snapshot re-execution. Instead of
publishing the raw response body (which could contain stale data rows),
publish a synthetic control-only message.

Also: fix flaky SSE fallback test (remove brittle upper bound on SSE
request count), refine error-path-publish rule to allow static array
literal arguments (synthetic control messages) while still catching
dynamic publishes from error data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant