feat(webapp): mollifier API mutations on buffered runs#3756
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🦋 Changeset detectedLatest commit: 6fd8529 The changes in this PR will be included in the next version bump. This PR includes changesets to release 32 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
1838229 to
af0aeeb
Compare
b8ead31 to
109fbd7
Compare
af0aeeb to
857bba3
Compare
109fbd7 to
c7a66bd
Compare
857bba3 to
21babc8
Compare
094d006 to
9914976
Compare
0919f7a to
f36c576
Compare
f4b6064 to
0547ba9
Compare
c8ab214 to
047b240
Compare
0547ba9 to
0708ce5
Compare
047b240 to
e57bc5e
Compare
0708ce5 to
396552e
Compare
396552e to
eb2a777
Compare
f4131eb to
d8f6cf7
Compare
e0a57d9 to
b3d188c
Compare
d8f6cf7 to
796a2c0
Compare
b3d188c to
c970692
Compare
796a2c0 to
b139391
Compare
c970692 to
ba084d8
Compare
b139391 to
d153042
Compare
ba084d8 to
eb520ef
Compare
d153042 to
0753300
Compare
2ea3d92 to
46c08fb
Compare
0753300 to
a1e1ad8
Compare
46c08fb to
0396ab9
Compare
a1e1ad8 to
494763c
Compare
0396ab9 to
65ca573
Compare
668547d to
edb3ebd
Compare
15f5580 to
f6d15b1
Compare
75d5cfd to
00a23a8
Compare
f6d15b1 to
fcd196d
Compare
5a68609 to
dd45a3e
Compare
214bd92 to
c2e1c6e
Compare
dd45a3e to
597cce5
Compare
c2e1c6e to
0f7365d
Compare
597cce5 to
188b8c7
Compare
0f7365d to
cedc6aa
Compare
Cancel, replay, reschedule, metadata, tags, and idempotency-key-reset now succeed against a run that's still in the mollifier buffer. Mutations are applied to the buffered snapshot via Lua CAS; the drainer carries the mutation forward when it replays. Stacked on the reads PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- metadata route: drop the \`as unknown as Parameters<...>\` cast on the parent/root operations path. Widen \`routeOperationsToRun\`'s env parameter to \`AuthenticatedEnvironment\` so the service's typed signature carries through; the caller always has the full env in scope. - replay route: validate the buffered fallback against a Zod \`BufferedReplayInputSchema\` covering the fields \`ReplayTaskRunService.call\` actually reads (id, friendlyId, runtimeEnvironmentId, taskIdentifier, payload, payloadType, queue, isTest, traceId, spanId, engine, runTags + nullable concurrencyKey/workerQueue/machinePreset/realtimeStreamsVersion). Schema-fail logs the issue list and 404s rather than passing a half-shaped object into the service. - resetIdempotencyKey: distinguish "PG-empty + buffer-cleared-nothing" (genuine 404) from "PG-empty + buffer-unreachable" (partial outage — 503 with retry hint). The previous behaviour silently returned 404 on outage, hiding the partial failure and leaving a buffered key effectively un-reset. New regression test covers all four branches (PG-hit + buffer-throws, PG-empty + buffer-hit, PG-empty + buffer-clean-miss, PG-empty + buffer-outage, mollifier-disabled). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- metadata route was routing rootOperations to bufferedEntry.parentTaskRunId with a comment claiming PG's nil-coalesce defaults to parent. PG actually defaults to taskRun.id (self), so a buffered grandchild metadata.root.set() was silently mutating the child's metadata instead of the root's. SyntheticRun already carries rootTaskRunFriendlyId from the snapshot — use it, falling back to the run itself (matching PG) when absent. - reschedule route's PG path delegates to RescheduleTaskRunService which enforces `status !== "DELAYED"` and 422s otherwise. The buffer path had no equivalent guard, so a customer could inject delayUntil into the snapshot of an undelayed buffered run and the drainer would materialise it with an unintended delay. Added a pre-fetch through findRunByIdWithMollifierFallback and 422 when the buffered run has no delayUntil. SyntheticRun doesn't carry a "DELAYED" status enum (only QUEUED|FAILED|CANCELED) so the gate reads the snapshot's delayUntil field directly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The wait-and-bounce loop for mutations racing a mid-drain run polled the PG primary on a fixed 20ms cadence with no jitter — up to ~100 reads per request, synchronized across concurrent waiters, piling load onto the writer exactly when mollifier is engaged to shed it. The drainer writes the canonical PG row BEFORE it acks (sets `materialised`) or fails (deletes the entry), so the buffer entry's own state is an authoritative, already-in-Redis signal for "is the row in PG yet?". Watch that (cheap Redis getEntry) instead, and touch the primary exactly once — for the actual mutation — only after it resolves. Poll gaps now use jittered exponential backoff (20ms → 250ms cap). Drops the per-poll PG timeout race (DEFAULT_PG_TIMEOUT_MS / pgTimeoutMs / findRunInPgWithTimeout), unneeded now that PG is read once rather than in a tight loop. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove plan-tracking shorthand (Q2/Q3 design, _plans/) from mutations-layer mollifier comments; reword to plain English. Comment-only; no behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… metadata fallback errors The tags API skipped MAX_TAGS_PER_RUN enforcement on the buffered path, letting a buffered run exceed the cap the trigger validator applies at creation. Enforce it atomically in the mutateSnapshot Lua: append_tags now accepts an optional maxTags and returns "limit_exceeded" (writing nothing) when the deduped count would overflow. mutateWithFallback gains a symmetric rejectedResponse builder + a "rejected" outcome; the tags route returns 422, matching the PG path. Also stop silently swallowing PG failures in the metadata route's parent/root op fan-out: warn (with targetRunId + error) before the best-effort buffer fallback so a genuine PG outage is observable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…oute comments Two Phase labels survived the earlier sweep — "Phase A6" in api.v1.runs.\$runId.metadata.ts (the GET-loader-added comment, mirroring the same pattern in api.v1.runs.\$runParam.attempts.ts on reads), and "Phase B4" in api.v1.runs.\$runParam.replay.ts (referring to where SyntheticRun was extended). Rewritten to plain prose; comment-only, no behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cedc6aa to
6fd8529
Compare
Summary
Cancel, replay, reschedule, metadata, tags, and idempotency-key-reset now succeed against a run that's still in the mollifier buffer. Mutations are applied to the buffered snapshot via Lua CAS; the drainer carries the mutation forward when it replays.
Primitives added:
mutateWithFallback— PG-first / buffer-fallback resolver with bounded-wait safety net for entries that transition mid-mutation.applyMetadataMutation— buffered metadata PUT mirroring the PG-side retry loop with CAS atomicity.resolveRunForMutation— discriminated-union resolver used by routefindResourceso the route builder's pre-action 404 check sees buffered runs.Routes wired (whole files, no GET/POST splits):
api.v2.runs.\$runParam.cancel.tsapi.v1.runs.\$runParam.replay.tsapi.v1.runs.\$runParam.reschedule.tsapi.v1.runs.\$runId.metadata.tsapi.v1.runs.\$runId.tags.tsresetIdempotencyKey.server.tsStacked on the reads PR.
Test plan