feat(serve+sdk): F4 prereq — daemon protocol completion (serverTimestamp / provenance / errorKind / state_resync_required)#4360
Conversation
…aemon events (#4175 F4 prereq) Adopts chiga0's three P0 SDK-side blockers from #4175 comment #19 — the SDK side already consumes these fields (PR #4353), but daemon hadn't stamped them yet, leaving the corresponding UI affordances inert. All three stampings are purely additive on the wire and don't require any SDK type changes (SDK already has forward-compat field slots). **#19.1 — `_meta.serverTimestamp` on every SSE frame** (`server.ts` `formatSseFrame()`) Stamped at the SSE write boundary (NOT EventBus.publish) so the in-memory `BridgeEvent` type stays unchanged and internal consumers don't see `_meta`. Pre-existing `_meta` keys (e.g. tool_call's `_meta.toolName`) are preserved via spread merge. SDK reads via the 3-location probe in `extractServerTimestamp` (chiga0's PR #4353); we pick `_meta.serverTimestamp` (Anthropic convention) so top-level event type stays unpolluted. Why this matters: pre-fix, multi-client UIs showing "X minutes ago" or sorting transcript blocks by emit time used each client's local clock — drifts of tens of seconds to minutes across browsers/tabs/ mobile produced visibly inconsistent timestamps. **#19.2 — `tool_call` `provenance` + `serverId` on every emitter event** (`ToolCallEmitter.ts`) New static `ToolCallEmitter.resolveToolProvenance(toolName, subagentMeta)` returns `{ provenance: 'builtin' | 'mcp' | 'subagent'; serverId? }`. Resolution rules (per user-confirmed design decision from issue comment): subagent takes precedence (set when subagentMeta is present); `mcp__<server>__<tool>` naming heuristic classifies MCP tools with serverId; everything else is builtin. Stamped on `emitStart` AND `emitResult` AND `emitError` (all three emit paths) so a reconnecting client receiving a `tool_call_update` frame from the replay ring (without the original `tool_call` start event) can still derive the provenance. Provenance is stable per tool, so stamping on every event is redundant — but the marginal serialization cost is tiny and reconnect correctness wins. Chose the naming heuristic (not ToolRegistry lookup) per user confirmation: matches the SDK's own fallback (chiga0 PR #4353), no new ctx-dep on emit hot path, no signature changes. **#19.3 — `errorKind` on `stream_error`** (`server.ts` line ~1955) Stamped via `mapDomainErrorToErrorKind(err)` — the 7-value classifier already lives in `@qwen-code/acp-bridge/status.ts` since #4319. When the classifier returns `undefined` (generic Error etc.) the field is omitted — strictly additive. SDK consumers handle "errorKind absent" as before (fall back to rendering `error` text). NOT stamped on `session_died` because the 3 emit sites in `acp-bridge/ bridge.ts` don't have a classifiable `err` in scope: - `channel_closed` carries only exitCode/signalCode (no error) - `killed` is user-initiated (no domain error) - `daemon_shutdown` is operator-initiated (no domain error) A follow-up could thread channel-spawn errors through to the session_died emit site to enable `errorKind: 'init_timeout'` / `missing_binary` classification — left for a separate PR to avoid mixing protocol stamping with lifecycle plumbing. Verification - `npx vitest run packages/cli/src/serve/server.test.ts -t "serverTimestamp|stream_error|errorKind"` — 5 pass - `npx vitest run packages/cli/src/acp-integration/session/emitters/ToolCallEmitter.test.ts` — 46 pass (+ 11 new tests for resolveToolProvenance + provenance stamping on all 3 emit paths) - `npx vitest run packages/cli/src/acp-integration/session/HistoryReplayer.test.ts` — 17 pass - TypeScript clean on touched regions; pre-existing F3 (#4335 merge) errors elsewhere are unrelated. Existing test updates - 15 `_meta: { toolName: 'X' }` assertions in ToolCallEmitter.test.ts updated to include `provenance: 'builtin'` (defensive — catches accidental drift if a future refactor stops stamping). 2 strict-equality assertions in HistoryReplayer.test.ts similarly updated. The first SSE-frame test in server.test.ts switched from `toEqual` to `toMatchObject` since `_meta.serverTimestamp` makes exact equality brittle; a dedicated test pins the new field's shape.
…ync_required (#4175 F4 prereq) Closes the multi-client SSE reducer divergence bug Ilya0527 raised in #4175 comment #15. Pre-fix scenario: 1. Consumer's SSE stream drops; client buffers `Last-Event-ID: N`. 2. Network reconnects long enough later that events `[N+1, ringHead-1]` were evicted from the daemon's per-session ring. 3. Daemon's `subscribe({lastEventId: N})` silently replays only the surviving suffix. 4. Consumer's SDK reducer keeps applying deltas as if the stream was contiguous. Its state has now drifted from the daemon's truth — no terminal signal, no warning. The `SessionState` reducer's "same event stream in → same state out" purity guarantee is broken. The bug's blast radius is exactly when multi-client matters: F4 brings up the TUI / IDE / web client adapters that share session state, so divergence becomes visibly inconsistent across clients. **Daemon side** (`packages/acp-bridge/src/eventBus.ts`) In `subscribe()`'s replay path, detect ring eviction by comparing the ring's earliest id against `lastEventId + 1`. When a gap exists, force-push a synthetic terminal `state_resync_required` frame BEFORE the surviving replay events: ``` { v: 1, type: 'state_resync_required', data: { reason: 'ring_evicted', lastDeliveredId: N, earliestAvailableId: M } } ``` Per user-confirmed design (issue comment discussion): the frame has NO `id` (mirrors the `client_evicted` synthetic terminal pattern so it doesn't burn a slot in the per-session monotonic sequence). Replay continues after the resync frame — the SDK reducer auto-skips subsequent deltas (see below) but the frames stay on the wire so adapters have the option to compute a "what you missed" diff later. **SDK side** (`packages/sdk-typescript/src/daemon/events.ts`) Adds: - `'state_resync_required'` to `DAEMON_EVENT_TYPES` union - `DaemonStateResyncRequiredData` + `DaemonStateResyncRequiredEvent` - `isStateResyncRequiredData` predicate - `DaemonStreamLifecycleEvent` union widened - Reducer state fields: `awaitingResync: boolean`, `resyncRequiredCount: number`, `lastResyncRequired?` - Reducer case for `state_resync_required` — sets the flag, increments count, records data - **Top-of-reducer gate**: when `awaitingResync === true`, all non- terminal events are auto-skipped (still advance `lastEventId`). Terminal lifecycle events (`session_died` / `session_closed` / `client_evicted` / `stream_error`) STILL apply — critical end-of- stream signals don't depend on prior state being current. - Re-exported `DaemonStateResyncRequiredData` / Event from `daemon/index.ts` and `src/index.ts` (matches surface posture of sibling lifecycle types). Consumer recovery contract: when `state.awaitingResync === true`, call `loadSession` (out of band) to fetch the daemon's canonical session snapshot, then reconstruct view state via `createDaemonSessionViewState({...seed from loaded state})`. The fresh state defaults `awaitingResync: false` so the seed implicitly clears the flag. **Side fix** (`stream_error` errorKind) `DaemonStreamErrorData.errorKind?: string` typed for the optional classification field that Commit 1 (`14637cd79`) added daemon-side. Strictly additive — old daemons omit the field, SDK falls back to rendering `error` text. Verification - `packages/acp-bridge`: 6 files, 108/108 pass (+5 new resync-detection tests; 1 existing "default ring size 8000" test updated to acknowledge the synthetic resync frame at the head of its replay batch). - `packages/sdk-typescript`: 13 files, 451/451 pass (+8 new reducer resync tests covering set/skip/terminal-passthrough/recovery/ repeated-resync/malformed-payload). - TypeScript clean across both packages on touched regions.
📋 Review SummaryThis PR bundles two F4 prerequisites: (1) protocol-level stamping additions for SSE events (server timestamp, error classification, tool provenance), and (2) gap detection for SSE reconnects past the ring eviction point. Both are well-implemented with comprehensive test coverage. The changes are strictly additive and backward-compatible. Overall assessment: strong implementation with excellent documentation, though a few edge cases and naming considerations warrant attention before merge. 🔍 General Feedback
🎯 Specific Feedback🟡 High Priority
🟢 Medium Priority
🔵 Low Priority
✅ Highlights
|
There was a problem hiding this comment.
Pull request overview
This PR completes two daemon protocol prerequisites needed for the F4 client-adapter wave: (1) daemon-side “stamping” of additional metadata (_meta.serverTimestamp, tool-call provenance, and errorKind on stream_error) and (2) SSE replay gap detection that emits a new state_resync_required event when a client resumes past the EventBus ring eviction boundary.
Changes:
- Daemon now stamps
_meta.serverTimestampon SSE frames, stampserrorKindonstream_errorwhen classifiable, and stamps tool-call provenance (builtin/mcp/subagent+ optionalserverId) in tool call events. - EventBus
subscribe()detects ring-eviction gaps on resume and force-pushes a syntheticstate_resync_requiredframe before replay frames. - SDK adds the new
state_resync_requiredevent type and reducer behavior (awaitingResync) to skip non-terminal deltas until the consumer reseeds vialoadSession.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/acp-bridge/src/eventBus.ts | Emit state_resync_required when Last-Event-ID is behind the ring head due to eviction. |
| packages/acp-bridge/src/eventBus.test.ts | Adds/updates unit tests for resync gap detection and ordering vs replay frames. |
| packages/cli/src/serve/server.ts | Stamps _meta.serverTimestamp at SSE write boundary; stamps errorKind on stream_error. |
| packages/cli/src/serve/server.test.ts | Updates SSE tests to account for _meta.serverTimestamp and validates errorKind stamping. |
| packages/cli/src/acp-integration/session/emitters/ToolCallEmitter.ts | Adds provenance stamping and resolveToolProvenance() helper for tool calls. |
| packages/cli/src/acp-integration/session/emitters/ToolCallEmitter.test.ts | Adds tests for provenance resolution and stamping across emit paths. |
| packages/cli/src/acp-integration/session/HistoryReplayer.test.ts | Updates replay expectations to include provenance in _meta. |
| packages/sdk-typescript/src/daemon/events.ts | Adds event typing + reducer state for state_resync_required and resync gating behavior. |
| packages/sdk-typescript/src/daemon/index.ts | Re-exports the new resync event/data types. |
| packages/sdk-typescript/src/index.ts | Re-exports the new resync event/data types from the SDK root. |
| packages/sdk-typescript/test/unit/daemonEvents.test.ts | Adds reducer tests covering resync state, skip gating, and terminal passthrough. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…x round 2 fold-in) Adopts Codex review round 2 P2 finding on PR #4360 — fold-in to the F4 prereq scope per user's "a" decision. **Problem**: When the `BridgeFileSystem` adapter (introduced in #4334 fs adapter wiring) throws a structured `FsError` (e.g. `kind: 'untrusted_workspace'` / `kind: 'symlink_escape'` / `kind: 'file_too_large'`), the `@agentclientprotocol/sdk` default RPC error serialization only sends `error.message` as JSON-RPC -32603 "Internal error". The structured `kind` / `status` / `hint` fields on FsError are stripped on the way to the agent. Downstream impact: SDK consumers receiving the ACP error payload lose the typed discriminator and have to regex-match the human- readable message to dispatch UI (auth retry vs file picker vs proxy hint). This silently regresses what the FsError-typed contract was supposed to provide. **Fix**: At the bridge boundary (`BridgeClient.writeTextFile` and `BridgeClient.readTextFile`), catch errors from `this.fileSystem. writeText/readText` calls. Duck-type check for FsError shape (`err.name === 'FsError'` + `typeof err.kind === 'string'`); when matched, rethrow as ACP `RequestError(-32603, message, {errorKind, hint, status})`. The agent's RPC client now receives `data. errorKind` and can branch on the closed-enum kind. Cross-package note: FsError lives in `cli/src/serve/fs/errors.ts` and acp-bridge can't `import { FsError }` from cli (dependency inversion). Same duck-typing pattern that `mapDomainErrorToErrorKind` (status.ts) already applies to `TrustGateError` / `SkillError` for the same cross-package bundling reason — `instanceof` would fail across package boundaries when bundlers don't dedupe. **Code shape** ```typescript function isFsErrorShape(err: unknown): err is FsErrorShape { return ( err instanceof Error && err.name === 'FsError' && typeof (err as { kind?: unknown }).kind === 'string' ); } function preserveFsErrorOverAcp(err: unknown): never { if (isFsErrorShape(err)) { throw new RequestError(-32603, err.message, { errorKind: err.kind, ...(err.hint !== undefined ? { hint: err.hint } : {}), ...(err.status !== undefined ? { status: err.status } : {}), }); } throw err; } ``` Applied at both `if (this.fileSystem) { ... }` blocks (writeTextFile + readTextFile) — wrapped the adapter call in try/catch + `preserveFsErrorOverAcp(err)`. Non-FsError errors are rethrown unchanged (default ACP serialization is fine for unstructured errors; only the structured shape needs preservation). JSON-RPC code stays at -32603 (internal error) rather than mapping FsError.kind → JSON-RPC code. Rationale: the JSON-RPC standard defines only a handful of code values (-32700/-32600/-32601/-32602/ -32603 + a reserved range for application errors), and mapping ~10 FsError kinds to that narrow space is lossy. Instead the structured `data.errorKind` carries the semantic information SDK consumers need; JSON-RPC code remains the generic "an error happened" signal. **Tests** (+5 in `bridgeClient.test.ts`) - writeTextFile FsError → ACP RequestError with errorKind in data - readTextFile FsError preserving symlink_escape kind (no hint field present → not stamped, spread guard works) - non-FsError pass-through (plain Error stays plain Error, no RequestError wrap) - hint field preservation when present - defensive: error with `kind` field but wrong `name` does NOT get wrapped (e.g. PermissionForbiddenError happens to have a kind field internally — must NOT be confused for FsError) Verification: 113/113 acp-bridge tests pass (+5 new FsError- preservation tests). Full serve suite shows pre-existing F3-related failures unrelated to this change (verified in isolation).
…stion) Adopts all 7 review threads from the first wenshao + Copilot review round on PR #4360. All technical fixes (no judgment calls). **[Critical] BridgeTimeoutError constructor blocks tsc** (wenshao PRRT_kwDOPB-92c6DfcRI) `server.test.ts:4670` called `new BridgeTimeoutError('initialize timed out')` but the constructor signature is `(label: string, timeoutMs: number)` — TS2554 blocked `tsc --noEmit` and `npm run build`. Fixed to `new BridgeTimeoutError('initialize', 5000)` per suggested fix; resulting message `"HttpAcpBridge initialize timed out after 5000ms"` still satisfies the existing `.toContain('timed out')` assertion. **[Suggestion] Copilot JSDoc package name** (Copilot PRRT_kwDOPB-92c6De-Sm, ToolCallEmitter.ts:210) JSDoc referenced `@qwen-code/core/mcp-tool` but the actual package is `@qwen-code/qwen-code-core` with the file at `packages/core/src/tools/mcp-tool.ts`. Updated the reference. **[Suggestion] Copilot errorKind type widening** (Copilot PRRT_kwDOPB-92c6De-Ro, events.ts:244) `DaemonStreamErrorData.errorKind` was typed as `string` and the JSDoc said "7-value" closed enum — but `DAEMON_ERROR_KINDS` actually has 8 values, and `SERVE_ERROR_KINDS` (daemon-side) has 9 (adds `stat_failed`). Typed as `DaemonErrorKind | (string & {})` for forward-compat: SDK consumers get IDE autocomplete on the known 8 kinds while still accepting future daemon-side additions (like `stat_failed`) without a type error. Updated JSDoc to accurately list 8 current values + call out the forward-compat widening. Side observation (NOT in scope of this PR): `DAEMON_ERROR_KINDS` (SDK) lacks `stat_failed` that exists in `SERVE_ERROR_KINDS` (daemon). That's a separate drift fix. **[Suggestion] TERMINAL wording misleading** (wenshao PRRT_kwDOPB-92c6Dj-JL, eventBus.ts:369) Comment called `state_resync_required` a "TERMINAL synthetic frame" but it's emitted FIRST (before replay) and the stream stays OPEN. Genuine terminals like `client_evicted` close the stream after the frame. Rewrote the comment per suggestion: "id-less synthetic frame... Unlike `client_evicted`, the stream stays OPEN" — so an oncall reading the source at 3am gets the right mental model. **[Suggestion] `_meta` merge dead code + stale reference** (wenshao PRRT_kwDOPB-92c6Dj-JF, server.ts:2569) The `existingMeta` merge reads `event._meta` at BridgeEvent top level, but ToolCallEmitter's `_meta` lives nested inside `event.data._meta` (publish path goes through `events.publish({type: 'session_update', data: params})`). In production `existingMeta` is always undefined — the merge is a forward-compat escape hatch, not an active merge. Also the comment referenced `extractServerTimestamp` (sdk-typescript) which grep confirms doesn't exist yet (it's planned in chiga0 PR #4353). Rewrote the comment block to (1) acknowledge no current producer sets `_meta` at the top level — it's a forward-compat hook for future envelope-level metadata; (2) drop the stale `extractServerTimestamp` reference and instead note that chiga0 PR #4353 plans the 3-location probe. Code shape unchanged (forward-compat spread stays). **[Suggestion] session_closed + client_evicted passthrough tests** (wenshao PRRT_kwDOPB-92c6Dj-JW, daemonEvents.test.ts:2284) `RESYNC_PASSTHROUGH_TYPES` has 5 members but only `session_died` and `stream_error` had passthrough tests. Added two missing tests: `session_closed` and `client_evicted` while awaitingResync. Critical because if a future refactor accidentally drops either from the set, a consumer in resync limbo would silently swallow the terminal signal and the UI would hang on "loading resync state…". **[Suggestion] readTextFile non-FsError passthrough test** (wenshao PRRT_kwDOPB-92c6Dj-JX, bridgeClient.test.ts:251) The non-FsError pass-through test only covered `writeTextFile`. Added a symmetric `readTextFile` test — the two `try/catch` blocks in `bridgeClient.ts` are independent, so test parity guards against divergent refactors (e.g. someone adding wrapping on one side but not the other). Verification - `packages/acp-bridge`: 6 files, 114/114 pass (+1 new readTextFile non-FsError test). - `packages/sdk-typescript`: 75/75 pass on daemonEvents.test.ts (+2 new session_closed / client_evicted passthrough tests). - `packages/cli/src/serve/server.test.ts`: 248 tests pass on touched cases (5 SSE / serverTimestamp / stream_error tests). Pre-existing F3 (#4335 merge) test failures unrelated to this PR's changes — verified by stash-test-restore on clean tree. - TypeScript clean on touched regions; `BridgeTimeoutError` 2-arg fix unblocks `tsc --noEmit` for the test file.
Adopts all 3 threads from wenshao's second review round on PR #4360. All Suggestion-level — daemon-side observability + 1 missing SDK reducer test. **[Suggestion] SSE ring eviction silently emits state_resync_required** (PRRT_kwDOPB-92c6Dp_Uk, eventBus.ts:394) Pre-fix: when a consumer reconnects past the ring boundary, the daemon emits `state_resync_required` with zero stderr breadcrumb. A 3am oncall chasing "my UI is frozen with stale state" couldn't grep daemon logs to distinguish (a) ring undersized, (b) client reconnecting too slowly, (c) network partition causing repeated reconnects. Fix: detect `next.value.type === 'state_resync_required'` in the SSE handler's iter loop in `server.ts` and emit a `writeStderrLine` with the gap details (`lastEventId`, `earliestInRing`, computed `gap` count, `reason`). Logged at the route boundary rather than inside `EventBus.subscribe` to keep the bus implementation pure + concentrate daemon-side observability in the route handler that already logs socket errors + heartbeats. **[Suggestion] Bridge iterator throw forwarded to client but not logged daemon-side** (PRRT_kwDOPB-92c6Dp_Uo, server.ts:1956) Pre-fix inconsistency: the adjacent `res.on('error', ...)` handler at line ~1925 logs SSE socket errors with `writeStderrLine`, but the bridge-iterator-catch block at line ~1940-1965 sends a `stream_error` SSE frame to the client AND swallows the error daemon-side. When the bridge iterator throws (subprocess crash, channel protocol error, unhandled rejection), distinguishing "subprocess OOM-killed" from "protocol bug" required attaching a debugger. Fix: mirror the adjacent handler's pattern — add `writeStderrLine` before the `stream_error` SSE frame send, including the classified `errorKind` (when available) in brackets so operators can grep for `[init_timeout]` / `[missing_binary]` etc. **[Suggestion] No SDK reducer test verifying stream_error.errorKind flowthrough** (PRRT_kwDOPB-92c6Dp_Uq, daemonEvents.test.ts:2331) The daemon-side wire format is tested in `server.test.ts` (`parsed.data.errorKind === 'init_timeout'`) and `DaemonStreamErrorData` now declares `errorKind?`, but the SDK reducer test suite never fed a `stream_error` event with `errorKind` and asserted `state.streamError?.errorKind`. A future refactor stripping `errorKind` from the reducer's data assignment (e.g. spreading only `{error}`) would silently regress without test signal. Fix: added `captures errorKind on stream_error in view state` test exercising the full pipeline — reducer receives stream_error with errorKind, view state's `streamError.errorKind` matches. Verification - `packages/sdk-typescript`: 76/76 daemonEvents tests pass (+1 new flowthrough test). - `packages/cli/src/serve/server.test.ts`: 6 targeted serverTimestamp / stream_error / errorKind tests pass — server.ts changes are observability-only (no behavior change to wire format). - Pre-existing F3 (#4335 merge) test failures elsewhere are unrelated to this PR's changes.
…verage) Adopts both threads from wenshao's third review round on PR #4360. Both Suggestion-level — pin the daemon-side stderr log artifacts that commit `dce2fed0f` introduced. Pre-fix: the EventBus-level state_resync_required emission was tested in eventBus.test.ts, and the SSE wire shape was tested in server.test.ts, but the actual operator-facing artifacts (the stderr log lines themselves) had no test coverage. A regression swapping operands in the `gap` arithmetic, dropping the sessionId from the log, or breaking the `[errorKind]` suffix would ship silently and only surface when an operator went grepping during an incident. **[Suggestion] SSE ring eviction stderr log untested** (PRRT_kwDOPB-92c6Dqtlb, server.ts:1948) Added 2 tests: - `writes a daemon-side stderr log on SSE ring eviction` — yields a `state_resync_required` frame from a fake bridge, spies on `process.stderr.write`, asserts the captured log contains `session sess-A` + `lastEventId=5` + `earliestInRing=12` + `gap=6 events` (pins the arithmetic) + `reason=ring_evicted` + `loadSession` (the recovery hint). - `falls back to "?" placeholders when state_resync_required data is partial` — yields a frame with empty `data: {}`, asserts every `?? '?'` branch fires (lastEventId=? / earliestInRing=? / gap=? events / reason=?). Defensive against future daemon schema changes that drop one of these fields. **[Suggestion] Bridge iterator error stderr log untested** (PRRT_kwDOPB-92c6Dqtlh, server.ts:1993) Added 2 tests: - `writes a daemon-side stderr log on bridge iterator error` — fake bridge throws plain `Error('agent died')` mid-stream, captures stderr, asserts the log contains `session sess-A` + `agent died`, and **no** `[…]` suffix (plain Error → `mapDomainErrorToErrorKind` returns undefined → no suffix). - `includes [errorKind] suffix in bridge iterator error log when classified` — fake bridge throws `BridgeTimeoutError('initialize', 5000)`, asserts the log contains `[init_timeout]`. Pins the classified-vs-unclassified branch of the conditional suffix template. All 4 tests use `vi.spyOn(process.stderr, 'write').mockReturnValue( true)` + filter `mock.calls` for the relevant log prefix — same pattern as the existing `mcp-client-manager.test.ts` stderr-spy tests in core, plus `startupProfiler.test.ts` in cli. Verification: 7/7 targeted observability tests pass. Pre-existing F3 (#4335 merge) test failures elsewhere are unrelated to this PR's changes.
wenshao
left a comment
There was a problem hiding this comment.
No issues found. LGTM! ✅ — gpt-5.5 via Qwen Code /review
yiliang114
left a comment
There was a problem hiding this comment.
LGTM — clean protocol-level additions with solid backward compat.
Commit 1 (#19 stamping): _meta.serverTimestamp at the SSE write boundary is the right call — keeps BridgeEvent unpolluted. resolveToolProvenance as a static pure function with duck-typed mcp__ parsing and subagent precedence is well-designed; edge cases (malformed prefix, empty server segment) fall through to builtin cleanly.
Commit 2 (#15 gap detection): The earliestInRing > lastEventId + 1 check with a synthetic id-less state_resync_required frame before replay is straightforward and well-tested — positive cases, boundary off-by-one, empty ring, and fresh subscriber are all covered. SDK reducer gating via awaitingResync with RESYNC_PASSTHROUGH_TYPES for terminal lifecycle frames is a solid safety default.
preserveFsErrorOverAcp duck typing approach is consistent with the existing mapDomainErrorToErrorKind pattern for cross-package boundaries. Test matrix covers write + read paths, non-FsError passthrough, and wrong-name guard.
Summary
Two F4 prerequisites bundled into one PR against
daemon_mode_b_main. Both are protocol-level daemon completions that the F4 client-adapter wave needs in place before it can render correctly:Error after selecting authentication method: "Cannot read properties of undefined (reading 'value')" #19 stamping (
14637cd79) — Three daemon-side stamping additions chiga0 called out in proposal(serve): Mode B feature-priority roadmap toward v0.16 production-ready #4175 comment Error after selecting authentication method: "Cannot read properties of undefined (reading 'value')" #19. SDK side (PR feat(sdk/daemon-ui): unified completeness follow-up to #4328 (PR-A through PR-E) #4353) is forward-compat ready; we just need daemon to start emitting these fields. UI affordances are inert without them.运行不了 #15 SSE reducer gap detection (
c1a2f0a78) — Closes the multi-client state divergence bug Ilya0527 raised in proposal(serve): Mode B feature-priority roadmap toward v0.16 production-ready #4175 comment 运行不了 #15. When an SSE consumer reconnects withLast-Event-IDpast the ring eviction point, daemon now emitsstate_resync_requiredso the SDK reducer can refuse to apply stale deltas until consumer recovery.Why bundle: both are small (~80 + ~150 LOC), both touch
eventBus/ SSE envelope area, both are F4 prerequisites. Single review unit is more efficient than two parallel PRs. Each ships as its own commit so review can step through them independently.Commit 1 —
14637cd79(#19 stamping)Files
packages/cli/src/serve/server.ts—formatSseFrame()stamps_meta.serverTimestampat the SSE write boundary. Preserves pre-existing_metakeys (e.g. tool_call's_meta.toolName) via spread merge.packages/cli/src/serve/server.ts—stream_erroremit stampserrorKindviamapDomainErrorToErrorKind(err)when classifiable.undefinedfor unclassified errors → field omitted (strictly additive).packages/cli/src/acp-integration/session/emitters/ToolCallEmitter.ts— new staticresolveToolProvenance(toolName, subagentMeta)→{ provenance: 'builtin' | 'mcp' | 'subagent'; serverId? }. Stamped onemitStartANDemitResultANDemitError(reconnecting clients can re-derive provenance from any update event).Design decisions (confirmed via AskUserQuestion before coding)
serverTimestampstamped atformatSseFrameboundaryBridgeEventtype. SDK reads via_meta.serverTimestamp(Anthropic convention).mcp__<server>__<tool>naming heuristicNOT stamped:
session_diederrorKindThe 3
session_diedemit sites inacp-bridge/bridge.tsdon't have a classifiableerrin scope (channel_closedcarries exitCode/signalCode only;killed+daemon_shutdownare user/operator-initiated, not domain errors). Threading channel-spawn errors through to enableerrorKind: 'init_timeout'/missing_binaryclassification is left for a separate PR to avoid mixing protocol stamping with lifecycle plumbing.Commit 2 —
c1a2f0a78(#15 gap detection)Daemon side (
packages/acp-bridge/src/eventBus.ts)In
subscribe()'s replay path, detect ring eviction by comparingthis.ring[0]?.idagainstlastEventId + 1. Whenearliest > last + 1, force-push a synthetic terminal frame:The frame has NO
id(mirrorsclient_evictedsynthetic terminal pattern). Replay continues after it — adapters get the option to compute a "what you missed" diff later; the SDK reducer auto-skips stale deltas regardless.SDK side (
packages/sdk-typescript/src/daemon/events.ts)Adds the event type to
DAEMON_EVENT_TYPES, the data interface + event type + predicate, widensDaemonStreamLifecycleEvent, adds reducer state fields (awaitingResync/resyncRequiredCount/lastResyncRequired?), and gates non-terminal events at the top of the reducer whenawaitingResync === true. TheRESYNC_PASSTHROUGH_TYPESset documents which event types still apply (terminal lifecycle frames must, so a session that dies during resync limbo is observable).Design decisions (confirmed via AskUserQuestion before coding)
state_resync_requiredawaitingResyncloadSession+createDaemonSessionViewState.Consumer recovery contract
When
state.awaitingResync === true, consumer code callsloadSession(out of band — the SDK doesn't see that as an event), then reconstructs view state viacreateDaemonSessionViewState({...seedFromLoadSession}). The fresh state defaultsawaitingResync: falseso seeding implicitly clears the flag.Test plan
npx vitest run packages/cli/src/serve/server.test.ts— +3 new tests (serverTimestamp + stream_error errorKind)npx vitest run packages/cli/src/acp-integration/session/emitters/ToolCallEmitter.test.ts— 46 tests pass (+11 new: resolveToolProvenance static + provenance stamping on all 3 emit paths)npx vitest run packages/cli/src/acp-integration/session/HistoryReplayer.test.ts— 17 tests pass (provenance flows through replay)cd packages/acp-bridge && npx vitest run— 6 files, 108/108 pass (+5 new resync detection tests)cd packages/sdk-typescript && npx vitest run— 13 files, 451/451 pass (+8 new reducer resync tests)prettier+eslint --max-warnings 0) clean on all 11 modified filesqwen serve+ curl SSE → verify_meta.serverTimestampon every frame; trigger an error → verifyerrorKind; force ring eviction → verifystate_resync_requiredframeBackward compat
_meta.serverTimestamp/provenance/serverId/errorKindfields. New SDKs read them.state_resync_requiredis a NEW event type; daemons predating this PR never emit it. SDK reducer's switch fall-through ignores unknown types (per existingunrecognizedKnownEventCountsemantics).DaemonStreamErrorData.errorKind?is optional — old daemons that don't stamp it remain valid.DaemonSessionViewStatewidened withawaitingResync/resyncRequiredCount/lastResyncRequired?— consumer reseeding viacreateDaemonSessionViewState()(no args) defaultsawaitingResync: false, so adapters that never saw a resync event behave identically.Refs
🤖 Generated with Qwen Code