Skip to content

docs(spec): Threadline A2A coherence — root cause + layered fix [SPEC FOR REVIEW]#670

Open
JKHeadley wants to merge 5 commits into
mainfrom
echo/coherence-a2a-spec
Open

docs(spec): Threadline A2A coherence — root cause + layered fix [SPEC FOR REVIEW]#670
JKHeadley wants to merge 5 commits into
mainfrom
echo/coherence-a2a-spec

Conversation

@JKHeadley

Copy link
Copy Markdown
Owner

Spec for review — approved: false. Nothing builds until you approve.

This specs the fix for the identity-coherence failure we hit on the Echo↔Dawn thread: every inbound Threadline message cold-spawns a memory-less session, so the agent talks to a peer as a crowd of disjoint fragments — it loops, deadlocks on stateful handshakes, and runs invisibly to you.

Root cause (grounded in code)

The real Claude session UUID is never captured back into ThreadResumeMap:

  • spawnNewThread stamps a placeholder UUID (spawnResult.sessionId || crypto.randomUUID()) — never the real one.
  • onSessionEnd (the function whose job is to write the real UUID back) has zero callers.

So get() always fails its jsonlExists(uuid) check → returns null → the router cold-spawns on every message. Verified at runtime: live-injects=0, only Spawned session ever logged, never Resumed.

The fix (layered, phased)

  • Phase 1 (resolves the break): capture the real UUID → resume continuity (Layer 1) + surface A2A threads to you (Layer 4).
  • Phase 2: warm live-sessions (Layer 2) + make the agent-to-agent "me" share the user-facing "me"'s memory (Layer 3).
  • Phase 3: keep the safety brakes — sensitive completions still escalate to you (Layer 5). The context-blind refusal was correct; coherence and the security gates compose, they don't trade off.

To review

  • Plain-language: docs/specs/THREADLINE-A2A-COHERENCE-ELI16.md
  • Full spec: docs/specs/THREADLINE-A2A-COHERENCE-SPEC.md

4 decisions I need from you (§7)

  1. UUID capture: post-spawn discovery, onSessionEnd wiring, or both (rec: both).
  2. Visibility default: surface all A2A threads to the silent hub topic, or only some?
  3. Warm-session model: per-thread vs per-peer, and TTL.
  4. Phase 1 scope: Layer 1+4 only, or pull the "one me" memory-sharing (Layer 3) forward?

🤖 Generated with Claude Code

Diagnoses the identity-coherence failure where every inbound Threadline
message cold-spawns a memory-less session (context-blind fragments that
loop, deadlock on stateful handshakes, and run invisibly to the operator).

Root cause grounded in code: the real Claude session UUID is never captured
back into ThreadResumeMap (spawnNewThread stamps a placeholder; onSessionEnd
has zero callers), so get() always fails jsonlExists and the router
cold-spawns every message. Layered fix (Phase 1 = resume continuity +
visibility; Phase 2 = warm sessions + identity/memory coherence; Phase 3 =
decision guardrails). Awaiting Justin's approval (approved: false).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 2, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
instar Ready Ready Preview, Comment Jun 2, 2026 4:40am

Request Review

Follow the eli16-overview-check.mjs convention so the commit-time gate and
the new publish-spec-review.mjs resolve the ELI16 overview.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…or review

- Add §2 'what EXISTS vs what is MISSING' from a code audit: topic-linkage
  (TopicLinkageHandler, approved spec) and message-mirroring (TelegramBridge,
  default-off) are already built — do not rebuild.
- Reframe Layer 1 as wiring the dead UUID feed (onSessionEnd has zero callers),
  not a greenfield continuity build.
- Redesign Layer 4 visibility as a STANDBY-STYLE conversational check-in
  (built on the PresenceProxy pattern), not raw message mirroring.
- Add Layer 5 (cold-inbound topic linkage — audit gap) and Layer 6
  (dual-conversation awareness + user-interruption/steering — operator
  requirement, genuinely absent today).
- Update phasing, decisions, non-goals + the ELI16 companion.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ound

Ran the convergence panel (security, adversarial, scalability, integration,
lessons-aware). Round 1 returned SERIOUS ISSUES (4 critical + multiple high,
convergent); v3 incorporates every material finding; the convergence check
verified all resolved with no new material issue → review-convergence: converged
(approved: false — convergence != approval).

Key hardening: authoritative-claudeSessionId continuity (mtime race fixed);
Layer 7 sensitive-completion floor reordered to a Phase-1 prerequisite (continuity
removes the accidental safeguard); Layer 6 provenance labeling + loop-gate;
Layer 3 read-only one-directional memory; Layer 4 redaction + LlmQueue +
Near-Silent default + default-off; multi-machine lease-gated writes; migration/
awareness/backup/dark-ship; lessons-engaged frontmatter. + convergence report.

cross-model: unavailable (no built dist in spec worktree — re-run before approval).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…aker heartbeat

Operator (Justin) approved 'yes to all' 5 decisions + a refinement: a
silence-breaker heartbeat (every 5-10 min while an a2a conversation is active
AND nothing has surfaced to the user) — folded into Layer 4 as the a2a analog
of the PresenceProxy standby heartbeat. approved: true; decisions §10 marked
resolved. Phase 1 = Layer 1 (authoritative-UUID continuity) + Layer 7
sensitive-completion floor (prerequisite) + Layer 4 (default-off).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant