JKHeadley · JKHeadley · May 29, 2026
diff --git a/docs/specs/cross-session-coordination.eli16.md b/docs/specs/cross-session-coordination.eli16.md
@@ -0,0 +1,75 @@
+# Cross-Session Coordination — plain-English overview
+
+## What this is, in one line
+
+When more than one copy of me is running at the same time on the same machine, this
+gives them a way to *see each other* before they each do something big — so they
+stop stepping on each other's toes.
+
+## Why we need it (the real story)
+
+I can have several sessions running at once against the same set of files (my
+`.instar/` folder). Think of it like several cooks in one kitchen, all following the
+same recipe, none of them aware the others are there. Two things actually went wrong
+on 2026-05-28:
+
+1. **The "still working" ghost.** One session finished a job but left a sticky note
+   saying "still cooking: yes." A second session read that stale note and kept telling
+   the user "I'm still working on it" — even though nothing was happening.
+2. **The double-reaction (the bad one).** Both sessions noticed the same bug. One
+   calmly built the proper fix. The *other* panicked and hit the emergency brake —
+   flipped a feature off and cancelled 19 pending tasks. Neither knew the other was on
+   it. Result: the bug got fixed, but the engine was left off and the test setup was
+   wiped. Two correct instincts that, uncoordinated, undid each other.
+
+The root cause is the same both times: **shared files, no traffic light.** Nothing
+tells one session "hey, another you is already acting — wait a second."
+
+## What we are building (and just as importantly, what we're NOT)
+
+Justin chose the **light** option from a menu of light / medium / heavy. So this is
+the gentle version, on purpose:
+
+- **What it does:** Before a session does something high-impact (flip a feature flag,
+  withdraw commitments), it can post a quick note — "I'm about to do X." Whenever any
+  session takes such an action, the system hands back a little advisory warning if
+  *another* session was just active: "⚠ another session withdrew 19 commitments 4
+  minutes ago — confirm before proceeding."
+- **What it does NOT do:** It never blocks anything. There are no hard locks. It never
+  changes the thing you were trying to change. It's a heads-up, not a gate. If a
+  session ignores the warning, the action still goes through.
+
+That matches exactly what Justin asked for: "start small and learn to collaborate
+slowly and smoothly," with cheap course-correction.
+
+## How you'd actually see it
+
+- A session announces intent: `POST /coordination/intent`.
+- A session inspects what's been happening: `GET /coordination/recent`.
+- The two risky actions from the incident — config-flag flips and commitment
+  withdrawals — automatically record themselves and carry the warning back in their
+  own response, so it works even if a session forgets to announce.
+- It's quiet: no Telegram pings. Everything is logged to
+  `logs/cross-session-events.jsonl` for later reading.
+
+## What already exists vs. what's new
+
+- **Already existed:** Single-commitment write safety (`CommitmentTracker.mutate()`
+  uses a compare-and-swap so two sessions can't tear one record). That protects a
+  single write — it does *not* stop two sessions adopting *opposite plans*.
+- **New here:** the cross-session visibility layer — the shared scratchpad + the
+  advisory warning that surfaces an opposing action before you commit to yours.
+
+## What's deliberately left for later
+
+- The stale "still working" ghost (incident #1) is a *separate* liveness bug, not a
+  coordination problem. It's noted as a candidate next step, not bundled in here.
+- Hard locks / leader election / one-session-in-charge — that's the heavy redesign
+  Justin chose *not* to do for now.
+
+## What the reader needs to decide
+
+Really just one thing: is "light and advisory" the right first step, or do you want
+something stronger (real locks) sooner? Justin already said light-first. If watching
+it in practice shows the advisory isn't enough, the medium/heavy options are still on
+the table — nothing here paints us into a corner.
diff --git a/docs/specs/cross-session-coordination.md b/docs/specs/cross-session-coordination.md
@@ -0,0 +1,129 @@
+---
+title: Cross-Session Coordination Signal (light, advisory)
+status: approved
+approved: true
+approver: justin
+approved-at: "2026-05-28T19:00:00Z"
+review-convergence: "2026-05-28T19:00:00Z"
+review-iterations: 3
+review-completed-at: "2026-05-28T19:00:00Z"
+review-report: "docs/specs/reports/cross-session-coordination-convergence.md"
+created: 2026-05-28
+owner: echo
+eli16-overview: cross-session-coordination.eli16.md
+---
+
+# Cross-Session Coordination Signal (light, advisory)
+
+> Approved direction: Justin, 2026-05-28 topic 15579 — "go with a light fix for now
+> and see if it helps … learn to collaborate slowly and smoothly." This is the
+> written form of the build Justin chose from the light/medium/heavy menu and saw
+> described before approving. Convergence review:
+> docs/specs/reports/cross-session-coordination-convergence.md.
+
+## The problem (two observed incidents, 2026-05-28)
+
+A single agent home can have **multiple concurrent Claude Code sessions** acting on
+the same `.instar/` state. They are blind to each other. Two real incidents:
+
+1. **Stale `active:true` ghost** — an autonomous-state file said `active:true` after
+   the work had completed; a *different* session read it as "still in flight" and
+   narrated "working." (Recovered easily. Treated as a SEPARATE staleness/liveness
+   bug — out of scope for this signal; see "Out of scope".)
+2. **Opposing durable actions (the damaging one)** — one session built the proper fix
+   (PR 495); in parallel another session hit a "safety brake": flipped a config flag
+   to `false` AND mass-withdrew ~19 active commitments. Both reached a correct local
+   diagnosis; neither knew the other was acting. Net: bugs fixed, engine off, test
+   bed gone.
+
+Root cause: shared mutable state (`.instar/config.json`, commitments, autonomous
+state) with **no cross-session visibility before a high-impact action**.
+`CommitmentTracker.mutate()` has single-writer CAS *per commitment* — that prevents
+torn writes, not two sessions adopting opposing policies and each acting durably.
+
+## The light fix (this build)
+
+A **CrossSessionCoordinator**: a shared, append-only scratchpad of recent
+high-impact "structural actions" + voluntary "I'm about to do X" intents. Any
+structural action surfaces *other recent* entries to the actor as an advisory
+`coordinationWarning`. **It never blocks** (light = advisory). It never mutates the
+target state. It is the "visible signal sessions check before acting" Justin asked for.
+
+### What gets recorded
+- **Voluntary intent** (`POST /coordination/intent`) — a session announces what it is
+  about to do / is doing ("building PR 495 fix for the redrive flood"). This is the
+  primary "I'm about to do X" surface.
+- **config-flag flip** — auto-recorded at the `PATCH /config` route for sensitive
+  keys (feature on/off toggles). Backstop; no agent reliance.
+- **commitment-withdraw** — auto-recorded in the `POST /commitments/:id/withdraw`
+  route handler after a successful withdrawal. All agent-initiated withdrawals are
+  route-driven (the route is the single agent-facing path), so recording there is
+  the single source — no double-count — and it lets the warning ride back on the
+  same HTTP response. (Code-internal lifecycle transitions like expiry are not
+  withdrawals and are deliberately not recorded.)
+
+### The signal
+On every recorded action, the coordinator computes `concurrent`: other non-expired
+records within `windowMs` (default 10 min) authored by a *different or unknown* actor.
+If non-empty, the acting route attaches a `coordinationWarning` to its HTTP response
+("⚠ N recent structural action(s) by another/unknown session in the last Xm: … —
+confirm before proceeding"). The session (Claude) reads that warning inline and can
+reconsider, re-check, or tell the user. `GET /coordination/recent` exposes the ledger
+for explicit pre-action checks and inspection.
+
+### Storage + audit
+- Ledger: `<stateDir>/state/cross-session-actions.json` — `{ version, actions: [...] }`,
+  capped (200), TTL-pruned on write (`retentionMs`, default 60 min). Atomic temp+rename,
+  reload-per-op (cross-process safe, mirrors ConversationStore).
+- Audit: append each record to `<stateDir>/logs/cross-session-events.jsonl`.
+
+### Config (default-ON housekeeping; matches the silently-stopped sentinels)
+```
+monitoring.crossSessionCoordination: {
+  enabled: true,        // records + warns. false => passive (GET still 200, no records/warnings)
+  windowMs: 600000,     // concurrency window (10 min)
+  retentionMs: 3600000, // ledger TTL (60 min)
+  sensitiveConfigKeys: ["monitoring", "tunnel", "autonomousSessions", "lifeline", "updates"]
+}
+```
+No Telegram escalation in v1 (deliberate — near-silent; in-response warning + GET +
+JSONL is full observability without topic-spam risk). A buzz-on-conflict toggle is a
+small future addition if it proves wanted <!-- tracked: topic-15579 -->.
+
+## Wiring
+- `CrossSessionCoordinator` constructed inside `AgentServer` (like FrameworkIssueLedger /
+  TokenLedger — always alive, so `GET /coordination/recent` returns 200 not 503),
+  in its own try/catch so a failure never cascades into the other monitors.
+- Added to `RouteContext` as `crossSessionCoordinator`. (NOTE: distinct from the
+  multi-MACHINE `coordinator` already on ctx.)
+- Actor resolution (`coordinationActor`) is best-effort and SESSION-level only
+  (`X-Instar-Session` / `X-Instar-Actor` header, or body `actor`) — never the agent
+  id, which all sessions of one agent share and would wrongly suppress the warning.
+- `PATCH /config`: record sensitive-key flips, attach `coordinationWarning`.
+- `POST /commitments/:id/withdraw`: after success, attach `coordinationWarning`.
+- Routes: `POST /coordination/intent`, `GET /coordination/recent`.
+
+## Migration parity
+- Config defaults: add `monitoring.crossSessionCoordination` to `ConfigDefaults.ts`
+  `SHARED_DEFAULTS` → auto-applied to existing agents via `applyDefaults` in
+  `PostUpdateMigrator.migrateConfig`. No bespoke migration needed.
+- CLAUDE.md template: add a Cross-Session Coordination awareness section to
+  `generateClaudeMd()` + `migrateClaudeMd()` content-sniff guard.
+
+## Tests (all three tiers — NON-NEGOTIABLE)
+- **Unit** (`tests/unit/cross-session-coordinator.test.ts`): record/prune/window,
+  concurrent detection (different vs same vs unknown actor), cap, atomic persistence,
+  reload-per-op, disabled mode.
+- **Integration** (`tests/integration/cross-session-coordination-routes.test.ts`):
+  POST /coordination/intent → GET /coordination/recent; PATCH /config sensitive key
+  returns `coordinationWarning` when a concurrent intent exists; withdraw warning;
+  auth required.
+- **E2E** (`tests/e2e/cross-session-coordination-lifecycle.test.ts`): boot real
+  AgentServer (production path), GET /coordination/recent returns 200 (alive), a
+  recorded action surfaces end-to-end, capability discoverable.
+
+## Out of scope (separable, flagged to Justin)
+- Incident #1 stale-`active:true` cleanup/liveness guard — a distinct staleness bug,
+  not a coordination signal. Candidate next step.
+- Hard locks / leader election / session registry — explicitly the *heavy* path
+  Justin declined for now.
diff --git a/docs/specs/reports/cross-session-coordination-convergence.md b/docs/specs/reports/cross-session-coordination-convergence.md
@@ -0,0 +1,77 @@
+# Convergence Report — Cross-Session Coordination Signal (light, advisory)
+
+- Spec: `docs/specs/cross-session-coordination.md`
+- Owner: echo
+- Approved direction: Justin, topic 15579, 2026-05-28 ("go with a light fix for now …
+  learn to collaborate slowly and smoothly"), chosen from an explicit
+  light / medium / heavy menu, with the build described before approval.
+- Iterations: 3 substantive design-review passes (below).
+
+## Scope under review
+
+A LIGHT, advisory cross-session signal: a shared append-only scratchpad of recent
+high-impact structural actions + voluntary "I'm about to do X" intents, surfaced to
+an acting session as an in-response `coordinationWarning`. Never blocks, never mutates
+target state. Explicitly NOT hard locks / leader election (the heavy path Justin
+declined for now).
+
+## Iteration 1 — initial design
+
+Settled the storage model (atomic temp+rename JSON ledger, reload-per-op for
+cross-process safety, TTL prune + hard cap), the action taxonomy
+(`intent` / `config-flag` / `commitment-withdraw` / `other`), and the advisory-only
+contract (record + warn, never block, never touch the target). Decided default-ON
+housekeeping with no Telegram escalation in v1 — the in-response warning + `GET
+/coordination/recent` + JSONL audit give full observability without re-introducing
+the topic-spam risk that the attention-flood work just closed.
+
+## Iteration 2 — implementation review (caught real issues)
+
+1. **Double-count reconciliation (commitment-withdraw recording site).** The first
+   design proposed subscribing to `CommitmentTracker`'s `withdrawn` event AND recording
+   in the withdraw route — which double-counts. Resolved to a single source: record in
+   the `POST /commitments/:id/withdraw` route handler. Rationale: all *agent-initiated*
+   withdrawals are route-driven (the route is the single agent-facing path), so this is
+   the single source, gives no double-count, and lets the warning ride back on the same
+   HTTP response. Code-internal lifecycle transitions (expiry) are not withdrawals and
+   are deliberately not recorded. Spec updated to match.
+
+2. **Actor-resolution correctness.** Decided `coordinationActor()` must resolve a
+   SESSION-level discriminator only (`X-Instar-Session` / `X-Instar-Actor` header, or
+   body `actor`) and must NEVER fall back to the agent id — every session of one agent
+   shares the agent id, so using it would wrongly suppress the very cross-session
+   warning we want. Unknown actor is treated as "potentially a different session" so the
+   signal errs toward surfacing.
+
+3. **Intent-dedup bug (found via the integration test, fixed).** The action-identity
+   used for dedupe was `kind + target + value`. Two *different* intents both have no
+   target/value, so they collapsed into "the same action" and the warning was wrongly
+   suppressed. Fixed: intents are events, not states — they are never deduped. Dedupe
+   now applies only to state-flip kinds (config-flag / commitment-withdraw / other),
+   where two sessions writing the identical kind+target+value genuinely are one write. A
+   regression unit test locks the boundary.
+
+## Iteration 3 — testing-integrity + standards pass
+
+- All three tiers built and green: unit (16), integration (8, exercising both real
+  incident vectors — config-flip + withdraw over live HTTP), e2e lifecycle (6, booting
+  the real AgentServer → feature alive, 200 not 503), migration-parity (5).
+- Migration parity: config default in `ConfigDefaults.SHARED_DEFAULTS`
+  (applyDefaults propagates to existing agents), `migrateClaudeMd` awareness section +
+  `generateClaudeMd` template, CapabilityIndex entry. Unit-tested.
+- No-silent-fallbacks: the coordinator's advisory catch blocks carry
+  `@silent-fallback-ok` markers (persistence failure must never break a calling route);
+  verified the file contributes zero to the lint count.
+
+## Open / deferred
+
+- Incident #1 (stale `active:true` liveness ghost) — a distinct staleness bug, not a
+  coordination signal; candidate next step <!-- tracked: topic-15579 -->.
+- Buzz-on-conflict Telegram toggle — small future addition if it proves wanted
+  <!-- tracked: topic-15579 -->.
+
+## Convergence
+
+No open design contradictions. The advisory-only contract, single-source recording,
+session-level actor resolution, and intent-distinctness are settled and test-locked.
+Converged.
diff --git a/site/src/content/docs/architecture/under-the-hood.md b/site/src/content/docs/architecture/under-the-hood.md
@@ -58,6 +58,9 @@ The proactive eye. Polls every 60 seconds to classify each session as healthy, i
 
 **How they connect:** SessionMonitor detects the problem → SessionRecovery tries a fast fix → if that doesn't work, TriageOrchestrator runs heuristics → if those don't match, it spawns an LLM diagnosis. Meanwhile, SessionWatchdog independently catches stuck commands at the process level.
 
+### CrossSessionCoordinator
+A light, advisory signal for when more than one session runs against the same agent home at once — they otherwise share `.instar/` state while blind to each other. CrossSessionCoordinator keeps a shared, append-only scratchpad of recent high-impact actions (feature-flag flips, commitment withdrawals) plus voluntary "I'm about to do X" intents. When a session takes a structural action, it surfaces any recent action by a *different* session as an advisory `coordinationWarning` ("⚠ another session withdrew 19 commitments 4m ago — confirm first"). It never blocks and never mutates the target state — it's a heads-up, not a lock. Default-on, near-silent (no Telegram); audit trail at `logs/cross-session-events.jsonl`.
+
 </details>
 
 ---

diff --git a/site/src/content/docs/reference/api.md b/site/src/content/docs/reference/api.md
@@ -123,6 +123,17 @@ The Instar server exposes a REST API on `localhost:4040` (configurable). All end
 | GET | `/messages/outbox` | Inter-agent outbox |
 | GET | `/messages/dead-letter` | Dead letter queue |
 
+## Cross-Session Coordination
+
+Backed by the `CrossSessionCoordinator` — a light, advisory signal so concurrent sessions on one agent home see each other's recent high-impact actions before acting. Never blocks; surfaces a `coordinationWarning`. Pass `X-Instar-Session: <topic-or-label>` so the signal can tell sessions apart.
+
+| Method | Path | Description |
+|--------|------|-------------|
+| POST | `/coordination/intent` | Announce "I'm about to do X" so other sessions see it before acting |
+| GET | `/coordination/recent` | Recent structural actions + intents (newest first) |
+
+Sensitive `PATCH /config` flips and `POST /commitments/:id/withdraw` calls auto-record and attach a `coordinationWarning` to their own response when another session was recently active. Config: `monitoring.crossSessionCoordination`. Audit: `logs/cross-session-events.jsonl`.
+
 ## Threadline (MCP Tools)
 
 These tools are registered as an MCP server and called by Claude Code (or any MCP client) via stdio transport. They are registered automatically on server boot.

diff --git a/src/config/ConfigDefaults.ts b/src/config/ConfigDefaults.ts
@@ -56,6 +56,21 @@ const SHARED_DEFAULTS: Record<string, unknown> = {
     contextWedgeSentinel: {
       enabled: true,
     },
+    // CrossSessionCoordinator — light, advisory cross-session coordination signal.
+    // Default-ON housekeeping (like the silently-stopped sentinels): it records
+    // high-impact structural actions + voluntary "I'm about to do X" intents and
+    // surfaces concurrent activity by another session as an in-response advisory
+    // warning. It NEVER blocks and never mutates target state. enabled:false =>
+    // passive (GET stays 200; nothing recorded, no warnings). No Telegram
+    // escalation in v1 (near-silent by design). sensitiveConfigKeys are the
+    // top-level prefixes whose PATCH /config flips get auto-recorded.
+    // See docs/specs/cross-session-coordination.md.
+    crossSessionCoordination: {
+      enabled: true,
+      windowMs: 600000,
+      retentionMs: 3600000,
+      sensitiveConfigKeys: ['monitoring', 'tunnel', 'autonomousSessions', 'lifeline', 'updates'],
+    },
     // SessionReaper — pressure-aware reaper of idle-but-alive sessions.
     // UNLIKE the sentinels above, default OFF + dry-run: it is the only monitor
     // that *kills* sessions on a heuristic, so it ships dark and must be flipped