diff --git a/docs/STANDARDS-REGISTRY.md b/docs/STANDARDS-REGISTRY.md index 26d734b92..5033549ca 100644 --- a/docs/STANDARDS-REGISTRY.md +++ b/docs/STANDARDS-REGISTRY.md @@ -433,6 +433,20 @@ The Root says *enforce behavior in structure, not willpower.* This family is **w ## Interaction — the agent's surface to the user and the world +### The User Experience Is the Product — Reachability, Responsiveness, and Coherence Are Sacred +**Rule.** The user's ability to **reach** a live agent, **be heard**, and get a **timely, coherent response** is a first-class invariant that **outranks internal caution when the two conflict**. No internal guard, safety net, resource limit, or self-continuity discipline may *silently* degrade the user's channel. When a guard cannot do its job, it must **fail toward the user being served** — preserving safety by a different, cheaper, deterministic mechanism — and surface the degradation **loudly**. A guard may never be the single thing standing between the user and *any* response at all. The whole-experience invariant must have an **owner** (a UX-liveness watchdog), a **metric** (reach-rate and time-to-first-response), and a **hard rule**: when internal safety and the user's reach conflict, *the user wins*. +**In practice.** This is the outward-facing complement to *Structure beats Willpower*: it says the structure must guard not only "the agent never does the wrong thing" but "the user reliably gets through." It is the umbrella over seven sub-standards, each the structural teeth for one failure mode, several of which already exist as their own articles above: + 1. **State Convergence** — every declarative desired-state the system records (a placement pin, a lease) has an owning reconciler that drives actual→desired within a bounded time or escalates loudly. Declarative intent with no controller is a wish. *(Failure 1 — the cross-machine move that pinned but never actuated.)* + 2. **Enforced Termination** — a time/iteration/resource budget is a **structural hard stop enforced by a watchdog OUTSIDE the run**, never the run's own willpower. This is *Structure beats Willpower* applied to the *end* of work, which we had only ever applied to its *middle* — closing the dangerous asymmetry where every pressure says "don't stop early" and none says "stop on time." *(Failure 2 — a 24h run that reached 46h.)* + 3. **Inbound Delivery Is Sacred** — every inbound user message reaches a live session within a bounded time OR raises a loud failure through a channel proven deliverable; never a silent expiry. Corollary: **a partially-built safety mechanism must fail OPEN (pass through to the working path), never CLOSED (capture-and-drop)** — a half-finished net that intercepts then drops is more dangerous than no net. Sibling of *The Operator Channel Is Sacred* (which governs the inbound consume/pause gate); this governs the inbound holding-queue. *(Failure 3 — the durable queue that ate "why aren't you responding?")* + 4. **Guards Degrade, Not Outage** — a safety check on the user-facing path must (i) have a fallback engine/path and (ii) distinguish *content-unsafe* (fail closed, hold) from *check-unavailable* (fail toward the user, with safety preserved by a cheaper deterministic check). The guard may never convert its own infra failure into the user's silence. The direct extension of *The Operator Channel Is Sacred* and *No Silent Degradation to Brittle Fallback* to the OUTBOUND tone gate. *(Failure 4 — the tone gate that held every reply when its single engine stalled.)* + 5. **User-Facing Priority Lane** — when a scarce resource (subprocess slots, LLM quota, CPU) is contended, work on the live user path **preempts** background/housekeeping work; background watchers are load-shed first, never the user's reply. The complement to *Bounded Blast Radius* (which caps the spawn floor): the floor must not treat a user's reply and a coherence-sweep as equal citizens. *(Failure 5 — the fork-bomb floor that starved the user's channel.)* + 6. **Degradation Is an Event, Not a Footnote** — a coordination layer running in a degraded/fallback mode (a stalled lease tick, a framework gone unavailable) must surface that state where it can reach the user, not re-arm quietly in a log. *(Failure 6 — the silently re-armed lease that made the ownership confusion easy to trigger.)* + 7. **Blast-Radius Before, Verify After** — any mutating action on shared infrastructure (sessions, routing, ownership) must (i) prefer the reversible option and (ii) verify the user-facing invariants still hold afterward. *Capability ≠ authority* exists; this adds *capability ≠ wisdom — check what you're about to break, and confirm the user can still reach you after.* The behavioral kin of *Live-User-Channel Proof Before Done*, applied to an in-flight infra mutation rather than a feature ship. *(Failure 7 — the force-kill that black-holed inbound messages.)* +**Earned from.** 2026-06-25 (incident window ~17:48–19:25 PDT): a runaway autonomous run overloaded the mesh, and then **the safety mechanisms themselves became the outage** — the inbound queue ate the user's messages, the tone gate blocked every reply, the fork-bomb floor starved the user's channel, and the "don't stop early" discipline (having no opposite) sheltered the runaway. Laid side by side, the seven failures reveal one blindspot: **every guard Instar had built pointed *inward* (what the agent emits, its sessions staying alive, it not quitting, it working on the right project, the machine not melting down); not one guarded the thing the user actually experiences — "can I reach a live agent, will it hear me, will it answer in reasonable time?"** Each component optimized its *local* safety, fail-closing toward "the agent does nothing wrong"; summed, they produced an agent that was internally pristine and **externally unreachable**. No standard said that when internal caution and the user's reach conflict, the user wins — so in every conflict, the user lost, quietly. The operator: "identify the root Blindspot and Meta issue that allowed this … what constitutional standard are we missing that would have [prevented it] in the first place." Full analysis: `docs/incidents/2026-06-25-user-reachability-postmortem.md`. +**Traces to the goal.** A coherent, self-evolving agent that the user cannot reliably reach, be heard by, or get a coherent answer from is not coherent *to the only observer that matters* — it is a fortress that is perfectly safe and perfectly silent. Robustness defined only as "the agent never does the wrong thing" is half a definition; the other half is "the user reliably gets through." The genesis taught that for an agent, structure is the only carrier of self across discontinuity — and the user's channel is the one structure whose failure makes every other coherence invisible. +**Applied through.** Sub-standard #4 (*Guards Degrade, Not Outage*) shipped its first structural teeth on 2026-06-26 — the outbound `MessagingToneGate` now degrades to an in-process deterministic leak floor (clean SENDS, leak HOLDS) on an LLM-backend outage instead of holding every reply, covering both the fast provider-throw and the slow route-budget timeout (PRs #1276 fast-engine-restored, #1277 per-framework breaker isolation, #1279 graceful-degrade; spec `docs/specs/tone-gate-graceful-degradation.md`). The remaining sub-standards (#1, #2, #3, #5, #6, #7) are tracked as the F-series fixes from the postmortem; per *How a new standard joins*, this article is the operator-ratifiable proposal and the honest test is that each tooth is *real*, not that listing it here makes it so. + ### No Manual Work (user *or* agent) **Rule.** Capturing context and taking available actions must be automatic. Don't make the user remember Instar's features, and don't rely on the agent remembering to use its own tools. **In practice.** No "remember to log it" or "remember to run X" step survives into a design — for anyone. If a behavior depends on someone remembering, it isn't built yet. All user interaction goes through channels; the agent never asks the user to go edit a file.