Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions docs/STANDARDS-REGISTRY.md
Original file line number Diff line number Diff line change
Expand Up @@ -433,6 +433,20 @@ The Root says *enforce behavior in structure, not willpower.* This family is **w

## Interaction — the agent's surface to the user and the world

### The User Experience Is the Product — Reachability, Responsiveness, and Coherence Are Sacred
**Rule.** The user's ability to **reach** a live agent, **be heard**, and get a **timely, coherent response** is a first-class invariant that **outranks internal caution when the two conflict**. No internal guard, safety net, resource limit, or self-continuity discipline may *silently* degrade the user's channel. When a guard cannot do its job, it must **fail toward the user being served** — preserving safety by a different, cheaper, deterministic mechanism — and surface the degradation **loudly**. A guard may never be the single thing standing between the user and *any* response at all. The whole-experience invariant must have an **owner** (a UX-liveness watchdog), a **metric** (reach-rate and time-to-first-response), and a **hard rule**: when internal safety and the user's reach conflict, *the user wins*.
**In practice.** This is the outward-facing complement to *Structure beats Willpower*: it says the structure must guard not only "the agent never does the wrong thing" but "the user reliably gets through." It is the umbrella over seven sub-standards, each the structural teeth for one failure mode, several of which already exist as their own articles above:
1. **State Convergence** — every declarative desired-state the system records (a placement pin, a lease) has an owning reconciler that drives actual→desired within a bounded time or escalates loudly. Declarative intent with no controller is a wish. *(Failure 1 — the cross-machine move that pinned but never actuated.)*
2. **Enforced Termination** — a time/iteration/resource budget is a **structural hard stop enforced by a watchdog OUTSIDE the run**, never the run's own willpower. This is *Structure beats Willpower* applied to the *end* of work, which we had only ever applied to its *middle* — closing the dangerous asymmetry where every pressure says "don't stop early" and none says "stop on time." *(Failure 2 — a 24h run that reached 46h.)*
3. **Inbound Delivery Is Sacred** — every inbound user message reaches a live session within a bounded time OR raises a loud failure through a channel proven deliverable; never a silent expiry. Corollary: **a partially-built safety mechanism must fail OPEN (pass through to the working path), never CLOSED (capture-and-drop)** — a half-finished net that intercepts then drops is more dangerous than no net. Sibling of *The Operator Channel Is Sacred* (which governs the inbound consume/pause gate); this governs the inbound holding-queue. *(Failure 3 — the durable queue that ate "why aren't you responding?")*
4. **Guards Degrade, Not Outage** — a safety check on the user-facing path must (i) have a fallback engine/path and (ii) distinguish *content-unsafe* (fail closed, hold) from *check-unavailable* (fail toward the user, with safety preserved by a cheaper deterministic check). The guard may never convert its own infra failure into the user's silence. The direct extension of *The Operator Channel Is Sacred* and *No Silent Degradation to Brittle Fallback* to the OUTBOUND tone gate. *(Failure 4 — the tone gate that held every reply when its single engine stalled.)*
5. **User-Facing Priority Lane** — when a scarce resource (subprocess slots, LLM quota, CPU) is contended, work on the live user path **preempts** background/housekeeping work; background watchers are load-shed first, never the user's reply. The complement to *Bounded Blast Radius* (which caps the spawn floor): the floor must not treat a user's reply and a coherence-sweep as equal citizens. *(Failure 5 — the fork-bomb floor that starved the user's channel.)*
6. **Degradation Is an Event, Not a Footnote** — a coordination layer running in a degraded/fallback mode (a stalled lease tick, a framework gone unavailable) must surface that state where it can reach the user, not re-arm quietly in a log. *(Failure 6 — the silently re-armed lease that made the ownership confusion easy to trigger.)*
7. **Blast-Radius Before, Verify After** — any mutating action on shared infrastructure (sessions, routing, ownership) must (i) prefer the reversible option and (ii) verify the user-facing invariants still hold afterward. *Capability ≠ authority* exists; this adds *capability ≠ wisdom — check what you're about to break, and confirm the user can still reach you after.* The behavioral kin of *Live-User-Channel Proof Before Done*, applied to an in-flight infra mutation rather than a feature ship. *(Failure 7 — the force-kill that black-holed inbound messages.)*
**Earned from.** 2026-06-25 (incident window ~17:48–19:25 PDT): a runaway autonomous run overloaded the mesh, and then **the safety mechanisms themselves became the outage** — the inbound queue ate the user's messages, the tone gate blocked every reply, the fork-bomb floor starved the user's channel, and the "don't stop early" discipline (having no opposite) sheltered the runaway. Laid side by side, the seven failures reveal one blindspot: **every guard Instar had built pointed *inward* (what the agent emits, its sessions staying alive, it not quitting, it working on the right project, the machine not melting down); not one guarded the thing the user actually experiences — "can I reach a live agent, will it hear me, will it answer in reasonable time?"** Each component optimized its *local* safety, fail-closing toward "the agent does nothing wrong"; summed, they produced an agent that was internally pristine and **externally unreachable**. No standard said that when internal caution and the user's reach conflict, the user wins — so in every conflict, the user lost, quietly. The operator: "identify the root Blindspot and Meta issue that allowed this … what constitutional standard are we missing that would have [prevented it] in the first place." Full analysis: `docs/incidents/2026-06-25-user-reachability-postmortem.md`.
**Traces to the goal.** A coherent, self-evolving agent that the user cannot reliably reach, be heard by, or get a coherent answer from is not coherent *to the only observer that matters* — it is a fortress that is perfectly safe and perfectly silent. Robustness defined only as "the agent never does the wrong thing" is half a definition; the other half is "the user reliably gets through." The genesis taught that for an agent, structure is the only carrier of self across discontinuity — and the user's channel is the one structure whose failure makes every other coherence invisible.
**Applied through.** Sub-standard #4 (*Guards Degrade, Not Outage*) shipped its first structural teeth on 2026-06-26 — the outbound `MessagingToneGate` now degrades to an in-process deterministic leak floor (clean SENDS, leak HOLDS) on an LLM-backend outage instead of holding every reply, covering both the fast provider-throw and the slow route-budget timeout (PRs #1276 fast-engine-restored, #1277 per-framework breaker isolation, #1279 graceful-degrade; spec `docs/specs/tone-gate-graceful-degradation.md`). The remaining sub-standards (#1, #2, #3, #5, #6, #7) are tracked as the F-series fixes from the postmortem; per *How a new standard joins*, this article is the operator-ratifiable proposal and the honest test is that each tooth is *real*, not that listing it here makes it so.

### No Manual Work (user *or* agent)
**Rule.** Capturing context and taking available actions must be automatic. Don't make the user remember Instar's features, and don't rely on the agent remembering to use its own tools.
**In practice.** No "remember to log it" or "remember to run X" step survives into a design — for anyone. If a behavior depends on someone remembering, it isn't built yet. All user interaction goes through channels; the agent never asks the user to go edit a file.
Expand Down
Loading