feat(messaging): update-relevance gate for the Agent Updates feed#755
Open
JKHeadley wants to merge 3 commits into
Open
feat(messaging): update-relevance gate for the Agent Updates feed#755JKHeadley wants to merge 3 commits into
JKHeadley wants to merge 3 commits into
Conversation
After PR #698 made user-facing update announcements opt-in + maturity-tagged, the owner still saw updates referencing internal machinery they have no clue about ("Sibling Agent Server Control", "apprenticeship cycle recording"). #698 fixed the framing; it never enforced relevance. This adds an LLM-backed UpdateRelevanceGate at the single chokepoint BOTH update paths share — POST /telegram/post-update (self-narration) and the upgrade-notify session's reply to the Agent Updates topic. For each candidate: - internal plumbing → withheld (200 {ok,suppressed:true}, audited to logs/update-relevance.jsonl; suppression is a success, never a retryable error) - relevant-but-jargon → plain-language rewrite delivered instead of the original - genuine user news → delivered as-is It does NOT trust the author's tag — it enforces relevance in code (Structure > Willpower; parent principle Near-Silent Notifications). Strict no-op off the Updates topic (normal replies byte-identical), fail-open on any LLM error (a hiccup never swallows a real update). Mirrors MessagingToneGate: shared IntelligenceProvider, model 'fast', temp 0, prompt-injection boundary, /metrics/features attribution. Ships dark on the fleet + live on Echo via the developmentAgent gate (enabled ?? !!developmentAgent — no config migration). Off-switch: monitoring.updateRelevanceGate.enabled. Tests: unit 11 / integration 6 / e2e 3 (real AgentServer boot proves the options→routeCtx wiring) + migration parity (CLAUDE.md template note + migrateClaudeMd + its idempotency test). A fail-safe ctx.state?.get guard keeps the reply path a no-op when state is unresolvable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
…ult-on) Rollout flip directed by Justin (2026-06-04): dark-on-fleet → LIVE fleet-wide, `enabled ?? true` (off-switch retained: monitoring.updateRelevanceGate.enabled). Why: this is a UX BUG FIX to a user-facing surface, not a new capability. Shipping it dark hides the fix from exactly the users who reported the noise — Justin watched the fleet agents whose updates were noisy, and a dark gate means he'd see no improvement there. The dark/developmentAgent convention exists for changes whose failure could break something; this gate cannot (fail-open, strict no-op off the Updates topic, every decision audited — worst case is one borderline note withheld, visibly logged). Grounding note: #698's silent-by-default notifier was verified to be live (unconditional) — the residual noise came from the self-narration path, which is what this gate covers. Companion constitution amendment ("User-Facing Fixes Ship Live") lands as a separate clean docs PR. Tests: integration disable case now uses the explicit off-switch (developmentAgent:false no longer disables); the E2E exercises the default-on path with NO flag set — the exact production path every fleet agent runs. All 25 feature tests green post-flip. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…relevance # Conflicts: # .instar/instar-dev-decisions.jsonl # src/commands/server.ts
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What & why
After #698 made user-facing update announcements opt-in + maturity-tagged, Justin pointed out (topic 18250, with a screenshot) that the Updates feed still doesn't feel relevant — messages reference internal features the owner has no clue about ("Sibling Agent Server Control", "I can now record apprenticeship cycles…"). #698 fixed the framing; it never enforced relevance.
This is option A from the conversation: a single relevance + plain-language gate at the one chokepoint BOTH update paths share, right before anything reaches the owner.
The two leaks #698 left
audience: user— but nothing checks whether the content is relevant or readable. Jargon marked user-facing sails through.POST /telegram/post-update, e.g. "I can now restart other agents' servers…") had no relevance gate at all — only a tone/junk check. feat(updates): maturity-aware, silent-by-default user update announcements #698 never touched it.The fix
A new LLM-backed
UpdateRelevanceGatewired at the shared chokepoint (the Agent Updates topic). For each candidate it returns one of:internal→ withheld entirely.200 {ok:true, suppressed:true, reason}(suppression is a success, never a retryable error). Recorded tologs/update-relevance.jsonl.jargon→ a plain-language "here's what you can now do" rewrite is delivered instead of the original.user-relevant→ delivered as-is.It does not trust the author's tag — it enforces relevance in code (Structure > Willpower; parent principle Near-Silent Notifications).
Guarantees
MessagingToneGate— sharedIntelligenceProvider, modelfast, temp 0, prompt-injection boundary,/metrics/featuresattribution.Rollout
Live fleet-wide, default-on (
enabled ?? true) — flipped at Justin's direction (2026-06-04). This is a UX bug fix to a user-facing surface, not a new capability: shipping it dark would hide the fix from exactly the agents whose noise was reported. The gate cannot break anything (fail-open, strict no-op off the Updates topic, fully audited), so the dark-gate risk rationale doesn't apply. Per-agent off-switch retained:.instar/config.json→monitoring.updateRelevanceGate.enabled: false. No config migration (runtime fallback). Companion constitution amendment ("User-Facing Fixes Ship Live") lands as a separate clean docs PR.Tests (all three tiers + migration parity)
/telegram/post-update, disabled passthrough, strict no-op off the Updates topic, suppress on the reply path; audit-trail assertion.AgentServer, proving theoptions → routeCtxwiring; suppress + deliver over real HTTP + Bearer-auth.ctx.state?.getguard.Process: Tier-2
/instar-devceremony (converged + approved spec, ELI16 overview, side-effects artifact, trace). Full suite runs in CI.🤖 Generated with Claude Code
ELI16 — a relevance check before any "what's new" message reaches you
Before any update note lands in your Agent Updates chat, a gate now asks one question: "would a normal person actually notice, use, or care about this?" Messages about internal machinery ("Sibling Agent Server Control", "apprenticeship cycle recording") get silently withheld — you never see them, though every decision is logged so nothing vanishes without a trace. Updates that ARE useful but written in tech-speak get rewritten into plain "here's what you can now do" language before they're sent. Genuinely useful plain news passes through untouched. The gate doesn't trust whoever wrote the message to judge their own relevance — the check is enforced in code, at the single doorway both update paths share. It can't break anything: if the judge ever errors, the original message just goes through, and it does nothing at all outside the Updates chat. It ships on for every agent (a fix to what you see only counts if you can see it), with a per-agent off-switch in config.