fix(skills): description-level exceptions are authoritative in the routing rule#1732
Open
arittr wants to merge 2 commits into
Conversation
Collaborator
Author
|
This change is part of the following stack: Change managed by git-spice. |
5 tasks
…uting rule (SUP-333 #4) Adversarial review findings D1/D2: the 1%-chance invocation rule and the "Add X doesn't mean skip workflows" line contradicted the new brainstorming description exception in both directions — a compliant agent re-imposes the cost failure (invocation itself is the measured cost event), while a cost-optimizing agent could treat any skip as sanctioned. The routing skill now states: a documented exception in a skill's own description defines that skill's scope (compliance, not rationalization); any doubt about the exception's conditions means invoke; and only the description can define one — agents cannot infer exceptions. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ception rule; writing-skills carve-out
Staff-review findings (4-reviewer panel):
- The skill_flow digraph still routed "yes, even 1%" straight to
invoke with no exception branch — and this stack's own evidence says
agents follow flowcharts literally. The flow now passes through
"Skill's own description exempts this request?" with no/any-doubt →
invoke.
- The <EXTREMELY-IMPORTANT> block ("you cannot rationalize your way
out of this") read unconditional; one parenthetical defers to The
Rule's single carve-out without weakening the block.
- Trimmed the redundant "the description defines the skill's scope"
clause from The Rule paragraph.
- writing-skills' "descriptions must not carry process" doctrine would
have had a future editor strip the brainstorming exception and
silently regress the cost evals; it now distinguishes negative
triggering conditions (scope — allowed and, per the routing rule,
required at the description) from workflow summaries (still
forbidden).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
aff9195 to
87ddfac
Compare
9d0ac38 to
36e289e
Compare
This was referenced Jun 11, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Who is submitting this PR? (required)
claude-fable-5[1m])devcheckout); quorum eval lab (superpowers-evals) as the testing apparatus; unrelated local ops plugins (decision-log, episodic-memory, superpowers-chrome, primeradiant-ops)What problem are you trying to solve?
An adversarial review fleet (three parallel reviewers: red-team, cross-corpus consistency, evidence verification) audited the SUP-333 stack and found that using-superpowers' routing rule contradicts #1718's description-level exception in both directions:
What does this PR change?
Adds one paragraph to The Rule: a documented exception in a skill's own description is authoritative — not invoking is compliance, not rationalization; any doubt about the exception's conditions means invoke; only the skill's description can define such an exception (agents cannot infer one). The "Add X doesn't mean skip workflows" line gains the matching qualifier.
Is this change appropriate for the core library?
Yes — it defines how the routing layer treats description-level scoping for ALL skills, which #1718 introduces and future skills may use. Without it the bootstrap text and skill descriptions give contradictory instructions.
What alternatives did you consider?
Does this PR contain multiple unrelated changes?
No — one rule, two coordinated touches in one file (the rule paragraph and the User Instructions line that contradicted it).
Existing PRs
Environment tested
New harness support (required if this PR adds a new harness)
N/A — no harness changes.
Evaluation
cost-checkbox-over-trigger/claude pass (the exception fires through the routing layer — no brainstorming invocation),triggering-writing-plans/claude pass (the 1% rule still triggers skills that DO apply),cost-spec-plan-duplication/claude pass (brainstorming still gates real features),cost-trivial-task-review-fanout/opencode pass,sdd-rejects-extra-features/claude pass (run IDs in the stack PRs).Rigor
superpowers:writing-skillsand completed adversarial pressure testing (paste results below)The Red Flags table is untouched; the new paragraph closes the "description exempts me" gap the table could not cover, with the doubt-means-invoke backstop preserved. Adversarial findings (D1/D2) and both exploit directions are documented above; the 5-run battery is the post-change evidence.
Human review
Round 3: staff-review refinements + evidence
Refinements in this PR's follow-up commit: the
skill_flowdigraph now routes through a "Skill's own description exempts this request?" diamond (no/any-doubt → invoke) — this stack's own evidence says agents follow flowcharts literally, and the chart previously contradicted the rule; the<EXTREMELY-IMPORTANT>block gains a one-line deferral to The Rule (previously it read unconditional, contradicting the rule in the same always-loaded file); writing-skills now distinguishes negative triggering conditions (scope — allowed, and required at the description per this rule) from workflow summaries (still forbidden), so a future editor applying its checklist does not strip the exception and silently regress the cost evals.Final-text evidence: the exception routes correctly where supported —
cost-checkbox-over-triggerskip: claude 3/3, codex ✓, antigravity ✓ (kimi does not pick up description exceptions; unchanged from baseline). The 1% rule still triggers skills that apply:triggering-writing-plans/claude 3/3 pass; ×codex fail — byte-for-byte its pre-existing documented signature (loads sibling skills, skips the mandated one; predates this stack, tracked separately).Merge guidance: merge together with #1718 (see its note).