temporalio · ardiewen · May 29, 2026
@@ -0,0 +1,147 @@
+---
+name: a11y-manifest-discipline
+description: When shipping or modifying an accessibility audit fix doc, update `scripts/a11y/manifest.yml` in the same PR so the A11y PR Triage workflow can label PRs correctly. Use when the user says "shipping a new a11y fix doc", "adding a manifest entry", "updating an audit bucket", "the audit team is filing a new defect", or whenever you draft a new entry in the audit team's private categorization output.
+user-invocable: true
+allowed-tools:
+  - Read
+  - Write
+  - Edit
+  - Glob
+  - Grep
+  - Bash
+---
+
+# A11y manifest discipline
+
+The accessibility PR triage system reads `scripts/a11y/manifest.yml` to decide what labels (and Slack pings) to apply to incoming PRs. This skill is the durable mechanism that keeps that manifest in sync with the audit team's source fix docs.
+
+**The rule, in one sentence:** when you author a new fix doc (or change a fix doc's bucket, severity, or scope), commit the corresponding manifest entry in the same PR that introduces or references it.
+
+If the manifest entry doesn't land, PRs closing that fix doc get labeled `a11y:broken-ref` and post a bot comment asking the audit team to add the entry. The triage system gracefully degrades, but every minute the manifest is stale is a minute reviewers are guessing.
+
+## When to invoke
+
+- Authoring a new fix doc in the audit team's private workspace (typically driven by the `issues-by-complexity-prompt.md` workflow).
+- Re-bucketing a fix doc (e.g., a Bucket-2 defect turns out to require logic changes → Bucket 3).
+- Renaming a slug (rare; coordinate with any open PRs referencing the old slug).
+- Removing a fix doc (mark the manifest entry's bucket as `null` rather than deleting if PRs may still reference it; otherwise delete).
+
+## The bucket framework (recap)
+
+Four levels, chosen by how the fix is going to be implemented — not by the severity of the defect:
+
+| Bucket | What it is                                                           | Reviewer signal                                           |
+| ------ | -------------------------------------------------------------------- | --------------------------------------------------------- |
+| 1      | CSS / token-only change. No logic, no markup change.                 | Design-mergeable. One reviewer.                           |
+| 2      | HTML or ARIA attribute addition to existing elements. Static values. | Design-mergeable. One reviewer.                           |
+| 3      | Logic, component composition, event handling, or state changes.      | Engineer reviewer required. Pings designer Slack channel. |
+| 4      | Engineering plus product judgment (new UX shape, behavior change).   | Engineer + product. Pings designer Slack channel.         |
+
+Boundary calls:
+
+- "Add `aria-label`" = Bucket 2. "Add `aria-label` whose value depends on component state" = Bucket 3.
+- "Add focus-ring CSS" = Bucket 1. "Add focus-ring CSS plus state-driven visibility" = Bucket 3.
+- "Add `role` attribute" = Bucket 2. "Add `role` plus manage roving tabindex" = Bucket 3.
+- When genuinely ambiguous between two buckets, prefer the higher one. Design can always escalate up; engineering can land design-bucket PRs trivially.
+
+Full categorization framework lives in the audit team's private `issues-by-complexity-prompt.md`. This skill restates the bucket definitions for in-repo accessibility — it does not replace the prompt.
+
+## The manifest schema
+
+`scripts/a11y/manifest.yml` is a YAML sequence. Each entry:
+
+```yaml
+- slug: 1.4.13-tooltip
+  sc: '1.4.13'
+  bucket: 3
+  severity: critical
+  scope: ui-main
+  files-touched: # optional
+    - src/lib/holocene/tooltip.svelte
+```
+
+Field constraints (validated by `scripts/a11y/manifest.test.mjs`):
+
+- `slug`: kebab-case, unique within the manifest. Convention: `{SC}-{description}`.
+- `sc`: WCAG criterion in `X.Y.Z` form, quoted as a string.
+- `bucket`: `1 | 2 | 3 | 4 | null`. Use `null` when the audit team has authored the doc but not yet categorized it; PRs referencing it will get `a11y:needs-categorization`.
+- `severity`: `critical | serious | moderate | minor` (lowercase). Original audit vocabulary (High / Medium / Low / range expressions) gets normalized to this four-level rubric on entry.
+- `scope`: `ui-main` or `universal`. (The validator's full enum accepts additional scope values for cross-repo cases, but this manifest holds `ui-main` entries by convention.)
+- `files-touched`: optional list of repo-relative paths the fix touches. Useful but not yet consumed by tooling.
+
+## The commit rule
+
+When you ship a fix doc, the manifest entry lands in the same PR as the first code change that references it.
+
+Cases:
+
+1. **Audit team commits the entry alongside the designer's PR.** Common case. The audit team authors the fix doc privately, decides bucket, opens a PR (or piggybacks the designer's) that adds the manifest entry. If the designer's PR is already open, the audit team can push a commit to that branch adding the entry.
+
+2. **PR opens before the manifest entry exists.** The PR gets `a11y:broken-ref` and a bot comment. Audit team adds the entry; on the PR's next `synchronize` event, labels update. Acceptable transient state — not a sustainable steady state.
+
+3. **Entry needs to land before any PR references it** (e.g., a fix doc is published for community contribution). Open a manifest-only PR (no code changes) and merge it independently.
+
+## Workflow
+
+1. Decide the bucket using the framework above and the boundary calls. Verbose bucket-1-vs-2 or 2-vs-3 reasoning lives in the audit team's `issues-by-complexity` working draft, not in the manifest.
+
+2. Construct the entry. Sort within the manifest by SC then by slug.
+
+3. **Validate locally before committing:**
+
+```sh
+pnpm a11y:manifest:test
+```
+
+Expected output: `A11y manifest OK (N entries).` Any failure prints the offending entry and the rule it violated. Don't push without a green validator — CI will block the PR.
+
+4. Commit with a message that names the slug, e.g.:
+
+```
+a11y(manifest): add 1.4.13-tooltip — bucket 3, ui-main, critical
+```
+
+Or, when piggy-backing on a code PR, fold the manifest change into the same commit.
+
+5. If the PR you're updating is open and labeled with `a11y:broken-ref` or `a11y:needs-categorization`, the triage workflow will re-label on the next PR event (synchronize, edit, or `workflow_dispatch`). You can force it with:
+
+```sh
+gh workflow run a11y-pr-triage.yml --field pr_number=<n>
+```
+
+## Backfill and batch re-runs
+
+The triage workflow's `workflow_dispatch` trigger lets you (re-)label any open PR — useful for one-time backfill after a manifest update or after fixing a workflow bug.
+
+**Single PR**, via the Actions tab or CLI:
+
+```sh
+gh workflow run a11y-pr-triage.yml --field pr_number=<n>
+```
+
+**All open a11y PRs:**
+
+```sh
+gh pr list --search '"a11y" -is:closed' \
+  --json number --jq '.[].number' \
+  | xargs -I{} gh workflow run a11y-pr-triage.yml --field pr_number={}
+```
+
+Both invocations route through the workflow's `getPullRequest()` helper, which fetches the PR via API and runs the same triage logic the `pull_request` event would.
+
+## Common pitfalls
+
+- **Silent typo in slug.** PR labels report `a11y:broken-ref` but the bot comment shows the typo'd slug. Fix in the manifest OR the PR body, then re-trigger.
+- **Bucket changed but PR not re-triggered.** Manifest change alone doesn't re-label open PRs. After landing a bucket change, dispatch the workflow against affected open PRs.
+- **Quoting `sc`.** YAML interprets `1.4.13` as a string by default, but `1.4.10` may be parsed as a number losing the trailing zero. Always quote: `sc: '1.4.10'`.
+
+## What this skill does not cover
+
+- How to write the source fix doc. That's the audit team's `wcag-audit` skill (in their private workspace; an abbreviated in-repo port lives at `.claude/skills/wcag-audit/SKILL.md`).
+- How a reviewer should react to the labels the triage workflow applies, or the title/trailer conventions a PR author must follow. See `.claude/skills/a11y-pr-review/SKILL.md`.
+
+## References
+
+- `scripts/a11y/manifest.yml` — the manifest itself.
+- `scripts/a11y/manifest.test.mjs` — the validator.
+- `.github/workflows/a11y-pr-triage.yml` — the workflow that reads the manifest.
@@ -0,0 +1,150 @@
+---
+name: a11y-pr-review
+description: Review a pull request that the A11y PR Triage workflow has labeled with `a11y:*` labels. Use the bucket label to decide review depth, the SC label to know what WCAG criterion the PR closes, and the failure-mode labels to know what's blocked. Use when the user says "review this a11y PR", "is this PR safe to merge", "what does this `a11y:broken-ref` label mean", or asks for the right reviewer for an accessibility PR.
+user-invocable: true
+allowed-tools:
+  - Read
+  - Glob
+  - Grep
+  - Bash
+  - AskUserQuestion
+---
+
+# A11y PR review
+
+Reviewing an accessibility PR labeled by the A11y PR Triage workflow. The bucket label tells you how much engineering review the change needs; the failure-mode labels tell you what's still pending before a clean review can happen.
+
+## PR conventions the action looks for
+
+Two conventions drive the workflow. Authors of a11y PRs must follow both; reviewers should know they exist so labels make sense in context.
+
+**Title gate.** The workflow matches `/^(fix\()?a11y[():]/i`. The target convention going forward is `a11y(<WCAG-code>): <description>` (e.g., `a11y(1.4.13): make Tooltip keyboard-accessible`). Legacy `fix(a11y):` and `a11y:` forms are still accepted during the transition. Titles that don't match the gate are not triaged — no labels are applied.
+
+**Trailer.** One `A11y-Audit-Ref: <slug>` line per fix doc closed, anywhere in the PR body. Case-insensitive; bare slug only (no path, no `.md`):
+
+```
+A11y-Audit-Ref: 2.5.8-chip-remove-button
+A11y-Audit-Ref: 1.4.13-tooltip
+```
+
+Multi-criterion PRs include multiple trailers — the action takes `max(bucket)` across all referenced entries when applying the bucket label, and emits one `a11y:sc-*` label per criterion. The slugs are looked up in `scripts/a11y/manifest.yml`; see the `a11y-manifest-discipline` skill for how those manifest entries are authored and maintained.
+
+## Quick reference
+
+| Label                       | What it means                                                                | What to do                                                                                                                                                          |
+| --------------------------- | ---------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `a11y`                      | Umbrella. PR title matched the a11y gate.                                    | Always present on triaged PRs.                                                                                                                                      |
+| `a11y:bucket-1`             | CSS / token-only. No logic.                                                  | One design reviewer suffices. Verify visual change matches the fix doc; merge.                                                                                      |
+| `a11y:bucket-2`             | HTML / ARIA attr addition.                                                   | One design reviewer suffices. Spot-check the attribute value; merge.                                                                                                |
+| `a11y:bucket-3`             | Logic / component change.                                                    | Engineer reviewer required. Pull down the branch, exercise the path. Slack already pinged.                                                                          |
+| `a11y:bucket-4`             | Engineering + product judgment.                                              | Engineer + product. Discuss design implications before merge. Slack already pinged.                                                                                 |
+| `a11y:sc-X.Y.Z`             | One per WCAG criterion the PR closes.                                        | Informational. Multiple labels = multi-criterion PR.                                                                                                                |
+| `a11y:needs-categorization` | Referenced manifest entry exists but has no bucket.                          | **Don't merge until audit team assigns a bucket.** Ping audit team.                                                                                                 |
+| `a11y:broken-ref`           | `A11y-Audit-Ref:` line points to a slug not in the manifest.                 | **Don't merge.** Either the slug is misspelled (ask author) or the entry hasn't been authored (ping audit team).                                                    |
+| `a11y:no-fix-doc`           | PR title matches the a11y gate but no `A11y-Audit-Ref:` line is in the body. | Acceptable for ad-hoc fixes if the audit team has explicitly opted not to track. Otherwise, ask author to add the trailer; ping audit team to add a manifest entry. |
+
+## Per-bucket review depth
+
+### Bucket 1 — CSS / token-only
+
+Examples: rem-based font-size, focus-ring color contrast token swap, scroll-margin addition.
+
+Review checklist:
+
+- Read the diff. Confirm it's CSS only — no JS or template logic changes.
+- If a color or token changed, check that the new value meets the WCAG threshold the fix claims (e.g., 4.5:1 for text contrast). The fix doc (private) has the calculation; if not obvious from the manifest, ask the audit team.
+- Optional: spin up the dev server and eyeball the change at the affected route. Often unnecessary for token swaps that cascade.
+- One design reviewer's approval is sufficient.
+
+Time budget: under 5 minutes.
+
+### Bucket 2 — HTML / ARIA attributes
+
+Examples: `aria-label` on icon-only button, `role="radiogroup"`, `autocomplete` prop forwarding, `scope="col"` on table headers.
+
+Review checklist:
+
+- Read the diff. Confirm only attribute additions or static-value changes.
+- For `aria-label` / `aria-labelledby`: verify the value is meaningful (not the literal element name). For i18n strings, confirm the translation key exists.
+- For `role` changes: confirm the role matches the element's actual behavior. `role="button"` on a non-`<button>` element still needs keyboard handlers — flag if missing (would push the change to Bucket 3).
+- For `autocomplete`: WHATWG token list lookup. `email`, `current-password`, `new-password`, `username`, etc.
+- One design reviewer's approval is sufficient.
+
+Time budget: 5–10 minutes.
+
+### Bucket 3 — Logic / component changes
+
+Examples: Tooltip rewrite (keyboard-accessible + dismissible), focus restoration after modal close, severity-icon pairing with state, name-role-value sweep across primitives.
+
+Review checklist:
+
+- The Slack channel was already pinged. If you're reviewing, you saw it.
+- Pull the branch down or open it in a preview environment.
+- Exercise the keyboard path: Tab, Shift+Tab, Enter, Space, Escape. Each interactive element should be reachable and the visual focus indicator should be visible.
+- If screen-reader behavior is claimed in the PR description: smoke-test with VoiceOver (Cmd+F5 on macOS) or NVDA. Confirm the new announcement matches the PR claim.
+- Check that no consumer of the affected primitive breaks. Primitive changes (e.g., `Button`, `Tooltip`, `Input`) cascade to many call sites; the audit team's fix doc usually enumerates the consumer surface, but you should grep for usages in both repos if the primitive is widely used.
+- Engineer reviewer's approval required.
+
+Time budget: 15–45 minutes depending on the primitive's reach.
+
+### Bucket 4 — Engineering + product
+
+Examples: session-timeout warning UX, prefers-reduced-motion fallback, account-deletion pre-expiration warning, "consistent help" mechanism.
+
+Review checklist:
+
+- The change implicates a product decision (what should the new behavior be? what should the copy say?). Engineering review alone is insufficient.
+- Loop in product before approving the implementation. Confirm the chosen UX shape aligns with broader product direction.
+- All Bucket-3 review steps apply too (keyboard, AT, primitive cascade).
+- Engineer + product approval required.
+
+Time budget: variable. Often blocked on product input before any keyboard testing.
+
+## Failure-mode triage
+
+### `a11y:needs-categorization`
+
+The referenced manifest entry exists in `scripts/a11y/manifest.yml` but has `bucket: null`. The audit team filed the defect but hasn't yet decided which bucket it falls into.
+
+What to do:
+
+1. Don't merge. The categorization affects who needs to review.
+2. Ping the audit team in the designer Slack channel; ask them to assign a bucket.
+3. Once the manifest is updated, dispatch the triage workflow against this PR:
+   ```sh
+   gh workflow run a11y-pr-triage.yml --repo temporalio/<repo> --field pr_number=<n>
+   ```
+   The label updates and you can proceed with the appropriate bucket's review steps.
+
+### `a11y:broken-ref`
+
+The PR body has an `A11y-Audit-Ref: <slug>` line, but `<slug>` isn't in the manifest. Either the slug is misspelled or the manifest entry hasn't been authored yet.
+
+What to do:
+
+1. Read the bot comment the triage workflow posted; it names the offending slug.
+2. Compare the slug to entries in `scripts/a11y/manifest.yml`. If close to an existing slug, ask the author to fix the trailer.
+3. If genuinely unknown, ping the audit team; they may need to add the entry. See `a11y-manifest-discipline` skill for what they should do.
+4. Re-dispatch the triage workflow after the fix.
+
+### `a11y:no-fix-doc`
+
+No `A11y-Audit-Ref:` line in the PR body. Either the author forgot, or this is a net-new defect not covered by the audit.
+
+What to do:
+
+1. Read the PR. Is the change clearly addressing a defect the audit team would have caught (e.g., adding `aria-label` to an icon-only button on a route they audited)?
+   - If yes: the author probably forgot the trailer. Ask them to add `A11y-Audit-Ref:` referencing the right slug (or coordinate with the audit team if no slug exists yet).
+   - If the change is genuinely new (e.g., a brand-new component the audit hasn't seen): coordinate with the audit team in Slack. They may add a manifest entry retroactively, or accept the PR as an ad-hoc fix and let the `a11y:no-fix-doc` label persist as a historical signal.
+2. For internal PRs, defer the merge until the audit team has weighed in. For external contributors, weigh the cost of holding the PR open against the cost of merging without classification.
+
+## When to escalate
+
+- **`a11y:bucket-3` or `a11y:bucket-4` PR sat for > 2 days unreviewed.** The Slack channel was pinged at label-application time; if no one picked it up, re-ping with the PR link. The frontend team's batch-review cadence is the bottleneck.
+- **Manifest entry disputes.** If you think the audit team's bucket assignment is wrong (e.g., they marked something Bucket 2 but the diff has real logic), comment on the PR and tag the audit team. They have the source fix doc to refer back to.
+- **Cross-cutting primitive changes.** A `Button` or `Tooltip` change can cascade to 100+ consumers. Engineer review must include "did the cascade plan get exercised?" — confirm via grep across the repo. If the audit team's fix doc doesn't enumerate the consumer surface, request that detail before merging.
+
+## References
+
+- `scripts/a11y/manifest.yml` — the manifest the action reads.
+- `.claude/skills/a11y-manifest-discipline/SKILL.md` — companion skill for the audit team side (manifest authoring + backfill).