Playbooks
Accessibility Audit Ops Playbook
By Flora May dela Cruz
An operational workflow for recurring accessibility audits that combines automated checks, manual verification, and an evidence log teams can actually maintain.
Purpose
Most teams run accessibility checks reactively, right before release or after escalation. This playbook turns audits into an operating rhythm: fast automated scans for coverage, manual checks for real interaction quality, and a living evidence log that shows what passed, what failed, and who owns each fix.
When to use it
- Your team ships frequently and needs repeatable audit checkpoints
- Accessibility findings keep reappearing because fixes are local, not systemic
- You need a practical way to combine design review and engineering verification
- Compliance conversations require concrete evidence beyond “we ran a scanner”
Skip it when: no one can own remediation, or the team has not yet wired basic accessible primitives into the shell.
Core framework
Use a three-lane audit loop:
- Automated lane: run contrast and structural checks on every release candidate
- Manual lane: test keyboard path, focus behavior, announcements, and error recovery
- Evidence lane: log findings, ownership, severity, and close criteria
Use this lifecycle:
flowchart LR
A[Automated Scan] --> B[Manual Verification]
B --> C[Evidence Log]
C --> D[Remediation]
D --> E[Re-test]
E --> F[Close or Re-open]
Reusable template
# Accessibility audit run: <surface>
## Run details
- Date: <yyyy-mm-dd>
- Auditor: <name>
- Scope: <routes/components>
- WCAG target: 2.1 AA (AAA where small text applies)
## Automated checks
- Contrast script result: <pass/fail + link to report>
- Structural scanner result: <pass/fail + summary>
## Manual checks
| Check | Result | Notes | Owner |
|---|---|---|---|
| Keyboard-only navigation | <pass/fail> | <notes> | <name> |
| Focus visibility + return | <pass/fail> | <notes> | <name> |
| Screen reader announcement flow | <pass/fail> | <notes> | <name> |
| Error messaging and recovery | <pass/fail> | <notes> | <name> |
## Findings log
| Finding ID | Severity | Criterion | Surface | Owner | Due date | Close criteria |
|---|---|---|---|---|---|---|
| AX-001 | <high/med/low> | <wcag ref> | <surface> | <name> | <date> | <test that must pass> |
AI-assisted workflow
Use AI to summarize and prioritize findings from audit artifacts.
You are helping me triage accessibility audit findings.
Given automated scan output and manual notes, produce:
1) grouped findings by WCAG criterion
2) severity (high/medium/low)
3) likely root cause category (token, component, interaction, content)
4) suggested first fix with the largest risk reduction
Constraints:
- Do not claim compliance without explicit evidence.
- If a criterion cannot be verified from input, mark as "needs manual confirmation".
- Keep output concise and implementation-neutral.
Collaboration considerations
- For PMs: treat unresolved high-severity findings as release criteria, not backlog suggestions
- For developers: tie each finding to a regression test or reusable primitive update
- For research: include at least one assistive-technology walkthrough for critical journeys
- For accessibility: keep criterion mapping explicit so audits remain defensible and repeatable
Common failure patterns
- Equating scanner pass with full accessibility pass
- Running audits once per quarter instead of per release cadence
- Fixing single screens while shared components remain broken
- Logging findings without clear close criteria
- Closing tickets before re-testing with keyboard and screen reader
Companion artifacts
- Accessibility Annotation Playbook for what must be specified before audits begin
- The Accessibility Compliance Baseline Playbook for shell primitives and acceptance checks
- Audit Command Center Template for weekly consolidation across audit domains
- Dark Mode Spec One-Pager for theme contrast requirements in this audit loop
Generalized example
A fictional content workflow product runs this audit on each release candidate. Automated checks flag contrast regressions in two routes; manual testing finds focus return failures in one dialog flow. The team patches shared tokens and dialog primitives, then re-runs both lanes before closing findings. Over three cycles, repeat issues drop because fixes happen at system level.
Public-safe review (verified before publish)
- No employer or client product names, codenames, or org names
- No customer names, segment sizes, or identifiable details
- No internal metrics, thresholds, OKRs, or telemetry numbers
- No roadmap, ship dates, or future plans
- No architecture, service names, API shapes, or schema fields from real systems
- No screenshots showing real chrome, real data, or recognizable surfaces
- No internal-only workflows, tools, or terminology
- Every example is fictional or abstracted; numbers are illustrative
- A peer outside any employer could read this and learn nothing proprietary
More in Playbooks
-
Playbook
Accessibility Annotation Playbook
What to annotate, when, and how — so designs ship with accessibility built-in, not bolted on at the end.
-
Playbook
Demo Days 0 to 14 Time-Machine Playbook
How to demo product maturity over time without faking production data: stage states from Day 0 to Day 14 and let people jump between them instantly.
-
Playbook
Design System Audit Playbook
A repeatable way to audit design systems for token drift, component divergence, and implementation parity before inconsistency becomes product debt.
Take this playbook with you
Drop your email to copy the markdown or download the file. One email unlocks every playbook in the Toybox.
No spam. Occasional notes on new playbooks. Unsubscribe in one click.