Playbooks

Accessibility Audit Ops Playbook

By Flora May dela Cruz

An operational workflow for recurring accessibility audits that combines automated checks, manual verification, and an evidence log teams can actually maintain.

accessibilityauditwcagdesign-opsquality

Purpose

Most teams run accessibility checks reactively, right before release or after escalation. This playbook turns audits into an operating rhythm: fast automated scans for coverage, manual checks for real interaction quality, and a living evidence log that shows what passed, what failed, and who owns each fix.

When to use it

Your team ships frequently and needs repeatable audit checkpoints
Accessibility findings keep reappearing because fixes are local, not systemic
You need a practical way to combine design review and engineering verification
Compliance conversations require concrete evidence beyond “we ran a scanner”

Skip it when: no one can own remediation, or the team has not yet wired basic accessible primitives into the shell.

Core framework

Use a three-lane audit loop:

Automated lane: run contrast and structural checks on every release candidate
Manual lane: test keyboard path, focus behavior, announcements, and error recovery
Evidence lane: log findings, ownership, severity, and close criteria

Use this lifecycle:

flowchart LR
  A[Automated Scan] --> B[Manual Verification]
  B --> C[Evidence Log]
  C --> D[Remediation]
  D --> E[Re-test]
  E --> F[Close or Re-open]

Reusable template

# Accessibility audit run: <surface>

## Run details

- Date: <yyyy-mm-dd>
- Auditor: <name>
- Scope: <routes/components>
- WCAG target: 2.1 AA (AAA where small text applies)

## Automated checks

- Contrast script result: <pass/fail + link to report>
- Structural scanner result: <pass/fail + summary>

## Manual checks

| Check | Result | Notes | Owner |
|---|---|---|---|
| Keyboard-only navigation | <pass/fail> | <notes> | <name> |
| Focus visibility + return | <pass/fail> | <notes> | <name> |
| Screen reader announcement flow | <pass/fail> | <notes> | <name> |
| Error messaging and recovery | <pass/fail> | <notes> | <name> |

## Findings log

| Finding ID | Severity | Criterion | Surface | Owner | Due date | Close criteria |
|---|---|---|---|---|---|---|
| AX-001 | <high/med/low> | <wcag ref> | <surface> | <name> | <date> | <test that must pass> |

AI-assisted workflow

Use AI to summarize and prioritize findings from audit artifacts.

You are helping me triage accessibility audit findings.
Given automated scan output and manual notes, produce:
1) grouped findings by WCAG criterion
2) severity (high/medium/low)
3) likely root cause category (token, component, interaction, content)
4) suggested first fix with the largest risk reduction

Constraints:
- Do not claim compliance without explicit evidence.
- If a criterion cannot be verified from input, mark as "needs manual confirmation".
- Keep output concise and implementation-neutral.

Collaboration considerations

For PMs: treat unresolved high-severity findings as release criteria, not backlog suggestions
For developers: tie each finding to a regression test or reusable primitive update
For research: include at least one assistive-technology walkthrough for critical journeys
For accessibility: keep criterion mapping explicit so audits remain defensible and repeatable

Common failure patterns

Equating scanner pass with full accessibility pass
Running audits once per quarter instead of per release cadence
Fixing single screens while shared components remain broken
Logging findings without clear close criteria
Closing tickets before re-testing with keyboard and screen reader

Companion artifacts

Accessibility Annotation Playbook for what must be specified before audits begin
The Accessibility Compliance Baseline Playbook for shell primitives and acceptance checks
Audit Command Center Template for weekly consolidation across audit domains
Dark Mode Spec One-Pager for theme contrast requirements in this audit loop

Generalized example

A fictional content workflow product runs this audit on each release candidate. Automated checks flag contrast regressions in two routes; manual testing finds focus return failures in one dialog flow. The team patches shared tokens and dialog primitives, then re-runs both lanes before closing findings. Over three cycles, repeat issues drop because fixes happen at system level.

Public-safe review (verified before publish)

No employer or client product names, codenames, or org names
No customer names, segment sizes, or identifiable details
No internal metrics, thresholds, OKRs, or telemetry numbers
No roadmap, ship dates, or future plans
No architecture, service names, API shapes, or schema fields from real systems
No screenshots showing real chrome, real data, or recognizable surfaces
No internal-only workflows, tools, or terminology
Every example is fictional or abstracted; numbers are illustrative
A peer outside any employer could read this and learn nothing proprietary

Take this playbook with you

Drop your email to copy the markdown or download the file. One email unlocks every playbook in the Toybox.

No spam. Occasional notes on new playbooks. Unsubscribe in one click.