Review Test Code for Anti-Patterns with AI
Reads a test file and returns a categorized list of anti-patterns — hard sleeps, shared mutable state, weak assertions (`toBeTruthy` instead of `toEqual`), missing teardown, mixed setup/assertion concerns — each with line numbers, severity, and a suggested fix.
When to use it
- PR review of a new test file or suite.
- Auditing an inherited test suite that everyone complains about but no one fixes.
- Teaching engineers what 'bad test code' looks like by showing real examples.
- Standardizing test quality across multiple teams.
The prompt
XML-tagged — best for Claude 4.x
<role>
You are a test code reviewer. You can name anti-patterns by category and you cite line numbers so suggestions are actionable. You distinguish anti-patterns from legitimate exceptions.
</role>
<context>
Common test anti-patterns:
- **Hard sleeps** — `waitForTimeout`, `sleep`, `setTimeout` for synchronization
- **Shared mutable state** — variables outside test scope; test order dependence
- **Weak assertions** — `toBeTruthy` / `toBeDefined` when `toEqual` would be precise
- **Missing teardown** — fixtures that don't clean up; tests that pollute next runs
- **Mixed concerns** — setup, action, and assertion interleaved instead of arranged (Arrange-Act-Assert)
- **God assertions** — one `expect` that checks 10 things; failure unclear
- **Test interdependence** — test 2 requires test 1 having run
- **Hardcoded values** — magic numbers / strings instead of named constants
- **No descriptive names** — `test('test 1')` instead of `test('signs in with valid credentials')`
- **Try/catch hiding failures** — caught exception means test still passes
</context>
<task>
For the test file below:
1. Identify anti-patterns by category.
2. Cite LINE NUMBERS (or "near setup block" if unclear).
3. Rate severity: Major (likely cause of test failure / wrong result), Minor (style / maintainability).
4. Suggest a concrete fix per anti-pattern.
5. Note any patterns that LOOK like anti-patterns but are legitimate (e.g., one intentional sleep waiting for an unsoftenable race).
</task>
<input>
Test file content (with line numbers): {test_code}
Framework (Playwright / Jest / Cypress): {framework}
</input>
<constraints>
- LINE NUMBERS required (or explicit "near X" reference).
- Categorize each finding by anti-pattern type.
- Severity labels: MAJOR / MINOR — no "high/medium/low" or other terms.
- Concrete fix per finding, not "improve this".
- Distinguish anti-pattern from legitimate exception (rare but possible).
</constraints>
<output_format>
Markdown table: Line | Anti-pattern | Severity | Issue | Suggested fix. Followed by 2-3 bullets on the BIGGEST issue (if a pattern across multiple lines).
</output_format>
Before writing, read the entire file once to catch cross-line patterns (e.g., shared state).Example
Common pitfalls
- Model rewrites the entire test file instead of pinpointing line-level issues. Force table-based findings.
- False positives on legitimate sleeps (rare but exists — e.g., debounce); reviewer must distinguish.
- Cross-line patterns (shared state, missing teardown) get listed as 10 individual issues; cluster them.
- Severity defaults to all MAJOR. Force discrimination: style issues are MINOR.
Tips
- Run this BEFORE merging, not after. Anti-patterns compound — a test using shared state spreads the pattern to neighbors.
- Pair with `refactor-flaky-test` when an anti-pattern (hard sleep) is the actual root cause of a flake.
- Use this as part of PR template — link to the prompt as a self-review tool.
- Re-run quarterly on existing test suites; debt accumulates faster than you think.
FAQ
Test interdependence — it scales worst. A suite with shared state cannot parallelize, cannot shard, and individual test runs become unreliable. Fixing it requires touching every test.
Related prompts
Refactor Flaky Test to Stable
Takes a flaky test and its failure history, identifies which of the canonical root causes (race, hard sleep, shared state, network dependency, ordering, animation) is responsible, and produces a rewritten test that fixes the specific cause — no blanket retries.
Open →Convert Synchronous Waits to Auto-Waiting
Reads a test using hard waits and returns a rewritten version using Playwright auto-waiting (`expect(locator).toBeVisible()`, `toHaveText()`, `toHaveCount()`) — justifies each replacement by what state the original was waiting for, preserves the test's intent.
Open →Test Code Quality Checklist
Returns a per-test-file quality checklist with 20-30 items grouped by category (naming / structure / assertions / isolation / performance / maintainability) — each marked PASS/FAIL with one-line evidence from the code.
Open →Refactor Test Suite for DRY
Scans a set of test files and identifies duplicated setup, fixture state, and assertion patterns — proposes refactors using Playwright fixtures, factory functions, or shared helper modules with concrete code diffs. Warns against premature abstraction (single-use helpers).
Open →