Can I automate this in CI?

Partially — ESLint rules cover some (`no-magic-numbers`, `no-shared-state`). The pattern-based detection (god assertion, weak assertion) is harder to lint reliably. Use this prompt for the harder patterns; ESLint for the simple ones.

How is this different from `test-code-quality-checklist`?

This prompt finds problems in a specific file ("what's wrong here?"). The checklist verifies general best practices ("is this well-tested?"). Use both: the checklist for breadth, this prompt for depth.

What about false positives on legitimate sleeps?

Rare but real — e.g., waiting on a debounced search input. Output should flag them as 'legitimate exception' with reasoning. Don't blindly remove them.

Review Test Code for Anti-Patterns with AI

Updated 2026-06-08·intermediate·Test Code Review

Reads a test file and returns a categorized list of anti-patterns — hard sleeps, shared mutable state, weak assertions (`toBeTruthy` instead of `toEqual`), missing teardown, mixed setup/assertion concerns — each with line numbers, severity, and a suggested fix.

When to use it

PR review of a new test file or suite.
Auditing an inherited test suite that everyone complains about but no one fixes.
Teaching engineers what 'bad test code' looks like by showing real examples.
Standardizing test quality across multiple teams.

The prompt

XML-tagged — best for Claude 4.x

<role>
You are a test code reviewer. You can name anti-patterns by category and you cite line numbers so suggestions are actionable. You distinguish anti-patterns from legitimate exceptions.
</role>

<context>
Common test anti-patterns:
- **Hard sleeps** — `waitForTimeout`, `sleep`, `setTimeout` for synchronization
- **Shared mutable state** — variables outside test scope; test order dependence
- **Weak assertions** — `toBeTruthy` / `toBeDefined` when `toEqual` would be precise
- **Missing teardown** — fixtures that don't clean up; tests that pollute next runs
- **Mixed concerns** — setup, action, and assertion interleaved instead of arranged (Arrange-Act-Assert)
- **God assertions** — one `expect` that checks 10 things; failure unclear
- **Test interdependence** — test 2 requires test 1 having run
- **Hardcoded values** — magic numbers / strings instead of named constants
- **No descriptive names** — `test('test 1')` instead of `test('signs in with valid credentials')`
- **Try/catch hiding failures** — caught exception means test still passes
</context>

<task>
For the test file below:
1. Identify anti-patterns by category.
2. Cite LINE NUMBERS (or "near setup block" if unclear).
3. Rate severity: Major (likely cause of test failure / wrong result), Minor (style / maintainability).
4. Suggest a concrete fix per anti-pattern.
5. Note any patterns that LOOK like anti-patterns but are legitimate (e.g., one intentional sleep waiting for an unsoftenable race).
</task>

<input>
Test file content (with line numbers): {test_code}
Framework (Playwright / Jest / Cypress): {framework}
</input>

<constraints>
- LINE NUMBERS required (or explicit "near X" reference).
- Categorize each finding by anti-pattern type.
- Severity labels: MAJOR / MINOR — no "high/medium/low" or other terms.
- Concrete fix per finding, not "improve this".
- Distinguish anti-pattern from legitimate exception (rare but possible).
</constraints>

<output_format>
Markdown table: Line | Anti-pattern | Severity | Issue | Suggested fix. Followed by 2-3 bullets on the BIGGEST issue (if a pattern across multiple lines).
</output_format>

Before writing, read the entire file once to catch cross-line patterns (e.g., shared state).

Example

Common pitfalls

Model rewrites the entire test file instead of pinpointing line-level issues. Force table-based findings.
False positives on legitimate sleeps (rare but exists — e.g., debounce); reviewer must distinguish.
Cross-line patterns (shared state, missing teardown) get listed as 10 individual issues; cluster them.
Severity defaults to all MAJOR. Force discrimination: style issues are MINOR.

Tips

Run this BEFORE merging, not after. Anti-patterns compound — a test using shared state spreads the pattern to neighbors.
Pair with `refactor-flaky-test` when an anti-pattern (hard sleep) is the actual root cause of a flake.
Use this as part of PR template — link to the prompt as a self-review tool.
Re-run quarterly on existing test suites; debt accumulates faster than you think.

FAQ

Test interdependence — it scales worst. A suite with shared state cannot parallelize, cannot shard, and individual test runs become unreliable. Fixing it requires touching every test.

Related prompts

Test Automationadvanced

Refactor Flaky Test to Stable

Takes a flaky test and its failure history, identifies which of the canonical root causes (race, hard sleep, shared state, network dependency, ordering, animation) is responsible, and produces a rewritten test that fixes the specific cause — no blanket retries.

Open →

Test Code Reviewintermediate

Convert Synchronous Waits to Auto-Waiting

Reads a test using hard waits and returns a rewritten version using Playwright auto-waiting (`expect(locator).toBeVisible()`, `toHaveText()`, `toHaveCount()`) — justifies each replacement by what state the original was waiting for, preserves the test's intent.

Open →

Test Code Reviewbasic

Test Code Quality Checklist

Returns a per-test-file quality checklist with 20-30 items grouped by category (naming / structure / assertions / isolation / performance / maintainability) — each marked PASS/FAIL with one-line evidence from the code.

Open →

Test Code Reviewintermediate

Refactor Test Suite for DRY

Scans a set of test files and identifies duplicated setup, fixture state, and assertion patterns — proposes refactors using Playwright fixtures, factory functions, or shared helper modules with concrete code diffs. Warns against premature abstraction (single-use helpers).

Open →