Leaky bucket vs token bucket — what's the difference?

Token bucket allows bursts up to bucket size, then drains at the configured rate. Leaky bucket smooths bursts to a constant rate. Token bucket is more permissive; most API gateways use it.

What about distributed rate-limiter race conditions?

Distributed limiters using Redis or similar often have off-by-N errors at high concurrency. Test specifically by running 100+ parallel workers against the limit boundary — surface 'should fail' vs 'did fail' delta.

Should I test rate limits in production?

Only in synthetic test paths designed for it. Real-traffic limits should be tested in staging with realistic traffic patterns. Production rate-limit testing risks impacting real customers.

Generate Rate Limiting Test Scenarios for an API

Updated 2026-06-08·intermediate·Security Testing

Returns rate-limiting test cases verifying X-RateLimit-* headers, behavior at the limit boundary, burst handling, reset window semantics, per-authentication-level limits, and concurrent request behavior across multiple workers.

When to use it

Verifying a newly implemented rate limiter.
Auditing existing rate limits after a denial-of-service incident.
Documenting rate limit behavior for API consumers (rate limit page).
Validating that public, authenticated, and admin tiers have correctly differentiated limits.

The prompt

XML-tagged — best for Claude 4.x

<role>
You are an API testing specialist. You know that 'rate limiting works' is meaningless — you have to verify SPECIFIC behaviors: header semantics, boundary precision, concurrency handling, reset rules.
</role>

<context>
Common rate-limit algorithms: fixed window, sliding window, token bucket, leaky bucket. Each has different boundary behavior and burst characteristics. Headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset (or Retry-After on 429). Limits often differ per auth tier (anonymous / authenticated / admin / partner).
</context>

<task>
For the endpoint and limit configuration below, generate:
1. Header validation tests — X-RateLimit-* headers present and accurate
2. Boundary behavior tests — exact request count where 429 starts
3. Burst handling tests — N requests in 1 second
4. Reset semantics tests — request after window expiry
5. Per-auth-level tests — anon / authenticated / admin limits differ
6. Concurrency tests — N parallel requests at boundary
</task>

<input>
Endpoint: {endpoint}
Limit configuration (limit + window + algorithm): {config}
Auth tiers and their limits: {tiers}
</input>

<constraints>
- Each test specifies: name, request count + timing, expected status, expected headers.
- Boundary tests target EXACT counts (e.g., request 100 of 100 = 200; request 101 = 429).
- Burst tests differ from boundary tests — burst is "many requests fast", boundary is "exactly at the limit".
- Concurrency tests must verify the limiter is correct under parallel load (some implementations are off-by-N).
- Include at least one negative test where bypassing is attempted (varying User-Agent, IP spoofing if possible).
</constraints>

<output_format>
Three sections:
1. **Test scenarios table** — Scenario | Setup | Request pattern | Expected status | Expected headers | Notes
2. **Implementation sketch** — pseudocode for the most complex scenario (concurrency test)
3. **Documentation snippet** — text suitable for an API docs "Rate Limits" section
</output_format>

Before writing, identify which algorithm the config implies (fixed window vs sliding vs token bucket).

Example

Common pitfalls

Model conflates boundary and burst — they're different tests. Force separate scenarios.
Concurrency test gets omitted; that's where most limiters break. Always include.
Per-tier tests get glossed if input is single-tier; if your API only has one auth level, ASK for the planned tiers and test them.
Bypass tests treated as 'attempts to attack' instead of validation of identity keying. Frame as 'verify identification is correct'.

Tips

Run concurrency test in CI weekly; race conditions in limiters often surface only at scale.
Use real-clock waits for reset semantics — mocking time often hides real bugs.
Validate Retry-After is a SMALL integer, not 0 or huge. Some limiters return 0 (meaning 'retry immediately') incorrectly.
Pair with `auth-bypass-test-cases` — rate limiting often a defense against credential stuffing.

FAQ

429 = identified consumer is over their limit (specific to identity). 503 = server overloaded for everyone. They have different meanings to consumers. Use 429 for rate limits, 503 for capacity issues.

Related prompts

Security Testingintermediate

Generate Security Test Checklist (OWASP ASVS)

Returns an OWASP ASVS-aligned security testing checklist covering authentication, session management, authorization, input validation, output encoding, cryptography, API security, file upload, and HTTP security headers — each item with a test method (manual / DAST / SAST) and ASVS chapter citation.

Open →

Security Testingintermediate

Create OWASP Top 10 Test Scenarios

For each OWASP Top 10 (2025) category (A01-A10), returns 3-5 concrete test cases with payloads, recommended tools, expected secure behavior, and remediation guidance tailored to the target application.

Open →

Security Testingadvanced

Generate Authentication Bypass Test Cases

Returns a structured suite of authentication and authorization bypass test cases — IDOR, JWT algorithm confusion, session fixation, MFA bypass, brute-force resistance, broken object-level authz — with payloads, CWE numbers, and the detection signal that confirms vulnerability vs secure behavior.

Open →

Test Automationintermediate

Create API Test Suite from OpenAPI Spec

Reads an OpenAPI 3.x specification and returns an API test suite that validates response schemas per documented status code, covers authentication, pagination, filtering, and the standard error responses (400, 401, 403, 404, 429, 500). Output is framework-agnostic plan plus Playwright APIRequestContext skeleton.

Open →