How do I estimate think time?

Measure it from production data. If unavailable, start with 5-10 seconds between user actions and tune based on what produces realistic per-session request counts. Generic 'no think time' load tests are usually wrong.

Dealing with rate limits during load tests?

Either bypass them (test infrastructure mode) or stub the rate-limited dependency. Running load tests through a real rate-limited dependency tests the limiter, not your system.

Should performance tests block deploy?

Load test pass should block; stress/spike/soak run less frequently and gating CI on them is impractical. The standard is: load test in CI gates merge; quarterly stress test gates the production deploy schedule.

Design a Comprehensive Load Test Strategy with AI

Updated 2026-06-08·intermediate·Performance Testing

Returns a load test strategy covering 5 scenario types (baseline / load / stress / spike / soak) with thresholds for response time, throughput, and error rate, environment requirements, monitoring checkpoints, and pass/fail criteria — and explicit environment-parity statement.

When to use it

Designing performance testing for a new service or major release.
Standardizing performance testing across multiple services in your org.
Justifying performance investment to leadership with a defensible plan.
Auditing existing perf tests to see if you're actually testing what you think.

The prompt

XML-tagged — best for Claude 4.x

<role>
You are a performance engineer. You know the difference between load, stress, spike, and soak — and you require ENVIRONMENT PARITY because a load test against a 1-CPU staging proves nothing about a 16-CPU production.
</role>

<context>
Five canonical scenario types:
- **Baseline** — Single-user, single-request: establish a non-loaded performance floor.
- **Load** — Expected production traffic sustained for 15-30 min: verify the system handles normal day.
- **Stress** — Gradually increasing load until breaking point: find the limit.
- **Spike** — Sudden traffic surge (e.g., 5x in 30 seconds): test elasticity.
- **Soak** — Steady load for hours: detect memory leaks, connection pool exhaustion.

Each scenario has different success criteria. Thresholds are per-scenario, not global.
</context>

<task>
For the system below, produce a load test strategy with:
1. **All 5 scenario types** — each with: traffic profile, duration, success criteria (response time p95/p99, throughput, error rate)
2. **Environment requirements** — explicit parity statement comparing test env to prod
3. **Test data volume** — fixtures, seeded data, third-party stubs
4. **Monitoring** — what's captured during runs, alert thresholds
5. **Success criteria** — overall pass/fail definition for each scenario
</task>

<input>
System description: {system}
Expected production traffic: {traffic}
Acceptable response time / error rate: {sla}
Known constraints (environment limits, partner rate limits): {constraints}
</input>

<constraints>
- All 5 scenarios MUST appear; do not conflate or omit.
- Each scenario has p95 AND p99 thresholds (or justify omitting p99).
- Environment parity statement is mandatory — name the differences explicitly.
- Soak duration is hours (not minutes); spike is seconds-to-minutes.
- Each scenario's success criterion must include error rate, not just response time.
</constraints>

<output_format>
Six sections:
1. **Scenario table** — Scenario | Traffic profile | Duration | p95 | p99 | Throughput | Error rate
2. **Environment parity** — paragraph explicitly comparing test env to prod
3. **Test data volume** — bullets
4. **Monitoring** — what's captured + alert thresholds
5. **Run cadence** — when each scenario runs (per-PR, nightly, pre-release)
6. **Overall pass/fail** — paragraph defining what "performance is healthy" means
</output_format>

Before writing, identify any scenario type that doesn't apply (rare but possible) and explain why instead of including it pro forma.

Example

Common pitfalls

Model conflates load and stress as 'increasing load'. Force the explicit distinction.
Soak gets a 30-minute duration — that's not soak. Force hours.
Spike test runs as a 5-minute ramp; the 'sudden' part gets lost. Force seconds-scale ramp.
Environment parity gets glossed over with 'use a representative environment'. Demand specific named differences.

Tips

Run baseline before EVERY load test — without a floor, you can't tell if performance regressed.
Save the response-time histograms; year-over-year comparison surfaces drift the day-to-day misses.
Pair with `k6-script-generator` to produce executable test code for each scenario.
Pair with `performance-bottleneck-analysis` to interpret the results when something fails.

FAQ

Stress finds the breaking point via gradual increase. Spike tests recovery from a sudden surge at a known level. Stress answers 'where do we break?'. Spike answers 'do we survive Black Friday's first 30 seconds?'.

Related prompts

Performance Testingintermediate

Generate k6 Test Script from Endpoint

Reads an endpoint description and returns a ready-to-run k6 script with `options.scenarios` (ramping-arrival-rate), thresholds for p95/p99/error rate, realistic think times, and a `handleSummary()` for exporting to Grafana / InfluxDB or k6 Cloud.

Open →

Performance Testingintermediate

Create a JMeter Test Plan

Returns a JMeter test plan as a valid .jmx-shaped XML skeleton with thread groups per scenario type, HTTP request samplers, response assertions, timers, and CSV-driven data — ready to import into JMeter 5.x for refinement and distributed runs.

Open →

Performance Testingadvanced

Analyze Performance Bottlenecks from Results

Reads a load test result summary (latency percentiles, throughput, error rate, system metrics) and returns a ranked list of suspected bottleneck layers — network, application, database, dependent service, or infrastructure — each with evidence cited from the metrics and a recommended next investigation step.

Open →

Performance Testingintermediate

Generate Synthetic Monitoring Scenario

Reads a critical user journey and returns a Playwright-based synthetic monitoring script with business-step checkpoints, failure-screenshot capture, an alerting threshold tied to a stated SLO/SLI, and a recommended run frequency.

Open →