Design a Comprehensive Load Test Strategy with AI
Returns a load test strategy covering 5 scenario types (baseline / load / stress / spike / soak) with thresholds for response time, throughput, and error rate, environment requirements, monitoring checkpoints, and pass/fail criteria — and explicit environment-parity statement.
When to use it
- Designing performance testing for a new service or major release.
- Standardizing performance testing across multiple services in your org.
- Justifying performance investment to leadership with a defensible plan.
- Auditing existing perf tests to see if you're actually testing what you think.
The prompt
XML-tagged — best for Claude 4.x
<role>
You are a performance engineer. You know the difference between load, stress, spike, and soak — and you require ENVIRONMENT PARITY because a load test against a 1-CPU staging proves nothing about a 16-CPU production.
</role>
<context>
Five canonical scenario types:
- **Baseline** — Single-user, single-request: establish a non-loaded performance floor.
- **Load** — Expected production traffic sustained for 15-30 min: verify the system handles normal day.
- **Stress** — Gradually increasing load until breaking point: find the limit.
- **Spike** — Sudden traffic surge (e.g., 5x in 30 seconds): test elasticity.
- **Soak** — Steady load for hours: detect memory leaks, connection pool exhaustion.
Each scenario has different success criteria. Thresholds are per-scenario, not global.
</context>
<task>
For the system below, produce a load test strategy with:
1. **All 5 scenario types** — each with: traffic profile, duration, success criteria (response time p95/p99, throughput, error rate)
2. **Environment requirements** — explicit parity statement comparing test env to prod
3. **Test data volume** — fixtures, seeded data, third-party stubs
4. **Monitoring** — what's captured during runs, alert thresholds
5. **Success criteria** — overall pass/fail definition for each scenario
</task>
<input>
System description: {system}
Expected production traffic: {traffic}
Acceptable response time / error rate: {sla}
Known constraints (environment limits, partner rate limits): {constraints}
</input>
<constraints>
- All 5 scenarios MUST appear; do not conflate or omit.
- Each scenario has p95 AND p99 thresholds (or justify omitting p99).
- Environment parity statement is mandatory — name the differences explicitly.
- Soak duration is hours (not minutes); spike is seconds-to-minutes.
- Each scenario's success criterion must include error rate, not just response time.
</constraints>
<output_format>
Six sections:
1. **Scenario table** — Scenario | Traffic profile | Duration | p95 | p99 | Throughput | Error rate
2. **Environment parity** — paragraph explicitly comparing test env to prod
3. **Test data volume** — bullets
4. **Monitoring** — what's captured + alert thresholds
5. **Run cadence** — when each scenario runs (per-PR, nightly, pre-release)
6. **Overall pass/fail** — paragraph defining what "performance is healthy" means
</output_format>
Before writing, identify any scenario type that doesn't apply (rare but possible) and explain why instead of including it pro forma.Example
Common pitfalls
- Model conflates load and stress as 'increasing load'. Force the explicit distinction.
- Soak gets a 30-minute duration — that's not soak. Force hours.
- Spike test runs as a 5-minute ramp; the 'sudden' part gets lost. Force seconds-scale ramp.
- Environment parity gets glossed over with 'use a representative environment'. Demand specific named differences.
Tips
- Run baseline before EVERY load test — without a floor, you can't tell if performance regressed.
- Save the response-time histograms; year-over-year comparison surfaces drift the day-to-day misses.
- Pair with `k6-script-generator` to produce executable test code for each scenario.
- Pair with `performance-bottleneck-analysis` to interpret the results when something fails.
FAQ
Stress finds the breaking point via gradual increase. Spike tests recovery from a sudden surge at a known level. Stress answers 'where do we break?'. Spike answers 'do we survive Black Friday's first 30 seconds?'.
Related prompts
Generate k6 Test Script from Endpoint
Reads an endpoint description and returns a ready-to-run k6 script with `options.scenarios` (ramping-arrival-rate), thresholds for p95/p99/error rate, realistic think times, and a `handleSummary()` for exporting to Grafana / InfluxDB or k6 Cloud.
Open →Create a JMeter Test Plan
Returns a JMeter test plan as a valid .jmx-shaped XML skeleton with thread groups per scenario type, HTTP request samplers, response assertions, timers, and CSV-driven data — ready to import into JMeter 5.x for refinement and distributed runs.
Open →Analyze Performance Bottlenecks from Results
Reads a load test result summary (latency percentiles, throughput, error rate, system metrics) and returns a ranked list of suspected bottleneck layers — network, application, database, dependent service, or infrastructure — each with evidence cited from the metrics and a recommended next investigation step.
Open →Generate Synthetic Monitoring Scenario
Reads a critical user journey and returns a Playwright-based synthetic monitoring script with business-step checkpoints, failure-screenshot capture, an alerting threshold tied to a stated SLO/SLI, and a recommended run frequency.
Open →