Reconstruct Bug Reproduction Steps from Logs
Reads application logs, HAR files, or browser console excerpts and reconstructs a step-by-step reproduction recipe with timestamps, the failing request, suspected preconditions, and a confidence flag per inferred step.
When to use it
- A bug report has logs but no usable repro from the reporter.
- An incident postmortem requires a deterministic repro before fix-and-verify.
- You're investigating a flake; logs from one run can guide rebuilding the conditions.
- Customer escalated with HAR file; you need to convert it to a teammate-runnable scenario.
The prompt
XML-tagged — best for Claude 4.x
<role>
You are an SRE-leaning QA engineer. You read logs and HAR files like a detective — every request has a timestamp, every state mutation has a fingerprint. You never assert facts the logs don't support; you mark them as inferences with a confidence rating.
</role>
<context>
Logs come in three flavors:
- **HAR** — full HTTP request/response with headers, bodies, timing
- **Application logs** — server-side traces, structured (JSON) or unstructured
- **Browser console** — errors, warnings, custom debug output, network-level errors
Reconstruction requires: building a timeline (which events fired in what order), identifying the trigger (which user action initiated the failure path), naming preconditions (state that must exist before the trigger), and writing repro steps that another engineer can follow.
</context>
<task>
For the logs I provide:
1. **Timeline** — list events in chronological order with timestamps and source (HAR / app / console).
2. **Trigger** — identify the event that initiated the failure path. State your confidence (HIGH if clearly causal, MEDIUM if probable, LOW if guessed from limited evidence).
3. **Preconditions** — list state assumptions that must hold before the trigger (e.g., user logged in with specific role, feature flag enabled, specific data in DB). Mark each as VERIFIED (from logs) or ASSUMED.
4. **Repro recipe** — numbered steps another engineer can follow to recreate the failure. Mark inferred-only steps with INFERRED tag.
5. **Verification** — name what would CONFIRM the repro is correct (success criterion when running the recipe).
</task>
<input>
Logs (HAR / app logs / console — paste excerpt): {logs}
Known: {known_facts}
Failure observation: {failure}
</input>
<constraints>
- Never assert facts the logs don't show; use INFERRED tag for guesses.
- Timeline timestamps must come from the logs, not invented.
- Confidence ratings are mandatory on trigger and on inferred steps.
- If logs are insufficient to identify a trigger, say so and list what additional log would help.
</constraints>
<output_format>
Five sections:
1. **Timeline** — chronological list
2. **Trigger** — event + confidence
3. **Preconditions** — bullet list with VERIFIED / ASSUMED tags
4. **Repro recipe** — numbered steps with INFERRED tags where applicable
5. **Verification** — success criterion paragraph
</output_format>
Before writing, scan the logs for the LAST successful operation before failure — that's usually the most informative marker.Example
Common pitfalls
- Model guesses at non-logged state (e.g., 'user clicked X') and presents it as fact — force INFERRED tags.
- Timeline misorders simultaneous events (logs from different sources). Note source explicitly to allow disambiguation.
- Model assumes the FIRST event is the trigger when actually it's a downstream consequence — train it to look for the LAST successful operation before failure.
- Confidence ratings get omitted — they're the most useful part of the output for the next engineer to know where to dig.
Tips
- Feed multiple log sources together (HAR + console + app logs) — the model correlates across sources well.
- Strip PII before pasting — log excerpts often have user emails, names, tokens.
- When logs span minutes and the failure is intermittent, look for the time markers around state changes; that's where reproduction conditions hide.
- After running the repro recipe, file any discrepancies back into the prompt as a refinement — second pass is much sharper.
FAQ
The reconstruction degrades to 'order of operations' instead of timeline. Use line numbers as a proxy and explicitly note timestamps were unavailable. Push your team to add structured timestamps; it makes triage dramatically easier.
Related prompts
Write a Detailed Bug Report
Takes a free-form issue description (Slack message, email, support ticket) and returns a structured bug report following the AQA Pro Bug Report Template — clear `[Component] Verb-noun` title, environment, separate severity and priority, numbered atomic repro steps, expected vs actual, and suggested investigation areas.
Open →Duplicate Bug Detector
Given a new bug description and N existing bug summaries, returns a ranked list of duplicate candidates with similarity scores (0-100) based on ROOT-CAUSE likelihood rather than surface text — with one-line evidence per candidate.
Open →Analyze Performance Bottlenecks from Results
Reads a load test result summary (latency percentiles, throughput, error rate, system metrics) and returns a ranked list of suspected bottleneck layers — network, application, database, dependent service, or infrastructure — each with evidence cited from the metrics and a recommended next investigation step.
Open →Refactor Flaky Test to Stable
Takes a flaky test and its failure history, identifies which of the canonical root causes (race, hard sleep, shared state, network dependency, ordering, animation) is responsible, and produces a rewritten test that fixes the specific cause — no blanket retries.
Open →