Skip to content

Debugging Playbook

Debugging is not trying fixes until one works. Debugging is reducing uncertainty.

symptom -> reproduction -> boundary -> evidence -> cause -> fix -> verification
  1. If you cannot reproduce it, you are guessing.
  2. If you cannot name the boundary, you do not know where the bug lives.
  3. If you cannot verify the fix, you did not finish.
  • Browser to frontend code.
  • Frontend to API.
  • API to service logic.
  • Service logic to database.
  • Service to external API.
  • App to cache or queue.
  • Container to environment variables.
  • Ingress/load balancer to service.
  • CI runner to build toolchain.
  • What exactly did the user do?
  • What exactly happened?
  • What should have happened?
  • Is it reproducible?
  • Does it happen locally, in staging, in production, or everywhere?
  • What changed recently?
  • What input triggered it?
  • What log line proves the request arrived?
  • What error proves the failing boundary?
  • What test would fail before the fix and pass after?

Weak evidence:

  • “I think it is fixed.”
  • “The code looks right.”
  • “The agent said so.”

Better evidence:

  • A failing test now passes.
  • A request returns the expected response.
  • A log line shows the corrected path.
  • A database row has the expected state.
  • A deployment health check is green.
  • A user path works in browser.
# Debugging Log
## Symptom
## Expected Behavior
## Reproduction Steps
## Observed Evidence
## Suspected Boundary
## Root Cause
## Fix
## Verification
## Prevention

Scenario: users report duplicate tickets.

Possible causes:

  • Double-clicked submit button.
  • Frontend retry.
  • Backend retry.
  • Missing idempotency key.
  • Missing database uniqueness.
  • Queue processed the same job twice.

Work:

  1. Reproduce duplicates.
  2. Add request IDs to logs.
  3. Add an idempotency key.
  4. Add a backend test.
  5. Add a database constraint if the business rule allows it.
  6. Verify duplicate submission creates one durable ticket.

You can point to the original trigger, the failing boundary, and the evidence that proves the fix.