Skip to content

Module 01: Systems And Architecture

Learn to break a product into components, boundaries, data flow, and failure modes.

Architecture is responsibility placement.

Every serious system has parts that should not all do the same job. The frontend should not own payment truth. The database should not decide UI state. The API should not trust the browser. The worker should not require a user to wait.

A hospital works because responsibilities are separated:

  • Reception identifies the patient.
  • Nurses collect vitals.
  • Doctors diagnose.
  • Pharmacy dispenses medicine.
  • Records store history.
  • Labs process slow tests.
  • Billing handles payment.

If reception starts prescribing medicine, people get hurt. If the database starts acting like the UI, software gets hurt.

ComponentJobCommon Failure
FrontendUser interaction and display stateShows wrong state or sends bad request
Backend APITrusted boundary and orchestrationBad validation or wrong status code
Business serviceDomain rulesWrong rule or missing edge case
DatabaseDurable stateBad schema, slow query, lost migration
CacheFast temporary stateStale value
QueueBuffer background workDuplicate or stuck job
WorkerProcess async tasksRetry storm or partial failure
External APIThird-party capabilityTimeout, quota, wrong response
ObservabilityLogs, metrics, tracesNo evidence during incident

component skeleton

UserNeeds an outcome, not code.
FrontendCollects input and shows state.
Backend APITrusted boundary for validation and orchestration.
Business RulesOwns domain decisions.
DatabaseDurable facts.
Queue + WorkerSlow or retryable work.
External APIOutside dependency with failure risk.
ObservabilityLogs, metrics, traces, and alerts connect every component to evidence.

A boundary is where data crosses responsibility.

Important boundaries:

  • User input to frontend.
  • Frontend to backend.
  • Backend to database.
  • Backend to external API.
  • Queue producer to queue consumer.
  • App to infrastructure.

At every boundary, ask:

  • What data shape is expected?
  • Who is allowed to call this?
  • What happens if data is missing?
  • What happens if the receiver is down?
  • What gets logged?

Design “upload a document and receive a summary later.”

Required components:

  • Upload screen.
  • API for upload.
  • Object storage for file.
  • Database record for document status.
  • Queue for summary job.
  • Worker that calls a model/API.
  • UI that shows pending, failed, or complete.
  • File too large.
  • Unsupported file type.
  • Upload succeeds but DB write fails.
  • Queue accepts job but worker crashes.
  • Model/API times out.
  • Summary is low quality.
  • User refreshes page during processing.
  • User lacks permission for the document.

Ask an agent for a failure-mode review:

Review this document-summary architecture. Find missing components, bad
boundaries, race conditions, security issues, and observability gaps. Do not
rewrite it. Return findings with severity.
  • Why should slow document processing use a queue?
  • What state should be stored in the database?
  • What should object storage own?
  • Which actions need authorization?
  • What metric tells you workers are falling behind?
  • What log line helps trace one document from upload to summary?
  1. Create portfolio/01-systems-and-architecture/document-summary-design.md.
  2. Draw the component diagram.
  3. Draw the request and job flow.
  4. Mark every boundary.
  5. List at least ten failure modes.
  6. Add one log and one metric per component.
  7. Ask an agent to critique the design.
  8. Revise the design.
  • You can explain why every component exists.
  • You can trace the happy path and three failure paths.
  • You can identify where data is stored and who owns it.
  • You can explain how you would know the system is broken.