Skip to content

Instructor Grading Guide

This guide helps mentors evaluate learners consistently.

Grade evidence, not confidence. A learner who says “I understand” but cannot run, explain, test, or debug the system is not ready.

Use 0-3 for each area.

  • 0: missing.
  • 1: present but shallow.
  • 2: works with minor gaps.
  • 3: strong, explainable, verified.
Area0123
Repo fluencyCannot run projectRuns exact commands onlyFinds key filesExplains runtime and entry points
System designNo diagramBoxes onlyFlow and componentsBoundaries and failure modes
ProgrammingCannot modifyEdits by copyingScoped change worksClean separation and tests
DataNo schemaTables listedConstraints and queriesMigration, audit, privacy considered
APINo contractEndpoint listRequest/response/errorsAuth, idempotency, edge cases
DebuggingGuessesFinds symptomReproduces and fixesRoot cause, test, postmortem
OpsNo runbookStart command onlyHealth and logsConfig, rollback, recovery
ADLCAgent copiedAgent used vaguelyBounded taskReviewed, verified, rejected weak output
CommunicationRamblingBasic notesClear summaryCrisp handoff with evidence

Question: Why did duplicate tickets happen?

Bad:

There was a bug in the frontend. I fixed it.

Why bad:

  • No reproduction.
  • No boundary.
  • No evidence.
  • No prevention.
Duplicate tickets happened because the API accepted repeated POST /tickets
requests without checking an idempotency key. I reproduced it by sending the
same request twice. The database had two ticket rows. I fixed create_ticket to
return the existing row when Idempotency-Key matches. The regression test
test_idempotency_key_returns_existing_ticket now proves only one row is created.

Why better:

  • Names the boundary.
  • Includes reproduction.
  • Names root cause.
  • Shows prevention.
  • References a test.

Ask the learner to explain:

POST /tickets

They should cover:

  • HTTP method and path.
  • Headers.
  • JSON body.
  • Validation.
  • Idempotency.
  • Database write.
  • Audit event.
  • Response.
  • Tests.
  • Failure modes.

A capstone passes only if:

  • Main user path runs.
  • At least one failure path is demonstrated.
  • Tests pass.
  • Data model is explained.
  • API contract matches behavior.
  • Runbook works.
  • Agentic log shows human verification.
  • Learner can name remaining production risks.
  • Cannot run the app.
  • Cannot explain own code.
  • Cannot show tests.
  • Cannot reproduce a claimed bug.
  • No rollback or recovery story.
  • Claims agent output is correct without verification.
  • Exposes secrets in code or docs.