Instructor Grading Guide

This guide helps mentors evaluate learners consistently.

Principle

Grade evidence, not confidence. A learner who says “I understand” but cannot run, explain, test, or debug the system is not ready.

Scoring

Use 0-3 for each area.

0: missing.
1: present but shallow.
2: works with minor gaps.
3: strong, explainable, verified.

Areas

Area	0	1	2	3
Repo fluency	Cannot run project	Runs exact commands only	Finds key files	Explains runtime and entry points
System design	No diagram	Boxes only	Flow and components	Boundaries and failure modes
Programming	Cannot modify	Edits by copying	Scoped change works	Clean separation and tests
Data	No schema	Tables listed	Constraints and queries	Migration, audit, privacy considered
API	No contract	Endpoint list	Request/response/errors	Auth, idempotency, edge cases
Debugging	Guesses	Finds symptom	Reproduces and fixes	Root cause, test, postmortem
Ops	No runbook	Start command only	Health and logs	Config, rollback, recovery
ADLC	Agent copied	Agent used vaguely	Bounded task	Reviewed, verified, rejected weak output
Communication	Rambling	Basic notes	Clear summary	Crisp handoff with evidence

Sample Bad Answer

Question: Why did duplicate tickets happen?

Bad:

There was a bug in the frontend. I fixed it.

Why bad:

No reproduction.
No boundary.
No evidence.
No prevention.

Sample Better Answer

Duplicate tickets happened because the API accepted repeated POST /tickets
requests without checking an idempotency key. I reproduced it by sending the
same request twice. The database had two ticket rows. I fixed create_ticket to
return the existing row when Idempotency-Key matches. The regression test
test_idempotency_key_returns_existing_ticket now proves only one row is created.

Why better:

Names the boundary.
Includes reproduction.
Names root cause.
Shows prevention.
References a test.

Oral Exam

Ask the learner to explain:

POST /tickets

They should cover:

HTTP method and path.
Headers.
JSON body.
Validation.
Idempotency.
Database write.
Audit event.
Response.
Tests.
Failure modes.

Capstone Review

A capstone passes only if:

Main user path runs.
At least one failure path is demonstrated.
Tests pass.
Data model is explained.
API contract matches behavior.
Runbook works.
Agentic log shows human verification.
Learner can name remaining production risks.

Automatic Failures

Cannot run the app.
Cannot explain own code.
Cannot show tests.
Cannot reproduce a claimed bug.
No rollback or recovery story.
Claims agent output is correct without verification.
Exposes secrets in code or docs.