Fixtures are staged and repeatable. Scoring is reducer-computed from a fixed rubric — the UI never defines hidden scoring rules, and this MVP does not independently re-execute runner evidence.
A multi-byte boundary bug drops the final grapheme of identifiers near buffer edges. Restore correct tokenization without widening scope beyond src/parser/** and tests/**. Evidence must include a passing boundary test.
A responsive grid overflows its track at 600px. Fix the layout and add a regression test. Do not touch unrelated routes.
A backfill job double-counts rows on retry. Make it idempotent and prove it with a replayable test.