Methodology

What Graded Against The Truth Actually Means

The Mythic Intel Team · Feb 5, 2026 · 6 min read

Graded against the truth means your answer is scored against verified facts about the role, not against whatever a model felt like rewarding in the moment. Interview feedback is only useful if the rubric it scores you on is correct, so the rubric is locked to claims that survived research and a second verification pass. If you state something that contradicts a confirmed fact, the grading flags it. If you make a claim the system cannot verify, it is treated as unverified rather than quietly accepted. That is the difference between accurate interview feedback and a confident score with nothing behind it.

The core idea is simple. Grounded scoring beats vibes because a score is only as trustworthy as the facts it is measured against.

The problem with ungrounded scoring

An interview answer scoring system that runs purely on a language model's judgment has a hidden flaw: the model can hallucinate the very standard it grades you by. Language models generate the most probable-sounding text, not the most correct text, and surveys of the field describe this as a built-in property of how they work rather than an occasional glitch. Point that capability at grading and it will happily invent a "correct answer," reward you for matching its invention, or penalize you for contradicting something it made up.

You would never see the error. The score arrives fluent and certain, and you would adjust your real answer to satisfy a criterion that does not exist. Ungrounded feedback is not neutral when it is wrong. It actively teaches you the wrong thing.

What "bound to verified facts" changes

Grounding the rubric flips the relationship. Instead of the grader deciding what is true as it scores, the truth is established first and the grader is held to it. The facts about the role are researched, then run through a second adversarial pass that strikes anything it cannot confirm, and only the survivors become grading criteria. The model's job during scoring is to compare your answer to that fixed, verified rubric, not to improvise a new standard.

This matters because grounding is the established defense against hallucination. The research consensus is that anchoring a model's output to retrieved, verified information reduces fabrication, because the model is reasoning over confirmed evidence instead of its own parametric guesses. A rubric is just that principle applied to grading: confirmed facts in, score out.

What a grounded grader actually scores

With the facts locked, the grading can focus on the dimensions of a strong answer rather than on inventing what is true. Mythic Intel scores along four:

Accuracy. Does what you said match the verified facts about the role? Wrong claims are caught against the rubric, not waved through.
Completeness. Did you cover what the question genuinely required, or stop short?
Structure. Is the answer organized so a real panel could follow it? Methods like STAR (Situation, Task, Action, Result) exist because structured answers are easier to evaluate and more predictive, which is why structured interviews sit at the top of selection-validity research.
Proof. Did you back the claim with a concrete example, a metric, a real decision, rather than asserting it?

Notice that accuracy is the dimension that depends entirely on grounding. The other three are about how you communicate. Accuracy is about whether what you said is true, and you cannot score that honestly without a verified rubric to check against.

Flagging unverified and wrong claims

Two distinct failure modes get handled differently, and the distinction is the whole point.

A wrong claim contradicts a verified fact. The grader marks it as inaccurate, because the rubric knows the truth and your statement disagrees with it.
An unverified claim is something the system could not confirm in the first place. It is not silently accepted as correct, and it is not blamed on you as wrong either. It is surfaced as unverified, because the honest answer is "this could not be confirmed."

Keeping those separate is what makes the feedback trustworthy. A grader that collapses "wrong" and "unconfirmable" into one bucket is hiding its own uncertainty, and hidden uncertainty is how a confident wrong score gets made.

Why grounded beats vibes

The practical payoff is that you can act on the feedback. When the score is bound to verified facts, a low accuracy mark means you actually said something untrue about the role, not that a model was in a stingy mood. A note on completeness points at a real gap. You can change your behavior because the signal is real.

Vibes-based scoring gives you a number you cannot interpret. Grounded scoring gives you a number you can debug. That is the entire reason to insist the rubric be locked to confirmed facts before a single answer is graded.

After the score, the model answer shows what a complete, accurate, well-structured response looks like for that exact question, so you are comparing your attempt to a verified target rather than a vague impression. The way to internalize the difference is to answer out loud, take the grounded feedback, and say it again until your spoken answer matches the truth instead of just sounding good.

your turn

Stop reading about interviews. Start training for yours.

Build My Room →