Methodology

Why We Verify Every Interview Fact Twice

The Mythic Intel Team · May 22, 2026 · 6 min read

We verify every interview fact twice because a single research pass produces confident, fluent claims that are sometimes wrong, and wrong feedback during prep is worse than no feedback. The first pass gathers what the role demands. The second pass is adversarial: it tries to disprove every claim from the first, and anything it cannot confirm against a real source gets struck before it ever reaches your room. That second check is the difference between interview prep accuracy you can trust and a plausible-sounding hallucination.

This matters because the cost of an unverified claim is asymmetric. If a tool tells you a framework ships with a feature it does not, or grades you against a "best practice" that was deprecated two versions ago, you walk into the real interview having rehearsed something false. The panel notices. So the standard for fact-checked interview questions has to be higher than "the model sounded sure."

One pass is not enough

Large language models generate text by predicting what is statistically likely to come next, which means they optimize for fluency and coherence, not for being correct. Researchers describe hallucination as output that is syntactically clean and convincing but factually unsupported, and surveys treat it as an innate limitation of how these models work rather than a bug you can fully patch out. When a model is unsure, it does not stop. It produces the most probable-looking continuation, which can be a fabrication delivered in the same confident tone as a fact.

A single research pass inherits that weakness directly. The model reads, summarizes, and writes questions in one motion, and any shaky claim it picked up gets baked into the rubric you study against. You would never know which lines were solid and which were guesses.

What the second pass actually does

The second pass is a verification step, not a rewrite. The research method behind it is well studied. Chain-of-Verification, published in 2023, showed that an LLM produces more accurate answers when it drafts a response, then generates independent verification questions, answers those separately, and only then writes a final version. The key finding is that short, isolated verification questions get checked more reliably than the same facts buried inside a long original answer. Independence is what makes it work.

So the second pass treats the first pass as a suspect, not a source:

It pulls each discrete claim out of the draft (a tool, a version, a metric, a responsibility, a "common question").
It re-checks each one against the live web independently, not against the draft's own reasoning.
It looks for a confirming source, and if it cannot find one, the claim is marked unconfirmed.
Unconfirmed claims are removed, not softened. A fact that survives only because it sounds right does not survive.

This is closer to red-teaming than to proofreading. Adversarial verification means actively trying to break a claim, and a claim that holds up under that pressure is one you can rehearse against.

Why self-checking has to be done carefully

There is a real trap here, and it is worth naming. Research on self-verification found that when a model is asked to critique its own reasoning loosely, quality can actually degrade, and using the model as its own verifier in a naive loop can collapse performance. Telling a model "are you sure?" and trusting whatever it says next is not verification. It is theater.

The fix is the structure Chain-of-Verification demonstrated: the check has to be independent and grounded in retrieved sources, not a vibes-based "double-check." That is why the second pass re-fetches evidence instead of re-reasoning over the first draft, and why it favors striking a claim over defending it.

What happens to claims that cannot be confirmed

The rule is simple and strict: if the second pass cannot confirm it, it does not ship. There is no "probably," no "likely," no hedged middle ground that quietly makes it into a question. A claim is either grounded in a source the verifier found, or it is gone.

That has a visible consequence. A verified room is sometimes shorter or more conservative than an unverified one would be, because the speculative material got cut. That is the point. A smaller set of true questions beats a larger set padded with things that merely sound true. Mythic Intel builds rooms this way on purpose: it researches the exact role, then runs the second pass that strikes anything it cannot stand behind, so the questions you hear and the rubric you are graded on are made of confirmed facts.

The honest version of accuracy is admitting what you do not know. A grader that quietly drops an unconfirmable claim is more trustworthy than one that keeps it to look complete.

The payoff for your prep

When every fact has survived two passes, the feedback you get means something. You can rehearse your answer to "how does this system handle X" knowing X is real, that the framework actually works the way the question assumes, and that the model answer is not built on a deprecated detail. Trust in the feedback is what lets you change your behavior based on it.

The best way to feel that difference is to stop reading and start speaking. Say your answer out loud, against questions that have been checked twice, and you will hear immediately where you are solid and where you are guessing.

your turn

Stop reading about interviews. Start training for yours.

Build My Room →