Engineering Roles

Database Engineer Interviews: Indexes, Locks, And Recovery

The Mythic Intel Team · Mar 22, 2025 · 8 min read

A database engineer interview goes deep on the parts of a database most application developers never see: how an index actually finds a row, why a query chose a slow plan, what two concurrent transactions can do to each other, and how you get the data back after a failure. Database interview questions reward engineers who can reason about mechanics, not just write SQL. The DBA interview and the SQL performance interview both come down to whether you understand what the engine is doing underneath your query.

This guide covers the rounds you should expect, indexing and query plans, isolation levels and the anomalies they prevent, locking and replication, and backup and recovery. Get the isolation-level anomalies and the index mechanics exact, because those are the questions where vague answers fail.

The Rounds You Should Expect

A database engineer loop in 2026 typically runs four to five conversations:

A SQL round, usually live: joins, aggregation, window functions, and writing a query against a schema you are shown.
A performance round: read a query plan, explain why a query is slow, and fix it.
A concurrency and internals round: isolation levels, locking, MVCC.
An operations round: replication, high availability, backup, and recovery.
A design or behavioral round on schema modeling and past incidents.

The performance and internals rounds are where depth separates candidates, because they cannot be answered from memorized syntax.

B-Tree Indexes And How They Work

Start with the data structure, because most relational indexes are B-trees (more precisely B+ trees) and interviewers expect you to know why. A B-tree keeps keys sorted and balanced, with all leaf nodes at the same depth, so a lookup walks from the root down a small number of levels regardless of table size. That gives roughly logarithmic lookup cost rather than a full scan.

What this buys you, and what it does not:

It makes equality and range lookups fast, and because the keys are sorted, it also serves ORDER BY and range scans on the indexed columns.
A composite index on (a, b) can be used for a filter on a alone or on a and b together, but generally not for a filter on b alone, because the leftmost columns order the tree. This leftmost-prefix rule is a frequent question.
Indexes cost writes. Every insert, update, or delete must maintain the index, so indexing every column is a mistake. Be ready to explain when an index helps and when it just slows writes for no read benefit.
A query that returns most of the table is often faster as a sequential scan than via the index, which is why the planner sometimes ignores an index you built.

Reading A Query Plan

The performance round almost always involves EXPLAIN. You should be comfortable reading a plan and naming the costly operations:

A sequential scan on a large table where an index lookup was expected usually means a missing index, a non-sargable predicate (wrapping the column in a function defeats the index), or stale statistics.
A nested-loop join over many rows can be far slower than a hash join; the planner picks based on row estimates, so wrong estimates from stale statistics produce bad plans.
Watch for a sort or hash that spills to disk, and for row-estimate numbers that are far off from reality.

The senior move is to read the plan, form a hypothesis ("this filter is not using the index because of the function call on the column"), and propose a specific fix, then confirm with the updated plan.

Isolation Levels And Their Anomalies

This is the section to memorize cold, because the definitions are exact and interviewers test them precisely. The SQL standard defines four isolation levels and three read anomalies. The anomalies:

Dirty read. A transaction reads a row another transaction has modified but not yet committed. If that other transaction rolls back, you read a value that never existed.
Non-repeatable read. A transaction reads a row, another transaction modifies or deletes that row and commits, and the first transaction reads the same row again and gets a different value.
Phantom read. A transaction runs a query that returns a set of rows, another transaction inserts or deletes rows matching that condition and commits, and the first transaction re-runs the query and the set of rows has changed.

The levels, from weakest to strongest, and what each permits per the standard:

Read Uncommitted. Allows dirty reads, non-repeatable reads, and phantoms.
Read Committed. Prevents dirty reads; still allows non-repeatable reads and phantoms.
Repeatable Read. Prevents dirty and non-repeatable reads; the standard still allows phantoms.
Serializable. Prevents all three; transactions behave as if run one after another.

A strong candidate adds the implementation nuance: the SQL standard describes the minimum guarantees, and real engines differ. PostgreSQL's Repeatable Read uses snapshot isolation and in practice prevents the phantom reads the standard would permit. Knowing that the standard and a given engine are not identical is exactly the depth the round is checking for.

Locking And Concurrency

Expect questions on how the engine enforces isolation. Be ready on:

Shared versus exclusive locks. Readers can share; a writer needs exclusivity. Lock granularity (row, page, table) trades concurrency against overhead.
MVCC. Many databases use multi-version concurrency control, keeping multiple versions of a row so readers see a consistent snapshot without blocking writers. Explain how this avoids read locks and the cost it brings, namely version cleanup or vacuuming.
Deadlocks. Two transactions each hold a lock the other needs. The engine detects the cycle and aborts one. Mitigate by acquiring locks in a consistent order and keeping transactions short.

Replication, Backup, And Recovery

The operations round covers how the data survives. Cover:

Replication. Synchronous replication acknowledges a write only after a replica has it, protecting against data loss at the cost of write latency. Asynchronous replication is faster but can lose the most recent writes if the primary fails. This is a direct trade-off you should be able to state.
Backup and recovery, with RPO and RTO. Recovery Point Objective (RPO) is the maximum acceptable data loss, measured in time; it sets how often you back up or how tight your replication must be. Recovery Time Objective (RTO) is the maximum acceptable downtime before service is restored; it sets your restore strategy. Point-in-time recovery, replaying the write-ahead log on top of a base backup, is how you hit a tight RPO. Be ready to say that a backup you have never restored is not a backup; you test restores.

How To Practice

Rehearse out loud: explain how a B-tree finds a row and the leftmost-prefix rule, read a slow query plan and name the fix, then recite the four isolation levels with the exact anomaly each one allows. Then walk a recovery: state the RPO and RTO and the backup-plus-log-replay path that meets them. Speaking it is what catches a fuzzy isolation-level answer before an interviewer does. A tool like Mythic Intel can build a verified database-engineer room and grade your spoken answer on accuracy, completeness, structure, and proof. Drill the anomaly table and the index rules until they are automatic, because those are the questions with one correct answer.

your turn

Stop reading about interviews. Start training for yours.

Build My Room →