Deliverability & Infra

Reading Postfix Logs Like A First Language

The Mythic Intel Team · Mar 1, 2026 · 8 min read

Reading Postfix logs fluently means tracking one message across several process names by its queue ID and knowing what each status= and dsn= value tells you. Postfix does not write one line per message. It writes one line per stage, each tagged with the same hexadecimal queue ID, so a single delivery is scattered across smtpd, cleanup, qmgr, and smtp entries that you reassemble by grepping that ID.

Every line follows the same shape: timestamp, hostname, postfix/<process>[pid], then the queue ID, then key-value fields. Once you internalize that the queue ID is the join key and that the four process names map to four stages of the pipeline, the log stops being noise and starts reading like a transaction trace.

The Queue ID Is The Join Key

The queue ID is a short hex string like 4F9D195432C. Postfix assigns it the moment a message enters the queue, and it appears on every subsequent line for that message until delivery completes or the message leaves the queue. To follow one message, grep the ID:

grep 4F9D195432C /var/log/maillog

The ID is reused over time once a message is gone, so it is unique only within a window, not forever. For one delivery, though, it is the thread you pull.

The Four Processes Are Four Stages

Oct 10 01:23:45 mailhost postfix/smtpd[2534]: 4F9D195432C: client=relay.other.com[203.0.113.7]
Oct 10 01:23:45 mailhost postfix/cleanup[2536]: 4F9D195432C: message-id=<[email protected]>
Oct 10 01:23:46 mailhost postfix/qmgr[2531]: 4F9D195432C: from=<[email protected]>, size=344, nrcpt=1 (queue active)
Oct 10 01:23:46 mailhost postfix/smtp[2538]: 4F9D195432C: to=<[email protected]>, relay=mail.example.com[216.150.150.131]:25, delay=1.1, delays=0.04/0/0.6/0.46, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 8BDCA22DA71)
  • smtpd is the inbound SMTP server. It logs the client= that connected and handed Postfix the message. This is where you see the source IP and any reject.
  • cleanup rewrites headers, runs header/body checks, and logs the message-id=. This is the line that ties the queue ID to the application-level Message-ID.
  • qmgr is the queue manager. It logs from=, size=, nrcpt= (recipient count) and moves the message into the active queue for delivery. It logs again with removed when the message leaves the queue.
  • smtp is the outbound delivery agent. It logs the per-recipient to=, the relay= it connected to, the timing, the dsn=, the status=, and the remote server's verbatim response in parentheses.

Reading The Delivery Line

The smtp line carries the verdict. Four fields decide what happened:

  • relay= is the host and IP Postfix actually delivered to, with the port. If this shows none, delivery never reached a server, usually a DNS or connection failure.
  • delay= is total seconds from queue entry to this delivery attempt.
  • delays=a/b/c/d breaks that total into four phases: a is time before the queue manager including reception, b is time inside the queue manager, c is connection setup including DNS, HELO and TLS, d is message transmission. Values under 0.01s are truncated to 0. A large c means slow DNS or a remote that is slow to answer; a large d means a slow transfer or a big message.
  • dsn= is the enhanced status code, and status= is Postfix's one-word verdict.

What status= Tells You

Three verdicts cover almost everything:

status=sent      (250 2.0.0 Ok: queued as 8BDCA22DA71)
status=deferred  (connect to mx.remote.com[198.51.100.4]:25: Connection timed out)
status=bounced   (host mx.remote.com[198.51.100.4] said: 550 5.1.1 User unknown)
  • sent means the remote accepted the message. The remote's own queue ID in the 250 response (queued as ...) is your proof of handoff and the ID you quote when a recipient says they never got it.
  • deferred is a temporary failure. Postfix keeps the message and retries on a backoff schedule. Most deferrals are connection timeouts, greylisting, or 4xx responses.
  • bounced is a permanent failure. Postfix gives up, generates a non-delivery report to the sender, and removes the message. The remote's 5xx response is the reason, quoted verbatim.

The dsn= Code Decodes The Class

The dsn= field is an RFC 3463 enhanced status code in class.subject.detail form. The class digit alone tells you the disposition:

  • 2.x.x is success.
  • 4.x.x is a persistent transient failure. RFC 3463 defines it as a case where "the message as sent is valid, but persistence of some temporary condition has caused abandonment or delay." This pairs with status=deferred.
  • 5.x.x is permanent failure. This pairs with status=bounced.

So dsn=5.1.1 is a permanent failure where subject 1 (addressing) detail 1 means the mailbox does not exist. dsn=4.7.1 is a transient delivery-policy block, often a temporary rate-limit or greylist. Postfix itself inserts 5.7.1 by default on its own reject actions and 4.7.1 on defer, so a 5.7.1 with no quoted remote response is often your own policy, not the remote's.

Correlating One Message End To End

Put it together. To answer "what happened to the message to [email protected] at 01:23," grep the recipient or the Message-ID to find the queue ID, then grep the queue ID to assemble the full timeline:

grep '[email protected]' /var/log/maillog        # find the queue id on the smtp line
grep 4F9D195432C /var/log/maillog                # replay every stage for that id

You will read it as a story: smtpd accepted from this client IP, cleanup assigned this Message-ID, qmgr queued it with one recipient, smtp connected to this relay and got back a 250 with the remote's queue ID. When something breaks, the same trace points straight at the stage: a missing smtp line means the message never left the queue, a status=deferred with relay=none means it never connected, a status=bounced quotes the exact 5xx the remote returned.

If you are rehearsing this for an email-infrastructure interview, practice reading a raw smtp line out loud and naming each field in order: recipient, relay, delay, the four-phase delays breakdown, the DSN class, the status, and the remote's response. The ability to look at one log line and immediately say "the remote accepted it, connection setup took most of the time, here is the upstream queue ID" is the difference between someone who has run a mail server and someone who has only read about one.

your turn

Stop reading about interviews. Start training for yours.