D. J. Bernstein
Internet mail
Internet mail message header format

Headers

A header is a sequence of nonempty lines, consisting of a concatenation of fields.

A line is a string of zero or more bytes. A line is empty if it contains zero bytes. Every line in a header contains one or more bytes.

A field is a sequence of one or more nonempty lines. The first line does not begin with space or tab. Subsequent lines, if there are any, each begin with space or tab. Each field includes a name and a value.

For example, the following header contains five fields:

     Received: (qmail-queue invoked by uid 666);
       30 Jul 1996 11:54:54 -0000
     From: "D. J. Bernstein" <djb@silverton.berkeley.edu>
     To: fred@silverton.berkeley.edu
     Date: 30 Jul 1996 11:54:54 -0000
     Subject: Go, Bears!
The first field contains two lines:
     Received: (qmail-queue invoked by uid 666);
       30 Jul 1996 11:54:54 -0000

Note on terminology: some people refer to fields as ``headers.''

Non-ASCII characters

Users often send bytes between 128 and 255, relying on out-of-band agreement to specify the character set. (One system administrator in France has reported that 20% of the messages received by his users contain such bytes.)

However, 822 requires that each byte in a header line be between 0 and 127 inclusive. Furthermore, sendmail gets rather confused by bytes between 128 and 159; it uses them for internal macro handling.

Other byte restrictions

822 specifies a particular line encoding, with each line terminated by \015\012; so the byte sequence \015\012 cannot appear inside a line. This restriction is also enforced by the message encoding used in SMTP.

UNIX mail programs store headers as UNIX text files, so the byte \012 cannot appear anywhere inside a line. If a message with a bare \012 is transmitted through sendmail, the \012 will be treated as a line ending; if a message with a bare \012 is given to qmail, the message will be rejected. 822bis prohibits \012.

sendmail corrupts any \0 in a message header. Some mailers truncate lines at \0. 822bis prohibits \0.

822bis also prohibits \015. 822 also discouraged tabs, and prohibited one use of backspaces.

Header termination

An Internet mail message is a sequence of lines, starting with a header. The header is terminated by an empty line (or by the end of the message). The header is not terminated by an invisible line, i.e., a line consisting entirely of spaces and tabs, unless the line is empty. Users occasionally create invisible lines, usually for aesthetic reasons:
        Received:
                from [censored] by pinerolo.piw.it
                        with smtp
                        
                (Linux Smail3.1.28.1 #5)
                id m0uqMBq-000BzwC; Tue, 13 Aug 96 17:19 GMT+0100

There are three popular strategies for detecting the end of the header: