D. J. Bernstein
Internet mail
Internet mail message header format

Field names and field values

A field is a sequence of lines. A header is a concatenation of fields.

Field names

The first line of a field begins with a name and a colon. The name is a string of one or more graphical ASCII characters other than colons, i.e., bytes between 33 and 126 inclusive other than 58.

For example, the name of the following field is Cc:

     Cc: God@heaven.af.mil,
       angels@heaven.af.mil
A field with name Cc is called a ``Cc field''; similar comments apply to ``To field'' and ``Subject field'' and so on.

I recommend against using field name characters other than letters, digits, and hyphens. One program reportedly has trouble with field names containing underscores.

Spaces before the colon

Spaces and tabs are not permitted between the field name and the colon:
     Subject : This is an invalid field
822bis explicitly prohibits spaces; several readers (reported: mpack, Cyrus imapd, AMS, and metamail) choke if they see spaces.

However, 822 allowed spaces and tabs before the colon, and there are (unconfirmed) rumors of old messages that include spaces before the colon, so I recommend that readers allow spaces and tabs.

Field values

The value of a field is a string of bytes, consisting of all bytes in the field after the first colon. In other words, it is the concatenation of all the lines in the field, except for the starting name and colon.

For example, the value of the Cc field shown above is

     God@heaven.af.mil,  angels@heaven.af.mil
with two spaces after the comma.

The position of line breaks within a field is irrelevant to the field value. In principle, writers can insert a line break before any space or tab. (822 suggests a more restrictive rule, but it then immediately gives examples disobeying the restrictions.) For example, the field

     From: "The boss" < God @ heaven. af (Air Force).mil>
has the same value as the field
     From:
      "The
      boss"
      < God
      @
      heaven. af (Air
      Force).mil>
In practice, many clients are confused by almost all line breaks within fields, except for line breaks between addresses in a recipient list. Note also that invisible lines are unsafe.

822 states incorrectly that \012 and \015 cannot appear inside field values. 822bis imposes this as a requirement.

Note that there is not necessarily a space after the colon:

     Subject:This is valid

Field semantics

The meaning of a field depends on (1) its name and (2) its value.

Many names have globally recognized semantics. The list of well-known names is constantly expanding; furthermore, writers are allowed to create user-defined fields with new names. Header-reading programs have to be able to handle fields with unrecognized names.

822 specified that all field names are interpreted without regard to case; so Cc and cC and CC and cc have the same meaning. This flexibility is used in practice; for example, many messages have Message-ID, and many messages have Message-Id. On the other hand, the elm filter program looks for only Cc, not cc or CC or cC; and many versions of the vacation program look for only To, not to or TO or tO.

Interaction between fields

A single header can contain several fields with the same name. Most messages received through SMTP contain two or more Received fields. However, the semantics of repeated fields are undefined for most field names. Some names are required to show up at most once.

822 suggested, but explicitly did not require, that writers use a particular order for some popular fields: Received, Date, From, Subject, Sender, To, Cc. In practice, those fields appear in several different orders. For example, readers cannot assume that From appears before To.