Anatomy of an email

Anatomy of an email
Photo by Hiroshi Kimura / Unsplash

Receiving and processing inbound emails is a big part of  Sirportly, our help desk service. Recently I was tasked with refactoring Sirportly's email processing code, and in doing so I found myself delving through many of the original email RFCs, and learning all about the structure of emails.

Although I've been using email for well over 20 years, I'd never given much thought to what an email really is, so in this post I'm going to share what I've learned while working on Sirportly.

Minimum Viable Email

So what is an email? In its most basic form, this:

Date: Mon, 01 Nov 2021 12:00:00 -0000
From: Test Mailbox <test@example.com>

Just two lines of plain text - according to RFC822 this is the minimum viable email.

Each of those lines represents a header - Date and From - the only two headers required by the spec.

As an experiment we tried sending an email containing only those two headers, and predictably it was flagged as spam. So while it is technically valid, it’s not possible to send an email like that in the real world, at least not any more. I’m sure it would have worked fine back when email was new, before spammers spoiled all the fun.

An interesting side note about the From header — it can include more than one mailbox, separated by commas. This may seem odd at first, how can multiple people send a single email? It turns out the From header isn’t only for who sent the email, but also who wrote it. Hence the ability to have more than one From address.

When specifying multiple mailboxes in the From header, the normally optional Sender header must also be included to specify who actually sent the email. The Sender header can only contain one mailbox. Eg:

From: Test Mailbox <test@example.com>, a.human@example.com, someone-else@example.com
Sender: a.human@example.com

Where to?

Our minimum viable email specifies when it was sent, and by whom, but how does it know where to go? There’s no To address anywhere. That’s where the SMTP envelope comes in.

The SMTP envelope is the collective name given to the commands issued to an SMTP server to send an email, eg:

=> 220 smtp.example.com Simple Mail Transfer Service Ready

HELO 127.0.0.1
=> 250 Hello 127.0.0.1

MAIL FROM:test@example.com
=> 250 OK

RCPT TO:someone@example.com
=> 250 OK

RCPT TO:other@example.com
=> 250 OK

DATA
=> 354 Send message content; end with <CRLF>.<CRLF>

Date: Mon, 01 Nov 2021 12:00:00 -0000
From: Test Mailbox <test@example.com>
To: a.human@example.com
.
=> 250 OK, message accepted for delivery: queued as 12345

QUIT
=>221 Bye

There are many other SMTP commands available, however these are the minimum set for sending an email:

  • HELO - Specifies the sender’s IP or domain name
  • MAIL FROM - Starts a new mail transaction and sets the address of the sender. This is only used by the SMTP program to send error notifications, and will not be included anywhere in the email
  • RCPT TO - Specifies to whom the email should be sent, can be called multiple times in a row for multiple recipients
  • DATA - Informs the SMTP server that the next lines will be the email content, terminated by a . on a line on its own

It’s the RCPT TO command that dictates where the email is sent. You can also add an optional To header to the email, but that is simply for the convenience of the recipient - it has no effect on where the email is sent.

"Envelope" is a great name for this, as it is analogous to a real world letter and envelope — the address on an envelope dictates where a letter goes. Writing the destination address on the letter itself is optional, and has no effect on where the letter goes.

The body

So far we've seen an email that only contains headers, and doesn't really say much. Real world emails include content in the form of a body, which is simply plain text separated from the headers by a blank line:

Date: Mon, 01 Nov 2021 12:00:00 -0000
From: Test Mailbox <test@example.com>
To: a.human@example.com
Subject: Email

Hi,

This is a slightly more realistic example email

Regards
A Human

There is one potential issue with this combination of a plaintext body and the SMTP command interface – What happens if the email content happens to contain a single full stop on its own line? Eg:

Date: Mon, 01 Nov 2021 12:00:00 -0000
From: Test Mailbox <test@example.com>

This is a contrived example
.This line isn't a problem, as it's not a dot on its own. But the next line is
.

This is more email body

With this email content, the SMTP server would stop processing the email content at the third line of the body, and then try to interpret the rest of the body (This is more email body) as a command, which would trigger an error.

To solve this issue "dot stuffing" should be employed. Before submitting the email content to the SMTP server, the email client prepends a dot to any lines that start with a dot.

Email after dot stuffing:

Date: Mon, 01 Nov 2021 12:00:00 -0000
From: Test Mailbox <test@example.com>

This is a contrived example
..This line isn't a problem, as it's not a dot on its own. But the next line is
..

This is more email body

Now there are no single dots on their own line the entire body can be processed successfully. Then after the SMTP server has accepted and processed the body it strips any leading dots. This means the email will be sent with no extra or missing characters:

Date: Mon, 01 Nov 2021 12:00:00 -0000
From: Test Mailbox <test@example.com>

This is a contrived example
.This line isn't a problem, as it's not a dot on its own. But the next line is
.

This is more email body

Making it fancy

Plain text content is all well and good, but what if we want fancy things like styled text or images? Email's got you covered with HTML content - enter the Content-Type header, specifically Content-Type: text/html:

Date: Mon, 01 Nov 2021 12:00:00 -0000
From: test@example.com
Content-Type: text/html; charset="ISO-8859-1"

<html>
  <body>
    <blink>HTML Email!</blink>
  </body>
</html>

As long as the receiving mail client supports it, this email will be rendered as HTML in all its blinking glory.

The Content-Type header is optional, and if not specified the email will be interpreted as if text/plain was specified.

MIME and Multipart

While most modern email clients support HTML formatted emails, and most users expect to see them, there are still clients and systems that prefer their emails in plain text flavour. To allow for this most emails are sent with both HTML and plain text content. This is made possible through MIME (Multipurpose Internet Mail Extensions) - specifically the multipart email extension.

To send an email with both HTML and plain text you need to add a Content-Type header of multipart/alternative with a multipart boundary string parameter. Eg:

Content-Type: multipart/alternative; boundary="=_part_boundary_abc123"

As the name suggests, the multipart extension allows an email to have multiple parts, eg: one part HTML and one part plain text. The boundary string allows the receiving mail client to know where one part ends and the next starts:

Date: Mon, 01 Nov 2021 12:00:00 -0000
From: test@example.com
Content-Type: multipart/alternative; boundary="=_part_boundary_abc123"

--=_part_boundary_abc123
Content-Type: text/plain

Hello

--=_part_boundary_abc123
Content-Type: text/html

<marquee>Hello</marquee>

--=_part_boundary_abc123--

Each of the body parts is separated by a line containing only the specified boundary string, prepended with two hyphens. The final boundary line is also appended with two hyphens.

Each part has the same structure as a non-MIME message: headers, a blank line, and the body. There are no required headers for each part, as the required headers are still at the top of the email before the multipart body begins. Each part usually has a Content-Type header, although this is optional and without it the part will be interpreted as plain text.

I've used a simple boundary string for the sake of example, however the boundary string must not occur anywhere else in the content. To minimise this risk boundary strings in the wild tend to be much longer and more random, eg: --==_mimepart_61c34e87190a7_a0ec788292544. A boundary string can be up to 70 characters of 7 bit ASCII characters.

The MIME specification allows for additional content before the first part, and after the last, known as the preamble and epilogue respectively.

In a MIME compatible email client the preamble and epilogue will not be displayed. They are designed to be used like comments that will only be displayed in non-MIME compatible clients, explaining that the email is in MIME format and is best viewed with MIME compliant software:

Date: Mon, 01 Nov 2021 12:00:00 -0000
From: test@example.com
Content-Type: multipart/alternative; boundary="=_part_boundary_abc123"

[preamble] This is a MIME formatted message, and is best viewed in a MIME compatible email client

--=_part_boundary_abc123
Content-Type: text/plain

Hello

--=_part_boundary_abc123
Content-Type: text/html

<marquee>Hello</marquee>

--=_part_boundary_abc123--

[epilogue] This was a MIME formatted message, and is best viewed in a MIME compatible email client

Attachments

The MIME specification also includes attaching files to emails. Each file attachment is another email "part". When attaching files the main Content-Type should be set to multipart/mixed instead of multipart/alternative (more details on this in the Multipart subtypes section of this post). The Content-Type header still needs the boundary string.

The attachment part also needs three new headers of its own:

  • Content-Type - This specifies the attached file's type, and should be a valid MIME type, eg: image/gif or application/pdf. An optional name parameter specifies the file's original name.
  • Content-Disposition - This can be set to inline if the file should be displayed immediately as part of the email, or attachment if it should be downloadable at the user's request. An optional filename parameter specifies what filename should be used when the file is downloaded.
  • Content-Transfer-Encoding - This specifies how the file's data has been encoded into the email content. Options are 7bit, 8bit, quoted-printable, base64 or binary. Base64 is the most commonly used encoding for attachments.
Date: Mon, 01 Nov 2021 12:00:00 -0000
From: test@example.com
Content-Type: multipart/mixed; boundary="=_part_boundary_abc123"

--=_part_boundary_abc123

Please see attached file

--=_part_boundary_abc123

Content-Type: image/gif; name="peanut-butter-jelly-time.gif"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="peanut-butter-jelly-time.gif"

V2UgbGlrZSBjdXJpb3VzIG1pbmRzLCBjaGVjayBvdXQgS3J5c3RhbCdzIGN1cnJlbnQgdmFjYW5jaWVzIC0gaHR0cHM6Ly9rcnlzdGFsLnVrL2NhcmVlcnM=

--=_part_boundary_abc123--

Multipart subtypes

The part after the /  in the multipart Content-Type header is known as the sub-type. There are many different sub-types in the specification, but you'll mostly only see two in the wild: mixed and alternative.

multipart/alternative

This subtype is used when the parts are interchangeable, ie: only one part needs to be viewed, and the receiving client can choose which of the parts is the best one to display.

The classic example of this is an email sent as both HTML and plaintext. Both of those parts should contain the same content in a different format. The receiving client will display either the HTML or plain text part based on the user's preferences.

multipart/mixed

This subtype is used when the parts do not contain the same content, so more than one part should be seen.

The classic example is an email with an attachment. The attachment is a separate part which should be displayed to the user in addition to the content body.

This is the default subtype. Emails with an unrecognised subtype will be interpreted as multipart/mixed.

Nested parts

With that in mind, which subtype should you use for an email with both HTML and plain text versions, and also has attachments?

While mixed would technically be ok for this, doing that would remove the advantage of using the alternative subtype — clients would no longer be able to know the text and HTML parts are the same content in different formats, and would likely end up displaying both.

A better approach is to use nested parts, as the body of each part can include more parts. Ie: parts can be nested within each other.

For an email with text and HTML content, as well as attachments, the main email body can be split into mixed parts: the first of which would be further split into alternative parts for the content, and the attachments would follow:

  • mixed parts:
    • alternative parts:
      • Plain text content
      • HTML content
    • Attachment
    • Attachment

This way the client can still display the email in the user's preferred format (plain text or HTML), and also include the attachments.


So that's a brief intro to the anatomy of an email. If you're interested in going deeper I recommend reading the original RFCs - as someone who knew practically nothing about the structure of emails, I found them surprisingly easy to read and understand:

If you'd like to know more about the other Multipart subtypes, there's a very detailed article on docs.microsoft.com

Did you know we have an open source mail platform? Check out Postal