Exploring DKIM Validation

Email Random

To explore how DKIM validation works first got an email in .eml format as an example email which was a known good email where the DKIM validation is OK. Meaning that both the header and the body are unchanged from when the email was first DKIM signed.

There is a little tool called “dkimpy”, which can be installed with:

pip install dkimpy

Once installed to run you can either run with standard input, or you can copy and paste the .eml contents directly into the console and then Ctrl+D to enter. For the remainder of this article, i’ll be using the standard input approach, as follows:

dkimverify < email_file.eml

Assuming the DKIM Signature matches the email, you’ll get an output such as:

$ dkimverify < validemail.eml
signature ok

Okay, so let’s dig into the DKIM-Signature *RFC2822) and have a look at what is actually there:

DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=domain.com;
        s=selector1; t=1710613595;
        bh=H9BvNbCpsH41x7+YSD1YntT35RyCQyz6jQC2gg7ZXsg=;
        h=To:From:Subject:Date:Cc:From;
        b=EZ2ucrQ3V22gmo2fIKMjEjbpo0bjaieQtBXowR2A/vcWCtygSOm571Hetvs1qhj64
         Zz5Gx0RE8Za5n9TTLHm9GLxLj6DoHWtDwQLikjW8b9PBaDVq7l1cIrzr4rnF3yuicX
         p2Nk3YdWLY/zKK3PT63mTZvCMWIpTUjrlpnZ9wY8=
  • b = Digital Signature, the signature of the mail message contents: headers and body
  • bh = A hash of the body of the message, the body is hashed then the signature is based on the hash of the body, rather than the body itself.
  • d = The domain used for the signing (signing domain), specified so that the recipient knows which domain it should be looking for the DKIM DNS record in.
  • s = The DKIM selector that was used to sign the email message. There may be one or more selectors on a domain, so it is essential to specify which one was used to sign the message, so the recipient knows which one to look for to get the public key that corresponds to the private key used to sign the mesage.
  • v = The version of DKIM.
  • a = The signing algorithm used, in the example above rsa-sha256, again so the recipient knows what to use when attempting a validation.
  • c = The canonicalization algorithm(s) for header and body, there are two options, relaxed and simple. It is specified in the format of header/body, so relaxed/simple would mean “relaxed” is applied to the header and “simple” to the body. Simple is the stricter of the two. It means it allows some mild/minor modifications to the emails, for example when an email has been forwarded, which invalidating the message’s signature. Although a hash when validated is a binary “No its not been changed” or a “Yes it has been changed”, using a “relaxed” canonicalisation coverts all header names to lower case, unfolds headers so each is a single line, removes multiple whitespace characters, removes trailing whitespace etc. making it more robust and therefore improving deliverabilty even if some trival modifications happen during transit.
  • h = the list of signed header fields, repeated for fields that occur multiple times, so in the example above: To:From:Subject:Date:Cc:From, which means these fields are included in the signature, when it comes to validating this is was the order they were presented duyring DKIM signing, so therefore the order they should be presented during verification, they are the fields not expected to be changed in transit.

The remaining DKIM fields are:

  • q = the default query method
  • l = the length of the canonicalised part of the body that has been signed, i.e. how long the message is that was included in the signature.
  • t = the signature timestamp – As Unix timestamp, so the example above was: Sat Mar 16 2024 18:26:35 GMT+0000
  • x = the expire time of the signature, i.e. its expected the message is delivered before this expires.

The DKIM Signature Investigated

So attribute “b” contains the DKIM signature:

b=EZ2ucrQ3V22gmo2fIKMjEjbpo0bjaieQtBXowR2A/vcWCtygSOm571Hetvs1qhj64
         Zz5Gx0RE8Za5n9TTLHm9GLxLj6DoHWtDwQLikjW8b9PBaDVq7l1cIrzr4rnF3yuicX
         p2Nk3YdWLY/zKK3PT63mTZvCMWIpTUjrlpnZ9wY8=

And “bh” the body hash contains the hash of the body of the message. Hashing the message body means you have a fixed length digest to use with your signing, rather than a completely random and variable length input of the email being signing, which can be difficult to use.

bh=H9BvNbCpsH41x7+YSD1YntT35RyCQyz6jQC2gg7ZXsg=;

Breaking It

OK, so we validated the .eml file we had above, now let’s make a change, we change time on the “Date” header, from “Sat, 16 Mar 2024 18:26:31 +0000” to “Sat, 16 Mar 2024 18:26:32 +0000”, as the “Date” field is included in the “h” attribute: h=To:From:Subject:Date:Cc:From; any change to this will result in the message failing validation.

$ dkimverify < notvalidemail.eml
signature verification failed

As expected its failed, now if we put the value back to what it was before and retry the validation, we’ll see its all fine.

Good, so now let’s try something else, what if we change the “Message-ID” slightly, say changing a letter to a number in the alphanumeric string, the “Message-ID” is not included, so:

$ dkimverify < notvalidemail.eml
signature ok

Interesting. If you ensure the Message-ID is now put back to its original value.

Now as a final test, we’ll alter something in the body of the message, the body of the message is hashed before DKIM signing, and hashing ensures that even just a single bit change in a file results in a completely different hash digest output.

I added a full stop character to one end of the message body. When running the dkimverify we now see the following (note i’ve omitted a traceback message for clarity):

$ dkimverify < notvalidemail.eml

dkim.ValidationError: body hash mismatch (got b'bHnODtY/ChDi+2pmAGXprtfQHSEgMKinB/v4O9NoZkU=', expected b'H9BvNbCpsH41x7+YSD1YntT35RyCQyz6jQC2gg7ZXsg=')

So let’s take a look at this, as you can see the bh should be: H9BvNbCpsH41x7+YSD1YntT35RyCQyz6jQC2gg7ZXsg= however because of that single full stop, it is now bHnODtY/ChDi+2pmAGXprtfQHSEgMKinB/v4O9NoZkU= instead, which therefore shows that the message body has been altered.

Conclusion

As you can see DKIM validation is very sensitive and very clear at determining if the message was authenticated successfully and also un-changed during transit. Using DKIM is a great way to verify the message has come from who you think it should have come from and also if someone has altered it in transit, but you must be careful with what happens in transit, ideally the DKIM signature should be added to the message at the last possible moment, i.e. at the very edge of your email infrastructure and ideally verified at the very edge of the recipients email infrastructure to minimise the risk that benign change (due to forwarding or adding a disclaimer to the bottom of the email etc.) breaks the DKIM signature and causes mail delivery issues.

Leave a Reply

Your email address will not be published. Required fields are marked *