E-mail authentication

From Wikipedia, the free encyclopedia

Email authentication is the effort to equip messages of the email transport system with enough verifiable information, so that recipients can recognize the nature of each incoming message automatically. That is different from both content filtering's fuzzy methods and authors' digital signatures verification.

Contents

[edit] Rationale

Ensuring a valid identity on an e-mail has become a vital step in stopping spam (as e-mail can be filtered based on such an identity), forgery, fraud, and even more serious crimes. The Simple Mail Transfer Protocol (SMTP) is continuously evolving, but when it was designed, in the early 1980's, it was the purview of academia and government agencies, and as such, there was no cause to consider security. It provided for no formal verification of sender. On the other hand, different mailing systems, e.g. Internet Mail 2000, are apparently loitering. Thus, SMTP still handles most email today.

Signing emails is a good first step towards identifying the origin of the message, but it does not establish whether that identity has a good reputation or whether it should be trusted.

This article explains how email identities are forged and the steps that are being taken now to prevent it.

[edit] Sender's IP verification

An email features four key players: the authors or originators of the e-mail, the sender or agent who first puts the e-mail on the public Internet, the receiver or agent who receives the e-mail from the Internet, and the recipients who are the persons intended to read the e-mail. For the sake of this discussion, the flow may be simplified as[1]

Image:Email_Authentication_01a.png

Thanks to the Transmission Control Protocol and to IP address registries the sender's IP address is automatically verified by the receiver.[2] However, there is no provision for the verification of the author and sender information that is eventually saved in the relevant headers. Thus, it is quite easy for a spammer to make an exact copy of an e-mail from example.com, including a long complicated sequence of headers and a genuine logo in the body of an e-mail, then change the content to send readers to a website that appears to be genuine, but is actually a phishing scam designed to capture names, passwords, and credit card numbers.

So why can't the sender's IP address be used to identify the spammer? There are two problems. One is that spammers often work through forwarders to hide their IP addresses (see below). Another is that the sender is often a zombie that has been infected by a computer virus, and is programmed to send spam without the owner even knowing about it. There are millions of insecure home computers, and they have become a major source of spam.

[edit] Blacklisting

Main article: DNSBL

Attempts to stop spam by blacklisting sender's IP addresses still allows a small percentage through[3]. Most IP addresses are dynamic, i.e. they are frequently changing. An ISP, or any organization directly connected to the Internet, gets a block of real Internet addresses when they register in the DNS. Within that block, they assign individual addresses to customers as needed. A dial-up customer may get a new IP address each time they connect. By the time that address appears on blacklists all over the world, the spammer will have new addresses for the next run. There are 4 billion possible IPv4 addresses on the Internet. The game of keeping up with these rapidly changing IP addresses has been facetiously called "whack-a-mole".

So called policy lists are black lists that contain IP addresses on a preventive basis. An IP address can be listed therein even if no spam has ever been sent from it, because it has been variously classified as a dial-up address, end-user address, or residential address, with no formal definition of such classification schemes. Not requiring evidence of spam for each enlisted address, these lists can collect a greater number of addresses and thus block more spam. However, the policies devised are not authoritative, since they have not been issued by the legitimate user of an IP address, and the resulting lists are therefore not universally accepted.

[edit] Controlling users

There are a number of things that ISPs have done to stop zombies and deliberate spamming by their customers:

  • Port 25 can be blocked by access providers in favor of Mail submission agent's port 587, that should always require authentication,
  • the number of existing Received headers in relayed mail can be limited[4],
  • infected computers can be cleared of viruses and patched to resist further infection,
  • outgoing e-mail can be monitored for any sudden increase in flow or in content that is typical of spam.

Some ISPs have been quite successful[4], but others don't care to make the effort. With spam now over 80% of all e-mail traffic[5], we can expect that there will always be ISPs who are not willing to take the necessary steps. The measures mentioned above don't directly help the entity who operates them to reduce incoming spam. By reducing outgoing spam, they help generic Internet users.

[edit] Authenticating senders

E-mail authentication greatly simplifies and automates the process of identifying senders. After identifying and verifying a claimed domain name, it is possible to treat suspected forgeries with suspicion, reject known forgeries, and block e-mail from known spamming domains. It is also possible to "whitelist" e-mail from known reputable domains, and bypass content-based filtering, which always loses some valid e-mails in the flood of spam. The fourth category, e-mail from unknown domains, can be treated like we now treat all e-mail – increasingly rigorous filtering, return challenges to the sender, etc. Success of a domain-rating system should encourage reputable ISPs to stop their outgoing spam and get a good rating.

There are a number of ways to authenticate a sender's domain name ( SPF, CSV, SenderID, DomainKeys, DKIM ). All are very effective in stopping the kind of forgery now prevalent. None exclude the use of other methods, although SPF / DKIM, SPF / CSV, SPF / SenderID, and SenderID / DomainKeys pairwise appear to be competing for the same niches. The most widely used will likely be the ones that require the least effort on the part of ISPs and others currently operating public mail servers.

SPF, CSV, and SenderID authenticate just a domain name. DomainKeys and DKIM use a digital signature to authenticate a domain name and the entire content of a message. SPF and CSV can reject a forgery before any data transfer. SenderID and DomainKeys must see at least the headers, so the entire message must be transmitted. SPF has a problem with forwarders (that Sender Rewriting Scheme defines a fix for), SenderID also with mailing lists (see below). CSV is only about the HELO identity.

SPF, CSV, and SenderID work by tying a temporary IP address to a claimed domain name. Every incoming e-mail has an IP address that cannot be forged[2], a bunch of domain names in the e-mail headers, and a few more in the commands from the sender's SMTP server. The methods differ in which of these names to use as the sender's domain name. All of them can be faked, but what cannot be faked is a domain name held by a DNS server for that section of the Internet[6].

The simplest and by far most widely deployed authentication scheme begins with a reverse DNS lookup of the connecting IP address. If there is no answer, it's a safe bet that the address is not a legitimate sender. If there is an answer, a forward DNS lookup of that answer authenticates the sender if it returns the connecting IP address. In other words, we look up the name of the connecting IP address, and look up the IP address of that name, and they must match.

Image:Email_Authentication_02a.png

The procedure to authenticate is basically simple. When a request to deliver an e-mail arrives, the claimed sender's domain name is sent in a query to a high-level DNS server. That DNS server in turn, refers to lower level servers until an answer is found that is authoritative for the domain in question. The answer returned to the receiver includes the information to authenticate the e-mail. For SPF and SenderID, the query returns the IP addresses which are authorized to send mail on behalf of that domain. Typically there will be very few authorized SMTP sending addresses, even from a domain with millions of dynamically assignable IP addresses. For DomainKeys, the query returns the public key for the domain, which then validates the signature in an e-mail header. A successful validation proves the email originated from the same people responsible for the DNS servers for that domain (as only the domain owners would have proper access to the private key that matches the public key published in DNS), and neither the headers nor the body of the e-mail were altered on its way from the sender.

A spammer has no access to any of the connections between these DNS servers. Even if he were to falsify records in the DNS server for his own domain, he would not be able to forge records for someone else's domain name. When a spammer tries to send an e-mail claiming to be from amazon.com, for example, the receiver queries the .com DNS server, then a server in a secure building at Amazon. The IP address on the message from the spammer won't match any of Amazon's authorized IP addresses, and the e-mail can be rejected. Alternatively, the DomainKey will show the signature in the e-mail is invalid because it is highly unlikely the spammer has access to the amazon.com private key.

Use of the DNS database to register authentication information for a domain is relatively new. The new information is added to existing DNS records, and queries for this information are handled the same way as any other DNS query. Publishing authentication records in DNS is voluntary, and many domains probably won't bother. However, any legitimate domain, even those that don't intend to operate public mail servers, will most likely want to block others from using their name to forge e-mails. A simple code in their DNS record will tell the world, "Block all mail claiming to be from our domain. We have no public mail servers."

[edit] Difficulties with e-mail forwarding

There are some additional details when an e-mail forwarder is involved. Forwarders perform a useful service in allowing you to have one simple permanent address, even if you change jobs or ISPs. List servers perform a similar function, forwarding e-mail to many receivers on behalf of one sender. Forwarders pose no problem for an end-to-end authentication method like DKIM and DomainKeys, as long as the signed message is not modified (some lists do this).

CSV limits its focus to one-hop authentications. SPF and SenderID have in essence the same limitation, they don't work directly behind the "border" ( MX ) of the receiver. For SPF forwarders to third parties could rewrite the Return-Path (MAIL FROM) in a similar way like mailing lists. This approach emulates the SMTP behaviour before RFC 1123 deprecated source routes; for a technical explanation see SRS.

For SenderID, forwarders to third parties and mailing lists are asked to add a Sender: or Resent-Sender: header. For many mailing lists, the former is already the case, but other forwarders avoid any modifications of the mail in addition to the mandatory Received-timestamp line.

Image:Email_Authentication_03d.png

Use of a forwarder prevents the receiver from directly seeing the sender's IP address. The incoming IP packets have only the forwarder's IP address. Two solutions are possible if one can trust all forwarders. Either one trusts the forwarder to authenticate the sender, or one trusts the forwarder to at least accurately record the incoming IP address and pass it on, so one can do their own authentication.

The situation gets complicated when there is more than one forwarder. A sender can explicitly authorize a forwarder to send on its behalf, in effect extending its boundary to the public Internet. A receiver can trust a forwarder that it pays to handle e-mail, in effect designating a new receiver. There may be additional "MTA relays" in the middle, however. These are sometimes used for administrative control, traffic aggregation, and routing control. All it takes is one broken link in the chain-of-trust from sender to receiver, and it is no longer possible to authenticate the sender.

Forwarders have one other responsibility, and that is to route Bounce messages (a.k.a. DSNs) in case the forwarding fails (or if it is requested anyway). E-mail forwarding is different from remailing when it comes to which address should receive DSNs. Spam bounces should not be sent to any address that may be forged[7]. These bounces may go back by the same path they came, if that path has been authenticated.

[edit] Criticism

“Authentication cannot stop spam, unless the cop/Reputation Service/Certificate Authority in charge revokes certificates for spamming. If that could happen, then ISPs would also be willing and even enthusiastic about terminating accounts or otherwise controlling (e.g. port block) their spammers. If ISPs would do that, then there would be no spam to need authentication to stop spam and so need for a CA playing cop. As long as ISPs remain unwilling to police their own spamming customers, they would never deal with a CA willing to play cop.
Authentication involving TLS, SMTP-AUTH, or S/MIME cannot stop backscatter for the same reasons SPF, DKIM, and the rest were, are, and always will be powerless against it. Some of those reasons are why Yahoo still does not sign DKIM on all outgoing mail, Hotmail still publishes whishywashy SPF RRs and neither requires their snakeoil forgery solution on incoming mail.”
--Vernon Schryver (Distributed Checksum Clearinghouse operator)

[edit] References

MacQuigg, David. Email Authentication. Retrieved on 2007-12-05.

  1. ^ How mail flows through the Internet http://www.openspf.org/mailflows.pdf (PNG)
  2. ^ a b IP Address forgery is possible, but generally involves a lower level of criminal behavior (breaking and entering, wiretapping, etc.), which are too risky for a typical hacker or spammer, or insecure servers not implementing RFC 1948, see also Transmission Control Protocol#Connection hijacking.
  3. ^ Spamhaus - effective spam filtering http://www.spamhaus.org/effective_filtering.html
  4. ^ a b America Online claims to have eliminated outgoing spam. A small sample of reports from SpamCop seems to validate this.
  5. ^ Junk mail statistics http://www.junk-o-meter.com/stats/index.php and http://www.postini.com/stats/
  6. ^ There have been attacks on DNS servers, but doing this on a large scale over a long period of time may be orders of magnitude more difficult than spreading zombie infections among millions of insecure home computers. The much smaller number of DNS servers could be upgraded to use DNSSEC if such attacks were to become commonplace.
  7. ^ SpamCop FAQ about spam bounces http://www.spamcop.net/fom-serve/cache/329.html

[edit] See also