E-mail authentication

From Wikipedia, the free encyclopedia

Ensuring a valid identity on an e-mail has become a vital first step in stopping spam, forgery, fraud, and even more serious crimes. An essential second step would be ensuring the entity has a good reputation. Unfortunately, the Simple Mail Transfer Protocol (SMTP) that handles most e-mail today was designed in the early 1980's when most Internet users were honest "techies" who expected others to be equally honest. This article will explain how e-mail identities are forged and the steps that are being taken now to prevent it. [1].

Contents

[edit] Mail transfer

In a simple mail transfer, there are four key players: the author or originator of the e-mail, the sender or agent who first puts the e-mail on the public Internet, the receiver or agent who gets the e-mail from the Internet, and the recipient who is the person supposed to read the e-mail.[2] When we say Internet, with a capital I, we mean the world-wide network that shares a common set of IP addresses, not the internal networks before the sender or after the receiver. For example, the computer I am writing this article on shares a local network with other computers having addresses I can assign at will. My network connects via a router to the network of my ISP, and I can assign whatever addresses I want within my network, including the address of my router. It is only when I connect my router to the Internet that a real Internet IP address is needed.

Image:Email_Authentication_01a.png

Other than the sender's IP address, there is no verification of any information in an e-mail. It is quite easy for a spammer to make an exact copy of an e-mail from smithbarney.com, including a long complicated sequence of headers and a genuine logo in the body of an e-mail, then change the content to send readers to a website that appears to be genuine, but is actually a phishing scam designed to capture names, passwords, and credit card numbers.

So why can't the sender's IP address be used to identify the spammer? There are two problems. One is that spammers often work through forwarders to hide their IP addresses (see below). Another is that the sender is often a zombie that has been infected by a computer virus, and is programmed to send spam without the owner even knowing about it. There are millions of insecure home computers, and they have become a major source of spam.

Attempts to stop spam by blacklisting sender's IP addresses still allows a small percentage through[3]. Most IP addresses are dynamic, i.e. they are frequently changing. An ISP, or any organization directly connected to the Internet, gets a block of real Internet addresses when they register in the DNS. Within that block, they assign individual addresses to customers as needed. A dial-up customer may get a new IP address each time they connect. By the time that address appears on blacklists all over the world, the spammer will have new addresses for the next run. There are 4 billion possible IPv4 addresses on the Internet. The game of keeping up with these rapidly changing IP addresses has been facetiously called "whack-a-mole".

There are a number of things that ISPs have done to stop zombies and deliberate spamming by their customers. Infected computers can be cleared of viruses and patched to resist further infection. Outgoing e-mail can be monitored for any sudden increase in flow or in content that is typical of spam. Some ISPs have been quite successful[4], but others don't care to make the effort. With spam now over 80% of all e-mail traffic[5], we can expect that there will always be ISPs who are willing to provide services for spammers.

[edit] Authenticating senders

E-mail authentication greatly simplifies and automates the process of identifying senders. After identifying and verifying a claimed domain name, it is possible to treat suspected forgeries with suspicion, reject known forgeries, and block e-mail from known spamming domains. It is also possible to "whitelist" e-mail from known reputable domains, and bypass content-based filtering, which always loses some valid e-mails in the flood of spam. The fourth category, e-mail from unknown domains, can be treated like we now treat all e-mail – increasingly rigorous filtering, return challenges to the sender, etc. Success of a domain-rating system should encourage reputable ISPs to stop their outgoing spam and get a good rating.

There are a number of ways to authenticate a sender's domain name ( SPF, CSV, SenderID, DomainKeys ). All are very effective in stopping the kind of forgery now prevalent. None exclude the use of other methods, although SPF / CSV, SPF / SenderID, and SenderID / Domainkeys pairwise appear to be competing for the same niches. The most widely used will likely be the ones that require the least effort on the part of ISPs and others currently operating public mail servers.

SPF, CSV, and SenderID authenticate just a domain name. DomainKeys uses a digital signature to authenticate a domain name and the entire content of a message. SPF and CSV can reject a forgery before any data transfer. SenderID and DomainKeys must see at least the headers, so the entire message must be transmitted. SPF has a problem with forwarders, SenderID also with mailing lists (see below). CSV is only about the HELO identity.

SPF, CSV, and SenderID work by tying a temporary IP address to a claimed domain name. Every incoming e-mail has an IP address that cannot be forged[6], a bunch of domain names in the e-mail headers, and a few more in the commands from the sender's SMTP server. The methods differ in which of these names to use as the sender's domain name. All of them can be faked, but what cannot be faked is a domain name held by a DNS server for that section of the Internet[7].

The simplest and by far most widely deployed authentication scheme begins with a reverse DNS lookup of the connecting IP. If there is no answer, it's a safe bet that the IP is not a legitimate sender. If there is an answer, a forward DNS lookup of that answer authenticates the sender if it returns the connecting IP. In other words, we look up the name of the connecting IP, and look up the IP of that name, and they must match.

Image:Email_Authentication_02a.png

The procedure to authenticate is basically simple. When a request to deliver an e-mail arrives, the claimed sender's domain name is sent in a query to a high-level Domain Name Server. That DNS server in turn, refers to lower level servers until an answer is found that is authoritative for the domain in question. The answer returned to the receiver includes the information to authenticate the e-mail. For SPF and SenderID, the query returns the IP addresses which are authorized to send mail on behalf of that domain. Typically there will be very few authorized SMTP sending addresses, even from a domain with millions of dynamically assignable IPs. For DomainKeys, the query returns the public key for the domain, which then validates the signature in the e-mail. A successful validation proves the domain name is not faked, and neither the headers nor the body of the e-mail were altered on its way from the sender.

A spammer has no access to any of the connections between these DNS servers. Even if he were to falsify records in the DNS server for his own domain, he would not be able to forge someone else's domain name. When a spammer tries to send an e-mail claiming to be from amazon.com, for example, the receiver queries the .com DNS server, then a server in a secure building at Amazon. The IP address on the message from the spammer won't match any of Amazon's authorized IP addresses, and the e-mail can be rejected. Alternatively, the DomainKey will show the signature in the e-mail is invalid.

Use of the DNS database to register authentication information for a domain is relatively new. The new information is added to existing DNS records, and queries for this information are handled the same way as any other DNS query. Publishing authentication records in DNS is voluntary, and many domains probably won't bother. However, any legitimate domain, even those that don't intend to operate public mail servers, will most likely want to block others from using their name to forge e-mails. A simple code in their DNS record will tell the world, "Block all mail claiming to be from our domain. We have no public mail servers."

[edit] Difficulties with e-mail forwarding

At this point, you probably know all you need to know about e-mail authentication, but there are some additional details when an e-mail forwarder is involved. Forwarders perform a useful service in allowing you to have one simple permanent address, even if you change jobs or ISPs. List servers perform a similar function, forwarding e-mail to many receivers on behalf of one sender. Forwarders pose no problem for an end-to-end authentication method like DomainKeys, as long as the signed message is not modified (some lists do this).

CSV limits its focus to one-hop authentications. SPF and SenderID have in essence the same limitation, they don't work directly behind the "border" ( MX ) of the receiver. For SPF forwarders to third parties could rewrite the Return-Path (MAIL FROM) in a similar way like mailing lists. This approach emulates the SMTP behaviour before RFC 1123 deprecated source routes, for a technical explanation see SRS.

For SenderID forwarders to third parties and mailing lists are asked to add a Sender or Resent-Sender to the mail header. For many mailing lists the former is already the case, but other forwarders avoid any modifications of the mail in addition to the mandatory Received-timestamp line.

Image:Email_Authentication_03d.png

Use of a forwarder prevents the receiver from directly seeing the sender's IP Address. The incoming IP packets have only the forwarder's IP Address. Two solutions are possible if you can trust all forwarders. Either you trust the forwarder to authenticate the sender, or you trust the forwarder to at least accurately record the incoming IP Address and pass it on, so you can do your own authentication.

The situation gets complicated when there is more than one forwarder. A sender can explicitly authorize a forwarder to send on its behalf, in effect extending its boundary to the public Internet. A receiver can trust a forwarder that it pays to handle e-mail, in effect designating a new receiver. There may be additional "MTA Relays" in the middle, however. These are sometimes used for administrative control, traffic aggregation, and routing control. All it takes is one broken link in the chain-of-trust from sender to receiver, and it is no longer possible to authenticate the sender.

Forwarders have one other responsibility, and that is to properly route Delivery Status Notices (DSNs) and spam bounces. Normal DSNs should be sent straight to an address chosen by the sender. Spam bounces should not be sent to any address that may be forged[8]. These bounces may go back by the same path they came, if that path has been authenticated.

[edit] References

  1. ^ This is a complex and controversial topic. See also the original simpler article by David MacQuigg - http://purl.net/macquigg/email/Email_Authentication.htm
  2. ^ How mail flows through the Internet http://www.openspf.org/mailflows.pdf (PNG)
  3. ^ Spamhaus - effective spam filtering http://www.spamhaus.org/effective_filtering.html
  4. ^ America Online claims to have eliminated outgoing spam. A small sample of reports from SpamCop seems to validate this.
  5. ^ Junk mail statistics http://www.junk-o-meter.com/stats/index.php and http://www.postini.com/stats/
  6. ^ IP Address forgery is possible, but generally involves a lower level of criminal behavior ( breaking and entering, wiretapping, etc.), and these crimes are too risky for a typical hacker or spammer.
  7. ^ There have been attacks on DNS servers, but doing this on a large scale over a long period of time may be orders of magnitude more difficult than spreading zombie infections among millions of insecure home computers. The much smaller number of DNS servers could be upgraded to use DNSSEC if such attacks were to become commonplace.
  8. ^ SpamCop FAQ about spam bounces http://www.spamcop.net/fom-serve/cache/329.html

[edit] See also