Greylisting
From Wikipedia, the free encyclopedia
Greylisting (or graylisting) is a method of defending e-mail users against spam. A mail transfer agent using greylisting will "temporarily reject" any email from a sender it does not recognize. If the mail is legitimate, the originating server will most likely try again to send it later (see disadvantages), at which time the destination will accept it. If the mail is from a spammer, it will probably not be retried, and spam sources which re-transmit later are more likely to be listed in DNSBLs and distributed signature systems such as Vipul's Razor.
Contents |
[edit] How it works
Typically, a server employing greylisting will record the three pieces of data known as a "triplet" for each incoming mail message:
- The IP address of the connecting host
- The envelope sender address
- The envelope recipient address
This is checked against the mail server's internal database. If this triplet has not been seen before (within some configurable period), the email is greylisted for a short time (also configurable), and it is refused with a temporary rejection. The assumption is that since temporary failures are built into the RFC specifications for email delivery, a legitimate server will attempt to connect again later on to deliver the email.
In practice, most greylisting systems do not require an exact match on the IP address and the sender address. Because large senders often have a pool of machines that can send (and resend) email, IP addresses that have the most-significant 24 bits (/24) the same are treated as equivalent, or in some cases SPF records are used to determine the sending pool. Similarly, some e-mail systems use unique per-message return-paths, for example variable envelope return path for mailing lists, Sender Rewriting Scheme for forwarded e-mail, Bounce Address Tag Validation for backscatter protection , etc. If an exact match on the sender address is required, each e-mail such systems will be delayed. Instead, some greylisting systems try to eliminate the variable parts of the VERP by using only the sender domain and the beginning of the local-part of the sender address.
Greylisting is effective because many mass email tools used by spammers will not bother to retry a failed delivery, so the spam is never delivered. When a spammer does retry a delivery after the waiting period has expired, however, it will likely be after a number of automated honeypots have detected the spam source and listed both the source and the particular message in their databases. Thus, these subsequent attempts are more likely to be detected as spam by other mechanisms than they were at first.
[edit] Advantages
The main advantage from the users' point of view is that greylisting requires no additional configuration from their end. If the server utilizing greylisting is configured appropriately, the end user will only notice a delay on the first message from a given sender.
From a mail administrator's point of view the benefit is twofold. Greylisting takes minimal configuration to get up and running with occasional modifications of any local whitelists. The second benefit is that rejecting email with a temporary 450 error (actual error code is implementation dependent) is very cheap in system resources. Most spam filtering tools are very intensive users of CPU and memory. By stopping spam before it hits filtering processes, far fewer system resources are used. This allows more layers of spam filtering or higher throughput.
[edit] Disadvantages
Perhaps the most significant disadvantage of greylisting is the fact that, like some other spam mitigation techniques, it destroys the near-instantaneous nature of email people have come to expect. A customer of a greylisting ISP can not always rely on getting every email in a pre-determined amount of time. However, the original specification for email states that it is not a guaranteed delivery mechanism and not an instantaneous delivery mechanism. This means that greylisting is a perfectly legitimate process and does not break any protocols or rules. Traditionally greylisting is very good at flushing out poorly configured mail servers that cannot maintain state, queue email correctly, or retry delivery in a reasonably short space of time.[citation needed] Mail servers that are properly configured and fully conform to SMTP generally have no problems with greylisting techniques and delays are very small so as not to be a problem.[citation needed]
On a technical level, some SMTP clients and SMTP servers acting as clients may interpret the temporary rejection as a permanent failure. Old clients conforming only to the obsolete specification (RFC 821) and ignoring its recommendations may give up on delivery after the first failed attempt -- RFC 821 states that clients "should" retry messages rather than using the word "must". RFC 2119 dictates that "should" means recommended and to ignore at your own risk, and it is a violation of the current SMTP standard for the client to fail to retry. The current SMTP specification (RFC 2821) clearly states that "the SMTP client retains responsibility for delivery of that message" (section 4.2.5) and "the SMTP client is encouraged to try again", and "mail that cannot be transmitted immediately MUST be queued and periodically retried by the sender." (section 4.5.4.1).
This problem can affect SMTP clients in unexpected ways. Most MTAs will queue and retry messages, but a small number do not.[1] A similar concern exists for applications which act as SMTP clients and fail to incorporate any form of queueing for deferred SMTP mail. This can be mitigated on the sending side by configuring the application to use a local SMTP server as an outbound queue, instead of attempting direct delivery. For the server operator who uses greylisting, clients which are known to fail on temporary errors can be supported by whitelisting or exception lists.
Some MTAs, upon encountering the temporary failure message from a greylisting server on the first attempt, will send a warning message back to the original sender of the message.[1] The warning message is not a bounce message, but it is often formatted similarly to and reads like one. This practice often causes the sender to believe that the message has not been delivered, when in fact the message will be delivered successfully at a later time.
When a mail server is greylisted, the duration of time between the initial delay and the re-transmission is variable. Some mail servers use a default of four hours, though most will retry sooner. Most open-source MTAs have retry rules set to attempt delivery after around fifteen minutes (Sendmail default is 0, 15, ..., Exim default is 0, 15, ..., Postfix default is 0, 16.6, ..., Qmail default is 0, 6:40, 26:40, ...).
Greylisting delays much of the mail from non-whitelisted mail servers - not just spam - until typical patterns of communication are recorded by the greylisting system.
Also, legitimate mail might not get delivered, if the retry doesn't come within the time window the greylisting software uses, or if the retry comes from a different IP address than the original attempt: When the source of an email is a server farm or goes out through an anti-spam mail relay service it is likely that on the retry a server other than the original server will make the next attempt. Since the IP addresses will be different, the recipient's server will fail to recognize that the two attempts are related and refuse the latest connection as well. This can continue until the message ages out of the queue if the number of servers is large enough. Such server farming techniques can be construed as breaking RFCs detailed above since the original sending machine absolves itself of the responsibility of mail delivery by tossing it back into to pool which breaks the state of the mail delivery process. The problem can be partially bypassed by identifying and whitelisting such server farms in advance. However, it is not possible on a distributed network the size of the Internet to maintain a complete list of all such server farms.[2]
Greylisting can be a particular nuisance with websites that require you to create an account and confirm your email address before you can begin using them, if the sending MTA of the site is poorly configured, because greylisting may delay the initial email containing your signup confirmation link, it will introduce a waiting period even though the actual website may attempt to send out your email confirmation code immediately. However almost all stock configured sendmail MTAs (sendmail being the most widely deployed MTA on the internet) will retry after a few minutes leading to typical delays of under 10 minutes in most cases but still dependent on the greylisting configuration. Greylisting is particularly effective in many cases at weeding out misconfigured MTAs and is gaining in popularity as a very effective anti spam tool. Those MTAs that do not correctly handle greylisting will, by their very nature become fewer and far between over time.
In order for greylisting to work for a particular domain, all backup mail servers (as specified by lower-priority MX records for the domain) must implement the greylisting policy as well.
[edit] See also
[edit] References
- ^ a b Evan Harris (21 August 2003). The Next Step in the Spam Control War: Greylisting. PureMagic Software. Retrieved on 2008-01-09.
- ^ Filtering Spam: Combined techniques give best results. Shamrock Software GmbH (December 2007). Retrieved on 2008-01-09.