Hashcash
Hashcash is a proof-of-work system designed to limit email spam and denial-of-service attacks. Hashcash was proposed in March 1997 by Adam Back.[1]
How it works
Hashcash is a method of adding a textual stamp to the header of an email to prove the sender has expended a modest amount of CPU time calculating the stamp prior to sending the email. In other words, as the sender has taken a certain amount of time to generate the stamp and send the email, it is unlikely that they are a spammer. The receiver can, at negligible computational cost, verify that the stamp is valid. However, the only known way to find a header with the necessary properties is brute force, trying random values until the answer is found; though testing an individual string is easy, if satisfactory answers are rare enough it will require a substantial number of tries to find the answer.
The theory is that spammers, whose business model relies on their ability to send large numbers of emails with very little cost per message, cannot afford this investment into each individual piece of spam they send. Receivers can verify whether a sender made such an investment and use the results to help filter email.
Technical details
The header line looks something like this:[2]
X-Hashcash: 1:40:1303030600:adam@cypherspace.org::McMybZIhxKXu57jd:FOvXX
The header contains: the recipient's email address, the date, and information proving the required computation has been performed. The presence of the recipient's email address requires that a new header be computed for each recipient, and the date allows the recipient to record headers received recently and make sure the header is unique to this email.
Sender's side
The sender prepares a header and adds an initial random number. It then computes the 160 bit SHA-1 hash of the header. If the first 20 bits of the hash are zeros then this is an acceptable header. If not then the sender increments the random number and tries again. Since about 1 in 220 headers will have 20 zeros as the beginning of the hash the sender will on average have to try 220 random numbers to find a valid header. Given reasonable estimates of the time needed to compute the hash, this would take about 1 second to find. At this time no more efficient method is known to find a valid header.
A normal user on a desktop PC would not be significantly impacted by the processing time required to generate the Hashcash string. However, spammers would suffer a significant impact due to the high number of spam messages required.
Recipient's side
Technically the system is implemented with the following steps:
- The recipient's computer calculates the 160-bit SHA-1 hash of the entire string (e.g.,
"1:20:060408:adam@cypherspace.org::1QTjaYd7niiQA/sc:ePa"
). This takes about two microseconds on a 1 GHz machine, far less time than the time it takes for the rest of the e-mail to be received. If the first 20 bits are not all zero, the hash is invalid. (Later versions may require more bits to be zero as machine processing speeds increase.) - The recipient's computer checks the date in the header (e.g.,
"060408"
, which represents the date 8 Apr 2006). If it is not within two days of the current date, it is invalid. (The two-day window compensates for clock skew and network routing time between different systems.) - The recipient's computer checks if the e-mail address in the hash string matches any of the valid e-mail addresses registered by the recipient, or matches any of the mailing lists to which the recipient is subscribed. If a match is not found, the hash string is invalid.
- The recipient's computer inserts the hash string into a database. If the string is already in the database (indicating that an attempt is being made to re-use the hash string), it is invalid.
If the hash string passes all of these tests, it is considered a valid hash string. All of these tests take far less time and disk space than receiving the body content of the e-mail.
Required effort
The time needed to compute such a hash collision is exponential with the number of zero bits. So one can keep adding zero bits (doubling the amount of time needed to send with each zero bit) until it is too expensive for spammers to generate valid header lines. (Confirming the header is valid always takes the same amount of time, no matter how many zero bits are required for a valid header.)
Advantages and disadvantages
The Hashcash system has the advantage over micropayment proposals applying to legitimate email that no real money is involved. Neither the sender nor recipient need pay, thus the administrative issues involved with all micropayment systems are entirely avoided.
On the other hand, as Hashcash requires potentially significant computational resources to be expended on each e-mail being sent, it is somewhat difficult to tune the ideal amount of average time you wish clients to expend computing a valid header. This can mean sacrificing accessibility from low-end embedded systems or else running the risk of hostile hosts not being challenged enough to provide an effective filter from spam.
Hashcash is also fairly simple to implement in mail user agents and spam filters. No central server is needed. Hashcash can be incrementally deployed—the extra Hashcash header is ignored when it is received by mail clients that do not understand it.
One plausible estimate[3] came to the conclusion that you can only have one of these: Either good e-mail will get stuck due to lack of processing power of the sender, or bad e-mail is bound to still get through. The reasons for this are botnets or cluster farms with which spammers can increase their processing power enormously, or centralized e-mail-topologies like mailing lists, in which some server is to send an enormous amount of legitimate e-mails.
Most of these issues may be addressed. E.g., botnets may expire faster because users notice the high CPU load and take counter-measures, and mailing list servers can be registered in white lists on the subscribers' hosts and thus be relieved from the hashcash challenges. But they represent serious obstacles to hashcash deployment that remain to be addressed.[citation needed]
Another projected problem is that computers continue to get faster according to Moore's law. So the difficulty of the calculations required must be increased over time. However, developing countries can be expected to use older hardware, which means that they will find it increasingly difficult to participate in the email system. This also applies to lower-income individuals in developed countries who cannot afford the latest hardware.
Applications
Email clients
The Penny Post software project[4] on SourceForge implements Hashcash in the Mozilla Thunderbird email client.[5] The project is named for the historical availability of conventional mailing services that cost the sender just one penny; see Penny Post for information about such mailing services in history.
Spam filters
Hashcash has been recommended as a potential solution for false positives with automated spam filtering systems, as legitimate users will rarely be inconvenienced by the extra time it takes to mint a stamp.[6] SpamAssassin has checked for Hashcash stamps since version 2.70, assigning a negative score (i.e. less likely to be spam) for valid, unspent Hashcash stamps. In the 3.3.x series (the current version at time of writing), it gives a bonus for any stamp 20 bits or greater, capping at -5 points for a stamp 26 bits or greater; however, an already-spent stamp incurs a small penalty.[7]
Email Postmark
Microsoft also implemented and released an open spec for a format incompatible version of Hashcash they call an email postmark [8] as part of their Coordinated Spam Reduction Initiative (CSRI).[9] The Microsoft email postmark variant of Hashcash is implemented in the Microsoft mail infrastructure components Exchange, Outlook and Hotmail. The format differences between Hashcash and Microsoft's email postmark is that postmark hashes the body in addition to the recipient, and uses a modified SHA1 as the hash function and uses multiple sub-puzzles to reduce proof of work variance.
Blogs
Like e-mail, blogs often fall victim to comment spam. Some blog owners have used hashcash scripts written in the JavaScript language to slow down comment spammers.[10] Some scripts (such as wp-hashcash) claim to implement hashcash but instead depend on JavaScript obfuscation to force the client to generate a matching key; while this does require some processing power, it does not use the hashcash algorithm or hashcash stamps.
Bitcoin
Bitcoin is a virtual currency that uses hashcash (with minor changes) for the generation and verification of currency.[11] Proof of work is used to protect the network and keep it decentralized with no central authority - rather, processing power becomes votes.
Intellectual Property
Hashcash is not patented, and the reference implementation [12] and most of the other implementations are open source. Hashcash is included or available for many linux distributions. (To install it, Fedora/Red Hat users can type sudo yum install hashcash. Ubuntu/Debian users can type sudo apt-get install hashcash).
RSA has made IPR statements to the IETF about client-puzzles [13] in the context of an RFC [14] that described client-puzzles (not hashcash). The RFC included hashcash in the title and referenced hashcash but the mechanism described in it is a known solution interactive challenge which is more akin to Client-Puzzles; hashcash is non-interactive and therefore does not have a known-solution. In any case RSA's IPR statement can not apply to hashcash because hashcash predates [15] (Mar 1997) the client-puzzles publication [16] (Feb 1999) and the client-puzzles patent filing US7197639 [17] (Feb 2000).
See also
References
- Adam Back, "Hashcash - A Denial of Service Counter-Measure", technical report, August 2002 (PDF).
- Ben Laurie and Richard Clayton, "'Proof-of-Work' Proves Not to Work", WEIS 04. (PDF).
- Dwork, C. and Naor, M. (1992) "Pricing via Processing or Combating Junk Mail", Crypto '92, pp. 139–147. (PDF)
- ↑ Announcement of Hashcash, from hashcash.org
- ↑ link to hashcash.org
- ↑ Hashcash proof-of-work paper
- ↑ Penny Post software project on SourceForge
- ↑ http://pennypost.sourceforge.net/PostageStamps
- ↑ http://www.hashcash.org/faq/
- ↑ http://spamassassin.apache.org/tests_3_3_x.html
- ↑ http://download.microsoft.com/download/5/d/d/5dd33fdf-91f5-496d-9884-0a0b0ee698bb/%5BMS-OXPSVAL%5D.pdf
- ↑ http://download.microsoft.com/download/7/6/b/76b1a9e6-e240-4678-bcc7-fa2d4c1142ea/csri.pdf
- ↑ WP-Hashcash, a plugin for Wordpress blog software that implements a Hashcash-like facility, written in JavaScript, by Elliott Back
- ↑ Nakamoto, Satoshi (1 Nov 2008). "Bitcoin: A Peer-to-Peer Electronic Cash System". Retrieved 20 December 2012.
- ↑ C reference implementation
- ↑ RSA IETF client-puzzles
- ↑ Draft Jennings SIP hashcash
- ↑ hashcash announce
- ↑ client puzzles
- ↑ client-puzzle patent filing
External links
- http://hashcash.org — Hashcash homepage
- Beat spam using hashcash — David Mertz's article on hashcash, its applications and an implementation in Python.
- RSA IPR note to the IETF about hashcash (2004)