AOL search data scandal

From Wikipedia, the free encyclopedia

On August 4, 2006, AOL released a compressed text file on one of its websites containing twenty million search keywords for over 650,000 users over a 3-month period, intended for research purposes. AOL pulled the file from public access by the 7th, but not before it had been mirrored, P2P-shared and seeded via BitTorrent. News filtered down to the blogosphere and popular tech sites such as Digg and Wired News.

While none of the records on the file are personally identifiable per se, certain keywords contain personally identifiable information by means of the user typing in their own name (ego-searching), as well as their address, social security number or by other means. Each user is identified on this list by a unique sequential key, which enables the compilation of a user's search history. [1] In fact, in a test of whether it was possible to do so, the New York Times was able to locate several individuals from the released, and anonymized search records by cross referencing them with phonebooks or other public records. [2] Consequently, the ethical implications of using this data for research are under debate. [3] [4]

AOL acknowledged it was a mistake and removed the data, although the files can still be downloaded from mirror sites.[5] Additionally, several searchable databases of the report also exist on the internet.[6]

Although the searchers were only identified by a numeric ID, the New York Times successfully discovered the identity of several searchers, and with her permission, exposed search number 4417749 as Thelma Arnold, a 62-year-old Georgian widow. This privacy breach was widely reported, and led to the resignation of AOL's CTO, Maureen Govern on August 21, 2006. The media quoted an insider as saying that two employees had been fired: the researcher who released the data, and his immediate supervisor, who reported to Govern.[7] [8]

In September 2006 a class action lawsuit was filed against AOL in the U.S. District Court for the Northern District of California. "The lawsuit accuses AOL of violating the Electronic Communications Privacy Act and of fraudulent and deceptive business practices, among other claims, and seeks at least $5,000 for every person whose search data was exposed." [9]

In January 2007, Business 2.0 Magazine on CNNMoney ranked the release of the search data #57 in a segment called "101 Dumbest Moments in Business."[10]

[edit] References

  1. ^ Michael Arrington. "AOL proudly releases massive amounts of user search data", TechCrunch, 2006-08-06. Retrieved on August 7, 2006.
  2. ^ "A Face Is Exposed for AOL Searcher No. 4417749", The New York Times, 2006-08-09. (Sign in or subscription required.)
  3. ^ Katie Hafner. "Tempting Data, Privacy Concerns; Researchers Yearn To Use AOL Logs, But They Hesitate", The New York Times, 2006-08-23. Retrieved on September 13, 2006.
  4. ^ Nate Anderson. "The ethics of using AOL search data", Ars Technica, 2006-08-23. Retrieved on September 13, 2006.
  5. ^ http://news.com.com/AOL+apologizes+for+release+of+user+search+data/2100-1030_3-6102793.html?tag=nefd.top
  6. ^ http://sergiorebelo.com/twodotfive/?page_id=25
  7. ^ Li, Kenneth. "AOL chief technology officer resigns: sources", Reuters, 2006-08-21.
  8. ^ http://www.iht.com/articles/2006/08/22/business/aol.php
  9. ^ http://news.com.com/2061-10803_3-6119218.html
  10. ^ http://money.cnn.com/magazines/business2/101dumbest/2007/full_list/index.html

[edit] External links