Internet privacy

From Wikipedia, the free encyclopedia

Internet privacy consists of privacy over the media of the Internet: the ability to control what information one reveals about oneself over the Internet, and to control who can access that information. Many people use the term to mean universal Internet privacy: every user of the Internet possessing Internet privacy.

Internet privacy forms a subset of computer privacy. Experts in the field of Internet privacy have a consensus that Internet privacy does not really exist. Privacy advocates believe that it should exist.

[edit] Scope of this article

This article discusses Internet privacy. Readers should understand the general topics of privacy and personally-identifiable information.

This article does not directly address the related topics of anonymity or pseudonymity; nor the separate topics of security or information security.

[edit] Levels of privacy

People with only a casual interest in Internet privacy need not achieve total anonymity. Regular Internet users with an eye to privacy may succeed in achieving a desirable level of privacy through careful disclosure of personal information and by avoiding spyware. The revelation of IP addresses, non-personally-identifiable profiling, and so on might become acceptable trade-offs for the convenience that such users would otherwise lose in using the workarounds needed to suppress such details rigorously. On the other hand, some people desire much stronger privacy. In that case, they may use Internet anonymity to ensure privacy — use of the Internet without giving any third parties the ability to link the Internet activities to personally-identifiable information of the Internet user.

[edit] Risks to Internet privacy

Those concerned about Internet privacy often cite a number of privacy risks — events that can compromise privacy — which one may encounter through Internet use. Unfortunately, given the complexity of Internet privacy, many people do not understand the issues. Therefore this section covers not only "real" privacy risks, but also risks perceived as overemphasized.

[edit] Cookies

See main article, HTTP cookie

Cookies have become perhaps the most widely-recognized privacy risk, receiving a great deal of attention. Although HTML-writers most commonly use cookies for legitimate, desirable purposes, cases of abuse can and do occur.

An HTTP cookie consists of a piece of information stored on a user's computer to add statefulness to web-browsing. Systems do not generally make the user explicitly aware of the storing of a cookie. (Although some users object to that, it does not properly relate to Internet privacy, although it does have implications for computer privacy, and specifically for computer forensics).

The original developers of cookies intended that only the website that originally sent them would retrieve them, therefore giving back only data already possessed by the website. However, in actual practice programmers can circumvent this intended restriction. Possible consequences include:

the possible placing of a personally-identifiable tag in a browser to facilitate web profiling (see below), or,
possible use in some circumstances of cross-site scripting or of other techniques to steal information from a user's cookies.

Many users choose to disable cookies in their web browsers. This eliminates the potential privacy risks, but may severely limit or prevent the functionality of many websites. All significant web browsers have this disabling ability built-in, with no external program required. As an alternative, users may frequently delete any stored cookies. Some browsers (for example, Mozilla Firefox) have an option to have the system clear cookies automatically whenever the user closes the browser. A third option involves allowing cookies in general, but preventing their abuse. There are also a host of wrapper application (for example, PrivacyView) that will redirect cookies and cache data to some other location. The Private Internet Browsing feature found in the CryptoStick Software Suite redirects all Internet Explorer information to a USB flash memory device. This prevents the storing of browsing information on the actual computer: the information goes off-system when the user removes the USB flash memory device from the computer.

[edit] Browsing profiles

The process of profiling (also known as "tracking") assembles and analyzes several events, each attributable to a single originating entity, in order to gain information (especially patterns of activity) relating to the originating entity. On the Internet, certain organizations employ profiling of people's web browsing, collecting the URLs of sites visited. The resulting profiles may or may not link with information that personally identifies the people who did the browsing.

Some web-oriented marketing-research organizations may use this practice legitimately, for example: in order to construct profiles of 'typical Internet users'. Such profiles, which describe average trends of large groups of Internet users rather than of actual individuals, can then prove useful for market analysis. Although the aggregate data does not constitute a privacy violation, some people believe that the initial profiling does.

Profiling becomes a more contentious privacy issue, on the other hand, when data-matching associates the profile of an individual with personally-identifiable information of the individual.

Governments and organizations may set up honeypot websites - featuring controversial topics - with the purpose of attracting and tracking unwary people. This constitutes a potential danger for individuals.

[edit] IP addresses

See main article IP address

Every device on the Internet (including each online computer) has an IP address, an identifying numeric code used to route data. The Internet Service Provider (ISP) through which the device connects may assign this address semi-permanently (for example, for the duration of the lifetime of an account) or temporarily (many dial-up connections, for example, get assigned new IP addresses each time they connect).

Every packet (piece of data) moving through the Internet gets tagged with the IP addresses of its source and of its intended destination. The proper working of the Internet depends on such routing information. Consequently, any direct connection between two devices on the Internet (such as when a personal computer reads a website) reveals both IP addresses to both parties.

An IP address sometimes becomes a personally-identifiable datum, and therefore potentially subject to privacy concerns. An IP address identifies its user's ISP, and often identifies its user's (or the ISP's) nation, region/province/state, and sometimes even city. The amount of information deducible from an IP address depends on the ISP's policies. See also: DNS, whois.

Any web site can track the movements of users through its pages by their IP addresses. This can serve for profiling within a single site.

An IP address provides the minimum amount of information needed to attack a computer over the Internet.

People seeking Internet anonymity usually have an interest in hiding their IP address from third parties. One can only do this (without loss of Internet use) by connecting through one or more anonymous proxies - special Internet servers that connect to remote hosts (a web site, for example) on behalf of the user. The remote host communicates with the proxy, and receives the proxy's IP address rather than the real user's. The proxy, however, knows the IP address of the user, and sees all data passing between the user and the website; therefore the anonymous proxy has the opportunity for abuse of the user's privacy, whether intentional or accidental. Onion routing offers one method of addressing this problem; as used in such systems as Tor, I2P and Freenet.

[edit] ISPs

Consumers obtain Internet access through an Internet Service Provider (ISP). All Internet data to and from the consumer must pass through the consumer's ISP. Given this, any ISP has the capability of observing anything and everything about the consumer's (unencrypted) Internet activities; however, ISPs presumably do not do this (or at least not fully) due to legal, ethical, business, and technical considerations.

ISPs do, however, collect at least some information about the consumers using their services. From a privacy standpoint, the ideal ISP would collect only as much information as it requires in order to provide Internet connectivity (IP address, billing information if applicable, etc). A common belief exists that most ISPs collect additional information, such as aggregate browsing habits or even personally-identifiable URL histories.

What information an ISP collects, what it does with that information, and whether it informs its consumers, can pose significant privacy issues. Beyond usages of collected information typical of third parties, ISPs sometimes state that they will make their information available to government authorities upon request. Often, such a request need not involve a warrant.

An ISP cannot know the contents of properly-encrypted data passing between its consumers and the Internet. For encrypting web traffic, https has become the most popular and best-supported standard. Note however, that even if users encrypt the data, the ISP still knows the IP addresses of the sender and of the recipient. (However, see the IP addresses section for workarounds.)

General concerns regarding internet user privacy have become a concern enough for a UN agency report to on the dangers of identity fraud^[1].

[edit] Data logging

Many programs and operating systems are set up to perform data logging of usage. This may include recording times when the computer is in use, or which web sites are visited. If a third party has sufficient access to the computer, legitimately or not, this may be used to lessen the user's privacy. This could be avoided by disabling logging, or clearing logs regularly.

[edit] Other potential Internet privacy risks

Spyware
Web bug (HTML-enabled email)
social engineering
Phishing
malicious proxy server (or other "anonymity" services)

[edit] Anonymous Internet usage

See main article, Internet anonymity

For anonymous browsing of websites, see anonymous proxy. For anonymous email, see anonymous remailer.

[edit] Noted cases

[edit] AOL search data

Main article: AOL search data

On August 4, 2006, AOL released three months of search history for 650,000 users to the public. Although the searchers were only identified by a numeric ID, the New York Times discovered the identity of several searchers, and with her permission, exposed search number 4417749 as Thelma Arnold, a 62-year-old Georgian widow. This privacy breach was widely reported, and led to the resignation of AOL's CTO, Maureen Govern on August 21. The responsible staff member was fired.

[edit] Jason Fortuny and Craigslist

In early September 2006, Jason Fortuny, a Seattle-area graphic designer and network administrator, posed as a woman and posted an ad to Craigslist Seattle seeking a casual sexual encounter with area men. Fortuny described the exercise as "The Craigslist Experiment", in a posting that detailed his goals.^[2] On September 4, he posted to the internet all 178 of the responses, complete with photographs and personal contact details, encouraging others to further identify participants. ^{[citation needed]} Although some online exposures of personal information have been seen as justified as exposing malfeasance, many commentators on the Fortuny case saw no such justification here. "The men who replied to Fortuny's posting did not appear to be doing anything illegal, so the outing has no social value other than to prove that someone could ruin lives online", said law professor Jonathan Zittrain, while Wired writer Ryan Singel described Fortuny as "sociopathic".^[3] Fortuny himself was unapologetic, writing to internet raconteur Tucker Max, "Let's milk this. All the way. [...] There must be a way to combine this. Into money. Money is important. Money is good."^[4]

Fortuny's prank was later duplicated by others, including Michael Crook, who also initiated email and online messaging sessions with his targets before publishing all the information he'd gathered.^[5]

[edit] Second Life database compromise

Also in September 2006, online game Second Life experienced a security breach that resulted in the exposure of its customer database, including the unencrypted names and addresses and the encrypted passwords and billing information of 650,000 users. Linden Lab, the company behind the game, notified the users of the breach and required all users (or 'residents') of the game to reset their passwords before logging in. ^[6]

[edit] Yahoo! and MSN search data

Data from major Internet companies, including Yahoo! and MSN (Microsoft) have already been subpoenad by the United States^[7] and China.^{[citation needed]} AOL even provided a chunk of its own search data online^[8], allowing reporters to track the online behaviour of private individuals^[9].