Usenet, a portmanteau of "user" and "network", is a world-wide distributed Internet discussion system. It evolved from the general purpose UUCP architecture of the same name.
It was conceived by Duke University graduate students Tom Truscott and Jim Ellis in 1979. Users read and post public messages (called articles or posts, and collectively termed news) to one or more categories, known as newsgroups. Usenet resembles bulletin board systems (BBS) in most respects, and is the precursor to the various web forums which are widely used today; and can be superficially regarded as a hybrid between Email and web forums. Discussions are threaded, with modern news reader software, as with web forums and BBSes, though posts are stored on the server sequentially.
One notable difference from a BBS or web forum is that there is no central server, nor central system owner. Usenet is distributed among a large, constantly changing conglomeration of servers which store and forward messages to one another. These servers are loosely connected in a variable mesh. Individual users usually read from and post messages to a local server operated by their ISP, university or employer. The servers then exchange the messages between one another, so that they are available to readers beyond the original server.
Contents |
Usenet is one of the oldest computer network communications systems still in widespread use. It was established in 1980, following experiments from the previous year, over a decade before the World Wide Web was introduced and the general public got access to the Internet. It was originally conceived as a "poor man's ARPANET," employing UUCP to offer mail and file transfers, as well as announcements through the newly developed news software. This system, developed at University of North Carolina at Chapel Hill and Duke University, was called USENET to emphasize its creators' hope that the USENIX organization would take an active role in its operation (Daniel et al, 1980).
The articles that users post to Usenet are organized into topical categories called newsgroups, which are themselves logically organized into hierarchies of subjects. For instance, sci.math and sci.physics are within the sci hierarchy, for science. When a user subscribes to a newsgroup, the news client software keeps track of which articles that user has read.
In most newsgroups, the majority of the articles are responses to some other article. The set of articles which can be traced to one single non-reply article is called a thread. Most modern newsreaders display the articles arranged into threads and subthreads, making it easy to follow a single discussion in a high-volume newsgroup.
When a user posts an article, it is initially only available on that user's news server. Each news server, however, talks to one or more other servers (its "newsfeeds") and exchanges articles with them. In this fashion, the article is copied from server to server and (if all goes well) eventually reaches every server in the network. The later peer-to-peer networks operate on a similar principle; but for Usenet it is normally the sender, rather than the receiver, who initiates transfers. Some have noted that this seems an inefficient protocol in the era of abundant high-speed network access. Usenet was designed for a time when networks were much slower, and not always available. Many sites on the original Usenet network would connect only once or twice a day to batch-transfer messages in and out.
Usenet has significant cultural importance in the networked world, having given rise to, or popularized, many widely recognized concepts and terms such as "FAQ" and "spam." Internet culture was born on Usenet.
Today, almost all Usenet traffic is carried over the Internet. The current[update] format and transmission of Usenet articles is very similar to that of Internet email messages. However, Usenet articles are posted for general consumption; any Usenet user has access to all newsgroups, unlike email, which requires a list of known recipients.
Today, Usenet has diminished in importance with respect to Internet forums, blogs and mailing lists. The difference, though, is that Usenet requires no personal registration with the group concerned, that information need not be stored on a remote server, that archives are always available, and that reading the messages requires not a mail or web client, but a news client (included in many modern e-mail clients).
Many Internet service providers, and many other Internet sites, operate news servers for their users to access. ISPs that do not operate their own servers directly will often offer their users an account from another provider that specifically operates newsfeeds. Most commonly, these accounts are through UsenetServer.com, Supernews, Giganews and Usenet.com. Usually the ISP will get a kickback for referring the customer to the Usenet provider. In early news implementations, the server and newsreader were a single program suite, running on the same system. Today, one uses separate newsreader client software, a program that resembles an email client but accesses Usenet servers instead.
Not all ISPs run news servers. A news server is one of the most difficult Internet services to administer well because of the large amount of data involved, small customer base (compared to mainstream Internet services such as email and web access), and a disproportionately high volume of customer support incidents (frequently complaining of missing news articles that are not the ISP's fault). Some ISPs outsource news operation to specialist sites, which will usually appear to a user as though the ISP ran the server itself. Many sites carry a restricted newsfeed, with a limited number of newsgroups. Commonly omitted from such a newsfeed are foreign-language newsgroups and the alt.binaries hierarchy which largely carries software, music, videos and images, and accounts for over 99 percent of article data.
For those who have access to the Internet, but do not have access to a news server, Google Groups ([1]) allows reading and posting of text news groups via the World Wide Web. Though this or other "news-to-Web gateways" are not always as easy to use as specialized newsreader software, especially when threads get long, they are often much easier to search. Users who lack access to an ISP news server can use Google Groups to access the alt.free.newsservers newsgroup, which has information about open news servers.
There are also Usenet providers that specialize in offering service to users whose ISPs do not carry news, or that carry a restricted feed.
See also news server operation for an overview of how news systems are implemented.
Newsreader clients are available for all major operating systems and come in all shapes and sizes. Mail clients or "communication suites" also now commonly have an integrated newsreader. Often, however, these integrated clients are of low quality, e.g. incorrectly implementing Usenet protocols, standards and conventions. Many of these integrated clients, for example the one in Microsoft's Outlook Express, are disliked by purists because of their misbehavior.[1]
Newsgroups are typically accessed with special client software that connects to a news server. With the rise of the world wide web, web front-ends have become more common. Web front ends have made Usenet more accessible by lowering the technical entry barrier requirements to one application and no Usenet server account requirement. Google Groups[2] is one of the most popular web based front ends and browsers such as Firefox can access Google Groups via news: protocol links directly.[3] There are numerous other websites now offering web based gateways to Usenet groups, although some people have begun filtering messages made by some of the web interfaces for one reason or another.[4][5]
A minority of newsgroups are moderated. That means that messages submitted by readers are not distributed to Usenet, but instead are emailed to the moderators of the newsgroup, for approval. Moderated newsgroups have rules called charters. Moderators are persons whose job is to ensure that messages that the readers see in newsgroups conform to the charter of the newsgroup. Typically, moderators are appointed in the proposal for the newsgroup, and changes of moderators follow a succession plan.
The job of the moderator is to receive submitted articles, review them, and inject approved articles so that they can be properly propagated worldwide. Such articles must bear the Approved: header line.
Unmoderated newsgroups form the majority of Usenet newsgroups, and messages submitted by readers for unmoderated newsgroups are immediately propagated for everyone to see.
Creation of moderated newsgroups often becomes a hot subject of controversy, raising issues regarding censorship and the desire of a subset of users to form an intentional community.
Usenet is a set of protocols for generating, storing and retrieving news "articles" (which resemble Internet mail messages) and for exchanging them among a readership which is potentially widely distributed. These protocols most commonly use a flooding algorithm which propagates copies throughout a network of participating servers. Whenever a message reaches a server, that server forwards the message to all its network neighbors that haven't yet seen the article. Only one copy of a message is stored per server, and each server makes it available on demand to the (typically local) readers able to access that server. Usenet was thus one of the first peer-to-peer applications, although in this case the "peers" are themselves servers that the users then access, rather than the users themselves being peers on the network. However, the end users connects to the server of the service provider and are therefore not interacting with each other, opposed the common meaning of a peer-to-peer network.
RFC 850 was the first formal specification of the messages exchanged by Usenet servers. It was superseded by RFC 1036.
In cases where unsuitable content has been posted, Usenet has support for automated removal of a posting from the whole network by creating a cancel message, although due to a lack of authentication and resultant abuse, this capability is frequently disabled. Copyright holders may still request the manual deletion of infringing material using the provisions of World Intellectual Property Organization treaty implementations, such as the U.S. Online Copyright Infringement Liability Limitation Act.
On the Internet, Usenet is typically served via NNTP on TCP Port 119 for plain text connections and on TCP port 563 for SSL encrypted connections which is offered only by a few sites.
The major set of worldwide newsgroups is contained within nine hierarchies, eight of which are operated under consensual guidelines that govern their administration and naming. The current "Big Eight" are:
(Note: the asterisks are used as wildmat patterns, examples follow in parentheses)
See also the Great Renaming.
The alt.* hierarchy is not subject to the procedures controlling groups in the Big Eight, and it is as a result less organized. However, groups in the alt.* hierarchy tend to be more specialized or specific—for example, there might be a newsgroup under the Big Eight which contains discussions about children's books, but a group in the alt hierarchy may be dedicated to one specific author of children's books. Binaries are posted in alt.binaries.*, making it the largest of all the hierarchies.
Many other hierarchies of newsgroups are distributed alongside these. Regional and language-specific hierarchies such as japan.*, malta.* and ne.* serve specific regions such as Japan, Malta and New England. Companies such as Microsoft administer their own hierarchies to discuss their products and offer community technical support. Some users prefer to use the term "Usenet" to refer only to the Big Eight hierarchies; others include alt as well. The more general term "netnews" incorporates the entire medium, including private organizational news systems.
Usenet was originally created to distribute text content encoded in the 7-bit ASCII character set. With the help of programs that encode 8-bit values into ASCII, it became practical to distribute binary files as content. Binary posts, due to their size and often-dubious copyright status, were in time restricted to specific newsgroups, making it easier for administrators to allow or disallow the traffic.
The oldest widely used encoding method is uuencode, from the Unix uucp package. In the late 1980s Usenet articles were often limited to 60,000 characters, and larger hard limits exist today. Files are therefore commonly split into sections that require reassembly by the reader.
With the header extensions and the Base64 and Quoted-Printable MIME encodings, there was a new generation of binary transport. In practice, MIME has seen increased adoption in text messages, but it is avoided for most binary attachments. Some operating systems with metadata attached to files use specialized encoding formats. For Mac OS, both Binhex and special MIME types are used.
Other lesser known encoding systems that may have been used at one time were BTOA, XX encoding, BOO, and USR encoding.
In an attempt to reduce file transfer times, an informal file encoding known as yEnc was introduced in 2001. It achieves about a 30% reduction in data transferred by assuming that most 8-bit characters can safely be transferred across the network without first encoding into the 7-bit ASCII space.
The standard method of uploading binary content to Usenet is to first archive the files into RAR archives (for large files usually in 20 MB or 50 MB parts) then create Parchive files. Parity files are used to recreate missing data. This is needed often, as not every part of the files reach a server. These are all then encoded into yEnc and uploaded to the selected binary groups.
Each newsgroup is generally allocated a certain amount of storage space for post content. When this storage has been filled, each time a new post arrives, old posts are deleted to make room for the new content. If the network bandwidth available to a server is high but the storage allocation is small, it is possible for a huge flood of incoming content to overflow the allocation and push out everything that was in the group before it. If the flood is large enough, the beginning of the flood will begin to be deleted even before the last part of the flood has been posted.
Binary newsgroups are only able to function reliably if there is sufficient storage allocated to a group to allow readers enough time to download all parts of a binary posting before it is flushed out of the group's storage allocation. This was at one time how posting of undesired content was countered; the newsgroup would be flooded with random garbage data posts, of sufficient quantity to push out all the content to be suppressed. This has been compensated by service providers allocating enough storage to retain everything posted each day, including such spam floods, without deleting anything.
The average length of time that posts are able to stay in the group before being deleted is commonly called the retention time. Generally the larger usenet servers have enough capacity to archive several weeks of binary content even when flooded with new data at the maximum daily speed available. A good binaries service provider must not only accommodate users of fast connections (3 megabit) but also users of slow connections (256 kilobit or less) who need more time to download content over a period of several days or weeks.
While binary newsgroups can be used to distribute completely legal user-created works, open-source software, and public domain material, some binary groups are used to illegally distribute vast quantities of commercial software, copyrighted media, and pornography, the last of which has its own legal implications in some countries.
For example, some binary groups such as alt.binaries.warez.* exist solely for the illegal distribution of commercial software.[6]
ISP-operated usenet servers frequently block access to all alt.binaries.* groups to both reduce their network traffic and to avoid all the related legal issues. Commercial usenet service providers claim to operate as a telecommunications service, and assert that they are not responsible for the user-posted binary content transferred via their equipment. In the United States, usenet providers can qualify for protection under the DMCA Safe Harbor regulations, provided that they establish a mechanism to comply with and respond to takedown notices from copyright holders.[7]
Removal of copyrighted content from the entire usenet network is a nearly impossible task, due to the rapid propagation between servers and the retention done by each server. Petitioning a usenet provider for removal only removes it from that one server's retention cache, but not any others. It is possible for a special post cancellation message to be distributed to remove it from all servers, but many providers ignore cancel messages by standard policy, because they can be easily falsified and submitted by anyone.[8][9] For a takedown petition to be most effective across the whole network, it would have to be issued to the origin server to which the content has been posted, but has not yet been propagated to other servers. Removal of the content at this early stage would prevent further propagation, but with modern high speed links, content can be propagated as fast as it arrives, allowing no time for content review and takedown issuance by copyright holders.
Establishing the identity of the person posting illegal content is equally difficult due to the trust-based design of the network. Like SMTP email, servers generally assume the header and origin information in a post is true and accurate. However, as in SMTP email, usenet post headers are easily falsified so as to obscure the true identity and location of the message source.[10] In this manner, usenet is significantly different from modern P2P services; most P2P users distributing content are typically immediately identifiable to all other users by their network address, but the origin information for a usenet posting can be completely obscured and unobtainable once it has propagated past the origin server.
Also unlike modern P2P services, the identity of the downloaders is hidden from view. On P2P services a downloader is identifiable to all others by their network address. On usenet, the downloader connects directly to a server, and only the server knows the address of who is connecting to it. Usenet providers do keep usage logs, but this logging information is not casually available to outside parties like the RIAA.
Newsgroup experiments first occurred in 1979. Tom Truscott and Jim Ellis of Duke University came up with the idea as a replacement for a local announcement program, and established a link with nearby University of North Carolina using Bourne shell scripts written by Steve Bellovin. The public release of news was in the form of conventional compiled software, written by Steve Daniel and Truscott.
UUCP networks spread quickly due to the lower costs involved, and the ability to use existing leased lines, X.25 links or even ARPANET connections. By 1983 the number of UUCP hosts had grown to 550, nearly doubling to 940 in 1984.
As the mesh of UUCP hosts rapidly expanded, it became desirable to distinguish the Usenet subset from the overall network. A vote was taken at the 1982 USENIX conference to choose a new name. The name Usenet was retained, but it was established that it only applied to news.[11] The name UUCPNET became the common name for the overall network.
In addition to UUCP, early Usenet traffic was also exchanged with Fidonet and other dial-up BBS networks. Widespread use of Usenet by the BBS community was facilitated by the introduction of UUCP feeds made possible by MS-DOS implementations of UUCP such as UFGATE (UUCP to FidoNet Gateway), FSUUCP and UUPC. The Network News Transfer Protocol, or NNTP, was introduced in 1985 to distribute Usenet articles over TCP/IP as a more flexible alternative to informal Internet transfers of UUCP traffic. Since the Internet boom of the 1990s, almost all Usenet distribution is over NNTP.
Early versions of Usenet used Duke's A News software. At Berkeley an improved version called B News was produced by Matt Glickman and Mark Horton. With a message format that offered compatibility with Internet mail and improved performance, it became the dominant server software. C News, developed by Geoff Collyer and Henry Spencer at the University of Toronto, was comparable to B News in features but offered considerably faster processing. In the early 1990s, InterNetNews by Rich Salz was developed to take advantage of the continuous message flow made possible by NNTP versus the batched store-and-forward design of UUCP. Since that time INN development has continued, and other news server software has also been developed.
Usenet was the initial Internet community and the place for many of the most important public developments in the commercial Internet. It was the place where Tim Berners-Lee announced the launch of the World Wide Web[12], where Linus Torvalds announced the Linux project[13], and where Marc Andreesen announced the creation of the Mosaic browser and the introduction of the image tag,[14] which revolutionized the World Wide Web by turning it into a graphical medium.
Web-based archiving of Usenet posts began in 1995 at Deja News with a very large, searchable database. In 2001, this database was acquired by Google.
AOL announced that it would discontinue its integrated Usenet service in early 2005, citing the growing popularity of weblogs, chat forums and on-line conferencing. The AOL community had a tremendous role in popularizing Usenet some 11 years earlier, with all of its positive and negative aspects. This change marked the end of the legendary Eternal September. Others, however, feel that Google Groups, especially with its new user interface, has picked up the torch that AOL has dropped—and that the so-called Eternal September has yet to end.
Over time, the amount of Usenet traffic has steadily increased. It is important to note, however, that much of this traffic increase reflects not an increase in discrete users or newsgroup discussions, but instead the combination of massive automated spamming and an increase in the use of .binaries newsgroups in which large files (frequently pornography or pirated media) are often posted publicly. A small sampling of the change (measured in feed size per day) follows:
Daily Volume | Date | Source |
---|---|---|
4.5 GB | 1996-12 | Altopia.com |
9 GB | 1997-07 | Altopia.com |
12 GB | 1998-01 | Altopia.com |
26 GB | 1999-01 | Altopia.com |
82 GB | 2000-01 | Altopia.com |
181 GB | 2001-01 | Altopia.com |
257 GB | 2002-01 | Altopia.com |
492 GB | 2003-01 | Altopia.com |
969 GB | 2004-01 | Altopia.com |
1.30 TB | 2004-09-30 | Octanews.net |
1.27 TB | 2004-11-30 | Octanews.net |
1.38 TB | 2004-12-31 | Octanews.net |
1.34 TB | 2005-01-01 | Octanews.net |
1.30 TB | 2005-01-01 | Newsreader.com |
1.67 TB | 2005-01-31 | Octanews.net |
1.63 TB | 2005-02-01 | Newsreader.com |
1.81 TB | 2005-02-28 | Octanews.net |
1.87 TB | 2005-03-08 | Newsreader.com |
2.00 TB | 2005-03-11 | Various sources |
3.12 TB | 2007-04-21 | Usenetserver.com |
3.80 TB | 2008-04-16 | Newsdemon.com |
Many terms now in common use on the Internet—so-called "jargon"—originated or were popularized on Usenet. Likewise, many conflicts which later spread to the rest of the Internet, such as the ongoing difficulties over spamming, began on Usenet.
Google Groups hosts an archive of Usenet posts dating back to May 1981. The archive was originally started by the company DejaNews (later Deja), which was purchased by Google in February 2001. Already during the DejaNews era the archive had become a popular constant in Usenet culture, and remains so today.
The archiving of Usenet led to a fear of loss of privacy. An archive simplifies ways to profile people. This has partly been countered with the introduction of the X-No-Archive: Yes header, which is itself seen as controversial.
Google Groups also allows users to create groups that can only be accessed from Google's own interface, but which look like Usenet groups in search results.
Usenet terms
|
Usenet history
|
There are no Usenet "administrators" per se; each server administrator is free to do whatever pleases him or her as long as the end users and peer servers tolerate and accept it. Nevertheless, there are a few famous administrators:
Public USENET servers are those NNTP hosts which deliberately accept for free incoming connections from every IP address without requiring any kind of authentication. All these sites impose on their users several access limits in order to keep low their spam ratio but they also strictly protect their clients' privacy.