8-bit clean

From Wikipedia, the free encyclopedia

Eight-bit clean describes a computer system that correctly handles 8-bit character sets, such as the ISO 8859 series and the UTF-8 encoding of Unicode. Up to the early 1990s, programs and communications systems assumed that all characters would be represented as numbers between 0 and 127 (7 bits), leaving the top bit of each byte free for use as a parity or flag bit. 7-bit systems are unable to handle more complex character codes, which are commonplace in non-English-speaking countries with larger alphabets.

Binary files cannot directly be transmitted through 7-bit systems. To work around this, encodings have been devised which use only 7-bit ASCII characters. The most popular of these encodings are uuencode and MIME base64. EBCDIC-based systems cannot handle all 7-bits of uuencoded data; the base64 encoding does not have this problem on legacy networks.

By the mid-1990s practically all computer and communication systems were updated to be 8-bit clean. Some communications protocols, such as SMTP (Internet e-mail), still require 7-bit data.

[edit] References

This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.

In other languages