Bit

From Wikipedia, the free encyclopedia

v  d  e
Quantities of bits
SI prefixes Binary prefixes
Name
(Symbol)
Standard
SI
Binary
usage
Name
(Symbol)
Value
kilobit (kbit) 103 210 kibibit (Kibit) 210
megabit (Mbit) 106 220 mebibit (Mibit) 220
gigabit (Gbit) 109 230 gibibit (Gibit) 230
terabit (Tbit) 1012 240 tebibit (Tibit) 240
petabit (Pbit) 1015 250 pebibit (Pibit) 250
exabit (Ebit) 1018 260 exbibit (Eibit) 260
zettabit (Zbit) 1021 270 zebibit (Zibit) 270
yottabit (Ybit) 1024 280 yobibit (Yibit) 280

A bit is a binary digit, taking a value of either 0 or 1. For example, the number 10010111 is 8 bits long, or in most cases, one modern PC byte. Binary digits are a basic unit of information storage and communication in digital computing and digital information theory. Information theory also often uses the natural digit, called either a nit or a nat. Quantum computing also uses qubits, a single piece of information with a probability of being true.

The bit is also a unit of measurement, the information capacity of one binary digit. It has the symbol bit, or b (see discussion below). The unit is also known as the shannon, with symbol Sh.

Contents

[edit] Binary digit

Claude E. Shannon first used the word bit in his 1948 paper A Mathematical Theory of Communication. He attributed its origin to John W. Tukey, who had written a Bell Labs memo on 9 January 1947 in which he contracted "binary digit" to simply "bit". Interestingly, Vannevar Bush had written in 1936 of "bits of information" that could be stored on the punch cards used in the mechanical computers of that time. [1]

A bit of storage is like a light switch; it can be either on (1) or off (0). A single bit is a one or a zero, a true or a false, a "flag" which is "on" or "off", or in general, the quantity of information required to distinguish two mutually exclusive equally probable states from each other. Gregory Bateson defined a bit as "a difference that makes a difference". [1]

[edit] Representation

[edit] Transmission

Bits can be implemented in many forms depending on context. For example, in digital circuitry in most computing devices as well as flash memories, a bit is an electrical pulse generated by the internal clock in the control unit or data register. For devices using positive logic, a logical 1 (true value) is represented by up to 5 volts, while a logical 0 (false value) is represented by 0 volt.

[edit] Storage

On storage devices like 1,200,000nm-thick CD-ROMs, a bit is mechanically etched by intensive laser beam as a pit about 168nm deep and 670nm wide of variable length (depending on data type) on concentric tracks spaced 1,600nm apart. The total length of the track in a 650MB disk thus may span several kilometres. The light of the reading laser is reflected back into the laser, which then picks up that light with a sensor. The transition between a pit and a ground means a 1, and a short period of time on the same level is a 0. No more than 11 consequent zeros may occur, because the laser receives no state change during consequent zeros and has to rely on a timer to know the amount of zeros, whose accuracy is limited. CD-Rs work on the same theory, except that they use dyes instead of pits and ground.[2]

Bits can also be represented magnetically, such as in magnetic tapes and cassettes.

[edit] Unit

It is important to differentiate between the use of "bit" in referring to a discrete storage unit and the use of "bit" in referring to a statistical unit of information. The bit, as a discrete storage unit, can by definition store only 0 or 1. A statistical bit is the amount of information that, on average[citation needed], can be stored in a discrete bit. It is thus the amount of information carried by a choice between two equally likely outcomes. One bit corresponds to about 0.693 nats (ln(2)), or 0.301 hartleys (log10(2)).

Consider, for example, a computer file with one thousand 0s and 1s which can be losslessly compressed to a file of five hundred 0s and 1s (on average, over all files of that kind). The original file, although having 1,000 bits of storage, has at most 500 bits of information entropy, since information is not destroyed by lossless compression. A file can have no more information theoretical bits than it has storage bits. If these two ideas need to be distinguished, sometimes the name bit is used when discussing data storage while shannon is used for the statistical bit. However, most of the time, the meaning is clear from the context.

[edit] Abbreviation/symbol

No uniform agreement has been reached yet about what the official unit symbols for bit and byte should be. One commonly-quoted standard, the International Electrotechnical Commission's IEC 60027, specifies that "bit" should be the unit symbol for the unit bit (e.g. "kbit" for kilobit). In the same standard, the symbols "o" and "B" are specified for the byte.

The other commonly-quoted relevant standard, IEEE 1541, specifies "b" to be the unit symbol for bit and "B" to be that for byte. This convention is also widely used in computing, but has so far not been considered acceptable internationally for several reasons:

  • both these symbols are already used for other units: "b" for barn and "B" for bel;
  • "bit" is already short for "binary digit", so there is little reason to abbreviate it any further;
  • it is customary to start a unit symbol with an uppercase letter only if the unit was named after a person (see also Claude Émile Jean-Baptiste Litre);
  • instead of byte, the term octet (unit symbol: "o") is used in some fields and in some French-speaking countries, which adds to the difficulty of agreeing on an international symbol;
  • "b" is occasionally also used for byte, along with "bit" for bit.

The unit bel is rarely used by itself (only as decibel, "dB", which is unlikely to be confused with a decibyte), so the chances of conflict with "B" for byte are quite small, even though both units are very commonly used in the same fields (e.g., telecommunication).

[edit] More than one bit

A byte is a collection of bits, originally differing in size depending on the context but now almost always eight bits. Eight-bit bytes, also known as octets, can represent 256 values (28 values, 0–255). A four-bit quantity is known as a nibble, and can represent 16 values (24 values, 0–15). A rarely used term, crumb, can refer to a two-bit quantity, and can represent 4 values (2² values, 0–3).

"Word" is a term for a slightly larger group of bits, but it has no standard size. It represents the size of one register in a Computer-CPU. In the IA-32 architecture more commonly known as x86-32, 16 bits are called a "word" (with 32 bits being a double word or dword), but other architectures have word sizes of 8, 32, 64, 80 or others.

Terms for large quantities of bits can be formed using the standard range of SI prefixes, e.g., kilobit (kbit), megabit (Mbit) and gigabit (Gbit). Note that much confusion exists regarding these units and their abbreviations (see above).

When a bit within a group of bits such as a byte or word is to be referred to, it is usually specified by a number from 0 (not 1) upwards corresponding to its position within the byte or word. However, 0 can refer to either the most significant bit or to the least significant bit depending on the context, so the convention being used must be known.

Certain bitwise computer processor instructions (such as bit set) operate at the level of manipulating bits rather than manipulating data interpreted as an aggregate of bits.

Telecommunications or computer network transfer rates are usually described in terms of bits per second (bps), not to be confused with baud.

[edit] See also

[edit] Notes

  1. ^ Darwin among the machines: the evolution of global intelligence, George Dyson, 1997. ISBN 0-201-40649-7