Byte

From Wikipedia, the free encyclopedia

For other users of the word/name byte, see byte (disambiguation).

Prefixes for bit and byte

Decimal
Value	SI
1000¹	k	kilo-
1000²	M	mega-
1000³	G	giga-
1000⁴	T	tera-
1000⁵	P	peta-
1000⁶	E	exa-
1000⁷	Z	zetta-
1000⁸	Y	yotta-

Binary
Value	IEC		JEDEC
1024¹	Ki	kibi-	K	kilo-
1024²	Mi	mebi-	M	mega-
1024³	Gi	gibi-	G	giga-
1024⁴	Ti	tebi-
1024⁵	Pi	pebi-
1024⁶	Ei	exbi-
1024⁷	Zi	zebi-
1024⁸	Yi	yobi-

In computer science a byte (pronounced "bite", IPA: /baɪt/) is a unit of measurement of information storage, most often consisting of eight bits. In many computer architectures it is a unit of memory addressing.

Originally, a byte was a small group of bits of a size convenient for data such as a single character from a Western character set. Its size was generally determined by the number of possible characters in the supported character set and was chosen to be a divisor of the computer's word size; historically, bytes have ranged from five to twelve bits. The popularity of IBM's System/360 architecture starting in the 1960s and the explosion of microcomputers based on 8-bit microprocessors in the 1980s has made eight bits by far the most common size for a byte. The term octet is widely used as a more precise synonym where ambiguity is undesirable (for example, in protocol definitions).

There has been considerable confusion about the meanings of SI prefixes used with the word "byte", such as kilo- (k or K) and mega- (M), as shown in the chart Quantities of bytes. Since computer memory comes in powers of 2 rather than 10, the industry used binary estimates of the SI-prefixed quantities. Because of the confusion, a contract specifying a quantity of bytes must define what the prefixes mean in terms of the contract (i.e., the alternative binary equivalents or the actual decimal values, or a binary estimate based on the actual values).

A byte is one of the basic integral data types in some programming languages, especially system programming languages.

To make the meaning of the table absolutely clear: A kibibyte is made up of 1,024 bytes. A mebibyte is made up of 1,024 × 1,024 bytes. The figures in the column using 1,024 raised to powers of 1, 2, 3, 4 and so on are in units of bytes.

1 Meanings
2 History
3 Alternative words
4 Abbreviation/Symbol
5 Names for different units
6 See also
7 Notes

[edit] Meanings

The word "byte" has two closely related meanings:

A contiguous sequence of a fixed number of bits (binary digits). The use of a byte to mean 8 bits has become nearly ubiquitous.
A contiguous sequence of bits within a binary computer that comprises the smallest addressable sub-field of the computer's natural word-size. That is, the smallest unit of binary data on which meaningful computation, or natural data boundaries, could be applied. For example, the CDC 6000 series scientific mainframes divided their 60-bit floating-point words into 10 six-bit bytes. These bytes conveniently held Hollerith data from punched cards, typically the upper-case alphabet and decimal digits. CDC also often referred to 12-bit quantities as bytes, each holding two 6-bit display code characters, due to the 12-bit I/O architecture of the machine. The PDP-10 used assembly instructions LDB and DPB to extract bytes — these operations survive today in Common Lisp. Bytes of six, seven, or nine bits were used on some computers, for example within the 36-bit word of the PDP-10. The UNIVAC 1100/2200 series computers (now Unisys) addressed in both 6-bit (Fieldata) and 9-bit (ASCII) modes within its 36-bit word.

[edit] History

The term byte was coined by Dr. Werner Buchholz in July 1956, during the early design phase for the IBM Stretch computer.^[1]^[2]^[3] Originally it was defined in instructions by a 4-bit byte-size field, allowing from one to sixteen bits (the production design reduced this to a 3-bit byte-size field, allowing from one to eight bits to be represented by a byte); typical I/O equipment of the period used six-bit bytes. A fixed eight-bit byte size was later adopted and promulgated as a standard by the System/360. The term "byte" comes from "bite," as in the smallest amount of data a computer could "bite" at once. The spelling change not only reduced the chance of a "bite" being mistaken for a "bit," but also was consistent with the penchant of early computer scientists to make up words and change spellings. A byte was also often referred to as "an 8-bit byte", reinforcing the notion that it was a tuple of n bits, and that other sizes were possible.

A contiguous sequence of binary bits in a serial data stream, such as in modem or satellite communications, or from a disk-drive head, which is the smallest meaningful unit of data. These bytes might include start bits, stop bits, or parity bits, and thus could vary from 7 to 12 bits to contain a single 7-bit ASCII code.
A datatype or synonym for a datatype in certain programming languages. C and C++, for example, defines byte as "addressable unit of data storage large enough to hold any member of the basic character set of the execution environment" (clause 3.6 of the C standard). Since the C char integral data type must contain at least 8 bits (clause 5.2.4.2.1), a byte in C is at least capable of holding 256 different values (signed or unsigned char does not matter). Various implementations of C and C++ define a "byte" as 8, 9, 16, 32, or 36 bits[1][2]. The actual number of bits in a particular implementation is documented as CHAR_BIT as implemented in the limits.h file. Java's primitive byte data type is always defined as consisting of 8 bits and being a signed data type, holding values from −128 to 127.

Early microprocessors, such as Intel 8008 (the direct predecessor of the 8080, and then 8086) could perform a small number of operations on four bits, such as the DAA (decimal adjust) instruction, and the "half carry" flag, that were used to implement decimal arithmetic routines. These four-bit quantities were called "nybbles," in homage to the then-common 8-bit "bytes."

[edit] Alternative words

Following "bit," "byte," and "nybble," there have been some analogical attempts to construct unambiguous terms for bit blocks of other sizes.^[4] All of these are strictly jargon, and not very common.

2 bits: crumb, quad, quarter, tayste, tydbit
4 bits: nibble, nybble
5 bits: nickel, nyckle
10 bits: deckle
16 bits: plate, playte, chomp, chawmp (on a 32-bit machine)
18 bits: chomp, chawmp (on a 36-bit machine)
32 bits: dinner, dynner, gawble (on a 32-bit machine)
48 bits: gobble, gawble (under circumstances that remain obscure)

[edit] Abbreviation/Symbol

IEEE 1541 and Metric-Interchange-Format specify "B" as the symbol for byte (e.g. MB means megabyte), while IEC 60027 seems silent on the subject. Furthermore, B means bel (see decibel), another (logarithmic) unit used in the same field. The use of B to stand for bel is consistent with the metric system convention that capitalized symbols are for units named after a person (in this case Alexander Graham Bell); usage of a capital B to stand for byte is not consistent with this convention. There is little danger of confusing a byte with a bel because the bel's sub-multiple the decibel (dB) is usually preferred, while use of the decibyte (dB) is extremely rare.

The unit symbol "kb" with a lowercase "b" is a commonly used abbreviation for "kilobyte". Use of this abbreviation leads to confusion with the alternative use of "kb" to mean "kilobit". IEEE 1541 specifies "b" as the symbol for bit; however the IEC 60027 and Metric-Interchange-Format specify "bit" (e.g. Mbit for megabit) for the symbol, achieving maximum disambiguation from byte.

French-speaking countries sometimes use an uppercase "o" for "octet". This is not consistent with SI because of the risk of confusion with the zero, and the convention that capitals are reserved for unit names derived from proper names, such as the ampere (whose symbol is A) and joule (symbol J), versus the second (symbol s) and metre (symbol m).

Lowercase "o" for "octet" is a commonly used symbol in several non-English-speaking countries, and is also used with metric prefixes (for example, "ko" and "Mo").

[edit] Names for different units

The prefixes used for byte measurements are usually the same as the SI prefixes used for other measurements, but have slightly different values. The former are based on powers of 1,024 (2¹⁰), a convenient binary number, while the SI prefixes are based on powers of 1,000 (10³), a convenient decimal number. The table below illustrates these differences. See binary prefix for further discussion.

Prefix	Name	SI Meaning	Binary meaning	Size difference
k	kilo	10³ = 1000¹	2¹⁰ = 1024¹	2.40%
M	mega	10⁶ = 1000²	2²⁰ = 1024²	4.86%
G	giga	10⁹ = 1000³	2³⁰ = 1024³	7.37%
T	tera	10¹² = 1000⁴	2⁴⁰ = 1024⁴	9.95%
P	peta	10¹⁵ = 1000⁵	2⁵⁰ = 1024⁵	12.59%
E	exa	10¹⁸ = 1000⁶	2⁶⁰ = 1024⁶	15.29%
Z	zega	10²¹ = 1000⁷	2⁷⁰ = 1024⁷	18.67%

Sometimes "K" is used intead of "k". The use of "K" as a prefix has no meanings for the SI.

In 1998, the IEC, then the IEEE, published a new standard describing binary prefixes:

Prefix	Name
kibi	binary kilo	1 kibibyte (KiB)	2¹⁰ bytes	1024 B
mebi	binary mega	1 mebibyte (MiB)	2²⁰ bytes	1024 KiB
gibi	binary giga	1 gibibyte (GiB)	2³⁰ bytes	1024 MiB
tebi	binary tera	1 tebibyte (TiB)	2⁴⁰ bytes	1024 GiB
pebi	binary peta	1 pebibyte (PiB)	2⁵⁰ bytes	1024 TiB
exbi	binary exa	1 exbibyte (EiB)	2⁶⁰ bytes	1024 PiB

Fractional information is usually measured in bits, nibbles, nats, or bans, where the later two are used especially in the context of information theory and not usually with computing in general.

[edit] See also

[edit] Notes

^ Origins of the Term "BYTE" Bob Bemer, accessed 2007-08-12
^ TIMELINE OF THE IBM STRETCH/HARVEST ERA (1956-1961) computerhistory.org, '1956 July ... Werner Buchholz ... Werner's term "Byte" first popularized'
^ byte catb.org, 'coined by Werner Buchholz in 1956'
^ nybble reference.com sourced from Jargon File 4.2.0, accessed 2007-08-12

v • d • e Units of Information

Base units	Bit · Byte

Traditional units	Kilobyte · Megabyte · Gigabyte · Terabyte · Petabyte · Exabyte · Zettabyte · Yottabyte

IEC standard units	Kibibyte · Mebibyte · Gibibyte · Tebibyte · Pebibyte · Exbibyte · Zebibyte · Yobibyte

Categories: Data types | Units of information

Byte

From Wikipedia, the free encyclopedia

Contents

[edit] Meanings

[edit] History

[edit] Alternative words

[edit] Abbreviation/Symbol

[edit] Names for different units

[edit] See also

[edit] Notes

Views

Navigation

Interaction

Search

Languages