8B/10B encoding

From Wikipedia, the free encyclopedia

In telecommunications, 8B/10B is a line code that maps 8-bit symbols to 10-bit symbols to achieve DC-balance (see DC coefficient) and bounded disparity, and yet provide enough state changes to allow reasonable clock recovery. This means that there are just as many "1"s as "0"s in a string of two symbols, and that there are not too many "1"s or "0"s in a row. This is an important attribute in a signal that needs to be sent at high rates because it helps reduce intersymbol interference.

Contents

[edit] History

The code was described in 1983 by Al Widmer and Peter Franaszek in the IBM Journal of Research and Development. IBM was issued a patent for the scheme the following year. IBM's patent notwithstanding; the method, implementation and goals are very similar to Group Code Recording (GCR) used on floppy disks in some computers during late 1970s/early 80s.

[edit] How it works

As the scheme name suggests, 8 bits of data are transmitted as a 10-bit entity called a symbol, or character. The low 5 bits of data are encoded into a 6-bit group and the top 3 bits are encoded into a 4-bit group. These code groups are concatenated together to form the 10-bit symbol that is transmitted on the wire. The data symbols are often referred to as Dxx.y where xx ranges from 0-31 and y from 0-7. Standards using the 8B/10B encoding also define special symbols (or control characters) that can be sent in place of a data symbol. They are often used to indicate end-of-frame, link idle, skip and similar link-level conditions. They are referred to as Kxx.y and have different encodings from any of the Dxx.y symbols. Because 8B/10B encoding uses 10-bit symbols to encode 8-bit words, each of the 256 possible 8-bit words can be encoded in two different ways, one the bit-wise inverse of the other. Using these alternative encodings, the scheme is able to effect long-term DC-balance in the serial data stream. This permits the data stream to be transmitted through a channel with a high-pass characteristic, for example Ethernet's transformer-coupled unshielded twisted pair.

The encoding is normally done entirely in hardware, based on lookup tables. Upper layers of the software stack should be "unaware" that this encoding is being used.

[edit] Technologies that use it

Now that the IBM patent has expired, the scheme has become even more popular and is the default DC-free line code for new standards.

Among the areas in which 8B/10B encoding finds application are

[edit] Heavy use in digital audio applications

The Digital Audio Tape and Digital Compact Cassette (DCC) use this modulation scheme.

[edit] Exceptions

The encoding scheme used in 10 Gigabit Ethernet's 10GBASE-R Physical Media Dependent (PMD) interfaces, 64B/66B, while similarly created with consideration of DC balance, maximum run length, transition density, electromagnetic emissions, and the like, is considerably different in design.

Note that 8B/10B is the encoding scheme, not a specific code. While many applications do use the same code, there exist some incompatible implementations; for example, Transition Minimized Differential Signaling, which also expands 8 bits to 10 bits, has some subtle differences.

[edit] Encoding tables

Note that in the following tables, "A" and "a" are the least significant bit.

[edit] 5B/6B

5B/6B code
input RD = -1 RD = +1 input RD = -1 RD = +1
EDCBA abcdei EDCBA abcdei
D.00 00000 100111 011000 D.16 10000 011011 100100
D.01 00001 011101 100010 D.17 10001 100011
D.02 00010 101101 010010 D.18 10010 010011
D.03 00011 110001 D.19 10011 110010
D.04 00100 110101 001010 D.20 10100 001011
D.05 00101 101001 D.21 10101 101010
D.06 00110 011001 D.22 10110 011010
D.07 00111 111000 000111 D.23 10111 111010 000101
D.08 01000 111001 000110 D.24 11000 110011 001100
D.09 01001 100101 D.25 11001 100110
D.10 01010 010101 D.26 11010 010110
D.11 01011 110100 D.27 11011 110110 001001
D.12 01100 001101 D.28 11100 001110
D.13 01101 101100 D.29 11101 101110 010001
D.14 01110 011100 D.30 11110 011110 100001
D.15 01111 010111 101000 D.31 11111 101011 010100
K.28 001111 110000

[edit] 3B/4B

3B/4B code
input RD = -1 RD = +1
HGF fghj
D.x.0 000 1011 0100
D.x.1 001 1001
D.x.2 010 0101
D.x.3 011 1100 0011
D.x.4 100 1101 0010
D.x.5 101 1010
D.x.6 110 0110
D.x.P7 111 1110 0001
D.x.A7 111 0111 1000

[edit] Control symbols

Control symbols
input RD = -1 RD = +1
abcdei fghj abcdei fghj
K.28.0 001111 0100 110000 1011
K.28.1 † 001111 1001 110000 0110
K.28.2  001111 0101 110000 1010
K.28.3  001111 0011 110000 1100
K.28.4  001111 0010 110000 1101
K.28.5 † 001111 1010 110000 0101
K.28.6  001111 0110 110000 1001
K.28.7 † 001111 1000 110000 0111
K.23.7  111010 1000 000101 0111
K.27.7  110110 1000 001001 0111
K.29.7  101110 1000 010001 0111
K.30.7  011110 1000 100001 0111

† — K28.1, K28.5, and K28.7 are comma symbols, containing the comma sequence abcdeifg = 00111110 or 11000001. K28.7 must not appear after another K.28.7, or it would form a second false comma sequence.

[edit] External links

In other languages