8B/10B encoding
From Wikipedia, the free encyclopedia
In telecommunications, 8B/10B is a line code that maps 8-bit symbols to 10-bit symbols to achieve DC-balance (see DC coefficient) and bounded disparity, and yet provide enough state changes to allow reasonable clock recovery. This means that there are just as many "1"s as "0"s in a string of two symbols, and that there are not too many "1"s or "0"s in a row. This is an important attribute in a signal that needs to be sent at high rates because it helps reduce intersymbol interference.
Contents |
[edit] History
The code was described in 1983 by Al Widmer and Peter Franaszek in the IBM Journal of Research and Development. IBM was issued a patent for the scheme the following year. IBM's patent notwithstanding; the method, implementation and goals are very similar to Group Code Recording (GCR) used on floppy disks in some computers during late 1970s/early 80s.
[edit] How it works
As the scheme name suggests, 8 bits of data are transmitted as a 10-bit entity called a symbol, or character. The low 5 bits of data are encoded into a 6-bit group and the top 3 bits are encoded into a 4-bit group. These code groups are concatenated together to form the 10-bit symbol that is transmitted on the wire. The data symbols are often referred to as Dxx.y where xx ranges from 0-31 and y from 0-7. Standards using the 8B/10B encoding also define special symbols (or control characters) that can be sent in place of a data symbol. They are often used to indicate end-of-frame, link idle, skip and similar link-level conditions. They are referred to as Kxx.y and have different encodings from any of the Dxx.y symbols. Because 8B/10B encoding uses 10-bit symbols to encode 8-bit words, each of the 256 possible 8-bit words can be encoded in two different ways, one the bit-wise inverse of the other. Using these alternative encodings, the scheme is able to effect long-term DC-balance in the serial data stream. This permits the data stream to be transmitted through a channel with a high-pass characteristic, for example Ethernet's transformer-coupled unshielded twisted pair.
The encoding is normally done entirely in hardware, based on lookup tables. Upper layers of the software stack should be "unaware" that this encoding is being used.
[edit] Technologies that use it
Now that the IBM patent has expired, the scheme has become even more popular and is the default DC-free line code for new standards.
Among the areas in which 8B/10B encoding finds application are
- HyperTransport
- PCI Express
- IEEE 1394b
- Serial ATA
- SAS
- Fibre Channel
- SSA
- Gigabit Ethernet
- InfiniBand
- XAUI
- Serial RapidIO
- DVI (Transition Minimized Differential Signaling)
- DVB Asynchronous Serial Interface (ASI)
[edit] Heavy use in digital audio applications
The Digital Audio Tape and Digital Compact Cassette (DCC) use this modulation scheme.
- The related Eight-to-Fourteen Modulation is used in the Compact Disc standard.
[edit] Exceptions
The encoding scheme used in 10 Gigabit Ethernet's 10GBASE-R Physical Media Dependent (PMD) interfaces, 64B/66B, while similarly created with consideration of DC balance, maximum run length, transition density, electromagnetic emissions, and the like, is considerably different in design.
Note that 8B/10B is the encoding scheme, not a specific code. While many applications do use the same code, there exist some incompatible implementations; for example, Transition Minimized Differential Signaling, which also expands 8 bits to 10 bits, has some subtle differences.
[edit] Encoding tables
Note that in the following tables, "A" and "a" are the least significant bit.
[edit] 5B/6B
input | RD = -1 | RD = +1 | input | RD = -1 | RD = +1 | ||
---|---|---|---|---|---|---|---|
EDCBA | abcdei | EDCBA | abcdei | ||||
D.00 | 00000 | 100111 | 011000 | D.16 | 10000 | 011011 | 100100 |
D.01 | 00001 | 011101 | 100010 | D.17 | 10001 | 100011 | |
D.02 | 00010 | 101101 | 010010 | D.18 | 10010 | 010011 | |
D.03 | 00011 | 110001 | D.19 | 10011 | 110010 | ||
D.04 | 00100 | 110101 | 001010 | D.20 | 10100 | 001011 | |
D.05 | 00101 | 101001 | D.21 | 10101 | 101010 | ||
D.06 | 00110 | 011001 | D.22 | 10110 | 011010 | ||
D.07 | 00111 | 111000 | 000111 | D.23 | 10111 | 111010 | 000101 |
D.08 | 01000 | 111001 | 000110 | D.24 | 11000 | 110011 | 001100 |
D.09 | 01001 | 100101 | D.25 | 11001 | 100110 | ||
D.10 | 01010 | 010101 | D.26 | 11010 | 010110 | ||
D.11 | 01011 | 110100 | D.27 | 11011 | 110110 | 001001 | |
D.12 | 01100 | 001101 | D.28 | 11100 | 001110 | ||
D.13 | 01101 | 101100 | D.29 | 11101 | 101110 | 010001 | |
D.14 | 01110 | 011100 | D.30 | 11110 | 011110 | 100001 | |
D.15 | 01111 | 010111 | 101000 | D.31 | 11111 | 101011 | 010100 |
K.28 | 001111 | 110000 |
[edit] 3B/4B
input | RD = -1 | RD = +1 | |
---|---|---|---|
HGF | fghj | ||
D.x.0 | 000 | 1011 | 0100 |
D.x.1 | 001 | 1001 | |
D.x.2 | 010 | 0101 | |
D.x.3 | 011 | 1100 | 0011 |
D.x.4 | 100 | 1101 | 0010 |
D.x.5 | 101 | 1010 | |
D.x.6 | 110 | 0110 | |
D.x.P7 | 111 | 1110 | 0001 |
D.x.A7 | 111 | 0111 | 1000 |
[edit] Control symbols
input | RD = -1 | RD = +1 |
---|---|---|
abcdei fghj | abcdei fghj | |
K.28.0 | 001111 0100 | 110000 1011 |
K.28.1 † | 001111 1001 | 110000 0110 |
K.28.2 | 001111 0101 | 110000 1010 |
K.28.3 | 001111 0011 | 110000 1100 |
K.28.4 | 001111 0010 | 110000 1101 |
K.28.5 † | 001111 1010 | 110000 0101 |
K.28.6 | 001111 0110 | 110000 1001 |
K.28.7 † | 001111 1000 | 110000 0111 |
K.23.7 | 111010 1000 | 000101 0111 |
K.27.7 | 110110 1000 | 001001 0111 |
K.29.7 | 101110 1000 | 010001 0111 |
K.30.7 | 011110 1000 | 100001 0111 |
† — K28.1, K28.5, and K28.7 are comma symbols, containing the comma sequence abcdeifg = 00111110 or 11000001. K28.7 must not appear after another K.28.7, or it would form a second false comma sequence.