Talk:8B/10B encoding

From Wikipedia, the free encyclopedia

< Talk:8B

Should this be "8b/10b" instead of "8B/10B" since little b means "bit"?--SFoskett 14:47, Sep 2, 2004 (UTC)

That's a good question. The 10GE FC draft spec refers to it as 8B/10B, but the PCI Express spec refers to it as 8b/10b. However this article gets named, we need a redirect from the other, IMO. MatthewWilcox 14:54, 24 Nov 2004 (UTC)

I studied also a 4b/5b encoding. Is it the same as this, referred to the byte (8 bits)? I can't find it in the Line code page. Maybe it is also known with some other name.--Luc4 Fri Sep 23 12:14:02 2005 (CEST)

I'm not especially familiar with 4b/5b encoding, but from what I gather here it appears to be somewhat similar but simpler. Aluvus 17:27, 16 June 2006 (UTC)
4b/5b is listed on that page... just as 4b5b. And yes, 4b/5b is similar but simplier. Mrand 18:26, 16 June 2006 (UTC)

[edit] 8b/10b coding table

For eventual cut & paste into main article. (Go ahead and delete this section from talk if/when that's done.)

The 8 input bits of the 8B/10B code are conventionally identified by upper-case letters A through H, with A the low-order bit and H the high-order bit. Thus, in standard binary notiation, the byte is HGFEDCBA. The output is 10 bits, abcdei fghj, where a is transmitted first.

Coding is done in alternate 5-bit and 3-bit sub-blocks. As coding proceeds, the encoder maintains a "running disparity", the difference between the number of 1 bits and 0 bits transmitted. At the end of sub-block, this disparity is ±1, and can be represented by a single state bit. The input and the running disparity bit are used to select the coded bits as follows:

5b/6b code
input RD = -1 RD = +1 input RD = -1 RD = +1
EDCBA abcdei EDCBA abcdei
D.00 00000 100111 011000 D.16 10000 011011 100100
D.01 00001 011101 100010 D.17 10001 100011
D.02 00010 101101 010010 D.18 10010 010011
D.03 00011 110001 D.19 10011 110010
D.04 00100 110101 001010 D.20 10100 001011
D.05 00101 101001 D.21 10101 101010
D.06 00110 011001 D.22 10110 011010
D.07 00111 111000 000111 D.23 10111 111010 000101
D.08 01000 111001 000110 D.24 11000 110011 001100
D.09 01001 100101 D.25 11001 100110
D.10 01010 010101 D.26 11010 010110
D.11 01011 110100 D.27 11011 110110 001001
D.12 01100 001101 D.28 11100 001110
D.13 01101 101100 D.29 11101 101110 010001
D.14 01110 011100 D.30 11110 011110 100001
D.15 01111 010111 101000 D.31 11111 101011 010100
K.28 001111 110000
3b/4b code
input RD = -1 RD = +1
HGF fghj
D.x.0 000 1011 0100
D.x.1 001 1001
D.x.2 010 0101
D.x.3 011 1100 0011
D.x.4 100 1101 0010
D.x.5 101 1010
D.x.6 110 0110
D.x.P7 111 1110 0001
D.x.A7 111 0111 1000

Note that, while in most cases, the codes that depend on RD change it by ±2, the encodings of D.07 and D.x.3 do not change it.

There are two encodings for D.x.7, "primary" and "alternate". The alternate code is used after D.11, D.13 and D.14 when RD=-1, and after D.17, D.18 and D.20 when RD=+1. This ensures that bits eifgh are never all the same, and thus five consecutive identical bits never appear in normal output.

Using an additional 6B output value, called K.28, and/or the alternate D.x.A7 output in context where it would not be otherwise required, an additional 12 "bytes" can be formed. Some of these contain a "comma sequence" abcdeifg = 00111110 or 11000001, which never appears anywhere else in the bit stream, and can be used to establish the byte boundaries in the data stream.

These symbols are used by higher-level protocols to indicate boundaries or gaps (idle time) in the data stream. The names are assigned because their encoding is a slight variant of the corresponding D.x.y codes.

Control symbols
input RD = -1 RD = +1
abcdei fghj abcdei fghj
K.28.0  001111 0100 110000 1011
K.28.1* 001111 1001 110000 0110
K.28.2  001111 0101 110000 1010
K.28.3  001111 0011 110000 1100
K.28.4  001111 0010 110000 1101
K.28.5* 001111 1010 110000 0101
K.28.6  001111 0110 110000 1001
K.28.7* 001111 1000 110000 0111
K.23.7  111010 1000 000101 0111
K.27.7  110110 1000 001001 0111
K.29.7  101110 1000 010001 0111
K.30.7  011110 1000 100001 0111

(* K28.1, K28.5, and K28.7 are comma symbols, containing the comma sequence abcdeifg = 00111110 or 11000001. K28.7 must not appear after another K.28.7, or it would form a second false comma sequence.) —The preceding unsigned comment was added by 192.35.100.1 (talkcontribs).

Wow, you put a lot of work into that. However, I think it may be a little excessive for an encyclopedic article. It is a great ref tho. Maybe link to it from the main article and put it in wikisource or something like that? — RevRagnarok Talk Contrib 12:04, 17 August 2006 (UTC)
I think they are absolutely appropriate for an encyclopedia! Cburnett 03:29, 19 December 2006 (UTC)
At least one part of this more detailed explanation should be included in the article: the discussion of Running Disparity. This Acronym (RD) is given in the article, but it is never explained (note: RD is not yet defined in the above discussion either, though it is pretty obvious what it refers to). Also, I liked the following external links, should they be added? http://www.xilinx.com/ipcenter/catalog/logicore/docs/encode_8b10b.pdf, http://www.xilinx.com/ipcenter/catalog/logicore/docs/decode_8b10b.pdf DaraParsavand 18:39, 9 January 2007 (UTC)
I strongly agree about adding an explanation to Running Disparity (RD). Not only is the word used in the article but also the abbreviation in the table - and it is never explained. Actually I came here (to the discussion page) in hope of just finding this. IHMO the article is already quiet technical and adding this information seems appropriate and gives a good indication on "how" 8B/10B "works". MJost 13:22, 10 January 2007 (UTC)