Linear code

From Wikipedia, the free encyclopedia

In mathematics and information theory, a linear code is an important type of block code used in error correction and detection schemes. Linear codes allow for more efficient encoding and decoding algorithms than other codes (cf. syndrome decoding).

1 Formal definition
2 Properties
3 Popular notation
4 Examples
5 Uses
6 Sphere packings and lattices
7 External links

[edit] Formal definition

A linear code of length n and rank k is a linear subspace C with dimension k of the vector space $\mathbb{F}_q^n$ where $\mathbb{F}_q$ is the finite field with q elements. Such a code with parameter q is called a q-ary code (e.g., when q = 5, the code is a 5-ary code). If q = 2 or q = 3, the code is described as a binary code, or a ternary code.

[edit] Properties

As a linear subspace of $\mathbb{F}_q^n$ , the entire code C (which may be very large) may be represented as the span of a minimal set of codewords (known as a basis in linear algebra). These basis codewords are often collated in the rows of a matrix known as a generating matrix for the code C.

The subspace definition also guarantees that the minimum Hamming distance d between any given codeword c₀ and the other codewords c ≠ c₀ is constant. Since the difference c − c₀ of two codewords in C is also a codeword (i.e., an element of the subspace C), and d(c, c₀) = d(c − c₀, 0), we see that

$\min_{c \in C,\ c \neq c_0}d(c,c_0)=\min_{c \in C, c \neq c_0}d(c-c_0, 0)=\min_{c \in C, c \neq 0}d(c, 0).$

[edit] Popular notation

Codes in general are often denoted by the letter C. A linear code of length n, of rank k (i.e., having k code words in its basis and k rows in its generating matrix), and of minimum Hamming weight d is referred to as an (n, k, d) code.

Remark. This (n, k, d) notation should not be confused with the [n, r, d] notation used to denote a non-linear code of length n, size r (i.e., having r code words), and minimum Hamming distance d.

[edit] Examples

Some examples of linear codes include:

Repetition codes
Parity codes
Cyclic codes
Hamming code
Golay code, both the binary and ternary versions
Reed-Solomon codes
BCH code
Reed-Muller codes

[edit] Uses

Binary linear codes (refer to formal definition above) are ubiquitous in electronic devices and digital storage media. For example the Reed-Solomon code is used to store digital data on a compact disc.

[edit] Sphere packings and lattices

Block codes are tied to the sphere packing problem which has received some attention over the years. In two dimensions, it is easy to visualize. Take a bunch of pennies flat on the table and push them together. The result is a hexagon pattern like a bee's nest. But block codes rely on more dimensions which cannot easily be visualized. The powerful Golay code used in deep space communications uses 24 dimensions. If used as a binary code (which it usually is,) the dimensions refer to the length of the codeword as defined above.

The theory of coding uses the N-dimensional sphere model. For example, how many pennies can be packed into a circle on a tabletop or in 3 dimensions, how many marbles can be packed into a globe. Other considerations enter the choice of a code. For example, hexagon packing into the constraint of a rectangular box will leave empty space at the corners. As the dimensions get larger, the percentage of empty space grows smaller. But at certain dimensions, the packing uses all the space and these codes are the so called perfect codes. There are very few of these codes.

Another item which is often overlooked is the number of neighbors a single codeword may have. Again, let's use pennies as an example. First we pack the pennies in a rectangular grid. Each penny will have 4 near neighbors (and 4 at the corners which are farther away). In a hexagon, each penny will have 6 near neighbors. When we increase the dimensions, the number of near neighbors increases very rapidly.

The result is the number of ways for noise to make the receiver choose a neighbor (hence an error) grows as well. This is a fundamental limitation of block codes, and indeed all codes. It may be harder to cause an error to a single neighbor, but the number of neighbors can be large enough so the total error probability actually suffers.