Optimal erasure codes with arbitrary parameters
From Wikipedia, the free encyclopedia
In computer science, an erasure code is used to encode data so that the original message can be decoded without receiving all the encoded data. The proportion of the encoded data that must be received before the message can be decoded is called the rate of the erasure code. An optimal erasure code is the smallest set of code words that can send a message of length n at rate r.
Contents |
[edit] Err-mail
Alice wants to send her telephone number (555629) to Bob using err-mail. Err-mail works just like e-mail, except
- About half of all the mail gets lost.[1]
- Messages longer than 5 characters are illegal.
- It is very expensive (similar to air-mail).
Instead of asking Bob to acknowledge the messages she sends, Alice devises the following scheme.
- She breaks her telephone number up into two parts a=555, b=629, and sends 2 messages – "A=555" and "B=629" – to Bob.
- She constructs a linear function, f(n) = a + (b − a)(n − 1), in this case f(n) = 555 + 74(n − 1).
- She computes the values f(3), f(4), and f(5), and then transmits three redundant messages: "C=703", "D=777" and "E=851".
Bob knows that the form of f(n) is f(n) = a + (b − a)(n − 1), where a and b are the two parts of the telephone number. Now suppose Bob receives "D=777" and "E=851".
Bob can reconstruct Alice's phone number by computing the values of a and b from the values (f(4) and f(5)) he has received. Bob can perform this procedure using any two err-mails, so the erasure code in this example has a rate of 40%.
Note that Alice cannot encode her telephone number in just one err-mail, because it contains six characters, and the maximum length of one err-mail message is five characters. If she sent her phone number in pieces, asking Bob to acknowledge receipt of each piece, at least four messages would have to be sent anyway (two from Alice, and two acknowledgments from Bob). So the erasure code in this example, which requires five messages, is quite economical.
This example is a little bit contrived. For truly generic erasure codes that work over any data set, we would need something other than the f(n) given. Nevertheless, it is a useful illustration of the general process.
[edit] Err-mail 2
When err-mail 2 arrives (the upgrade), the maximum message length is reduced to 4 characters. Alice will now start by sending "A=55", "B=56" and "C=29". She will then place markers called A through E on the floor in a configuration that is known to Bob. She will then fit a plane such that its height above A is 55, its height above B is 56 and its height above C is 29. The redundant messages are then the height of that plane above D and E.
If Bob receives three messages he can reconstruct the plane and extract Alice's phone number.
[edit] Real world implementation
This process is commonly implemented with code words constructed over a finite field using a Vandermonde matrix.
[edit] Notes
- ^ Some versions of this story refer to the err-mail daemon.