Elias omega coding
Elias omega coding is a universal code encoding the positive integers developed by Peter Elias. Like Elias gamma coding and Elias delta coding, it works by prefixing the integer with a representation of its order of magnitude in a universal code. Unlike those other two codes, however, Elias omega recursively encodes that prefix; thus, they are sometimes known as recursive Elias codes.
Omega coding is used in applications where the largest encoded value is not known ahead of time, or to compress data in which small values are much more frequent than large values.
To code a number:
- Put a "0" at the end of the representation.
- If the number to be encoded is 1, stop without recording it; if not, add the binary representation of the number as a 'group' to the beginning of the representation.
- Repeat the previous step, with the number of digits just written, minus 1, as the new number to be encoded.
To decode an Elias omega-coded integer:
- Start with a variable N, set to a value of 1.
- Read the first 'group', which will either be a single "0", or a "1" followed by N more digits. If it is a "0", it means the value of the integer is 1; if it starts with a "1", then N becomes the value of the group interpreted as a binary number.
- Read each successive group; it will either be a single "0", or a "1" followed by N more digits. If it is a "0", it means the value of the integer is N; if it starts with a "1", then N becomes the value of the group interpreted as a binary number.
Examples
The first few codes are shown below. Included is the so-called implied distribution, describing the distribution of values for which this coding yields a minimum-size code; see Relationship of universal codes to practical compression for details.
Value | Code | Implied distribution |
---|---|---|
1 | 0 | 1/2 |
2 | 10 0 | 1/8 |
3 | 11 0 | 1/8 |
4 | 10 100 0 | 1/64 |
5 | 10 101 0 | 1/64 |
6 | 10 110 0 | 1/64 |
7 | 10 111 0 | 1/64 |
8 | 11 1000 0 | 1/128 |
9 | 11 1001 0 | 1/128 |
10 | 11 1010 0 | 1/128 |
11 | 11 1011 0 | 1/128 |
12 | 11 1100 0 | 1/128 |
13 | 11 1101 0 | 1/128 |
14 | 11 1110 0 | 1/128 |
15 | 11 1111 0 | 1/128 |
16 | 10 100 10000 0 | 1/2048 |
17 | 10 100 10001 0 | 1/2048 |
... | ||
100 | 10 110 1100100 0 | 1/8192 |
1000 | 11 1001 1111101000 0 | 1/131,072 |
1,000,000 | 10 100 10011 11110100001001000000 0 | 1/2,147,483,648 |
The encoding for 1 googol, 10100, is 11 1000 101001100 followed by the binary representation of 1 googol, which is 10010 01001001 10101101 00100101 10010100 11000011 01111100 11101011 00001011 00100111 10000100 11000100 11001110 00001011 11110011 10001010 11001110 01000000 10001110 00100001 00011010 01111100 10101010 10110010 01000011 00001000 10101000 00101110 10001111 00010000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 and a trailing 0, for a total of 349 bits.
The encoding for a googol to the hundredth power, 1010000, is 33243 bits long; under Elias delta coding, the same number is 33250 bits long. Interestingly, log2(1010000) is about 33219 bits, so in this instance, omega and delta coding are, respectively, only 0.07% and 0.09% short of optimal.
Example code
Encoding
void eliasOmegaEncode(char* source, char* dest) { IntReader intreader(source); BitWriter bitwriter(dest); while (intreader.hasLeft()) { int num = intreader.getInt(); BitStack bits; while (num > 1) { int len = 0; for (int temp = num; temp > 0; temp >>= 1) // calculate 1+floor(log2(num)) len++; for (int i = 0; i < len; i++) bits.pushBit((num >> i) & 1); num = len - 1; } while (bits.length() > 0) bitwriter.putBit(bits.popBit()); bitwriter.putBit(false); // write one zero } bitwriter.close(); intreader.close(); }
Decoding
void eliasOmegaDecode(char* source, char* dest) { BitReader bitreader(source); IntWriter intwriter(dest); while (bitreader.hasLeft()) { int num = 1; while (bitreader.inputBit()) // potentially dangerous with malformed files. { int len = num; num = 1; for (int i = 0; i < len; ++i) { num <<= 1; if (bitreader.inputBit()) num |= 1; } } intwriter.putInt(num); // write out the value } bitreader.close(); intwriter.close(); }
Generalizations
Elias omega coding does not code zero or negative integers. One way to code all non negative integers is to add 1 before coding and then subtract 1 after decoding. One way to code all integers is to set up a bijection, mapping integers all integers (0, 1, -1, 2, -2, 3, -3, ...) to strictly positive integers (1, 2, 3, 4, 5, 6, 7, ...) before coding.
External references
- Peter Elias, "Universal codeword sets and representations of the integers", IEEE Trans. Information Theory 21(2):194-203, Mar 1975.
- Khalid Sayood, Lossless Compression Handbook, Elsevier, 2003. Section 3.6.
- Implementation in Python
See also
|