Run length limited

From Wikipedia, the free encyclopedia

Run length limited or RLL coding is a technique that is used to store data on recordable media. Specifically, RLL prevents long stretches of repeated bits from causing the signal recorded on media to not change for an excessive period, by modulating the data. RLL reduces the timing uncertainty in decoding the stored data which would lead to the possible errant insertion or removal of bits when reading the data back. Run length limited codes are widely used in hard disk drives, digital optical discs such as CD, DVD and Blu-ray as the mechanism to ensure that the boundaries between bits can always be accurately found while efficiently using the media to reliably store the maximum amount of data in a given space. RLL coding is also used in Hi-MDs. While other forms of coding can also achieve either or both of these goals, few if any can achieve both as well, and RLL is simpler to implement than some, which led to its becoming the de facto industry standard for hard disks by the early 1990s.

Contents

[edit] Need for RLL coding

On a hard disk, information is represented by changes in the direction of the magnetic field on the disk. Since bits are generally conceived as zeros and ones, with zero being nothing and one being something, but tracks on a hard disk are always magnetized, with only the polarity of the field varying, some code must be used to translate between bits and magnetism. One of the simplest practical codes, Modified Non-Return-to-Zero-Inverted (NRZI), simply encodes a 1 as a magnetic polarity transition, also known as a "flux reversal", and a zero as no transition. With the disk spinning at a constant rate, each bit is given an equal time period, a "data window," for the magnetic signal that represents that bit, and the flux reversal, if any, occurs at the start of this window. (Note: older hard disks used the same data window over the whole disk, but modern disks are more complicated; for more on this, see zoned bit recording.) Say the data window is 1 ns (one nanosecond, or one billionth of a second.) Then for the data pattern 101, there will be a flux transition plus the remainder of 1 ns of unchanging magnetic field to represent the first "1", then a 1 ns period of unchanging field representing the "0", and then another flux reversal and the remainder of 1 ns representing the second one. In this example, there were two ones relatively close together, and no long periods without a flux reversal occurred, so it was easy to keep track of the data window and not drift out of sync. Now, consider encoding "000000000" (a string of nine zeros in a row.) This would involve simply keeping the magnetic polarity unchanged for 9 times the data window or 9 ns. However, when decoding the data to retrieve it, timing uncertainty limits the accuracy with which the data window can be measured out onto the data. Uncertainty is introduced by small changes in disk speed, as well as by possible drift in the frequency of the clock circuit providing the time reference for the electronics (that count out data window periods), including temperature related drift and manufacturing tolerance (variation) of the electronic components. This uncertainty may cause the decoder to detect that 8 or 10 zero bits were read from the media when it should detect nine. If more zero bits follow, the risk of this sort of error only gets higher. In arbitrary data, no number of zero bits in a row is impossible, so no matter how precise the timing of the system is made, it is always possible there will be a string of zeros so long that the system cannot guarantee a correct count when reading them back.

To prevent this problem, data is coded in such a way that these long stretches without a transition do not occur. That means the source data is transformed into a code that limits the maximum number of zeros between two successive ones. At the same time, a related issue can be solved: in NRZI, encoding "111111" would be done by repeatedly changing the magnetic phase rapidly. However, there is a minimum distance between transitions possible, so if the encoding guarantees a minimum number of zeros between consecutive 1s, then it is possible to write more bits per linear distance on a disk track. This is greater efficiency, and it means more data can be stored in the same space on disk, yielding physical smaller and lighter drives or more data storage capacity in the same physical drive.

[edit] History

The RLL recording technique, specifically RLL (2,7), was originally developed by IBM engineers, and was first used commercially in 1979 on the IBM 3370 DASD,[1][2][3] for use with the 4300 series mainframe. During the late 1980s, PC hard disks began using RLL. RLL codes have found almost universal application in optical disc recording practice since 1980. In consumer electronics, RLLs like the EFM code with (Eight-to-Fourteen:rate = 8/14, d=2, k=10) are employed in the Compact Disc (CD), and the EFMPlus code (rate = 8/16, d=2, k=10) used in the DVD. Parameters d, k are the minimum and maximum allowed run-lengths. For more coverage on the storage technologies, the references cited in this article are useful.

[edit] Technical overview

Generally run-length is the number of bits for which signal remains unchanged. A run-length of 3 for bit 1, represents a sequence of '111'. For instance, the runlengths in the string '0111100111000000' are of length 1, 4, 2, 3, and 6. Run length limited sequences are characterized by two parameters, (d) and (k), which stipulate the minimum and maximum runlength that can occur in the sequence. So RLL's are generally specified as (d,k) RLL. eg: (1,3) MFM RLL.

For disk drives the situation is little different. A disk drive stores information by changing the magnetic polarization of the disk surface. A change in polarization is considered to be a '1' bit, while if there is no change for a time it is considered to be a '0' bit. The actual signal on the disk might be '+−−−−++−−−++++++' but this is usually read as '0100010100100000'. So run length limited coding for a disk is really about the upper and lower limits on how many '0's there must be between consecutive '1's. A magnetic tape might be able to support 3,200 flux reversals per inch.

A Modified Frequency Modulation or (1,3) RLL encoding stores each data bit as two bits on tape, but since there is guaranteed to be one 0 (non flux reversal) bit between any 1 (flux reversal) bits then it is possible to store 6,400 encoded bits per inch on the tape, or 3,200 data bits per inch. A (1,7) RLL encoding can also store 6,400 encoded bits per inch on the tape, but since it only takes 3 encoded bits to store 2 data bits this is 4,267 data bits per inch. A (2,7) RLL encoding can takes 2 encoded bits to store each data bit, but since there is guaranteed to be two 0 bits between any 1 bits then it is possible to store 9,600 encoded bits per inch on the tape, or 4,800 data bits per inch.

The flux reversal densities on hard drives are significantly greater, but the same improvements in storage density are seen by using different encoding systems.

[edit] Coding

In the encoded format a "1" bit indicates a flux transition, while a "0" indicates that the magnetic field on the disk does not change for that time interval.

[edit] (1,7) RLL

(1,7) RLL maps 2 bits of data onto three bits on the disk, and the encoding is done in two or four bit groups. The encoding rules are: (x, y) becomes (NOT x, x AND y, NOT y), except (x, 0, 0, y) becomes (NOT x, x AND y, NOT y, 0, 0, 0) [4]

Data Encoded
00 00 101 000
00 01 100 000
10 00 001 000
10 01 010 000
00 101
01 100
10 001
11 010

Example:

Data:    0 0 1 0 1 1 0 1 0 0 0 1 1 0
Encoded: 101 001 010 100 101 100 001

[edit] (2,7) RLL

(2,7) RLL maps n bits of data onto 2n bits on the disk, but the encoding can be done in two, three or four bit groups.

Data (2,7) RLL Encoded
11 1000
10 0100
000 000100
010 100100
011 001000
0011 00001000
0010 00100100

Example:

Data:    1 1 0 1 1 0 0 1 1
Encoded: 100000100000001000

[edit] DC Free (1,7) RLL

There is also an alternate (1,7) RLL encoding that is sometimes used to avoid a DC bias (which helps when sending a signal over a long distance or with some types of recording media).

Data Encoded
00 x01
01 010
10 x00
11 00 010 001
11 01 x00 000
11 10 x00 001
11 11 010 000

Where "x" is the complement of the previous encoded bit (i.e. 1 if the previous bit was 0, and 0 if the previous bit was 1).

Example:

Data:    0 1 0 0 1 1 0 1 0 1
Encoded: 010 101 000 000 010

[edit] Other encodings

RLL is closely related to group code recording (GCR). Other schemes include FM and MFM (Modified Frequency Modulation).

[edit] See also

[edit] References

This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.

  1. ^ A Quarter Century of Disk File Innovation, IBM Journal of Research and Development.
  2. ^ P.A. Franaszek (1972), “Run-Length-Limited Variable Length Coding with Error Propagation Limitation”, U.S. Patent 3,689,899 .
  3. ^ Five decades of disk drive industry firsts, DISK/TREND Reports.
  4. ^ Mee, C. Denis; Eric D. Daniel (1996). Magnetic Storage Handbook, 2nd Edition. McGraw Hill. 

[edit] External links