Cyclic redundancy check

A cyclic redundancy check (CRC) or polynomial code checksum is a non-secure hash function designed to detect accidental changes to raw computer data, and is commonly used in digital networks and storage devices such as hard disk drives. A CRC-enabled device calculates a short, fixed-length binary sequence, known as the CRC code or just CRC, for each block of data and sends or stores them both together. When a block is read or received the device repeats the calculation; if the new CRC does not match the one calculated earlier, then the block contains a data error and the device may take corrective action such as rereading or requesting the block be sent again, otherwise the data is assumed to be error free (though, with some small probability, it may contain undetected errors; this is the fundamental nature of error-checking)[1].

CRCs are so called because the check (data verification) code is a redundancy (it adds zero information to the message) and the algorithm is based on cyclic codes. The term CRC may refer to the check code or to the function that calculates it, which accepts data streams of any length as input but always outputs a fixed-length code. CRCs are popular because they are simple to implement in binary hardware, are easy to analyse mathematically, and are particularly good at detecting common errors caused by noise in transmission channels. The CRC was invented by W. Wesley Peterson in 1961; the 32-bit polynomial used in the CRC function of Ethernet and many other standards is the work of several researchers and was published in 1975.

Contents

Introduction

A CRC is an error-detecting code. Its computation resembles a polynomial long division operation in which the quotient is discarded and the remainder becomes the result, with the important distinction that the polynomial coefficients are calculated according to the carry-less arithmetic of a finite field. The length of the remainder is always less than the length of the divisor (called the generator polynomial), which therefore determines how long the result can be. The definition of a particular CRC specifies the divisor to be used, among other things.

Although CRCs can be constructed using any finite field, all commonly used CRCs employ the finite field GF(2). This is the field of two elements, usually called 0 and 1, comfortably matching computer architecture. The rest of this article will discuss only these binary CRCs, but the principles are more general.

An important reason for the popularity of CRCs for detecting the accidental alteration of data is their efficiency guarantee. Typically, an n-bit CRC, applied to a data block of arbitrary length, will detect any single error burst not longer than n bits (in other words, any single alteration that spans no more than n bits of the data), and will detect a fraction 1−2n of all longer error bursts. Errors in both data transmission channels and magnetic storage media tend to be distributed non-randomly (i.e. are "bursty"), making CRCs' properties more useful than alternative schemes such as multiple parity checks.

The simplest error-detection system, the parity bit, is in fact a trivial 1-bit CRC: it uses the generator polynomial x+1.

The CRC was invented by W. Wesley Peterson in 1961, and published in his 1961 paper[2].

CRCs and data integrity

CRCs are specifically designed to protect against common types of errors on communication channels, where they can provide quick and reasonable assurance of the integrity of messages delivered. However, they are not suitable for protecting against intentional alteration of data. Firstly, as there is no authentication, an attacker can edit a message and recalculate the CRC without the substitution being detected. This is even the case when the CRC is encrypted—this was one of the design flaws of the WEP protocol[3]. Secondly, the linear properties of CRC codes allow an attacker even to keep the CRC unchanged while modifying parts of the message[4][5]—this also makes calculating the CRC adjustment for small changes more efficient. Nonetheless, it is still often falsely assumed that when a message and its CRC are received from an open channel and the CRC matches the message's calculated CRC then the message cannot have been altered in transit[6].

Cryptographic hash functions can provide stronger integrity guarantees in that they do not rely on specific error pattern assumptions. However, they are much slower than CRCs, and are therefore commonly used to protect off-line data, such as files on servers or databases.

Both CRCs and cryptographic hash functions by themselves do not protect against intentional modification of data. Any application that requires protection against such attacks must use cryptographic authentication mechanisms, such as message authentication codes.

Computation of CRC

To compute an n-bit binary CRC, line the bits representing the input in a row, and position the (n+1)-bit pattern representing the CRC's divisor (called a "polynomial") underneath the left-hand end of the row. Here is the first calculation for computing a 3-bit CRC:

11010011101100 <--- input
1011           <--- divisor (4 bits)
--------------
01100011101100 <--- result

If the input bit above the leftmost divisor bit is 0, do nothing and move the divisor to the right by one bit. If the input bit above the leftmost divisor bit is 1, the divisor is XORed into the input (in other words, the input bit above each 1-bit in the divisor is toggled). The divisor is then shifted one bit to the right, and the process is repeated until the divisor reaches the right-hand end of the input row. Here is the last calculation:

00000000001110 <--- result of previous step
          1011 <--- divisor
--------------
00000000000101 <--- remainder (3 bits)

Since the leftmost divisor bit zeroed every input bit it touched, when this process ends the only bits in the input row that can be nonzero are the n bits at the right-hand end of the row. These n bits are the remainder of the division step, and will also be the value of the CRC function (unless the chosen CRC specification calls for some postprocessing).

Mathematics of CRC

Mathematical analysis of this division-like process reveals how to pick a divisor that guarantees good error-detection properties. In this analysis, the digits of the bit strings are thought of as the coefficients of a polynomial in some variable x—coefficients that are elements of the finite field GF(2) instead of more familiar numbers. This binary polynomial is treated as a ring. A ring is, loosely speaking, a set of elements somewhat like numbers, that can be operated on by an operation that somewhat resembles addition and another operation that somewhat resembles multiplication, these operations possessing many of the familiar arithmetic properties of commutativity, associativity, and distributivity. Ring theory is part of Abstract Algebra.

Designing CRC polynomials

The selection of generator polynomial is the most important part of implementing the CRC algorithm. The polynomial must be chosen to maximise the error detecting capabilities while minimising overall collision probabilities.

The most important attribute of the polynomial is its length (largest degree(exponent) +1 of any one term in the polynomial), because of its direct influence of the length of the computed checksum.

The most commonly used polynomial lengths are:

The design of the CRC polynomial depends on the maximum total length of the block to be protected (data + CRC bits), the desired error protection features, and the type resources for implementing the CRC as well as the desired performance. A common misconception is that the "best" CRC polynomials are derived from either an irreducible polynomial or an irreducible polynomial times the factor (1 + x), which adds to the code the ability to detect all errors affecting an odd number of bits[7]. In reality, all the factors described above should enter in the selection of the polynomial.

The advantage of choosing a primitive polynomial as the generator for a CRC code is that the resulting code has maximal total block length; in here if r is the degree of the primitive generator polynomial then the maximal total blocklength is equal to 2 ^ {r} - 1 , and the associated code is able to detect any single bit or double errors. If instead, we used as generator polynomial g(x) = p(x)(1 + x), where p(x) is a primitive polynomial of degree r - 1, then the maximal total blocklength would be equal to 2 ^ {r - 1} - 1 but the code would be able to detect single, double, and triple errors.

A polynomial g(x) that admits other factorizations may be chosen then so as to balance the maximal total blocklength with a desired error detection power. A powerful class of such polynomials, which subsumes the two examples described above, is that of BCH codes. Regardless of the reducibility properties of a generator polynomial of degree r, assuming that it includes the "+1" term, such error detection code will be able to detect all error patterns that are confined to a window of r contiguous bits. These patterns are called "error bursts".

Specification of CRC

The concept of the CRC as an error-detecting code gets complicated when an implementer or standards committee turns it into a practical system. Here are some of the complications:

Commonly used and standardized CRCs

Numerous varieties of cyclic redundancy check have been incorporated into technical standards. By no means does one algorithm, or one of each degree, suit every purpose; Koopman and Chakrabarty recommend selecting a polynomial according to the application requirements and the expected distribution of message lengths[8]. The number of distinct CRCs in use have however led to confusion among developers which authors have sought to address[7]. There are three polynomials reported for CRC-12[8], thirteen conflicting definitions of CRC-16, and six of CRC-32[9].

The polynomials commonly applied are not the most efficient ones possible. Between 1993 and 2004, Koopman, Castagnoli and others surveyed the space of polynomials up to 16 bits[8], and of 24 and 32 bits[10][11], finding examples that have much better performance (in terms of Hamming distance for a given message size) than the polynomials of earlier protocols, and publishing the best of these with the aim of improving the error detection capacity of future standards[11]. In particular, iSCSI and SCTP have adopted one of the findings of this research.

The design of the 32-bit polynomial, CRC-32-IEEE, most commonly used by standards bodies, was the result of a joint Rome Laboratory and Air Force Electronic Systems Division effort of J.L Hammond, J.E. Brown, and S.S. Liu of the Georgia Institute of Technology and K. Brayer of the MITRE Corporation. The earliest known appearances of the 32-bit polynomial were in their 1975 reports written by K. Brayer of the MITRE Corporation, Technical Report 2956, published in January 1975 and released for public dissemination through DTIC, then known as DDC, in August 1975[12] and J.L Hammond, J.E. Brown, and S.S. Liu of Georgia Institute of Technology for the Rome Laboratory[13] published in May 1975. Both reports contained contributions from the other team. In December 1975, Brayer and Hammond presented their work in a paper at the IEEE National Telecommunications Conference: the IEEE CRC-32 polynomial is the generating polynomial of a Hamming code and was selected for its error detection performance[14]. Even so, the Castagnoli CRC-32C polynomial used in iSCSI or SCTP matches its performance on messages from 58 bits–131 kbits, and outperforms it in several size ranges including the two most common sizes of Internet packet[11]. The ITU-T G.hn standard also uses CRC-32C to detect errors in the payload (although it uses CRC-16-CCITT for PHY headers).

The table below lists only the polynomials of the various algorithms in use. Any particular protocol can impose pre-inversion, post-inversion and reversed bit ordering as described above. CRCs in proprietary protocols might use a non-trivial initial value and final XOR for obfuscation but this does not add cryptographic strength to the algorithm.

Note: in this table the high-order bit is omitted; see Specification of CRC above.

Name Polynomial Representations: normal / reversed / reverse of reciprocal
CRC-1 x + 1 (most hardware; also known as parity bit) 0x1 / 0x1 / 0x1
CRC-4-ITU x^4 + x + 1 (ITU-T G.704, p. 12) 0x3 / 0xC / 0x9
CRC-5-EPC x^5 + x^3 + 1 (Gen 2 RFID[15]) 0x09 / 0x12 / 0x14
CRC-5-ITU x^5 + x^4 + x^2 + 1 (ITU-T G.704, p. 9) 0x15 / 0x15 / 0x1A
CRC-5-USB x^5 + x^2 + 1 (USB token packets) 0x05 / 0x14 / 0x12
CRC-6-ITU x^6 + x + 1 (ITU-T G.704, p. 3) 0x03 / 0x30 / 0x21
CRC-7 x^7 + x^3 + 1 (telecom systems, ITU-T G.707, ITU-T G.832, MMC, SD) 0x09 / 0x48 / 0x44
CRC-8-CCITT x^8 + x^2 + x + 1 (ATM HEC), ISDN Header Error Control and Cell Delineation ITU-T I.432.1 (02/99) 0x07 / 0xE0 / 0x83
CRC-8-Dallas/Maxim x^8 + x^5 + x^4 + 1 (1-Wire bus) 0x31 / 0x8C / 0x98
CRC-8 x^8 + x^7 + x^6 + x^4 + x^2 + 1 0xD5 / 0xAB / 0xEA[8]
CRC-8-SAE J1850 x^8 + x^4 + x^3 + x^2 + 1 0x1D / 0xB8 / 0x8E
CRC-8-WCDMA x^8 + x^7 + x^4 + x^3 + x + 1[16] 0x9B / 0xD9 / 0xCD[8]
CRC-10 x^{10} + x^9 + x^5 + x^4 + x + 1 (ATM; ITU-T I.610) 0x233 / 0x331 / 0x319
CRC-11 x^{11} + x^9 + x^8 + x^7 + x^2 + 1 (FlexRay[17]) 0x385 / 0x50E / 0x5C2
CRC-12 x^{12} + x^{11} + x^3 + x^2 + x + 1 (telecom systems[18][19]) 0x80F / 0xF01 / 0xC07[8]
CRC-15-CAN x^{15} + x^{14} + x^{10} + x^8 + x^7 + x^4 + x^3 + 1 0x4599 / 0x4CD1 / 0x62CC
CRC-16-IBM x^{16} + x^{15} + x^2 + 1 (Bisync, Modbus, USB, ANSI X3.28, many others; also known as CRC-16 and CRC-16-ANSI) 0x8005 / 0xA001 / 0xC002
CRC-16-CCITT x^{16} + x^{12} + x^5 + 1 (X.25, HDLC, XMODEM, Bluetooth, SD, many others; known as CRC-CCITT) 0x1021 / 0x8408 / 0x8810[8]
CRC-16-T10-DIF x^{16} + x^{15} + x^{11} + x^{9} + x^8 + x^7 + x^5 + x^4 + x^2 + x + 1 (SCSI DIF) 0x8BB7[20] / 0xEDD1 / 0xC5DB
CRC-16-DNP x^{16} + x^{13} + x^{12} + x^{11} + x^{10} + x^8 + x^6 + x^5 + x^2 + 1 (DNP, IEC 870, M-Bus) 0x3D65 / 0xA6BC / 0x9EB2
CRC-16-DECT x^{16} + x^{10} + x^8 + x^7 + x^3 + 1 (cordless telephones)[21] 0x0589 / 0x91A0 / 0x82C4
CRC-16-Fletcher Not a CRC; see Fletcher's checksum Used in Adler-32 A & B CRCs
CRC-24 x^{24} + x^{22} + x^{20} + x^{19} + x^{18} + x^{16} + x^{14} + x^{13} + x^{11} + x^{10} + x^8 + x^7 + x^6 + x^3 + x + 1 (FlexRay[17]) 0x5D6DCB / 0xD3B6BA / 0xAEB6E5
CRC-24-Radix-64  x^{24} + x^{23} + x^{18} + x^{17} + x^{14} + x^{11} + x^{10} + x^7 + x^6 + x^5 + x^4 + x^3 + x + 1 (OpenPGP) 0x864CFB / 0xDF3261 / 0xC3267D
CRC-30 x^{30} + x^{29} + x^{21} + x^{20} + x^{15} + x^{13} + x^{12} + x^{11} + x^{8} + x^{7} + x^{6} + x^{2} + x + 1 (CDMA) 0x2030B9C7 / 0x38E74301 / 0x30185CE3
CRC-32-Adler Not a CRC; see Adler-32 See Adler-32
CRC-32-IEEE 802.3 x^{32} + x^{26} + x^{23} + x^{22} + x^{16} + x^{12} + x^{11} + x^{10} + x^8 + x^7 + x^5 + x^4 + x^2 + x + 1 (V.42, Ethernet, MPEG-2, PNG[22], POSIX cksum) 0x04C11DB7 / 0xEDB88320 / 0x82608EDB[11]
CRC-32C (Castagnoli) x^{32} + x^{28} + x^{27} + x^{26} + x^{25} + x^{23} + x^{22} + x^{20} + x^{19} + x^{18} + x^{14} + x^{13} + x^{11} + x^{10} + x^9 + x^8 + x^6 + 1 (iSCSI & SCTP, G.hn payload, SSE4.2) 0x1EDC6F41 / 0x82F63B78 / 0x8F6E37A0[11]
CRC-32K (Koopman) x^{32} + x^{30} + x^{29} + x^{28} + x^{26} + x^{20} + x^{19} + x^{17} + x^{16} + x^{15} + x^{11} + x^{10} + x^{7} + x^{6} + x^{4} + x^{2} + x + 1 0x741B8CD7 / 0xEB31D82E / 0xBA0DC66B[11]
CRC-32Q x^{32} + x^{31} + x^{24} + x^{22} + x^{16} + x^{14} + x^{8} + x^{7} + x^{5} + x^{3} + x + 1 (aviation; AIXM[23]) 0x814141AB / 0xD5828281 / 0xC0A0A0D5
CRC-64-ISO x^{64} + x^4 + x^3 + x + 1 (HDLC — ISO 3309, Swiss-Prot/TrEMBL; considered weak for hashing[24]) 0x000000000000001B / 0xD800000000000000 / 0x800000000000000D
CRC-64-ECMA-182 x^{64} + x^{62} + x^{57} + x^{55} + x^{54} + x^{53} + x^{52} + x^{47} + x^{46} + x^{45} + x^{40} + x^{39} + x^{38} + x^{37} + x^{35} + x^{33} + x^{32} + x^{31} + x^{29} + x^{27} + x^{24} + x^{23} + x^{22} + x^{21} + x^{19} + x^{17} + x^{13} + x^{12} + x^{10} + x^9 + x^7 + x^4 + x + 1 (as described in ECMA-182 p. 51) 0x42F0E1EBA9EA3693 / 0xC96C5795D7870F42 / 0xA17870F5D4F51B49

See also

References

  1. Ritter, Terry (February 1986). "The Great CRC Mystery". Dr. Dobb's Journal 11 (2): 26–34, 76–83. http://www.ciphersbyritter.com/ARTS/CRCMYST.HTM. Retrieved 21 May 2009. 
  2. Peterson, W. W. and Brown, D. T. (January 1961). "Cyclic Codes for Error Detection". Proceedings of the IRE 49: 228. doi:10.1109/JRPROC.1961.287814. 
  3. N. Cam-Winget, Nancy; R. Housley; D. Wagner; J. Walker (May 2003). "Security Flaws in 802.11 Data Link Protocols". Communications of the ACM 46 (5): 35–39. doi:10.1145/769800.769823. 
  4. Stigge, Martin; Plötz, Henryk; Müller, Wolf; Redlich, Jens-Peter (May 2006). Reversing CRC – Theory and Practice. Berlin: Humboldt University Berlin. pp. 24. http://sar.informatik.hu-berlin.de/research/publications/SAR-PR-2006-05/SAR-PR-2006-05_.pdf. Retrieved 21 Jul 2009. "The presented methods offer a very easy and efficient way to modify your data so that it will compute to a CRC you want or at least know in advance. This is not a very difficult task, as CRC is not a cryptographical hash algorithm [...] So you should never consider the CRC as some kind of message authentication code [...] – it can easily be forged". 
  5. Anachriz (30 April 1999). "CRC and how to Reverse it". http://www.woodmann.com/fravia/crctut1.htm. Retrieved 21 January 2010.  Online essay with example x86 assembly code.
  6. "Eurocontrol – FAQ: Technologies". European Organisation for the Safety of Air Navigation. http://www.eurocontrol.int/aim/public/faq/chain_faq3.html. Retrieved 29 April 2009. "A Cyclic Redundancy Check (CRC) is a means by which a data item may be assessed to verify that it has not been changed (either intentionally or unintentionally) since the CRC value was applied to it." 
  7. 7.0 7.1 Williams, Ross N. (24 September 1996). "A Painless Guide to CRC Error Detection Algorithms V3.00" (HTML). http://www.repairfaq.org/filipg/LINK/F_crc_v3.html. Retrieved 5 June 2010. Contains a rigorous explanation of how to generate the CRC table typically found in implementations.
  8. 8.0 8.1 8.2 8.3 8.4 8.5 8.6 Koopman, Philip; Chakravarty, Tridib (2004). "Cyclic Redundancy Code (CRC) Polynomial Selection For Embedded Networks". http://www.ece.cmu.edu/~koopman/roses/dsn04/koopman04_crc_poly_embedded.pdf. 
  9. Greg Cook (26 March 2010). "Catalogue of parametrised CRC algorithms". http://regregex.bbcmicro.net/crc-catalogue.htm. Retrieved 5 June 2010. 
  10. Castagnoli, G.; S. Braeuer; M. Herrman (June 1993). "Optimization of Cyclic Redundancy-Check Codes with 24 and 32 Parity Bits". IEEE Transactions on Communications 41 (6): 883. doi:10.1109/26.231911. . Castagnoli's et al. work on algorithmic selection of CRC polynomials
  11. 11.0 11.1 11.2 11.3 11.4 11.5 Koopman, P. (June 2002). "32-Bit Cyclic Redundancy Codes for Internet Applications". The International Conference on Dependable Systems and Networks: 459. doi:10.1109/DSN.2002.1028931. http://citeseer.ist.psu.edu/koopman02bit.html. . Verification of Castagnoli's results by exhaustive search and some new good polynomials
  12. Kenneth Brayer (August 1975). Evaluation of 32 Degree Polynomials in Error Detection on the SATIN IV Autovon Error Patterns. National Technical Information Service. p. 74. http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier=ADA014825 
  13. Joseph L. Hammond, Jr.; James E. Brown; Shyan-Shiang Liu (May 1975). Development of a Transmission Error Model and an Error Control Model. National Technical Information Service. p. 74. http://handle.dtic.mil/100.2/ADA013939 
  14. Brayer, Kenneth; Hammond, Joseph L. Jr. (December 1975). "Evaluation of error detection polynomial performance on the AUTOVON channel". Conference Record. 1. IEEE National Telecommunications Conference, New Orleans, La. New York: Institute of Electrical and Electronics Engineers. pp. 8–21 to 8–25. 
  15. Class-1 Generation-2 UHF RFID Protocol. 1.2.0. EPCglobal. 23 October 2008. p. 35. http://www.epcglobalinc.org/standards/uhfc1g2/uhfc1g2_1_0_9-standard-20050126.pdf. Retrieved 21 May 2009. 
  16. Richardson, Andrew (17 March 2005). WCDMA Handbook. Cambridge, UK: Cambridge University Press. p. 223. ISBN 0521828155. 
  17. 17.0 17.1 FlexRay Protocol Specification. 2.1 Revision A. Flexray Consortium. 22 December 2005. p. 93. 
  18. Perez, A.; Wismer & Becker (1983). "Byte-Wise CRC Calculations". IEEE Micro 3 (3): 40–50. doi:10.1109/MM.1983.291120. 
  19. Ramabadran, T.V.; Gaitonde, S.S. (1988). "A tutorial on CRC computations". IEEE Micro 8 (4): 62–75. doi:10.1109/40.7773. 
  20. Thaler, Pat (28 August 2003). 16-bit CRC polynomial selection. INCITS T10. http://www.t10.org/ftp/t10/document.03/03-290r0.pdf. Retrieved 11 August 2009. 
  21. ETSI EN 300 175-3. V2.2.1. Sophia Antipolis, France: European Telecommunications Standards Institute. November 2008. 
  22. Thomas Boutell, Glenn Randers-Pehrson, et al. (1998-07-14). "PNG (Portable Network Graphics) Specification, Version 1.2". http://www.libpng.org/pub/png/spec/1.2/PNG-Structure.html. Retrieved 2008-04-28. 
  23. AIXM Primer. 4.5. European Organisation for the Safety of Air Navigation. 20 March 2006. http://www.eurocontrol.int/aim/gallery/content/public/aicm_aixm_4_5/aixm_primer/AIXM_Primer_4.5.pdf. Retrieved 29 April 2009. 
  24. Jones, David T.. An Improved 64-bit Cyclic Redundancy Check for Protein Sequences. University College London. http://www.cs.ucl.ac.uk/staff/d.jones/crcnote.pdf. Retrieved 15 December 2009. 

External links