Simple file verification

From Wikipedia, the free encyclopedia

Simple file verification (SFV) is a file format for storing CRC32 checksums of files in order to verify the integrity of files. SFV can be used to detect random corruptions in a file, but cannot be used for checking authenticity in any meaningful way. Typically, the .sfv extension is used on SFV files.

[edit] Checksum

Files can become corrupted for a variety of reasons including: faulty storage media, errors in transmission, write errors during copying or moving, software bugs and so on. SFV verification ensures that a file has not been corrupted by comparing the file's CRC hash value to a previously calculated value. Due to the nature of hash functions, hash collisions may result in false negatives, but the likelihood of collisions is usually negligible with random corruption.

SFV cannot be used to verify the authenticity of files, as CRC32 is not a collision resistant hash function; even if the hash sum file is not tampered with, it is computationally trivial for an attacker to cause deliberate hash collisions, meaning that a malicious change in the file is not detected by a hash comparison. In cryptography, this attack is called a collision attack. For this reason, the md5sum and sha1sum utilities are often preferred in Unix operating systems, which use the MD5 and SHA-1 cryptographic hash functions respectively.

Even a single-bit error causes both SFV's CRC and md5sum's cryptographic hash to fail, typically requiring the entire file to be re-fetched from scratch. For this reason, the Parchive and rsync utilities are often preferred for verifying that a file has not been accidentally corrupted in transmission, since they can correct common small errors with a much shorter download.

Despite above-mentioned weaknesses possessed by the SFV format, it is still a popular data verification technique. This is due to the relatively small amount of time taken by SFV utilities to calculate the CRC32 checksums, especially when compared to the time taken to calculate equivalent cryptographic hashes such as MD5 or SHA-1.

One of the first programs to use the SFV format was WinSFV.

SFV uses a plain text file containing one line for each file and its checksum in the format FILENAME<whitespaces>CHECKSUM. Any line starting with a semicolon ';' is considered to be a comment and is ignored for the purposes of file verification. The delimiter between the filename and checksum is always one or several spaces; tabs are never used. A sample SFV file appears as follows:

file_one.zip   c45ad668
file_two.zip   7903b8e6
file_three.zip e99a65fb

[edit] See also

Languages