Comparison of archive formats
There are many popular computer data archive formats for creating and maintaining archive files. The tables below compare many popular archive formats.
Features
The table compares various features column-by-column in the table below:
Purpose
Archive formats are used for backups, mobility, and archiving. Many archive formats compress the data to consume less storage space and result in quicker transfer times as the same data is represented by fewer bytes. Another benefit is that files are combined into one archive file which has less overhead for managing or transferring.
There are numerous compression algorithms available to losslessly compress archived data and some algorithms work better (smaller archive or faster compression) with particular data types.
Archive formats are also used by most operating systems to package software for easier distribution and installation than binary executables.
Filename extension
The DOS and Windows operating systems required filenames to include a three-character extension to identify the file type and use. Filename extensions must be unique for each type of file. Many operating systems identify a file's type from its contents without the need for an extension in its name. However, the use of three-character extensions has been embraced as a useful and efficient shorthand for identifying file types—both for computer software, and for humans.
Integrity check
Archive files are often stored on magnetic media, which is subject to data storage errors. Early tape media had a higher rate of errors than is expected for magnetic media today. Many archive formats contain extra data embedded in the files in order to detect data storage or transmission errors, and the software used to read the archive files contain logic to detect errors.
Recovery record
Many archive formats contain redundant data embedded in the files in order to detect data storage or transmission errors, and the software used to read the archive files contain logic to detect and correct errors.
Encryption
In order to protect the data being stored or transferred from being read if intercepted, many archive formats include the capability to encrypt the data. There are multiple mathematical algorithms available to encrypt data.
Comparison
Containers and Compression
Format | Filename extension |
Created by |
Introduced in | Based on | Compression | Integrity check | Recovery record | Encryption supported | Unicode filenames | Modification date resolution | Pre-processing |
---|---|---|---|---|---|---|---|---|---|---|---|
Archive (ar) | .a | CSRG | ? | Original | No | No | No | No | No | 1 s | ? |
cpio | .cpio | Bell Labs | 1983 Unix System V | ? | No | Partial, select formats only | No | No | No | 1 s | ? |
Shell Archive (shar and makeself) | .shar, .run | ? | 1994 4.4BSD | Original | No | Yes, commonly MD5 | Partial | Partial | Partial | arbitrary (typically 1 s) | ? |
Tape Archive (tar) | .tar | Bell Labs | 1975 Version 6 Unix | ? | No | Partial, metadata only. Full integrity providable by filters such as gzip. | No | No | Optional1 | 1 s | No |
Extended TAR format (pax) | .tar | OpenGroup | 2001 | Sun proposal + TAR | No | metadata | No | No | Yes | arbitrary (typically 1 ns) | ? |
BagIt | - | The Library of Congress | 2007 | file system | No | Yes | No | No | Yes | No | ? |
7z | .7z | Igor Pavlov | 2000 | LZMA | Yes | Yes, CRC32 |
No | Yes, AES-256 |
Yes | 1 ms (maybe better?) | Yes |
ACE | .ace | Marcel Lemke | ? | ? | Yes | Yes | Yes | Yes, Blowfish | Yes | ? | ? |
AFA | .afa | Vicente Sánchez-Alarcos | 2009 | Original | Yes | Yes | Yes | Yes, AES and CAST | Yes | ? | ? |
ARC | .arc | Thom Henderson (SEA) | 1985 | ? | Yes | CRC16 | No | weak XOR only | No | 2s | ? |
ARJ | .arj | Robert Jung | 1991 | AR001 and AR002 | Yes | Yes | Yes | weak XOR with initial constant | No | ? | ? |
B1 | .b1 | Catalina Group Ltd | 2011 | LZMA | Yes | Yes | No | Yes, AES | Yes | ? | ? |
Cabinet | .cab | Microsoft | 1992 Windows 3.1 | DEFLATE | Yes | Optional PKCS7 Authenticode signature | No | Optional (with SDK) | Yes | 2 s | ? |
Compact File Set | .cfs | Joe Lowe (Pismo Technic Inc.) | 2008 | ZIP/LZMA | Yes | Yes | ? | Yes | Yes | ? | ? |
Compact Pro | .cpt | Bill Goodman | 1990 (as "Compactor") | Original | Yes | Yes | No | Yes | ? | ? | ? |
Disk Archive (DAR) | .dar | Denis Corbin | 2002 | Original | Yes | Yes | Yes2 | Yes | Yes | 1 µs | Yes |
DGCA | .dgc | Shin-ichi Tsuruta | 2001 | GCA | Yes | Yes | Yes | Yes | Yes | ? | ? |
FreeArc | .arc | Bulat Ziganshin | 2006 | LZMA, PPMD, TTA | Yes | Yes | Yes | Yes, AES, Blowfish, Twofish and Serpent | Yes | ? | ? |
LHA (also LZH) | .lzh, .lha | Haruyasu Yoshizaki | 1988 | Frozen | Yes | Only on recent LHA releases | No | No | No | 1–2 s | ? |
LZX | .lzx | Jonathan Forbes and Tomi Poutanen | 1995 | LZ77 | Yes | Only on recent LZX releases | ? | ? | ? | ? | ? |
Sparc | .arc | David Pilling | 1989 | ? | Yes | ? | ? | ? | ? | ? | ? |
WinMount format | .mou | ? | 2007 | ? | Yes | Yes | Yes | Yes | Yes | ? | ? |
Macintosh Disk Image | .dmg | Apple Computer | 2001 Mac OS X | Original | Yes | Yes | ? | Yes | ? | ? | ? |
Partition Image (PartImage) | .partimg | François Dupoux and Franck Ladurelle | 2000 | ? | Yes | ? | ? | ? | ? | ? | ? |
PAQ Family (Several formats)4 | .paq#*, .lpaq#* | Matt Mahoney | 2002–2006 | Original | Yes | ? | ? | ? | ? | ? | ? |
PEA | .pea | Giorgio Tani | 2006 | Original, Deflate based compression | Yes | Yes Adler32, CRC32, CRC64, MD5, SHA1, RIPEMD-160, SHA256, SHA512, Whirlpool | No | Yes Authenticated Encryption, AES128 and AES256 in EAX mode | Yes system dependent | Yes arbitrary | ? |
PIM | .pim | Ilia Muraviev | 2004–2008 | Original | Yes | Yes | No | No | Yes | No | ? |
Quadruple D | .qda | Taku Hayase (aka sandman) | 1997 | ? | Yes | ? | ? | ? | ? | ? | ? |
RAR | .rar | Eugene Roshal | 1993 | Original | Yes | Yes, CRC32, BLAKE2 |
Yes, Reed-Solomon |
Yes, AES-256 |
Yes, UTF-8 |
2 s, 1 s, 6.5536 ms, 25.6 µs or 100 ns 3 | Dropped |
RK | .rk | M Software, Ltd. | 2004 | Original | Yes | Yes | No | Yes, AES, Square, Twofish | Yes | 1 s | ? |
StuffIt (also SIT) | .sit | Raymond Lau | 1987 | ? | Yes | ? | ? | Yes | ? | ? | ? |
StuffIt X (also SITx) | .sitx | Aladdin/Allume Systems | 2002 | ? | Yes | ? | Optional | Yes, RC4,Blowfish, AES,DES |
Yes | ? | ? |
UltraCompressor II | .uc .uc0 .uc2 .ucn .ur2 .ue2 |
Nico de Vries | 1992–1996 | LZ77 and Huffman coding | Yes | Yes | Yes | Yes, triple DES | ? | ? | ? |
Windows Image | .wim | Microsoft | ? | Original | Yes | Optional | ? | No | Yes | ? | ? |
ZIP (also PKZIP) | .zip | Phil Katz | 1989 | DEFLATE | Yes | Yes | No | Yes, AES | Yes | 2 s | ? |
ZPAQ | .zpaq | Matt Mahoney | 2009 | PAQ | Yes | Yes, SHA-1 | No | Yes, AES-256 | Yes | ? | ? |
Notes
^1 While the original tar format uses the ASCII character encoding, current implementations use the UTF-8 (Unicode) encoding, which is backwards compatible with ASCII.
^2 Supports the external Parchive program (par2).
^3 From 3.20 release RAR can store modification, creation and last access time with the precision up to 0.0000001 second (= 0.1 µs).
^4 The PAQ family (with its lighter weight derivative LPAQ) went through many revisions, each revision suggested its own extension. For example: ".paq9a".
Software Packaging and Distribution
Format | Filename extension |
Created by |
Introduced in | Based on | Integrity check | Recovery record | Encryption supported | Unicode filenames | Modification date resolution |
---|---|---|---|---|---|---|---|---|---|
Debian package (deb) | .deb | Debian | 1994 Debian 0.91 | ar, tar, and gzip | Yes | No | No | Yes | 1 s |
Macintosh Installer | .pkg, .mpkg (metapackage) | NeXT | 1989 NeXTSTEP 1.0 | pax and gzip | Yes | ? | ? | Yes | ? |
RPM Package Manager (RPM) | .rpm | Red Hat | 1995 Red Hat Linux 1.0 | cpio and gzip | Yes | ? | ? | ? | 1 s |
Slackware Package | .tgz | Patrick Volkerding | 1993 Slackware 1.0 | tar and gzip | Yes | No | No | ? | ? |
Windows Installer (also MSI) | .msi | Microsoft | 2000 Windows 2000 | OLE Structured Storage, Cabinet and SQL | Optional PKCS7 Authenticode Signature | No | No | No | 2 s |
Java Archive (JAR1) | .jar | Sun Microsystems | 1997 JDK 1.1 | PKZIP | Yes | ? | ? | Yes | ? |
Google Chrome extension package | .crx | 2009 (Chrome 4.0) | Zip | ? | ? | Yes[1] | ? | ? | |
Pacman | .pkg.tar.xz | Judd Vinet | ? | .tar.xz | ? | ? | ? | ? | ? |
Notes
^1 Not to be confused with the archiver JAR written by Robert K. Jung, which produces ".j" files.
Features
Archive format | Built-in compression | Self-extracting | Directory Structure | POSIX attributes | ACLs | Alternate data streams |
---|---|---|---|---|---|---|
cpio | No1 | No | Yes | Yes | ? | ? |
tar | No1 | No | Yes | Yes | Yes | No |
dar | Yes3 | No | Yes | Yes | Yes | Yes |
ar | No | No | No | Yes | No | ? |
pax | No | No | Yes | Yes | Yes | ? |
dump | No1 | No | Yes | Yes | Yes | ? |
shar | No | Yes | Yes | Yes | ? | ? |
makeself | Yes | Yes | Yes | Yes | Yes | ? |
zip | Yes | Yes2 | Yes | No | ? | ? |
rar | Yes | Yes2 | Yes | No | ? | ? |
ace | Yes | ? | Yes | No | ? | ? |
arj | Yes | Yes2 | Yes | No | No | ? |
zoo | Yes | ? | Yes | No | ? | ? |
ISO 9660 (CD-ROM) | No1 | No | Yes | (with Rock Ridge extension) | No | ? |
cab | Yes | Yes2 | ? | No | ? | ? |
rpm | Yes | No | Yes | Yes | ? | ? |
deb | Yes | No | Yes | Yes | ? | ? |
7z | Yes | No | Yes | Yes | ? | ? |
Archive format | Built-in compression | Self-extracting | Directory Structure | POSIX attributes | ACLs | Alternate data streams |
Notes
^1 Compression is not a built-in feature of the formats, however, the resulting archive can be compressed with any algorithm of choice. Several implementations include functionality to do this automatically
^2 That is, most implementations can optionally produce a self-extracting executable
^3 Per-file compression with gzip, bzip2, lzo, xz, lzma (as opposed to compressing the whole archive). An individual can choose not to compress already compressed filenames based on their suffix as well.
References
See also
- List of archive formats
- Comparison of file archivers
- Comparison of file systems
- List of file systems
|