Filename extension | .7z |
---|---|
Internet media type | application/x-7z-compressed |
Magic number | '7', 'z', 0xBC, 0xAF, 0x27, 0x1C |
Developed by | Igor Pavlov |
Initial release | 1999[1] |
Type of format | Data compression |
Open format? | Yes: GNU Lesser General Public License |
7z is a compressed archive file format that supports several different data compression, encryption and pre-processing algorithms. The 7z format initially appeared as implemented by the 7-Zip archiver. The 7-Zip program is publicly available under the terms of the GNU Lesser General Public License. The LZMA SDK 4.62 was placed in the public domain in December 2008. The latest stable version of 7-Zip and LZMA SDK is version 9.20. [1]
The MIME type of 7z is application/x-7z-compressed
.
The official 7z file format specification is distributed with 7-Zip's source code. The specification can be found in plain text format in the doc\ sub directory of the source code distribution.
Contents |
The 7z format provides the following main features:
The format's open architecture allows additional future compression methods to be added to the standard.
The following compression methods are currently defined:
A suite of recompression tools called AdvanceCOMP contains a copy of the DEFLATE encoder from the 7-Zip implementation; these utilities can often be used to further compress the size of existing gzip, ZIP, PNG, or MNG files.
The LZMA SDK comes with the BCJ / BCJ2 preprocessor included, so that later stages are able to achieve greater compression: For x86, ARM, PowerPC (PPC), IA-64 Itanium, and ARM Thumb processors, jump targets are normalized before compression by changing relative position into absolute values. For x86, this means that near jumps, calls and conditional jumps (but not short jumps and conditional jumps) are converted from the machine language "jump 1655 bytes backwards" style notation to normalized "jump to address 5554" style notation; all jumps to 5554, perhaps a common subroutine, are thus encoded identically, making them more compressible.
Similar executable pre-processing technology is included in other software; the RAR compressor features displacement compression for 32-bit x86 executables and IA-64 executables, and the UPX runtime executable file compressor includes support for working with 16-bit values within DOS binary files.
The 7z format supports encryption with the AES algorithm with a 256-bit key. The key is generated from a user-supplied passphrase using an algorithm based on the SHA-256 hash function. The SHA-256 is executed 219 (524288) times[3] which causes a significant delay on slow PCs before compression or extraction starts. This technique is called key stretching and is used to make a brute-force search for the passphrase more difficult. The 7z format provides the option to encrypt the filenames of a 7z archive.
The 7z format does not store UNIX owner/group permissions, and hence can be inappropriate for backup/archival purposes. A workaround for this is to convert data to a tar bitstream before compressing with 7z. But it is worth noting that GNU tar (common in many UNIX environments) can also compress with the LZMA algorithm natively, without the use of 7z, and that in this case the suggested[4] file extension for the archive is ".tar.lzma" (or just ".tlz"), and not ".tar.7z".
The 7z format does not allow extraction of some "broken files" — that is (for example) if one has the first segment of a series of 7z files, 7z cannot give the start of the files within the archive — it must wait until all segments are downloaded. The format 7z also lacks recovery records, which might be a problem when limited file corruption has occurred.
Salomon, David (2007). Data compression: the complete reference. Springer. p. 241. ISBN 1-84628-602-6.
|