Crypt (Unix)

From Wikipedia, the free encyclopedia

The correct title of this article is crypt. The initial letter is shown capitalized due to technical restrictions.

crypt(1) is a Unix utility command while crypt(3) is an unrelated standard library function. The (1) and (3) suffixes to these reflect a documentation convention among Unix writers, system administrators, programmers, and users which disambiguate some terms based on whether they are commands (documented in chapter (1) of the Unix man pages) or library functions (documented in chapter (3), traditionally).

Despite the similarity in names, the two are basically unrelated.

Contents

[edit] Command filter

crypt(1) is a simple command to encrypt or decrypt data. Usually this is used as a filter and it has traditionally been implemented using an algorithm based on the Enigma machine. It is considered to be far too cryptographically weak to provide any security against brute force attacks by modern, commodity personal computers.

Some versions of Unix shipped with an even weaker version of the crypt(1) command in order to comply with contemporaneous laws and regulations which limited the exportation of cryptographic software (for example by classifying them as munitions). Some of these were simply implementations of the Caesar cipher (effectively no more secure than ROT13 which is implemented as a Caesar cipher with a well known key).

[edit] crypt(1) under Linux

Linux distributions generally do not include a Unix compatible version of the crypt command. This is largely due to a combination of three major factors:

  1. crypt is relatively obscure and rarely used for e-mail attachments nor as a file format
  2. crypt is considered far too cryptographically weak to withstand brute force attacks by modern computing systems (Linux systems generally ship with GNU Privacy Guard which is considered to be reasonably secure by modern standards)
  3. During the early years of Linux development and adoption there was some concern that even as weak as the algorithm used by crypt was, that it might still run afoul of ITAR's export controls; so mainstream distribution developers in the United States generally excluded it (and left their customers to fetch GnuPG/GPG or other strong cryptographic software from international sites, sometimes providing packages or scripts to automate that process).

The source code to the legacy version of the crypt command do not seem to be readily available and are apparently not included with the Heirloom Toolchest release of original Unix source code.

Enhanced symmetric encryption utilities are available for Linux (and should also portable to any other Unix-like system) including mcrypt and ccrypt. While these provide support for much more sophisticated and modern algorithms, they can be used to encrypt and decrypt files which are compatible with the traditional crypt(1) command by providing the correct command line switches and options.

[edit] Library Function

crypt(3) is the library function which is used to compute a password hash that can be used to store user account passwords while keeping them relatively secure. Technically the name is a misnomer since it is actually a cryptographic hash function. The output of the function is not merely the hash: it is a text string which also encodes the salt and identifies the hash algorithm used. This output forms the password record which may be stored in a plain text file. By a tricky interface, the same function is used both to generate these hashes anew for storage and also to hash a proffered password with a recorded salt for comparison.

[edit] Traditional DES-based scheme

The traditional implementation uses a modified form of the DES algorithm. The user's password is truncated to eight characters, and those are coerced down to only 7-bits each; this forms the 56-bit DES key. That key is then used to encrypt an all-bits-zero block, and then the ciphertext is encrypted again with the same key, and so on for a total of 25 DES encryptions. A 12-bit salt is used to perturb the encryption algorithm, so standard DES implementations can't be used to implement crypt(). The salt and the final ciphertext are encoded into a printable string in a form of base 64.

This is technically not encryption since the data (all bits zero) is not being kept secret; it's widely known to all in advance. However, one of the properties of DES is that it's very resistant to key recovery even in the face of known plaintext situations. It is theoretically possible that two different passwords could result in exactly the same hash. Thus the password is never "decrypted": it is merely used to compute a result, and the matching results are presumed to be proof that the passwords were "the same."

The advantages of this method have been that the password can be stored in plain text and copied among Unix systems without being exposed to the system administrators or other users. This portability has worked for over 30 years across many generations of computing architecture, and across many versions of Unix from many vendors.

[edit] Modifications of the traditional scheme

crypt(3) was originally chosen because DES was resistant to key recovery even in the face of "known plaintext" attacks, and because it was computationally expensive. On the earliest Unix machines it took over a full second to compute a password hash. This also made it reasonably resistant to dictionary attacks in that era. At that time password hashes were commonly stored in an account file (/etc/passwd) which was readable to anyone on the system. (This account file was also used to map user ID numbers into names, and user names into full names, etc).

In the three decades since that time, computers have become vastly more powerful. Moore's Law has generally held true, so the computer speed and capacity available for a given financial investment has doubled over 20 times since Unix was first written. This has long since left the crypt(3) function vulnerable to dictionary attacks, and Unix and Unix-like systems such as Linux have used "shadow" files for a long time, migrating just the password hash values out of the account file (/etc/passwd) and into a file which can only be read by privileged processes (conventionally named /etc/shadow).

To increase the computational cost of password breaking, some Unix sites privately started increasing the number of encryption rounds on an ad hoc basis. This had the side effect of making their crypt() incompatible with the standard crypt(): the hashes had the same textual form, but were now calculated using a different algorithm. Some sites also took advantage of this incompatibility effect, by modifying the initial block from the standard all-bits-zero. This did not increase the cost of hashing, but meant that precomputed hash dictionaries based on the standard crypt() could not be applied.

[edit] BSDi extended DES-based scheme

To gain greater cryptographic security and resistance to brute-force attacks, modern versions of Unix now have a variety of new password hash schemes implemented using the crypt() interface. BSDi modified the original DES-based scheme, extending the salt to 24 bits and making the number of rounds variable (up to 224-1). The chosen number of rounds is encoded in the stored password hash, avoiding the incompatibility that occurred when sites modified the number of rounds used by the original scheme. These hashes are identified by starting with _.

The BSDi algorithm also supports longer passwords, using DES to fold the initial long password down to the eight bytes supported by the original algorithm.

[edit] MD5-based scheme

Poul-Henning Kamp designed a baroque and (at the time) computationally expensive algorithm based on the MD5 message digest algorithm. MD5 itself would provide good cryptographic strength for the password hash, but it is designed to be quite quick to calculate relative to the strength it provides. The crypt() scheme is designed to be expensive to calculate, to slow down dictionary attacks. The printable form of MD5 password hashes starts with $1$.

This scheme allows users to have any length password, and they can use any characters supported by their platform (not just 7-bit ASCII). (In practice many implementations limit the password length, but they generally support passwords far longer than any human being would be willing to type.) The salt is also an arbitrary string, limited only by character set considerations.

First the passphrase and salt are hashed together, yielding an MD5 message digest. Then a new digest is constructed, hashing together the passphrase, the salt, and the first digest, all in a rather complex form. Then this digest is passed through a thousand iterations of a function which rehashes it together with the passphrase and salt in a manner that varies between rounds. The output of the last of these rounds is the resulting passphrase hash.

The fixed iteration count has caused this scheme to lose the computational expense that it once enjoyed. Variable numbers of rounds are now favoured.

[edit] Blowfish-based scheme

Niels Provos and David Mazieres designed a crypt() scheme based on Blowfish, and presented it at USENIX in 1999.[1] The printable form of these hashes starts with $2$ or $2a$, depending on which variant of the algorithm is used.

Blowfish is notable among block ciphers for its expensive key setup phase. It starts off with subkeys in a standard state, then uses this state to perform a block encryption using part of the key, and uses the result of that encryption (really, a hashing) to replace some of the subkeys. Then it uses this modified state to encrypt another part of the key, and uses the result to replace more of the subkeys. It proceeds in this fashion, using a progressively modified state to hash the key and replace bits of state, until all subkeys have been set.

Provos and Mazieres took advantage of this, and actually took it further. They developed a new key setup algorithm for Blowfish, dubbing the resulting cipher "Eksblowfish" ("expensive key schedule Blowfish"). The key setup begins with a modified form of the standard Blowfish key setup, in which both the salt and password are used to set all subkeys. Then there is a configurable number of rounds in which the standard Blowfish keying algorithm is applied, using alternately the salt and the password as the key, each round starting with the subkey state from the previous round. This is not cryptographically significantly stronger than the standard Blowfish key schedule; it's just very slow.

The number of rounds of keying is a power of two, which is an input to the algorithm. The number is encoded in the textual hash.

[edit] crypt(3) under Linux

The GNU C Library used by almost all Linux distributions provides an implementation of the crypt function which can transparently manage both the tradtional DES-based and MD5 hashing algorithms.

Hash algorithms: Gost-Hash | HAS-160 | HAVAL | MDC-2 | MD2 | MD4 | MD5 | N-Hash | RIPEMD | SHA family | Snefru | Tiger | VEST | WHIRLPOOL | crypt(3) DES
MAC algorithms: Data Authentication Code | CBC-MAC | HMAC | OMAC/CMAC | PMAC | UMAC | Poly1305-AES | VEST
Authenticated encryption modes: CCM | EAX | GCM | OCB | VEST   Attacks: Birthday attack | Collision attack | Preimage attack | Brute force attack
Standardization: CRYPTREC | NESSIE   Misc: Avalanche effect | Hash collision | Hash functions based on block ciphers
Cryptography
v  d  e
History of cryptography | Cryptanalysis | Cryptography portal | Topics in cryptography
Symmetric-key algorithm | Block cipher | Stream cipher | Public-key cryptography | Cryptographic hash function | Message authentication code | Random numbers