BinHex

BinHex
Filename extension .hqx
Internet media type application/mac-binhex40
application/mac-binhex
application/binhex
Uniform Type Identifier com.apple.binhex-archive

BinHex, short for "binary-to-hexadecimal", is a binary-to-text encoding system that was used on the Mac OS for sending binary files through e-mail. It is similar to Uuencode, but combined both "forks" of the Mac file system together, along with extended file information. BinHexed files take up more space than the original files, but are far less likely to be corrupted in transit.

History

BinHex was originally written by Tim Mann for the TRS-80, as a stand-alone version of an encoding scheme originally built into a popular terminal emulator. It worked by converting the binary file contents to hexadecimal numbers, which were themselves encoded as ASCII digits and letters. BinHex files of the era were typically given the file extension .hex. BinHex was used for sending files via major online services such as CompuServe, which were not "8-bit clean" and required ASCII armoring to survive. CompuServe later addressed this problem in the mid-1980s with the addition of 8-bit clean file transfer protocols, and solutions like BinHex stopped being used.

The file upload problem still existed on CompuServe when the Mac was first released in 1984. William Davis ported BinHex to the Mac using Microsoft BASIC in a simple version that could encode the data fork only, ignoring the resource fork. The rise in use of Internet e-mail coincided roughly with the release of the Macintosh, and Davis's version was posted on the Info-Mac mailing list by Joel Heller in June 1984. Several newer versions were published during 1984, resulting in BinHex 3 that could encode both forks.

Yves Lempereur, author of the first assembler for the Mac, MacASM, found that in order to upload his files to CompuServe he had to use BinHex. The BASIC version was very slow, so he ported it to assembler and released it as BinHex 1.0. The program was roughly a hundred times as fast as the BASIC version, and soon upgrade requests were flooding in.

The original BinHex was a fairly simple format, one that was not very efficient because it expanded every byte of input into two, as required by the hexadecimal representation—an 8-to-4 bit encoding. For BinHex 2.0, Lempereur used a new 8-to-6 encoding that improved file size by 50% and took the opportunity to add a new CRC error checking routine in place of the earlier checksum. Even though the new encoding was no longer hexadecimal in nature, the established name of the program was retained. The smaller files were incompatible with the older ones, so the extension became .hcx, c for compact. Unfortunately, the compact format also had its problems. The 6-bit encoding produced a number of characters that some foreign-language mail programs would convert into local versions, thereby destroying the file. In addition, the file metadata information was still placed in the file in plain text, and therefore could become corrupted in the same fashion.

In order to solve all of these problems, Lempereur released BinHex 4.0 in 1985, skipping 3.0 to avoid confusion with the now long-dead BASIC version. 4.0 carefully selected its character mappings to avoid ones that were translated by mail software, encoded all the information including the file information and protected everything with multiple CRCs. The resulting .hqx files were roughly the same size of the .hcx's, but much more robust.

At about the time BinHex 4 was released, most online services started supporting robust 8-bit file transfer protocols such as Zmodem, and the need for ASCII armoring went away. This left a problem on the Mac however, as there was still the need to encode the two forks into one. A team effort among Macintosh communications programmers resulted in MacBinary, which left the contents of the forks in their original 8-bit format and added a simple header for combining them on reception. MacBinary files were thus much smaller than BinHex. Lempereur released BinHex 5.0, almost identical to 4.0 with the exception that it used MacBinary to combine the forks before running the 8-to-6 encoding, but it saw little use, as he expected.

However, on the Internet, e-mail was still the primary method of moving files. At the time only a few people had access to the Internet, and it was an isolated community unto its own. Years later when he first got onto the Internet, Lempereur was surprised to find that BinHex 4.0 was still extremely popular. The same ends could be achieved by first using MacBinary or AppleSingle to combine the forks, and then using Uuencode or Base64 on the resulting file, but none of these solutions ever became popular and BinHex 4.0 survived well into the late 1990s. Various file archives of pre-Mac OS X software are still filled with BinHexed files.

Format

Looking at the contents of a BinHex file, one will notice that it has a message on the first line identifying it as BinHex, followed by many 64-character lines made up of seemingly random letters, numbers, and punctuation marks. Here is a sample of what BinHex actually looks like:

(This file must be converted with BinHex 4.0)

:$f*TEQKPH#jdCA0d,R0TG!"6594%8dP8)3#3"!&m!*!%EMa6593K!!%!!!&mFNa
KG3,r!*!$&[rr$3d,BQPZD'9i,R4PFh3!RQ+!!"AV#J#3!i!!N!@QKUjrU!#3'[q
3"&4&@&483N)f!3#Xaj6bV-H8mJ!!!B3!N!0"!*!$[3#3!cR@iiY)!*!'[I%4!!J
Fp$X%X3@J!mZE6!GRiKUi$HGKMf0U61S46%i1"AB!TI,fLl!d1X3RDDE8ALfTCbM
8UP9p4iUqY-0k4krHpk9XK@`rbj2Ti'U@5rGH@+[fr-i4T6-qXpfl26,k!H5$Nml
TIkI'(l3GI4)f8mII&01CNEbC2LrNLBeaZ1HG@$G8!Z6"k)hh,q9p"r6FC*!!Se"
(ic,Pd(4(b`pflKC`H1&JN5)GVX3mREdH55[l`%`Yhp%q092c`A(hPV)!83Dr&f4
$$L#I1aM-"VjqV-q$34KQq6$M$f8#,Zc,i),!(`*ZN!$K$rS!LA%3cL+dYi"@,K(
Z"`#3!fKi!!!:

At the start of the file there must be a text line, which is used by users and tools to recognize BinHex versions: (This file must be converted with BinHex 4.0)

The rest of the file consists of three parts, a header (containing file name, size etc), a data fork (containing the file data) and a resource fork. Each has a two-byte CRC checksum.

Everything except the (This file... is then seen as an area of binary data, which is encoded to ASCII characters. The encoding algorithm says that three bytes input are divided into four 6-bit values, in a similar way as Base64 does it. Number 0-63 are given characters according to the following list !"#$%&'()*+,-012345689@ABCDEFGHIJKLMNPQRSTUVXYZ[`abcdefhijklmpqr

When encoding, a <return> should be inserted after every 64 characters. After encoding, a colon is placed before and after the data.

External links