Hierarchical File System

From Wikipedia, the free encyclopedia

HFS
Developer Apple Computer
Full name Hierarchical File System
Introduced September 17, 1985 (System 2.1)
Partition identifier Apple_HFS (Apple Partition Map)
0xAF (MBR)
Structures
Directory contents B*-tree
File allocation B*-tree
Bad blocks B*-tree
Limits
Max file size 2 GiB
Max number of files 65535
Max filename length 31 characters
Max volume size 2 TiB
Allowed characters in filenames All 8-bit values except colon ":". Discouraged null and nonprints.
Features
Dates recorded Creation, modification, backup
Date range January 1, 1904 - February 6, 2040
Date resolution 1s
Forks Only 2 (data and resource)
Attributes Color (3 bits, all other flags 1 bit), locked, custom icon, bundle, invisible, alias, system, stationery, inited, no INIT resources, shared, desktop
File system permissions AppleShare
Transparent compression Yes (third-party), Stacker
Transparent encryption No
Supported operating systems Mac OS, Mac OS X

Hierarchical File System (HFS), is a file system developed by Apple Inc. for use in computer systems running Mac OS. Originally designed for use on floppy and hard disks, it can also be found on read-only media such as CD-ROMs. HFS is also referred to as Mac OS Standard (or, erroneously, “HFS Standard”), where its successor, HFS Plus, is also called Mac OS Extended (or, erroneously, “HFS Extended”).

Contents

[edit] History

HFS was introduced by Apple in September 1985 to replace Macintosh File System (MFS), the original file system which had been introduced the year before with the Macintosh computer. Developed by Patrick Dirks and Bill Bruffey, HFS shared a number of design features with MFS that were not available in other file systems of the time (such as DOS's FAT). Files could have multiple forks (normally a data and a resource fork), which allowed program code to be stored separately from resources such as icons that might need to be localised. Files were referenced with unique file IDs rather than file names, and file names could be 255 characters long (although the Finder only supported a maximum of 31 characters).

However MFS was optimised to be used on very small and slow media, namely floppy disks, so HFS was introduced to overcome some of the performance problems that arrived with the introduction of larger media, notably hard drives. The main concern was the time needed to display the contents of a folder. Under MFS all of the file and directory listing information was stored in a single file, which the system had to search to build a list of the files stored in a particular folder. This worked well with a system with a few hundred kilobytes of storage and perhaps a hundred files, but as the systems grew into megabytes and thousands of files, the performance degraded rapidly.

The solution was to replace MFS's directory structure with one more suitable to larger file systems. HFS replaced the flat table structure with the Catalog File which uses a B*-tree structure that could be searched very quickly regardless of size. HFS also re-designed various structures to be able to hold larger numbers, 16-bit integers being replaced by 32-bit almost universally. Oddly, one of the few places this "upsizing" did not take place was the file directory itself, which limits HFS to a total of 64k files.

While HFS is a proprietary file system format, it is well documented so there are usually solutions available to access HFS formatted disks from most modern operating systems.

In 1998, Apple introduced HFS Plus to address inefficient allocation of disk space in HFS and to add other improvements. HFS is still supported by current versions of Mac OS, but starting with Mac OS X an HFS volume cannot be used for booting.

[edit] Design

The Hierarchical File System divides a volume into logical blocks of 512 bytes. These logical blocks are then grouped together into allocation blocks which can contain one or more logical blocks depending on the total size of the volume. HFS uses a 16 bit value to address allocation blocks, limiting the number of allocation blocks to 65,536.

There are five structures that make up an HFS volume:

  1. Logical blocks 0 and 1 of the volume are the Boot Blocks, which contain system startup information. For example, the names of the System and Shell (usually the Finder) files which are loaded at startup.
  2. Logical block 2 contains the Master Directory Block (aka MDB). This defines a wide variety of data about the volume itself, for example date & time stamps for when the volume was created, the location of the other volume structures such as the Volume Bitmap or the size of logical structures such as allocation blocks. There is also a duplicate of the MDB called the Alternate Master Directory Block (aka Alternate MDB) located at the opposite end of the volume in the second to last logical block. This is intended mainly for use by disk utilities and is only updated when either the Catalog File or Extents Overflow File grow in size.
  3. Logical block 3 is the starting block of the Volume Bitmap, which keeps track of which allocation blocks are in use and which are free. Each allocation block on the volume is represented by a bit in the map: if the bit is set then the block is in use; if it is clear then the block is free to be used. Since the Volume Bitmap must have a bit to represent each allocation block, its size is determined by the size of the volume itself.
  4. The Extent Overflow File is a B*-tree that contains extra extents that record which allocation blocks are allocated to which files, once the initial three extents in the Catalog File are used up. Later versions also added the ability for the Extent Overflow File to store extents that record bad blocks, to prevent the file system from trying to allocate a bad block to a file.
  5. The Catalog File is another B*-tree that contains records for all the files and directories stored in the volume. It stores four types of records. Each file consists of a File Thread Record and a File Record while each directory consists of a Directory Thread Record and a Directory Record. Files and directories in the Catalog File are located by their unique Catalog Node ID (or CNID).
    • A File Thread Record stores just the name of the file and the CNID of its parent directory.
    • A File Record stores a variety of metadata about the file including its CNID, the size of the file, three timestamps (when the file was created, last modified, last backed up), the first file extents of the data and resource forks and pointers to the file's first data and resource extent records in the Extent Overflow File. The File Record also stores two 16 byte fields that are used by the Finder to store attributes about the file including things like its creator code, type code, the window the file should appear in and its location within the window.
    • A Directory Thread Record stores just the name of the directory and the CNID of its parent directory.
    • A Directory Record which stores data like the number of files stored within the directory, the CNID of the directory, three timestamps (when the directory was created, last modified, last backed up). Like the File Record, the Directory Record also stores two 16 byte fields for use by the Finder. These store things like the width & height and x & y co-ordinates for the window used to display the contents of the directory, the display mode (icon view, list view, etc) of the window and the position of the window's scroll bar.

[edit] Problems

The Catalog File, which stores all the file and directory records in a single data structure, results in performance problems when the system allows multitasking, as only one program can write to this structure at a time, meaning that many programs may be waiting in queue due to one program "hogging" the system.[1] It is also a serious reliability concern, as damage to this file can destroy the entire file system. This contrasts with other filesystems that store file and directory records in separate structures (such as DOS's FAT file system or the Unix File System), where having structure distributed across the disk means that damaging a single directory is generally non-fatal and the data may possibly be re-constructed with data held in the non-damaged portions.

Additionally, the limit of 65,535 allocation blocks resulted in files having a "minimum" size equivalent 1/65,535th the size of the disk. Thus, any given volume, no matter its size, could only store a maximum of 65,535 files. Moreover, any file would be allocated more space than it actually needed, up to the allocation block size. When disks were small, this was of little consequence, because the individual allocation block size was trivial, but as disks started to approach the 1 GB mark, the smallest amount of space that any file could occupy (a single allocation block) became excessively large, wasting significant amounts of disk space. For example, on a 1 GB disk, the allocation block size under HFS is 16 KB, so even a 1 byte file would take up 16 KB of disk space. This situation was less of a problem for users having large files (such as pictures, databases or audio) because these larger files wasted less space as a percentage of their file size. Users with many small files, on the other hand, could lose a copious amount of space due to large allocation block size. This made partitioning disks into smaller logical volumes very appealing for Mac users, because small documents stored on a smaller volume would take up much less space than if they resided on a large partition. The same problem existed in the FAT16 file system.

[edit] See also

[edit] References

[edit] External links