ISO 9660
From Wikipedia, the free encyclopedia
Optical media types | |
---|---|
|
|
Standards | |
|
|
Further reading | |
ISO 9660, a standard published by the International Organization for Standardization (ISO), defines a file system for CD-ROM media. It aims at supporting different computer operating systems such as Unix, Linux (see CDFS), Windows and Mac OS, so that data may be exchanged.
An extension to ISO 9660, the Joliet format, adds support for longer file names and non-ASCII character sets.
DVDs may also use the ISO 9660 file system. However, the UDF file system is more appropriate on DVDs as it has better support for the larger media and is better suited for modern operating system needs.
Contents |
[edit] History
A CD-ROM may be mastered with any kind of data on it. Sun Microsystems, for example, uses the Berkeley UNIX UFS file systems on many CD-ROMs. Silicon Graphics' IRIX installation media uses EFS. Mac OS uses HFS Plus. This restricts them to the producer's operating environment, which, while beneficial in the case of platform-specific software distributions, is not appropriate for widely distributing content. Hence, the need for one volume format that would be accessible on a variety of equipment arose.
Before there was a standard on this matter some were using the High Sierra format on CD-ROM, which arranged file information in a dense, sequential layout to minimise nonsequential access. The High Sierra file system format uses a hierarchical (eight levels of directories deep) tree file system arrangement, similar to UNIX and FAT. High Sierra has a minimal set of file attributes (directory or ordinary file and time of recording) and name attributes (name, extension, and version). The designers realised they could never get people to agree on a unified definition of file attributes, so the minimum common information was encoded, and a place for future optional extensions (system use area) was defined for each file.
High Sierra was adopted in December 1986 (with changes) as an international standard by Ecma International as ECMA-119 [1] and submitted for the fast tracking to the International Organization for Standardization, where it was eventually accepted as ISO 9660:1988. The ISO 9660 file system format is now used throughout the industry.
[edit] Specifications
[edit] CD-ROM Specifications
The smallest entity in the CD format is called a frame, and holds 24 bytes. Data in a CD-ROM is organized in frames and sectors. A CD-ROM sector contains 98 frames, and holds 2352 bytes.
CD-ROM Mode 1, usually used for computer data, divides the 2352 byte data area defined by the Red Book standards into 12 bytes of synchronization information, 4 bytes of header data, 2048 bytes of user data and 288 bytes of error correction and detection codes. These codes help prevent the data from becoming corrupted, which could lead to errors for executable data.
CD-ROM Mode 2 Form 1, usually used for computer data, has the same user data and error correction as Mode 1, but with a slightly different layout. Its use is not recommended for compatibility reasons. [2]
CD-ROM Mode 2 Form 2, intended to be used for error-tolerant data such as audio and video, divides the 2352 bytes into 12 bytes of synchronization information, 4 bytes of header data and 2336 bytes of user data. Mode 2 provides 14% more user data space than Mode 1 by omitting error correction, since a read error in audio or video will only cause a small flaw which may not even be detectable to humans. Video CDs are classified as Mode 2 Form 2.
[edit] ISO 9660 Specifications
The first 32768 bytes of the disk are unused by ISO 9660 data structure, and therefore available for other use. For example, a CD-ROM may contain an alternative file system descriptor in this area, as it is often used by Hybrid CDs to offer Mac OS-specific content.
Immediately afterwards, a series of volume descriptors details the contents and kind of information contained on the disk (similar to the BIOS parameter block used by FAT and NTFS formatted disks).
A volume descriptor describes the characteristics of the file system information present on a given CD-ROM, or volume. It is divided into two parts: the type of volume descriptor, and the characteristics of the descriptor.
The volume descriptor is constructed in this manner so that if a program reading the disk does not understand a particular descriptor, it can just skip over it until it finds one it recognises, thus allowing the use of many different types of information on one CD-ROM. Also, if an error were to render a descriptor unreadable, a subsequent redundant copy of a descriptor could then allow for fault recovery.
An ISO 9660 compliant disk contains at least a primary descriptor describing the ISO 9660 file system and a terminating descriptor for indicating the end of the descriptor sequence. Joliet and UDF are examples of file systems adding more descriptors to this sequence.
The primary volume descriptor acts much like the superblock of the Unix File System, providing details on the ISO 9660 compliant portion of the disk. Contained within the primary volume descriptor is the root directory record describing the location of the contiguous root directory. (As in UNIX, directories appear as files for the operating system special use). Directory entries are successively stored within this region. Evaluation of the ISO 9660 filenames is begun at this location. The root directory is stored as an extent, or sequential series of sectors, that contains each of the directory entries appearing in the root. In addition, since ISO 9660 works by segmenting the CD-ROM into logical blocks, the size of these blocks is found in the primary volume descriptor as well.
The first field in a Volume Descriptor is the Volume Descriptor Type (type), which can have the following values:
- Number 0: shall mean that the Volume Descriptor is a Boot Record
- Number 1: shall mean that the Volume Descriptor is a Primary Volume Descriptor
- Number 2: shall mean that the Volume Descriptor is a Supplementary Volume Descriptor
- Number 3: shall mean that the Volume Descriptor is a Volume Partition Descriptor
- Number 255: shall mean that the Volume Descriptor is a Volume Descriptor Set Terminator.
The second field is called the Standard Identifier and is set to CD001 for a CD-ROM compliant to the ISO 9660 standard.
Another interesting field is the Volume Space Size which contains the amount of data available on the CD-ROM.
File attributes are very simple in ISO-9660. The most important file attribute is determining whether the file is a directory or an ordinary file. File attributes for the file described by the directory entry are stored in the directory entry and optionally, in the extended attribute record.
There are two ways to locate a file on an ISO 9660 file system. One way is to successively interpret the directory names and look through each directory file structure to find the file (much the way MS-DOS and UNIX work to find a file). The other way is through the use of a precompiled table of paths, where all the entries are enumerated in the successive contents of a file with the corresponding entries. Some systems do not have a mechanism for wandering through directories and they obtain a match by consulting the table.
While a large linear table seems a bit arcane, it can be of great value, as one can quickly search without wandering across the disk (thus reducing seek time).
All multi-byte values are stored twice, in little-endian and big-endian format, either one-after-another in what the specification calls "both-endian format", or in duplicated data structures such as the path table. It is therefore theoretically possible to author an ISO-9660 image which delivers different content on different architectures.
[edit] Restrictions
[edit] File and directory name restrictions
There are different levels to this standard.
- Level 1: File names are limited to eight characters with a three-character extension, using upper case letters, numbers and underscore only. The maximum depth of directories is eight.
- Level 2: File names are not limited to 11 characters (the 8.3 format) but may be up to the maximum allowed by the 1 byte counter in the dir entry and the filename length byte counter. Typically, this is close to 180 characters, depending on how many extended attributes are present.
- Level 3: Files are allowed to be non-contiguous, i.e. fragmented (principally to allow packet writing or incremental CD recording).
Other name restrictions:
- All levels restrict filenames to upper case letters, digits, underscores ("_"), and a dot. Linux converts uppercase letters to lower case while mounting ISO filesystems.
- File names shall not include spaces.
- File names shall not start or end with the dot character.
- File names shall not have more than one dot.
- Directory names shall not use dots at all.
Some CD authoring applications allow the user to use almost any character. Whilst, strictly speaking, this does not conform to the ISO 9660 standard, most operating systems which can read ISO 9660 file systems have no problem with out-of-spec names. However, the names may appear wrong to the user.
[edit] Directory depth limit
The restrictions on filename length and directory depth (8 levels, including the root directory) are a more serious limitation of the ISO 9660 file system. Many CD authoring applications attempt to get around this by truncating filenames automatically, but do so at the risk of breaking applications that rely on a specific file structure.
[edit] The 4 GiB (or 2 GiB depending on implementation) file size limit
All numbers in ISO 9660 file systems except the single byte value used for the GMT offset are unsigned numbers. As the length of a file's extent on disk is stored in a 32 bit value[3], it allows for a maximum length of 4 GiB. (Note: Some older operating systems may handle such values incorrectly, i.e. signed instead of unsigned, which would make it impossible to access files larger than 2 GiB in size.)
Based on this, it is often assumed that a file on an ISO 9660 formatted disc cannot be larger than 232-1 in size, as the file's size is stored in an unsigned 32 bit value, for which 232-1 is the maximum.
It is, however, possible to circumvent this limitation by using the multi-extent (fragmentation) feature of ISO 9660 Level 3. With this, files larger than 4 GiB can be split up into multiple extents (sequential series of sectors), each not exceeding the 4 GiB limit. For example, the free software mkisofs as well as Roxio Toast are able to create ISO 9660 filesystems that use multi-extent files to store files larger than 4 GiB on appropriate media such as recordable DVDs.
Empirical tests with a 4.2 GiB fragmented file on a DVD media have shown that Microsoft Windows XP supports this, while Mac OS X (as of 10.4.8) does not handle this case properly. In the case of Mac OS X, the driver appears not to support file fragmentation at all (i.e. it only supports ISO 9660 Level 2 but not Level 3). Linux supports multiple extents [4]; FreeBSD only shows and reads the last extent of a multi-extent file.
[edit] Limit on number of directories
There is also another, less well known, limitation. There is a structure in the ISO image called “path table”. For each directory in the image, the path table provides the identifier of its parent directory. The problem is that the directory identifier is a 16-bit number, limiting its range from 1 to 65,535 [5]. The content of each directory is written also in a different place, therefore the path table is redundant, and intended only for fast searching. Some operating systems (Windows) use it, while others (Linux) don't. If an ISO image or disk consists of more than 65,535 directories, it will be readable in Linux, while in Windows environment all files from the additional directories will be visible, but empty (zero length). A popular application using ISO format, mkisofs, aborts if there is a path table overflow. Nero Burning ROM (for Windows) doesn't check whether the problem occurs, and will produce an invalid ISO file or disk without warning. Also, isovfy cannot easily report this problem. There is no other place in the ISO format where a 16-bit number is used, causing such limitations.
[edit] Multisession support
ISO 9660 is by design a read-only, pre-mastered, file system. This means that all the data has to be written in one go to the medium. Once written, there is no provision for altering the stored content. Therefore ISO 9660 is not suitable to be used on random-writable media, such as Hard Disks.
Recordable CD media (CD-R) provides for multiple session writing. This means that data can be written to disc and made accessible, then later more data can be added to the disc as long as there is unused space left on the disc. (CD-Rs are Write Once media, so they do not support erasing or overwriting data once written.)
The Multisession extension to ISO 9660 makes use of this feature, by defining a rule for operating systems as to how to read an ISO 9660 volume from a CD-R. Instead of looking for the volume descriptor at offset 32768 (block number 16 on a CD) from the start of the disc, it starts reading from the 16th block in the first track of the latest session. Block numbers form a contiguous sequence starting at the first session, and continuing over added sessions and their gaps.
Hence, if a CD mastering program wants to add a single file to a CD-R that has an ISO 9660 volume, it has to append a session containing at least an updated copy of the entire directory tree, plus the new file. The duplicated directory entries can still reference the data files in the previous session(s).
In a similar way, file data can be updated or even removed. Removal is, however, only virtual: the removed content does not appear any more in the directory shown to the user, but it is still physically present on the disc. It can therefore be recovered, and it takes up space (such that the CD will become full even though appearing to still have unused space).
[edit] ISO 9660:1999
ISO 9660:1999 is the latest update to the ISO 9660 standard. It improves on the restrictions imposed by the older standard, by extending the maximum path length to 207 characters, removing the eight-level maximum directory nesting limit, and removing the special meaning of the dot character in filenames.
[edit] Disc images
ISO 9660 file system images (ISO images) are a common way to electronically transfer the contents of CD-ROMs. They often have the filename extension .iso
(.iso9660
is less common, but also in use) and are commonly referred to as "ISOs". It should be noted an .iso
file may be:
- A single ISO 9660 file system image
- A multi-track disc image with a table of contents
[edit] Extensions
There are common extensions to ISO 9660 to deal with the limitations. Rock Ridge supports the preservation of Unix-style permissions and longer ASCII-coded names; Joliet supports names stored in Unicode, thus allowing almost any character to be used, even from non-Latin scripts; El Torito enables CDs to be bootable on PC; Apple ISO 9660 Extensions adds support for Mac-OS-specific file characteristics such as Resource forks, file backup date and more.
ISO 13490 is basically ISO 9660 with multisession support.
For operating systems which do not support any extensions, there is a name translation file TRANS.TBL. It should be located in each directory, including root directory. Now obsolete.
[edit] Operating system support
Most operating systems support reading of ISO 9660 formatted discs, and most new versions support the extensions such as Rock Ridge and Joliet. Operating systems that do not support the extensions usually show the basic (non-extended) features of a plain ISO 9660 disc.
Here are some operating systems and their support for ISO 9660 and extensions:
- DOS: access with extensions, such as MSCDEX.EXE (Microsoft CDROM Extension) or CORELCD.EXE
- Microsoft Windows 95, Windows 98, Windows ME: can read ISO 9660 Level 1, 2, 3, and Joliet
- Microsoft Windows NT 4, Windows 2000
- Windows XP can read ISO 9660 Level 1, 2, 3, Joliet, and ISO 9660:1999
- Linux and BSD: ISO 9660 Level 1, 2, 3, Joliet, Rock Ridge, and ISO 9660:1999
- GS/OS: ISO Level 1 and 2 support via the HS.FST File System Translator. [6]
- Mac OS 7 to 9: ISO Level 1, 2. Optional free software supports Rock Ridge and Joliet (including ISO Level 3): Joke Ridge and Joliet Volume Access.
- Mac OS X 10.2 Jaguar, 10.3 Panther, 10.4 Tiger: ISO Level 1, 2, Joliet and Rock Ridge Extensions. Level 3 is not currently supported, although users have been able to mount these disks: [[1]]
- AmigaOS supports the "AS" extensions (which preserve the Amiga protection bits and file comments)
[edit] See also
[edit] References
- ^ Volume and File Structure of CDROM for Information Interchange. Ecma International (December 1987).
- ^ Media Sciences - Mode and Form differences
- ^ ECMA-119 9.1.4
- ^ kern/95222: File sections on ISO9660 [sic] level 3 CDs ignored
- ^ ECMA-119 6.9
- ^ The Virtual GS: Using ISO disk images in Apple II emulators. Juiced.GS Volume 9, Issue 2 (May 2004).
[edit] External links
- ECMA-119 This is the ECMA release of the ISO 9660:1988 standard, available as a free download.
- [2] Technical information on ISO 9660:1999
- iat – ISO 9660 Analyzer Tool
- ISO 9660 Specifications
- Description of data structures in ISO-9660
- CD Recording FAQ
- Media Sciences - Book types and compatibility, Multisession
- ISO files:
- Mode 1 and 2:
- Sony Storage Support - What are CD-ROM Mode-1, Mode-2 and XA?
- Media Sciences - Varieties of Mode 2
- DivXLand - Mode 2 explanation and creation tools