Hard link

From Wikipedia, the free encyclopedia

In computing, a hard link is a directory entry that associates a name with a file on a file system. (A directory is itself a special kind of file that contains a list of such entries.) The term is used in file systems which allow multiple hard links to be created for the same file. This has the effect of creating multiple names for the same file, causing an aliasing effect: e.g. if the file is opened by one of its names, and changes are made to its content, then these changes will also be visible when the file is opened by an alternative name. By contrast, a soft link on such file systems is not a link to a file itself, but to a file name. This also creates aliasing, but in a different way.

Directories are files, so multiple hard links to directories are possible; however, their unrestricted creation is usually avoided, because of the cyclic structures this may create.

Hard linksthat is, multiple directory entries to the same fileare supported by POSIX-compliant and partially POSIX-compliant operating systems, such as GNU/Linux, Android, Apple's Mac OS X, Windows NT4[1] and later Windows NT operating systems.

Support also depends on the type of file system being used. For instance, the NTFS file system supports hard links, while FAT and ReFS do not.

Usage

On POSIX-compliant and partially POSIX-compliant operating systems, such as all Unix-like systems, additional hard links to existing files are created with the link() system call, or the ln and link command-line utilities. The stat command can reveal how many hard links point to a given file. The link count is also included in the output of ls -l.

On Microsoft Windows, hard links can be created using the mklink /H command on Windows NT 6.0 and later systems (such as Windows Vista), and in earlier systems (Windows XP, Windows Server 2003) using fsutil.exe hardlink create. The Windows API from Windows 2000 onwards includes a CreateHardLink() call to create hard links, DeleteFile() is used to remove them, and GetFileInformationByHandle() can be used to determine the number of hard links associated with a file.[2] Hard links require an NTFS partition. Starting with Windows Vista, hard links are used by Windows Component Store (WinSxS) to keep track of different versions of DLLs stored on the hard disk drive. Unix-like emulation or compatibility software running on Windows, such as Cygwin and Subsystem for UNIX-based Applications, allow the use of POSIX interfaces under Windows.

The process of unlinking dissociates a name from the data on the volume without destroying the associated data. The data are still accessible as long as at least one link that points to it still exists. When the last link is removed, the space is considered free.[3] A process ambiguously called undeleting allows the recreation of links to data that are no longer associated with a name. However, this process is not available on all systems and is often not reliable.

Link counter

Most file systems that support hard links use reference counting. An integer value is stored with each physical data section. This integer represents the total number of links that have been created to point to the data. When a new link is created, this value is increased by one. When a link is removed, the value is decreased by one. If the link count becomes zero, the operating system usually automatically deallocates the data space of the file if no process has the file opened for access. The maintenance of this value assists users in preventing data loss. This is a simple method for the file system to track the use of a given area of storage, as zero values indicate free space and nonzero values indicate used space.

On POSIX-compliant operating systems, such as many Unix-variants, the reference count for a file or directory is returned by the stat() or fstat() system calls in the st_nlink field of struct stat.

Example

In the figure to the right, two hard links, named "LINK A.TXT" and "LINK B.TXT", point to the same physical data.

If the file "LINK A.TXT" is opened in an editor, modified and saved, then those changes will be visible if the file "LINK B.TXT" is then opened for viewing since both filenames point to the same data ("opened," because, on POSIX systems, an associated file descriptor remains valid after opening, even when the original file is moved). The same is true if the file were opened as "LINK B.TXT" - or any other name associated with the data.

Some editors however break the hard link concept, e.g. emacs. When opening a file "LINK B.TXT" for editing, emacs first renames "LINK B.TXT" to "LINK B.TXT~", loads "LINK B.TXT~" into the editor, and saves the modified contents to a newly created "LINK B.TXT". Using this approach, the two hard links are now "LINK A.TXT" and "LINK B.TXT~" (the backup file); "LINK B.TXT" would now have just one link and no longer shares the same data as "LINK A.TXT". (This behavior can be changed using the emacs variable backup-by-copying)

Any number of hard links to the physical data may be created. To access the data, a user only needs to specify the name of any existing link; the operating system will resolve the location of the actual data.

If one of the links is removed with the POSIX unlink function (for example, with the UNIX rm command), then the data are still accessible through any other link that remains. If all of the links are removed and no process has the file open, then the space occupied by the data is freed, allowing it to be reused in the future. This semantic allows for deleting open files without affecting the process that uses them. This technique is commonly used to ensure that temporary files are deleted automatically on program termination, including the case of abnormal termination.

Limitations of hard links

To prevent endless recursion, most modern operating systems do not allow hard links on directories. In addition, hard links on directories would lead to inconsistency on parent directory entries. A notable exception to this is Mac OS X v10.5 (Leopard) and newer, which use hard links on directories for the Time Machine backup mechanism only. Symbolic links and NTFS junction points are generally used instead for this purpose.

Hard links can only be created to files on the same volume. If a link to a file on a different volume is needed, it may be created with a symbolic link.

The maximum number of hard links to a single file is limited by the size of the reference counter: with NTFS this is limited to 1023 because a 10-bit field is used for this purpose. On Unix-like systems the counter is usually machine-word-sized (32- or 64-bit: 4,294,967,295 or 18,446,744,073,709,551,615 links, respectively), though in some filesystems such as btrfs the number of hard links is limited more strictly by their on-disk format.[4] As of Linux 3.11, the ext4 filesystem limits the number of hard links on a file to 65,000.[5]

See also

  • Fat link
  • Symbolic link or soft link, unlike hard link, points to filename, not file data itself.
    • NTFS junction point, the NTFS implementation
    • alias (Mac OS), a method for linking files introduced in Mac OS System 7 and still available in Mac OS X which is in some ways similar to a symbolic link. Note that true symbolic links are also available in OS X.
    • shadow (OS/2), the OS/2 implementation
  • Firm link is between a hard link and a soft link, used in the GNU_Hurd Operating System.
  • ln (Unix)The ln command, which is used to create new links on Unix-like systems.
  • freedupThe freedup command frees-up disk space by replacing duplicate data stores with automatically generated hard links

Notes

  1. "Link Shell Extension". 
  2. "NTFS Hard Links, Directory Junctions, and Windows Shortcuts". flexhex.com. 
  3. "Freeware to find and delete hard links". 
  4. "Linux kernel source tree, fs/ext4/ext4.h, line 229". 
This article is issued from Wikipedia. The text is available under the Creative Commons Attribution/Share Alike; additional terms may apply for the media files.