Garbage Collected Filesystem
From Wikipedia, the free encyclopedia
This article is orphaned as few or no other articles link to it. Please help introduce links in articles on related topics. (September 2006) |
In computer science, a Garbage Collected Filesystem is a computer filesystem which uses garbage collection to free disk space when files or directories are deleted. Unlike traditional Unix-derived filesystems hard links to directories are allowed.
Contents |
[edit] Data & Metadata
Files and directories consist of two parts: data and metadata. In this case, data is the contents of the file and metadata is data about the file or directory such as file size, owner, and permissions.
[edit] Names and Nodes
The filesystem can be considered to be a graph. Files and directories are nodes on that graph, and names are arrows from directory nodes to file nodes or other directory nodes.
Operations that operate on names are mknod (create new node linked to by this name), open (get node from existing name), unlink (remove name from directory). Operations that operate on nodes are anything that accesses any file data or metadata (most other operations). [These operation names are not the same as unix system calls of the same name].
The special case is the link operation. On traditional systems, it takes two names and creates a link between the two. However since the first name already exists, a version that takes a node can be conceived, and should be provided.
[edit] Overview of Semantics
Local semantics are preserved from traditional hierarchical filesystems under most cases (an exception is .. which is ambiguous). Hard links exist in other filesystems: UNIX has always had them, and Windows NT has had them from at least version 4. These filesystems do not allow hard links to directories, only to files. In a garbage collected filesystem, it is always allowed to hard link directories, even if loops will be created. If /sandbox/foo and /sandbox/bar are two links to the same directory, then any /sandbox/foo/quux is also /sandbox/bar/quux. In addition, directory loops can be created. /sandbox/foo/baz can be created as a hard link to /sandbox/bar. /sandbox/foo/baz/baz/baz/quux is now the same as /sandbox/foo/quux.
Unlinking /sandbox/bar disturbs nothing: even /sandbox/foo/baz/quux works. If /sandbox/foo is unlinked the directory still exists, but it cannot be reached because all names and links to it have been removed. Also, quux is unreachable because its parent (/sandbox/foo or /sandbox/bar) is unreachable. The garbage collector will (when it next runs) find all unreachable nodes and convert them to free disk space.
[edit] See also
[edit] Compatibility
Any algorithm for performing actions on a filesystem by recursively expanding directories will fail if the filesystem contains directory loops (the algorithm will enter an infinite loop.