Versioning file system

A versioning file system is any computer file system which allows a computer file to exist in several versions at the same time. Thus it is a form of revision control. Most common versioning file systems keep a number of old copies of the file. Some limit the number of changes per minute or per hour to avoid storing large numbers of trivial changes. Others instead take periodic snapshots whose contents can be accessed with similar semantics to normal file access.

Similar technologies

Backup

A versioning file system is similar to a periodic backup, with several key differences.

Revision control system

Versioning file systems provide some of the features of revision control systems. However, unlike most revision control systems, they are transparent to users, not requiring a separate "commit" step to record a new revision.

Journaling file system

Versioning file systems should not be confused with journaling file systems. Whereas journaling file systems work by keeping a log of the changes made to a file before committing those changes to that file system (and overwriting the prior version), a versioning file system keeps previous copies of a file when saving new changes. The two features serve different purposes and are not mutually exclusive.

Implementations

ITS

An early implementation of versioning, possibly the first, was in MIT's ITS. In ITS, a filename consisted of two six-character parts; if the second part was numeric (consisted only of digits), it was treated as a version number. When specifying a file to open for read or write, one could supply a second part of ">"; when reading, this meant to open the highest-numbered version of the file; when writing, it meant to increment the highest existing version number and create the new version for writing.

Another early implementation of versioning was in TENEX, which became TOPS-20.[1]

Files-11 (RSX-11 and OpenVMS)

Main article: Files-11

A powerful example of a file versioning system is built into the RSX-11 and OpenVMS operating system from Digital Equipment Corporation. In essence, whenever an application opens a file for writing, the file system automatically creates a new instance of the file, with a version number appended to the name. Version numbers start at 1 and count upward as new instances of a file are created. When an application opens a file for reading, it can either specify the exact file name including version number, or just the file name without the version number, in which case the most recent instance of the file is opened. The "purge" DCL/CCL command can be used at any time to manage the number of versions in a specific directory. By default, all but the highest numbered versions of all files in the current directory will be deleted; this behavior can be overridden with the /keep=n switch and/or by specifying directory path(s) and/or filename patterns. VMS systems are often scripted to purge user directories on a regular schedule; this is sometimes misconstrued by end-users as a property of the versioning system.

Linux

No mainstream Linux file system supports versioning, but a number of experimental/research and lesser-known solutions do, namely:

LMFS

The Lisp Machine File System supports versioning. This was provided by implementations from MIT, LMI, Symbolics and Texas Instruments. Such an operating system was Symbolics Genera.

Mac OS X

Starting with Lion (10.7), OS X has a feature called Versions which allows Time Machine-like saving and browsing of past versions of documents for applications written to use Versions. This functionality, however, takes place at the application layer, not the filesystem layer;[2] Lion does not incorporate a true versioning file system.

SCO OpenServer

HTFS, adopted as the primary filesystem for SCO OpenServer in 1995, supports file versioning. Versioning is enabled on a per-directory basis by setting the directory's setuid bit, which is inherited when subdirectories are created. If versioning is enabled, a new file version is created when a file or directory is removed, or when an existing file is opened with truncation. Non-current versions remain in the filesystem namespace, under the name of the original file but with a suffix attached consisting of a semicolon and version sequence number. All but the current version are hidden from directory reads (unless the SHOWVERSIONS environment variable is set), but versions are otherwise accessible for all normal operations. The environment variable and general accessibility allow versions to be managed with the usual filesystem utilities, though there is also an "undelete" command that can be used to purge and restore files, enable and disable versioning on directories, etc.

Others

Related software

The following are not versioning filesystems, but allow similar functionality.

See also

References

  1. Daniel G. Bobrow, Jerry D. Burchfiel, Daniel L. Murphy, Raymond S. Tomlinson, TENEX, A Paged Time Sharing System for the PDP-10 (Communications of the ACM, Vol. 15, pp. 135-143, March 1972)
  2. "Mac OS X Lion file versions, part 2". Retrieved 28 April 2012.
  3. Version Control with Subversion: Next Generation Open Source Version Control
  4. pDumpFS Homepage
  5. "Git Internals". Git is fundamentally a content-addressable filesystem with a VCS user interface written on top of it.