Revision control (also known as version control (system) (VCS), source control or (source) code management (SCM)) is the management of multiple revisions of the same unit of information. It is most commonly used in engineering and software development to manage ongoing development of digital documents like application source code, art resources such as blueprints or electronic models, and other projects that may be worked on by a team of people. Changes to these documents are usually identified by incrementing an associated number or letter code, termed the "revision number", "revision level", or simply "revision" and associated historically with the person making the change. A simple form of revision control, for example, has the initial issue of a drawing assigned the revision number "1". When the first change is made, the revision number is incremented to "2" and so on.
Version control systems are most commonly stand-alone applications, but revision control is also embedded in various types of software like word processors (e.g. Microsoft Word, OpenOffice.org Writer, KOffice, Pages, Google Docs), spreadsheets (e.g. OpenOffice.org Calc, Google Spreadsheets, Microsoft Excel), and in various content management systems. Integrated revision control is a key feature of wiki software packages such as MediaWiki, DokuWiki, TWiki, etc. In wikis, revision control allows for the ability to revert a page to a previous revision, which is critical for allowing editors to track each other's edits, correct mistakes, and defend public wikis against vandalism and spam.
Software tools for revision control are increasingly recognized as being necessary for the organization of multi-developer projects.[1]
Contents |
Engineering revision control developed from formalized processes based on tracking revisions of early blueprints or bluelines. Implicit in this control was the ability to return to any earlier state of the design, for cases in which an engineering dead-end was reached in the development of the design. Likewise, in computer software engineering, revision control is any practice that tracks and provides control over changes to source code. Software developers sometimes use revision control software to maintain documentation and configuration files as well as source code. Also, version control is widespread in business and law. Indeed, "contract redline" and "legal blackline" are some of the earliest forms of revision control, and are still employed with varying degrees of sophistication. An entire industry has emerged to service the document revision control needs of business and other users, and some of the revision control technology employed in these circles is subtle, powerful, and innovative. The most sophisticated techniques are beginning to be used for the electronic tracking of changes to CAD files (see Product Data Management), supplanting the "manual" electronic implementation of traditional revision control.
As software is designed, developed and deployed, it is extremely common for multiple versions of the same software to be deployed in different sites, and for the software's developers to be working simultaneously on updates. Bugs and other issues with software are often only present in certain versions (because of the fixing of some problems and the introduction of others as the program develops). Therefore, for the purposes of locating and fixing bugs, it is vitally important to be able to retrieve and run different versions of the software to determine in which version(s) the problem occurs. It may also be necessary to develop two versions of the software concurrently (for instance, where one version has bugs fixed, but no new features, while the other version is where new features are worked on).
At the simplest level, developers could simply retain multiple copies of the different versions of the program, and number them appropriately. This simple approach has been used on many large software projects. While this method can work, it is inefficient as many near-identical copies of the program have to be maintained. This requires a lot of self-discipline on the part of developers, and often leads to mistakes. Consequently, systems to automate some or all of the revision control process have been developed.
Moreover, in software development and other environments, including in legal and business practice, it is increasingly common for a single document or snippet of code to be edited by a team, the members of which may be geographically diverse and/or may pursue different and even contrary interests. Sophisticated revision control that tracks and accounts for ownership of changes to documents and code may be extremely helpful or even necessary in such situations.
Another use for revision control is to track changes to configuration files, such as those typically stored in /etc or /usr/local/etc on Unix systems. This gives system administrators another way to easily track changes to configuration files and a way to roll back to earlier versions should the need arise.
Most revision control software can use delta compression, which retains only the differences between successive versions of files. This allows for more efficient storage of many different versions of files.
Traditional revision control systems use a centralized model, where all the revision control functions are performed on a shared server. If two developers try to change the same file at the same time, without some method of managing access the developers may end up overwriting each other's work. Centralized revision control systems solve this problem in one of two different "source management models": file locking and version merging.
The simplest method of preventing "concurrent access" problems is to lock files so that only one developer at a time has write access to the central "repository" copies of those files. Once one developer "checks out" a file, others can read that file, but no one else is allowed to change that file until that developer "checks in" the updated version (or cancels the checkout).
File locking has merits and drawbacks. It can provide some protection against difficult merge conflicts when a user is making radical changes to many sections of a large file (or group of files). But if the files are left exclusively locked for too long, other developers can be tempted to simply bypass the revision control software and change the files locally anyway. That can lead to more serious problems.
Most version control systems, such as CVS, allow multiple developers to edit the same file at the same time. The first developer to "check in" changes to the central repository always succeeds. The system provides facilities to merge changes into the central repository, so the changes from the first developer are preserved when the other developers check in.
The concept of a reserved edit can provide an optional means to explicitly lock a file for exclusive write access, even though a merging capability exists.
Distributed revision control (DRCS) takes a peer-to-peer approach, as opposed to the client-server approach of centralized systems. Rather than a single, central repository on which clients synchronize, each peer's working copy of the codebase is a bona-fide repository.[2] Synchronization is conducted by exchanging patches (change-sets) from peer to peer. This results in some striking differences from a centralized system:
An open system of distributed revision control is characterized by its support for independent branches, and its heavy reliance on merge operations. Its general characteristics are:
One of the first open systems was BitKeeper, notable for its use in the development of the Linux kernel. A later decision by the makers of BitKeeper to restrict its licensing led the Linux developers on a search for a free replacement[4]. Common open systems now in free use are:
A closed system of distributed revision control is based on a replicated database. A check-in is equivalent to a distributed commit. Successful commits create a single baseline. An example of a closed distributed system is Code Co-op.
Some of the more advanced revision control tools offer many other facilities, allowing deeper integration with other tools and software engineering processes. Plugins are often available for IDEs such as IntelliJ IDEA, Eclipse and Visual Studio. NetBeans IDE and Xcode come with integrated version control support.
Terminology can vary from system to system, but here are some terms in common usage.[5][6]
Most often only one of the terms baseline, label, or tag are used in documentation or discussion and can be considered synonyms. Most revision control tools will use only one of baseline, label, or tag to refer to the action of identifying a snapshot ("label the project") or the record of the snapshot ("try it with baseline X"). However, in most projects some snapshots are more significant than others, such as those used to indicate published releases, branches, or milestones.
When both the term baseline and either of label or tag are used together in the same context, label and tag are usually used to refer to the mechanism within the tool of identifying or making the record of the snapshot and baseline is used to indicate increased significance of any given label or tag.
Baseline is the term used in most formal discussion of configuration management.