Fork (software development)

From Wikipedia, the free encyclopedia

For other uses, see fork (disambiguation).

In software engineering, a project fork happens when a developer (or a group of them) takes a copy of source code from one software package and starts to independently develop a new package. The term is also used more loosely to represent a similar branching of any work (for example, there are several forks of the English-language Wikipedia), particularly with free or open source software.

A fork that is standard practice in many projects are stable or release versions which are modified only for bug fixes, while a development tree develops new features. This is common practice in the Linux kernel, for instance, but has been misrepresented occasionally in the trade press as the more problematic sort of fork described above.[1] Such forks are often referred to instead as "branches" both to avoid the negative connotations of a fork and because it is closer in intent and function to the common software engineering meaning of branching.

Contents

[edit] Free software

In free software, forks often result from a schism over different goals or personality clashes. In a fork, both parties assume nearly identical code bases but typically only the larger group, or that containing the original architect, will retain the full original name and its associated user community. Thus there is a reputation penalty associated with forking. The relationship between the different teams can be cordial (e.g. Ubuntu and Debian), very bitter (XOrg Server and XFree86) or none to speak of (most branching Linux distributions).

Forks are considered an expression of the freedom made available by free software, but a weakness since they duplicate development efforts and can confuse users over which forked package to use. Developers have the option to collaborate and pool resources with free software, but it is not ensured by free software licenses, only by a commitment to cooperation. That said, many developers will make the effort to put changes into all relevant forks, e.g. amongst the BSDs.

The Cathedral and the Bazaar stated in 1997 [2] that "The most important characteristic of a fork is that it spawns competing projects that cannot later exchange code, splitting the potential developer community." However, this is not common present usage.

In some cases, a fork can merge back into the original project or replace it. The Experimental/Enhanced GNU Compiler System was a fork from GCC which proved more vital than the original project and was eventually "blessed" as the official GCC project. Some have attempted to invoke this effect deliberately, e.g. Mozilla Firefox was an unofficial project within Mozilla that soon replaced the Mozilla Suite as the focus of development.

On the matter of forking, the Jargon File says:

"Forking is considered a Bad Thing — not merely because it implies a lot of wasted effort in the future, but because forks tend to be accompanied by a great deal of strife and acrimony between the successor groups over issues of legitimacy, succession, and design direction. There is serious social pressure against forking. As a result, major forks (such as the Gnu-Emacs/XEmacs split, the fissionings of the 386BSD group into three daughter projects, and the short-lived GCC/EGCS split) are rare enough that they are remembered individually in hacker folklore."

It is easy to declare a fork, but can require considerable effort to continue independent development and support. As such, forks without adequate resources can soon become inactive, e.g. GoneME, a fork of GNOME by a former developer, which was soon discontinued despite attracting some publicity. Some well-known forks have enjoyed great success, however, such as the X.Org X11 server, a fork from XFree86 which gained widespread support and notably sped up X development.

[edit] Proprietary software

In proprietary software, the copyright is usually held by the employing entity, not by the individual software developers. Proprietary code is thus more commonly forked when the owner needs to develop two or more versions, such as a windowed version and a command line version, or versions for differing operating systems, such as a wordprocessor for IBM PC compatible machines and Macintosh computers. Generally, such internal forks will concentrate on having the same look, feel, data format, and behavior between platforms so that a user familiar with one can also be productive or share documents generated on the other. This is almost always an economic decision to generate a greater market share and thus pay back the associated extra development costs created by the fork.

A notable proprietary fork not of this kind is the many varieties of proprietary Unix — all derived from AT&T Unix and all called "Unix", but increasingly mutually incompatible. See UNIX wars.

The BSD license permits (and some say encourages) forks of F/OSS software to become proprietary software. Examples include EnterpriseDB (a version of postgresql with Oracle compatibility features), Fujitsu Supported PostgresSQL with their proprietary ESM storage system, and Netezza's proprietary highly scalable derivative of postgresql. Some of these vendors contribute back changes to the community project; while some keep their changes as their own competitive advantages.

[edit] Other notable forks

  • Most Linux distributions are descended from other distributions, most being traceable back to Red Hat Linux, Debian or Slackware. Since most of the content of a distribution is free software, ideas and software interchange freely as is useful to the individual distribution. Merges (e.g. United Linux or Mandriva) are rare.
  • Pretty Good Privacy was forked outside of the United States to free it from the restrictive laws on the exportation of cryptographic software.
  • The game NetHack has spawned a number of variants using the original code, notably Slash'EM, and was itself a fork of Rogue.
  • OpenSSH was a fork from SSH, which happened because the license for SSH 2.x was non-free (even though the source was available), so an older version of SSH 1.x, the last to have been licensed as free software, was forked. Within months, virtually all Linux distributions, BSD versions and even some proprietary Unixes had replaced SSH with OpenSSH.
  • DragonFly BSD was forked from FreeBSD 4.8 by long-time FreeBSD developer Matt Dillon, due to disagreement over FreeBSD 5's technical direction.
  • Adempiere is a community maintained fork of Compiere 2.5.3b, due to disagreement with commercial and technical direction of Compiere Inc.
  • Enciclopedia Libre, a fork of the Spanish-language Wikipedia, was created to evade the possibility of advertising.
  • X.org is a fork of XFree86, an implementation of the X Window System after disagreement with the new license for the final release version of XFree86
  • Numerous forks of the PostgreSQL database have gone on to become successful commercial products such as Netezza's data warehouse appliance and Fujitsu's FSP database which uses PostgreSQL's SQL front end over their proprietary back-end storage system.

[edit] References