Fork (software development)

From Wikipedia, the free encyclopedia

In software engineering, a project fork happens when developers take a copy of source code from one software package and start independent development on it, creating a distinct piece of software.

Free or open source software is, by definition, that which is possible to fork without permission of the original creator. However, licensed forks of proprietary software (e.g. Unix) can also be important.

Contents

[edit] Branching

A kind of fork that is standard practice in many projects is a stable or release version, modified only for bug fixes, while a development tree develops new features. Such forks are often referred to instead as "branches" both to avoid the negative connotations of a fork and because it is closer in intent and function to the common software engineering meaning of branching.

[edit] Free software

Free or open source software is possible to fork with no prior permission, per the definitions of "free software" ("Freedom 3: The freedom to improve the program, and release your improvements to the public, so that the whole community benefits") and "open source" ("3. Derived Works: redistribution of modifications must be allowed. (To allow legal sharing and to permit new features or repairs.)").

In free software, forks often result from a schism over different goals or personality clashes. In a fork, both parties assume nearly identical code bases but typically only the larger group, or that containing the original architect, will retain the full original name and its associated user community. Thus there is a reputation penalty associated with forking. The relationship between the different teams can be cordial (e.g., Ubuntu and Debian), very bitter (X.Org Server and XFree86, or cdrtools and cdrkit) or none to speak of (most branching Linux distributions).

Forks are considered an expression of the freedom made available by free software, but a weakness since they duplicate development efforts and can confuse users over which forked package to use. Developers have the option to collaborate and pool resources with free software, but it is not ensured by free software licenses, only by a commitment to cooperation. That said, many developers will make the effort to put changes into all relevant forks, e.g., amongst the BSDs.[citation needed]

The Cathedral and the Bazaar stated in 1997 [1] that "The most important characteristic of a fork is that it spawns competing projects that cannot later exchange code, splitting the potential developer community." However, this is not common present usage.

In some cases, a fork can merge back into the original project or replace it. EGCS (the Experimental/Enhanced GNU Compiler System) was a fork from GCC which proved more vital than the original project and was eventually "blessed" as the official GCC project. Some have attempted to invoke this effect deliberately, e.g., Mozilla Firefox started as an unofficial project within Mozilla that soon replaced the Mozilla Suite as the focus of development.

On the matter of forking, the Jargon File says:

"Forking is considered a Bad Thing—not merely because it implies a lot of wasted effort in the future, but because forks tend to be accompanied by a great deal of strife and acrimony between the successor groups over issues of legitimacy, succession, and design direction. There is serious social pressure against forking. As a result, major forks (such as the Gnu-Emacs/XEmacs split, the fissioning of the 386BSD group into three daughter projects, and the short-lived GCC/EGCS split) are rare enough that they are remembered individually in hacker folklore."

It is easy to declare a fork, but can require considerable effort to continue independent development and support. As such, forks without adequate resources can soon become inactive, e.g., GoneME, a fork of GNOME by a former developer, which was soon discontinued despite attracting some publicity. Some well-known forks have enjoyed great success, however, such as the X.Org X11 server, a fork from XFree86 which gained widespread support from developers and users and notably sped up X development.

[edit] Proprietary software

In proprietary software, the copyright is usually held by the employing entity, not by the individual software developers. Proprietary code is thus more commonly forked when the owner needs to develop two or more versions, such as a windowed version and a command line version, or versions for differing operating systems, such as a wordprocessor for IBM PC compatible machines and Macintosh computers. Generally, such internal forks will concentrate on having the same look, feel, data format, and behavior between platforms so that a user familiar with one can also be productive or share documents generated on the other. This is almost always an economic decision to generate a greater market share and thus pay back the associated extra development costs created by the fork.

A notable proprietary fork not of this kind is the many varieties of proprietary Unix — all derived from AT&T Unix and all called "Unix", but increasingly mutually incompatible. See UNIX wars.

The BSD licenses permit forks to become proprietary software, and some say that commercial incentives thus make proprietisation almost inevitable. Examples include Cedega and CrossOver (proprietary forks of Wine), EnterpriseDB (a fork of PostgreSQL, adding Oracle compatibility features), Fujitsu Supported PostgreSQL with their proprietary ESM storage system, and Netezza's proprietary highly scalable derivative of PostgreSQL. Some of these vendors contribute back changes to the community project, while some keep their changes as their own competitive advantages.

[edit] Other notable forks

  • Most Linux distributions are descended from other distributions, most being traceable back to Debian, Red Hat or Slackware. Since most of the content of a distribution is free software, ideas and software interchange freely as is useful to the individual distribution. Merges (e.g., United Linux or Mandriva) are rare.
  • Pretty Good Privacy was forked outside of the United States to free it from restrictive US laws on the exportation of cryptographic software.
  • The game NetHack has spawned a number of variants using the original code, notably Slash'EM, and was itself a fork of Hack.
  • OpenBSD was a fork of NetBSD 1.0 by Theo de Raadt
  • OpenSSH was a fork from SSH, which happened because the license for SSH 2.x was non-free (even though the source was available), so an older version of SSH 1.x, the last to have been licensed as free software, was forked. Within months, virtually all Linux distributions, BSD versions and even some proprietary Unixes had replaced SSH with OpenSSH.
  • DragonFly BSD was forked from FreeBSD 4.8 by long-time FreeBSD developer Matt Dillon, due to disagreement over FreeBSD 5's technical direction.
  • Adempiere is a community maintained fork of Compiere 2.5.3b, due to disagreement with commercial and technical direction of Compiere Inc.
  • NeoOffice is a fork of OpenOffice.org, with an incompatible license (GPL rather than LGPL), due to disagreements about licensing and about the best method to port OpenOffice.org to Mac OS X.
  • Funpidgin is a fork of the instant messaging client Pidgin, due to disagreement with the Pidgin developers over automated text input area resizing and lack of other user-requested features.

[edit] References