Source code

From Wikipedia, the free encyclopedia

An illustration of Java source code with prologue comments indicated in red and inline comments indicated in green. Program code is indicated in blue.
An illustration of Java source code with prologue comments indicated in red and inline comments indicated in green. Program code is indicated in blue.

In computer science, source code (commonly just source or code) is any sequence of statements and/or declarations written in some human-readable computer programming language.

The source code which constitutes a program is usually held in one or more text files, and may also appear as code snippets printed in books or other media.

A computer program's source code is the collection of files needed to convert from human-readable form to some kind of computer-executable form. The source code may be converted into an executable file by a compiler, or executed on the fly from the human readable form with the aid of an interpreter.

The code base of a programming project is the larger collection of all the source code of all the computer programs which make up the project.

Contents

[edit] Purposes

Source code is primarily either used to produce object code (which can be executed by a computer directly), or to be executed by an interpreter.

Source code has a number of other uses. It can be used for the description of software. It can also be used for learning; beginning programmers often find it helpful to review existing source code to learn about programming techniques. It is used as a communication method between experienced programmers, due to its (ideally) concise and unambiguous nature. The sharing of source code between developers is frequently cited as a contributing factor to the maturation of their programming skills. Source code can be an expressive artistic medium; consider, for example, obfuscated code or PerlMonks.Org.

Source code is a vital component in the activity of porting software to alternative computer platforms. Without the source code for a particular piece of software, portability is generally so difficult as to be impractical and even impossible. Binary translation can be used to run a program without source code, but not to maintain it, as the machine code output of a compiler is extremely difficult to read directly. Decompilation can be used to generate source code where none exists, and with some manual effort, maintainable source code can be produced (VEW04). Programmers frequently adapt source code from one piece of software to use in other projects, a concept which is known as Software reusability.

[edit] Organization

The source code for a particular piece of software may be contained in a single file or many files. A program's source code is not necessarily all written in the same programming language; for example, it is common for a program to be written primarily in the C programming language, with some portions written in Assembly language for optimization purposes. It is also possible for some components of a piece of software to be written and compiled separately, in an arbitrary programming language, and later integrated into the software using a technique called library linking. In some languages, such as Java, this is essentially how each file is handled; each is compiled separately and linked at runtime. Yet another method is to make the main program an interpreter for a programming language, either designed specifically for the application in question or general-purpose, and then write the bulk of the actual user functionality as macros or other forms of add-ins in this language, an approach taken for example by the GNU Emacs text editor.

Moderately complex software customarily requires the compilation or assembly of several, sometimes dozens or even hundreds, of different source code files. This complexity is reduced considerably by the inclusion of a Makefile with the source code, which describes the relationships among the source code files, and contains information about how they are to be compiled. The revision control system is another tool frequently used by developers for source code maintenance.

[edit] Licensing

Software, and its accompanying source code, typically falls within one of two licensing paradigms: Free software and Proprietary software. Generally speaking, software is free if the source code is free to use, distribute, modify and study, and proprietary if the source code is kept secret, or is privately owned and restricted. The provisions of the various copyright laws are often used for this purpose, though trade secrecy is also relied upon. For a further discussion of the differences between these paradigms, and the divisions within them, see software license. Frequently source code of commercial software products additionally to licensing requires some protection from decompilation, reverse engineering, analysis and modifications to prevent illegal use integrating in an application a copy protection. There are different types of source code protection as code encryption, code obfuscation or code morphing.

[edit] Legal issues in the United States

As of 2003, court systems are in the process of deciding whether source code should be considered a Constitutionally protected form of free speech in the United States. Proponents of the free speech argument claim that because source code conveys information to programmers, is written in a language, and can be used to share humour and other artistic pursuits, it is a protected form of communication. The opposing view is that source code is functional, more than artistic speech, and is thus not protected by First Amendment Rights of the U.S. Constitution.

One of the first court cases regarding the nature of source code as free speech involved University of California mathematics professor Dan Bernstein, who had published on the internet the source code for an encryption program that he created. At the time, encryption algorithms were classified as munitions by the United States government; exporting encryption to other countries was considered an issue of national security, and had to be approved by the State Department. The Electronic Frontier Foundation sued the U.S. government on Bernstein's behalf; the court ruled that source code was free speech, protected by the First Amendment.

In 2000, in a related court case, the issue was again brought under some scrutiny when the Motion Picture Association of America (MPAA) sued the 'hacker' magazine 2600 and a number of other websites for distributing the source code to DeCSS, an algorithm capable of decrypting scrambled DVD discs. The algorithm was developed to allow people to play legally purchased DVDs on the Linux operating system, which had no DVD software at the time. The US District court decision favored the MPAA; 2600 magazine was prohibited from posting or linking to the source code on their website. This ruling was widely considered a victory for the supporters of the Digital Millennium Copyright Act, as it established a legal precedent for the notion that source code is not Constitutionally protected free speech. It was affirmed by the Appeals Court and as of late 2003 is being appealed to the US Supreme Court.

[edit] Quality

Main article: Software quality

The way a program is written can have important consequences for its maintainers. Many source code programming style guides, which stress readability and some language-specific conventions are aimed at the maintenance of the software source code, which involves debugging and updating. Other issues also come into considering whether code is well written, such as the logical structuring of the code into manageable sections.

[edit] Reference

(VEW04) "Using a Decompiler for Real-World Source Recovery", M Van Emmerik and T Waddington, the Working Conference on Reverse Engineering, Delft, Netherlands, 9th-12th November 2004. Extended version of the paper.

[edit] See also

Look up source code in Wiktionary, the free dictionary.