Source code
From Wikipedia, the free encyclopedia
- Not to be confused with source coding.
Source code (commonly just source or code) is any series of statements written in some human-readable computer programming language. In modern programming languages, the source code which constitutes a program is usually in several text files, but the same source code may be printed in a book or recorded on tape (usually without a filesystem). The term is typically used in the context of a particular piece of computer software. A computer program's source code is the collection of files that can be converted from human-readable form to an equivalent computer-executable form. The source code is either converted into an executable file by a compiler for a particular computer architecture, or executed on the fly from the human readable form with the aid of an interpreter.
The code base of a programming project is the larger collection of all the source code of all the computer programs which make up the project. There is utility in this sort of aggregation as often the same source code file will be used by more than one of a project's different programs.
Contents |
[edit] Purposes
Source code is primarily either used to produce object code (which can be executed by a computer directly), or to be run by an interpreter.
Source code has a number of other uses. It can be used for the description of software. It can also be used as a tool of learning; beginning programmers often find it helpful to review existing source code to learn about programming techniques and methodology. It is used as a communication tool between experienced programmers, due to its (ideally) concise and unambiguous nature. The sharing of source code between developers is frequently cited as a contributing factor to the maturation of their programming skills. Source code can be an expressive artistic medium; consider, for example, obfuscated code or PerlMonks.Org.
Source code is a vital component in the activity of porting software to alternative computer platforms. Without the source code for a particular piece of software, portability is generally so difficult as to be impractical and even impossible. Binary translation can be used to run a program without source code, but not to maintain it, as the machine code output of a compiler is extremely difficult to read directly. Decompilation can be used to generate source code where none exists, and with some manual effort, maintainable source code can be produced (VEW04). Programmers frequently borrow source code from one piece of software to use in other projects, a concept which is known as Software reusability.
[edit] DevCDs and non-free sourcecode
Sometimes companies provide a "DevCD" that can be purchased through them for the sourcecode for their program. Usually the source would be for a game, so serious gamers can release game mods with the source. Of course, many companies do not do this as they fear that the game could simply be re-compiled with the copyright restrictions and any need for CD-Keys or validation removed, allowing it to be distributed via P2P technology.
An example of a company selling a DevCD is Introversion software, the creator of the computer game Uplink. The source has been purchased and mods such as Onlink and the FBI Mod have been created.
[edit] Organization
The source code for a particular piece of software may be contained in a single file or many files. A program's source code is not necessarily all written in the same programming language; for example, it is common for a program to be written primarily in the C programming language, with some portions written in Assembly language for optimization purposes. It is also possible for some components of a piece of software to be written and compiled separately, in an arbitrary programming language, and later integrated into the software using a technique called library linking. Yet another method is to make the main program an interpreter for a programming language, either designed specifically for the application in question or general-purpose, and then write the bulk of the actual user functionality as macros or other forms of add-ins in this language, an approach taken for example by the GNU Emacs text editor.
Moderately complex software customarily requires the compilation or assembly of several, sometimes dozens or even hundreds, of different source code files. This complexity is reduced considerably by the inclusion of a Makefile with the source code, which describes the relationships among the source code files, and contains information about how they are to be compiled. The revision control system is another tool frequently used by developers for source code maintenance.
[edit] Licensing
Software, and its accompanying source code, typically falls within one of two licensing paradigms: Free software and Proprietary software. Generally speaking, software is free if the source code is free to use, distribute, modify and study, and proprietary if the source code is kept secret, or is privately owned and restricted. The provisions of the various copyright laws are often used for this purpose, though trade secrecy is also relied upon. For a further discussion of the differences between these paradigms, and the divisions within them, see software license. Frequently source code of commercial software products additionally to licensing requires some protection from decompilation, reverse engineering, analysis and modifications to prevent illegal use intergrating in an application a copy protection. There are different types of source code protection as code encryption, code obfuscation or code morphing.
[edit] Legal issues in the United States
As of 2003, court systems are in the process of deciding whether source code should be considered a Constitutionally protected form of free speech in the United States. Proponents of the free speech argument claim that because source code conveys information to programmers, is written in a language, and can be used to share humour and other artistic pursuits, it is a protected form of communication. The opposing view is that source code is functional, more than artistic speech, and is thus not protected by First Amendment Rights of the U.S. Constitution.
One of the first court cases regarding the nature of source code as free speech involved University of California mathematics professor Dan Bernstein, who had published on the internet the source code for an encryption program that he created. At the time, encryption algorithms were classified as munitions by the United States government; exporting encryption to other countries was considered an issue of national security, and had to be approved by the State Department. The Electronic Frontier Foundation sued the U.S. government on Bernstein's behalf; the court ruled that source code was free speech, protected by the First Amendment.
In 2000, in a related court case, the issue was again brought under some scrutiny when the Motion Picture Association of America (MPAA) sued the 'hacker' magazine 2600 and a number of other websites for distributing the source code to DeCSS, an algorithm capable of decrypting scrambled DVD discs. The algorithm was developed to allow people to play legally purchased DVDs on the Linux operating system, which had no DVD software at the time. The US District court decision favored the MPAA; 2600 magazine was prohibited from posting or linking to the source code on their website. This ruling was widely considered a victory for the supporters of the Digital Millennium Copyright Act, as it established a legal precedent for the notion that source code is not Constitutionally protected free speech. It was affirmed by the Appeals Court and as of late 2003 is being appealed to the US Supreme Court.
[edit] Quality
The way a program is written can have important consequences for its maintainers. Many source code programming style guides, which stress readability and some language-specific conventions are aimed at the maintenance of the software source code, which involves debugging and updating. Other issues also come into considering whether code is well written, such as the logical structuring of the code into manageable sections.
[edit] Reference
(VEW04) "Using a Decompiler for Real-World Source Recovery", M Van Emmerik and T Waddington, the Working Conference on Reverse Engineering, Delft, Netherlands, 9th-12th November 2004. Extended version of the paper.