Bootstrapping (compilers)

In computer science, bootstrapping is the process of writing a compiler (or assembler) in the target programming language which it is intended to compile. Applying this technique leads to a self-hosting compiler.

A large proportion of programming languages are bootstrapped, including BASIC, C, Pascal, Factor, Haskell, Modula-2, Oberon, OCaml, Common Lisp, Scheme, Python and more.

Contents

Advantages

Bootstrapping a compiler has the following advantages:[1] [2]

The chicken and egg problem

If one needs a compiler for language X to obtain a compiler for language X (which is written in language X), how did the first compiler get written? Possible methods to solving this chicken or the egg problem include:

Methods for distributing compilers in source code include providing a portable bytecode version of the compiler, so as to bootstrap the process of compiling the compiler with itself.

Such methods are also one way of detecting or avoiding (or both) the potential problem pointed out in Reflections on Trusting Trust.

The T-diagram is a notation used to explain these compiler bootstrap techniques.[2]

In some cases, the most convenient way to get a complicated compiler running on a system that has little or no software on it involves a series of ever more sophisticated assemblers and compilers.[3]

History

The first language to provide such a bootstrap was NELIAC. The first commercial language to do so was PL/I.

The first self-hosting compiler (excluding assemblers) was written for Lisp by Hart and Levin at MIT in 1962. They wrote a Lisp compiler in Lisp, testing it inside an existing Lisp interpreter. Once they had improved the compiler to the point where it could compile its own source code, it was self-hosting.[4]

The compiler as it exists on the standard compiler tape is a machine language program that was obtained by having the S-expression definition of the compiler work on itself through the interpreter. (AI Memo 39)[4]

This technique is only possible when an interpreter already exists for the very same language that is to be compiled. It borrows directly from the notion of running a program on itself as input, which is also used in various proofs in theoretical computer science, such as the proof that the halting problem is undecidable.

List of self-hosting compilers

The following languages have self-hosting compilers:

See also

References

  1. ^ Compilers and Compiler Generators: An Introduction With C++. Patrick D. Terry 1997. International Thomson Computer Press. ISBN 1850322988
  2. ^ a b "Compiler Construction and Bootstrapping" by P.D.Terry 2000. HTML. PDF. HTML.
  3. ^ "Bootstrapping a simple compiler from nothing" by Edmund GRIMLEY EVANS 2001
  4. ^ a b Tim Hart and Mike Levin. "AI Memo 39-The new compiler". ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-039.pdf. Retrieved 2008-05-23.