Compiler-compiler
From Wikipedia, the free encyclopedia
A compiler-compiler or compiler generator is a program that generates the source code of a parser, interpreter, or compiler from a programming language description. In the most general case, it takes a full machine-independent syntactic and semantic description of a programming language, along with a full language-independent description of a target instruction set architecture, and generates a compiler. In practice, most compiler generators are more or less elaborate parser generators which handle neither semantics nor target architecture.
Contents |
[edit] Variants
The earliest and still most common form of compiler-compiler is a parser generator, whose input is a grammar (usually in BNF) of the language. A typical parser generator associates executable code with each of the rules of the grammar that should be executed when these rules are applied by the parser. These pieces of code are sometimes referred to as semantic action routines since they define the semantics of the syntactic structure that is analysed by the parser. Depending upon the type of parser that should be generated, these routines may construct a parse tree (or AST), or generate executable code directly.
Some experimental compiler-compilers take as input a formal description of programming language semantics, typically using denotational semantics. This approach is often called 'semantics-based compiling', and was pioneered by Peter Mosses' Semantic Implementation System (SIS) in 1979. However, both the generated compiler and the code it produced were inefficient in time and space. No production compilers are currently built in this way, but research continues.
The Production Quality Compiler-Compiler (PQCC) project at Carnegie-Mellon University does not formalize semantics, but does have a semi-formal framework for machine description.
Compiler-compilers exist in many flavors, including bottom-up rewrite machine generators (see JBurg) used to tile syntax trees according to a rewrite grammar for code generation, and attribute grammar parser generators (e.g. ANTLR can be used for simultaneous type checking, constant propagation, and more during the parsing stage).
[edit] History
The first compiler-compiler to use that name was written by Tony Brooker in 1960 and was used to create compilers for the Atlas computer at the University of Manchester, including the Atlas Autocode compiler. However it was rather different from modern compiler compilers, and today would probably be described as being somewhere between a highly customisable generic compiler and an extensible-syntax language. The name 'compiler compiler' was far more appropriate for Brooker's system than it is for most modern compiler compilers, which are more accurately described as mere parser generators. It is almost certain that the Compiler Compiler name has entered common use due to Yacc rather than Brooker's work being remembered.[citation needed]
Other examples of parser generators in the yacc vein are ANTLR, Coco/R, CUP, GNU bison, Eli, FSL, META 5, MUG2, Parsley, Pre-cc, SableCC, JavaCC and MixedCC.
[edit] See also
- List of compiler-compilers
- PQCC, a compiler-compiler that is more than a parser generator
- Flex/Bison
- Lex/Yacc
- ANTLR
- LRgen
- Parser types: LL, LR, SLR, LALR, GLR, Packrat
[edit] References
This article was originally based on material from the Free On-line Dictionary of Computing, which is licensed under the GFDL.