LRgen

From Wikipedia, the free encyclopedia

LRgen is a program which creates SLR, LALR and LR(1) lexers and parsers for use in compilers, translators and interpreters. The input to LRgen is a parser grammar or a lexer grammar written in Extended BNF (EBNF) or the more advanced TBNF grammar notation. The output from LRgen is source code in the C++ programming language or other programming language, if you modify the parser skeleton file appropriately. The LRgen package includes: integrated symbol-table builder, abstract-syntax tree AST constructor and tree walker, intermediate code generator, and source code in C++ for a compiler front end.


TBNF grammar notation is a superset of EBNF allowing one to specify the complete translation process from input text to intermediate code. TBNF has operators to define the creation of the Abstract Syntax Tree (AST), the traversal of the AST and the intermediate code to be generated as output from a generated parser. Besides creating LR parsers, LRgen is capable of creating LR lexers, which are more powerful for recognizing input tokens that are impossible to define with regular expressions (e.g. nested comments).


The finite-state machine automatons (lexers and parsers) generated by LRgen have a format called four-matrix-parser table, that is firmly based in automata theory and lends itself to further optimizations, such as chain reduction elimination, thereby offering very fast and small lexers and parsers.


LRgen was created by Parsetec. For more information see LRgen.