GOLD (parser)

From Wikipedia, the free encyclopedia

GOLD is a freeware parsing system that was designed to support multiple programming languages.

The system uses the LALR algorithm for parsing and a DFA for the lexer. It designed around the principle of logically separating the process of generating the LALR and DFA parse tables and the actual implementation of the parsing algorithms. This allows parsers to be implemented in different programming languages while maintaining the same grammars and development process.

Contents

[edit] Components

The GOLD system consists of three logical components, the "Builder", the "Engine", and a "Compiled Grammar Table" file type which functions as an intermediary between the first two.

[edit] Builder

GOLD Builder Application
GOLD Builder Application

The Builder is the primary component and is used to read a grammar and construct the LALR and DFA parse tables. Essentially, the Builder performs the same table construction tasks as most compiler-compilers such as YACC and ANTLR.

Once the LALR and DFA parse tables are successfully constructed, the Builder can save them into a Compiled Grammar Table file. This will be later used by the "Engine".

In addition, the Builder can embed table information into a "skeleton program". These make use of templates designed for each different implementation of the Engine.

Currently, the Builder component is only available for Windows 32-bit operating systems.

[edit] Compiled Grammar Table File

The Compiled Grammar Table file is used to store table information generated by the Builder.

[edit] Engine

The Engine component is written in a specific programming language and/or development platform. The Engine implements the LALR and DFA algorithms and interacts with the developers software.

As different implementations of the Engine are created for different programming languages, the approach and design will vary. Since different programming languages use different approaches to designing programs, each implementation of the Engine will vary. As a result, an implementation of the Engine written for Visual Basic 6 will differ greatly from one written for ANSI C.

[edit] Grammars

GOLD grammars are based directly on Backus-Naur Form, regular expressions, and set notation.

The following grammar defines the syntax for a minimal general-purpose programming language called "Simple".

"Name"    = 'Simple'
"Author"  = 'Devin Cook'
"Version" = '2.1' 
"About"   = 'This is a very simple grammar designed for use in examples'

"Case Sensitive" = False 
"Start Symbol"   = <Statements>

{String Ch 1} = {Printable} - ['']
{String Ch 2} = {Printable} - ["]

Identifier    = {Letter}{AlphaNumeric}*    

! String allows either single or double quotes

StringLiteral = ''  {String Ch 1}* ''
              | '"' {String Ch 2}* '"'

NumberLiteral = {Number}+('.'{Number}+)?

Comment Start = '/*'
Comment End   = '*/'
Comment Line  = '//' 

<Statements>  ::= <Statements> <Statement>
               |  <Statement>

<Statement>   ::= display <Expression>
               |  display <Expression> read ID
               |  assign ID '=' <Expression>
               |  while <Expression> do <Statements> end
               |  if <Expression> then <Statements> end
               |  if <Expression> then <Statements> else <Statements> end
               
<Expression>  ::= <Expression> '>'  <Add Exp>
               |  <Expression> '<'  <Add Exp>
               |  <Expression> '<=' <Add Exp>
               |  <Expression> '>=' <Add Exp>
               |  <Expression> '==' <Add Exp>
               |  <Expression> '<>' <Add Exp>
               |  <Add Exp>

<Add Exp>     ::= <Add Exp> '+' <Mult Exp>
               |  <Add Exp> '-' <Mult Exp>
               |  <Add Exp> '&' <Mult Exp>
               |  <Mult Exp>

<Mult Exp>    ::= <Mult Exp> '*' <Negate Exp>
               |  <Mult Exp> '/' <Negate Exp>
               |  <Negate Exp>

<Negate Exp>  ::= '-' <Value>
               |  <Value>

<Value>       ::= Identifier
               |  StringLiteral
               |  NumberLiteral
               |  '(' <Expression> ')'

[edit] Development Overview

[edit] 1. Design the Grammar

The first step consists of writing and testing a grammar for the language being parsed. The grammar can be written using any text editor - such as Notepad or the editor that is built into the Builder. At this stage, no coding is required.

Grammars are written using the GOLD Meta-language which is based on notation used by Backus-Naur Form and regular expressions.

[edit] 2. Generate Skeleton Program and Compiled Grammar Table File

Once the grammar is complete, it is analyzed by the Builder, the LALR and DFA parse tables are constructed, and any ambiguities or problems with the grammar are reported. Afterwards, the tables are saved to a Compiled Grammar Table file to be used later by a parsing engine. At this point, the GOLD Parser Builder is no longer needed.

[edit] 3. Select a Parsing Engine

In the final stage, the tables are read by an Engine. At this point, the development process is dependent on the selected implementation language.

[edit] Supported Programming Languages

[edit] External Links