Ambiguous grammar

From Wikipedia, the free encyclopedia

In computer science, a grammar is said to be an ambiguous grammar if there is some string that it can generate in more than one way (i.e., the string has more than one parse tree or more than one leftmost derivation). A language is inherently ambiguous if it can only be generated by ambiguous grammars.

Some programming languages have ambiguous grammars; in this case, semantic information is needed to select the intended parse of an ambiguous construct. For example, in C the following:

x * y ;

can be interpreted either as the declaration of an identifier y of type pointer to x, or as an expression in which x is multiplied by y and the result is discarded. To choose between the two possible interpretations, a compiler must consult its symbol table to find out whether x has been declared as a typedef name that is visible at this point.

Contents

[edit] Example

The context free grammar

A → A + A | A − A | a

is ambiguous since there are two leftmost derivations for the string a + a + a:

     A → A + A      A → A + A
     → a + A      → A + A + A
     → a + A + A      → a + A + A
     → a + a + A      → a + a + A
     → a + a + a      → a + a + a

As another example, the grammar is ambiguous since there are two parse trees for the string a + a − a:

Leftmostderivations jaredwf.png

The language that it generates, however, is not inherently ambiguous; the following is a non-ambiguous grammar generating the same language:

A → A + a | A − a | a

[edit] Recognizing an ambiguous grammar

The general question of whether a grammar is not ambiguous is undecidable. No algorithm can exist to determine the ambiguity of a grammar because the undecidable Post correspondence problem can be encoded as an ambiguity problem.

There is an obvious difficulty in parsing an ambiguous grammar by a deterministic parser (see deterministic context-free grammar) but nondeterministic parsing imposes a great efficiency penalty. Most constructs of interest to parsing can be recognized by unambiguous grammars. Some ambiguous grammars can be converted into unambiguous grammars, but no general procedure for doing this is possible just as no algorithm exists for detecting ambiguous grammars. Compiler generators such as YACC include features for disambiguating some kinds of ambiguity, such as by using the precedence and associativity constraints.

[edit] Inherently ambiguous languages

While some languages (the set of strings that can be generated by a grammar) have both ambiguous and unambiguous grammars, there exist languages for which no unambiguous grammar can exist. An example of an inherently ambiguous language is the union of {anbmcmdn | n,m > 0} with {anbncmdm | n,m > 0}. This is context-free, since it is a union of context-free languages, but Introduction to Automata Theory... contains a proof that there is no way to unambiguously parse strings in the (non-context-free) subset {anbncndn | n > 0} which is the intersection of the two languages.

[edit] External links

[edit] References

Compilers: Principles, Techniques, and Tools