Useless rules

In theoretical computer science, in particular in the theory of formal languages, useless rules of a formal grammar are those rules of symbol production that are unreachable or unproductive, that is, that can or need never be applied.

Definition

Given a context-free grammar, a nonterminal symbol X is called productive, or generating, if there is a derivation X * w for some string w of terminal symbols. A nonterminal symbol X is called reachable if there is a derivation S* αXβ for some strings α, β of non-terminal and terminal symbols, and where S denotes the grammar's start symbol.

A rule with an unproductive or unreachable symbol on its left-hand side can be deleted from the grammar without changing the accepted (a.k.a. generated) language. Likewise, an alternative containing such a symbol can be deleted from the right-hand side of a rule without changing the language. Such rules and alternatives are called useless.[1]

For formal grammars that are not context-free, similar definitions apply.

Examples

Denoting nonterminal and terminal symbols by upper- and lower-case letters, respectively, in the following regular grammar with start symbol S

SBb | Cc | Ee
BBb | b
CCc | c
DBd | Cd | d
EEe

the nonterminal D (colored red for convenience) is unreachable, and E (green) is unproductive. Hence, omitting the last two rules doesn't change the language accepted by the grammar, nor does omitting the alternative "| Ee" from the right-hand side of the rule for S.

Cleaning Useless Rules

Hopcroft, et al.[2] give an algorithm to eliminate useless rules from a context-free grammar.

Aiken and Murphy[3] give a fixpoint algorithm to detect which nonterminals of a given regular tree grammar are unproductive.

References

  1. John E. Hopcroft, Rajeev Motwani, Jeffrey D. Ullman (2003). Introduction to Automata Theory, Languages, and Computation. Addison Wesley.; here: Sect.7.1.1, p.256
  2. Theorem 7.2, Sect.7.1, p.255ff
  3. Aiken, A.; Murphy, B. (1991). "Implementing Regular Tree Expressions". ACM Conference on Functional Programming Languages and Computer Architecture. pp. 427–447.; here: Sect.4