Useless rules
In theoretical computer science, in particular in the theory of formal languages, useless rules of a formal grammar are those rules of symbol production that are unreachable or unproductive, that is, that can or need never be applied.
Definition
Given a context-free grammar, a nonterminal symbol X is called productive, or generating, if there is a derivation X ⇒* w for some string w of terminal symbols. A nonterminal symbol X is called reachable if there is a derivation S ⇒* αXβ for some strings α, β of non-terminal and terminal symbols, and where S denotes the grammar's start symbol.
A rule with an unproductive or unreachable symbol on its left-hand side can be deleted from the grammar without changing the accepted (a.k.a. generated) language. Likewise, an alternative containing such a symbol can be deleted from the right-hand side of a rule without changing the language. Such rules and alternatives are called useless.[1]
For formal grammars that are not context-free, similar definitions apply.
Examples
Denoting nonterminal and terminal symbols by upper- and lower-case letters, respectively, in the following regular grammar with start symbol S
S → Bb | Cc | Ee |
B → Bb | b |
C → Cc | c |
D → Bd | Cd | d |
E → Ee |
the nonterminal D is unreachable, and E is unproductive. Hence, omitting the last two rules doesn't change the language accepted by the grammar, nor does omitting the alternative "| Ee" from the right-hand side of the rule for S.
Cleaning Useless Rules
Hopcroft, et al.[2] give an algorithm to eliminate useless rules from a context-free grammar.
Aiken and Murphy[3] give a fixpoint algorithm to detect which nonterminals of a given regular tree grammar are unproductive.
References
- ↑ John E. Hopcroft; Rajeev Motwani; Jeffrey D. Ullman (2003). Introduction to Automata Theory, Languages, and Computation. Addison Wesley.; here: Sect.7.1.1, p.256
- ↑ Theorem 7.2, Sect.7.1, p.255ff
- ↑ Aiken, A.; Murphy, B. (1991). "Implementing Regular Tree Expressions". ACM Conference on Functional Programming Languages and Computer Architecture. pp. 427–447.; here: Sect.4