Myhill–Nerode theorem

From Wikipedia, the free encyclopedia

In the theory of formal languages, the Myhill–Nerode theorem provides a necessary and sufficient condition for a language to be regular. It is almost exclusively used in order to prove that a given language is not regular.

The theorem is named for John Myhill and Anil Nerode, who proved it at the University of Chicago in 1958.

[edit] Statement of theorem

Given a language L, define a relation R_L on strings by the rule x R_L y if there is no distinguishing extension z with the property that exactly one of the strings xz and yz is in L. It is easy to show that R_L is an equivalence relation on strings, and thus it divides the set of all finite strings into one or more equivalence classes.

The Myhill–Nerode Theorem states that the number of states in the smallest automaton accepting L is equal to the number of equivalence classes in R_L. The intuition is that if one starts with such a minimal automaton, then any strings x and y that drive it to the same state will be in the same equivalence class; and if one starts with a partition into equivalence classes, one can easily construct an automaton that uses its state to keep track of the equivalence class containing the part of the string seen so far.

[edit] Use and consequences

A consequence of the Myhill–Nerode theorem is that a language L is regular (i.e., accepted by a finite state machine) if and only if the number of equivalence classes of R_L is finite.

The immediate corollary is that if a language defines an infinite set of equivalence classes, it is not regular. It is this corollary that is frequently used to prove that a language is non-regular.

For example, the language consisting of binary numbers which can be divided by 3 is regular. There are 3 equivalence classes - numbers that give remainders 0, 1 and 2 when divided by 3. The minimal automaton accepting our language would have three states corresponding to the equivalence classes.