Regular language

From Wikipedia, the free encyclopedia

A regular language is a formal language (i.e., a possibly infinite set of finite sequences of symbols from a finite alphabet) that satisfies the following equivalent properties:

it can be accepted by a deterministic finite state machine
it can be accepted by a nondeterministic finite state machine
it can be accepted by an alternating finite automaton
it can be described by a regular expression
it can be generated by a regular grammar
it can be generated by a prefix grammar
it can be accepted by a read-only Turing machine
it can be defined in monadic second-order logic

1 Regular languages over an alphabet
2 Closure properties
3 Deciding whether a language is regular
4 Finite languages
5 See also
6 References
7 External links

[edit] Regular languages over an alphabet

The collection of regular languages over an alphabet Σ is defined recursively as follows:

the empty language Ø is a regular language.
the empty string language { ε } is a regular language.
For each a ∈ Σ, the singleton language { a } is a regular language.
If A and B are regular languages, then A U B (union), A • B (concatenation), and A* (Kleene star) are regular languages.
No other languages over Σ are regular.

All finite languages are regular. Other typical examples include the language consisting of all strings over the alphabet {a, b} which contain an even number of a's, or the language consisting of all strings of the form: several a's followed by several b's.

If a language is not regular, it requires a machine with at least Ω(log log n) space to recognize (where n is the input size). In other words, DSPACE(o(log log n)) equals the class of regular languages. In practice, most nonregular problems are solved by machines taking at least logarithmic space.

[edit] Closure properties

The regular languages are closed under the following operations: That is, if L and P are regular languages, the following languages are regular as well:

the complement $\bar{L}$ of L
the Kleene star $L *$ of L
the image φ(L) of L under a homomorphism
the concatenation $L \circ P$ of L and P
the union $L \cup P$ of L and P
the intersection $L \cap P$ of L and P
the difference $L - P$ of L and P
the reverse $L R$ of L
HALF(L), the set of strings that are the first half of a string in L

[edit] Deciding whether a language is regular

To locate the regular languages in the Chomsky hierarchy, one notices that every regular language is context-free. The converse is not true: for example the language consisting of all strings having the same number of a's as b's is context-free but not regular. To prove that a language such as this is not regular, one uses the Myhill-Nerode theorem or the pumping lemma.

There are two purely algebraic approaches to defining regular languages. If Σ is a finite alphabet and Σ* denotes the free monoid over Σ consisting of all strings over Σ, f : Σ* → M is a monoid homomorphism where M is a finite monoid, and S is a subset of M, then the set f⁻¹(S) is regular. Every regular language arises in this fashion.

If L is any subset of Σ*, one defines an equivalence relation ~ on Σ* as follows: u ~ v is defined to mean

uw ∈ L if and only if vw ∈ L for all w ∈ Σ*

The language L is regular if and only if the number of equivalence classes of ~ is finite; if this is the case, this number is equal to the number of states of the minimal deterministic finite automaton accepting L.

[edit] Finite languages

A specific subset within the class of regular languages is the finite languages - those containing only a finite number of words. These are obviously regular as one can create a regular expression that is the union of every word in the language, and thus are regular.

[edit] See also

Pumping lemma for regular languages

[edit] References

Michael Sipser (1997). Introduction to the Theory of Computation. PWS Publishing. ISBN 0-534-94728-X. Chapter 1: Regular Languages, pp.31–90. Subsection "Decidable Problems Concerning Regular Languages" of section 4.1: Decidable Languages, pp.152–155.

[edit] External links

Department of Computer Science at the University of Western Ontario: Grail+, http://www.csd.uwo.ca/Research/grail/. A software package to manipulate regular expressions, finite-state machines and finite languages. Free for non-commercial use.

Chalchalero! http://www.ucse.edu.ar/fma/sepa/chalchalero.htm. A free visual software to manipulate regular expressions, regular grammars, finite-state machines and finite languages developed by the SEPa! Project Team (Universidad Católica de Santiago del Estero)

Automata theory: formal languages and formal grammars
Chomsky hierarchy	Grammars	Languages	Minimal automaton
Type-0	Unrestricted	Recursively enumerable	Turing machine
n/a	(no common name)	Recursive	Decider
Type-1	Context-sensitive	Context-sensitive	Linear-bounded
Type-2	Context-free	Context-free	Pushdown
Type-3	Regular	Regular	Finite
Each category of languages or grammars is a proper subset of the category directly above it.