Regular grammar

From Wikipedia, the free encyclopedia

In computer science a right regular grammar is a formal grammar (N, Σ, P, S) such that all the production rules in P are of one of the following forms:

  1. Aa - where A is a non-terminal in N and a is a terminal in Σ
  2. AaB - where A and B are in N and a is in Σ
  3. A → ε - where A is in N and ε denotes the empty string, i.e. the string of length 0.

In a left regular grammar, all rules obey the forms

  1. Aa - where A is a non-terminal in N and a is a terminal in Σ
  2. ABa - where A and B are in N and a is in Σ
  3. A → ε - where A is in N and ε is the empty string.

An example of a right regular grammar G with N = {S, A}, Σ = {a, b, c}, P consists of the following rules

S → aS
S → bA
A → ε
A → cA

and S is the start symbol. This grammar describes the same language as the regular expression a*bc*.

A regular grammar is a left regular or right regular grammar.

[edit] Introduction

The regular grammars describe exactly all regular languages and are in that sense equivalent to finite state automata and regular expressions. Moreover, the right regular grammars by themselves are also equivalent to the regular languages, as are the left regular grammars.

Every regular grammar is a context-free grammar.

Every context-free grammar can be easily rewritten into a form in which only a combination of left regular and right regular rules is used. Therefore, such grammars can express all context-free languages. Regular grammars, which use either left-regular or right-regular rules but not both, can only express a smaller set of languages, called the regular languages. In that sense they are equivalent with finite state automata and regular expressions. (for illustration: the paradigmatic context-free language with strings of the form aibi is generated by the grammar G with N = {S, A}, Σ = {a, b}, P with the rules

S → aA
A → Sb
S → ε

and S being the start symbol. Note that this grammar has both left-regular and right-regular rules and is therefore not regular any more.)

Some textbooks and articles disallow empty production rules, and assume that the empty string is not present in languages.

[edit] See also

Automata theory: formal languages and formal grammars
Chomsky
hierarchy
Grammars Languages Minimal
automaton
Type-0 Unrestricted Recursively enumerable Turing machine
n/a (no common name) Recursive Decider
Type-1 Context-sensitive Context-sensitive Linear-bounded
n/a Indexed Indexed Nested stack
Type-2 Context-free Context-free Nondeterministic Pushdown
n/a Deterministic Context-free Deterministic Context-free Deterministic Pushdown
Type-3 Regular Regular Finite
Each category of languages or grammars is a proper subset of the category directly above it.