Matrix grammar

From Wikipedia, the free encyclopedia

A Matrix grammar is a formal grammar in which instead of single productions, productions are grouped together into finite sequences. A production cannot be applied separately, it must be applied in sequence. In the application of such a sequence of productions, the rewriting is done in accordance to the each production in sequence, the first one, second one etc. till the last production has been used for rewriting. The sequences are referred to as matrices.

[edit] Formal definition

A matrix grammar is an ordered quadruple

G = (VN,VT,X0,M).

where

  • VN is a finite set of non-terminals
  • VT is a finite set of terminals
  • X0 is a special element of VN, viz. the starting symbol
  • M is a finite set of non-empty sequences whose elements are ordered pairs
(P, Q), \quad P \in W(V) V_N W(V), \quad Q \in W(V), \quad V = V_N \cup V_T.

The pairs are called productions, written as P \to Q. The sequences are called matrices and can be written as

m = [P_1 \to Q_1, \ldots, P_r \to Q_r].

Let F be the set of all productions appearing in the matrices m of a matrix grammar G. Then the matrix grammar G is of type-i,i = 0,1,2,3, length-increasing, linear, λ-free, context-free or context-sensitive if and only if the grammar G1 = (VN,VT,X0,F) has the following property.

For a matrix grammar G, a binary relation \Rightarrow_G is defined; also represented as \Rightarrow. For any P, Q \in W(V), P \Rightarrow Q holds if and only if there exists an integer r \ge 1 such that the words

\alpha_1,, \ldots, \alpha_{r + 1}, \quad P_1, \ldots, P_r, \quad R_1, \ldots, R_r, \quad, R^1, \ldots, R^r

over V exist and

  • αi = P and αr + 1 = Q
  • m is one of the matrices of G
  • αi = RiPiRi and αi + 1 = RiQiRi.

If the above conditions are satisfied, it is also said that P \Rightarrow Q holds with (m,R1) as the specifications.

Let \Rightarrow^{*} be the reflexive transitive closure of the relation \Rightarrow. Then, the language generated by the matrix grammar G is given by

L(G) = \{P \in W(V_T) | X_0 \Rightarrow^{*} P\}.

[edit] Example

Consider the matrix grammar

G = ({S,A,B},{a,b,c},S,M)

where M is a collection containing the following matrices:

[S \rightarrow XY], \quad [X \rightarrow aXb, Y \rightarrow cY], \quad [X \rightarrow ab, Y \rightarrow c]

These matrices, which contain only context-free rules generate the context-sensitive language

L = \{a^nb^nc^n|n \ge 1\}.

This example can be found on pages 8 and 9 of [1].

[edit] References

[1] Ábrahám, S. Some questions of language theory. International Conference on Computational Linguistic, 1965. pp 1-11. Available at http://acl.ldc.upenn.edu/C/C65/C65-1001.pdf