Dyck language

Lattice of the 14 Dyck words of length 8 - [ and ] interpreted as up and down

In the theory of formal languages of computer science, mathematics, and linguistics, the Dyck language is the language consisting of balanced strings of square brackets [ and ]. It is important in the parsing of expressions that must have a correctly nested sequence of brackets, such as arithmetic or algebraic expressions. It is named after the mathematician Walther von Dyck.

Formal definition

Let be the alphabet consisting of the symbols [ and ] and let denote its Kleene closure. For any element with length we define partial functions and by

is with "" inserted into the th position
is with "" deleted from the th position

with the understanding that is undefined for and is undefined if . We define an equivalence relation on as follows: for elements we have if and only if there exists a finite sequence of applications of the and functions starting with and ending with , where the empty sequence is allowed. That the empty sequence is allowed accounts for the reflexivity of . Symmetry follows from the observation that any finite sequence of applications of to a string can be undone with a finite sequence of applications of . Transitivity is clear from the definition.

The equivalence relation partitions the language into equivalence classes. If we take to denote the empty string, then the language corresponding to the equivalence class is called the Dyck language.

Alternative definition

An alternative definition of the Dyck language can be formulated when we introduce the function.

for any .

where and are respectively the number of [ and ] in , i.e. counts the imbalance of [ over ]. If is positive then has more [ than ].

Now, the Dyck language can be defined as the language

Properties

Examples

We can define an equivalence relation on the Dyck language . For we have if and only if , i.e. and have the same length. This relation partitions the Dyck language where . Note that is empty for odd . E.g. there are 14 words in , i.e. [[[[]]]], [[[][]]], [[[]][]], [[][[]]], [[[]]][], [[][][]], [][[[]]], [[][]][], [[]][[]], [][[][]], [[]][][], [][[]][], [][][[]], [][][][].

Having introduced the Dyck words of length , we can introduce a relationship on them. For every we define a relation on ; for we have if and only if can be reached from by a series of proper swaps. A proper swap in a word swaps an occurrence of '][' with '[]'. For each the relation makes into a partially ordered set. The relation is reflexive because an empty sequence of proper swaps takes to . Transitivity follows because we can extend a sequence of proper swaps that takes to by concatenating it with a sequence of proper swaps that takes to forming a sequence that takes into . To see that is also antisymmetric we introduce an auxiliary function where ranges over all prefixes of . The following table illustrates that is strictly monotonic with respect to proper swaps.

Strict monotonicity of
partial sums of
] [
[ ]
partial sums of
Difference of partial sums 0 2 0 0

Hence so when there is a proper swap that takes into . Now if we assume that both and , then there are non-empty sequences of proper swaps such is taken into and vice versa. But then which is nonsensical. Therefore, whenever both and are in , we have , hence is antisymmetric.

The partial ordered set is shown in the illustration accompanying the introduction if we interpret a [ as going up and ] as going down.

See also

Notes

  1. Kambites, Communications in Algebra Volume 37 Issue 1 (2009) 193-208
  2. Barrington and Corbett, Information Processing Letters 32 (1989) 251-256

References

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.