Pumping lemma for context-free languages
From Wikipedia, the free encyclopedia
The pumping lemma for context-free languages, also known as the Bar-Hillel lemma, is a lemma that gives a property that all context-free languages have. Its primary use is to prove a language is not context-free.
The pumping lemma for context-free languages cannot be used to prove that any arbitrary non-context-free language is not context-free. In some cases the more generalized Ogden's lemma must be used.
Contents |
[edit] Formal statement
If a language L is infinite and context-free, then there exists some integer p > 0 such that any string w in L with |w| ≥ p (where p is a pumping length) can be written as
- w = uvxyz
with strings u, v, x, y and z, such that |vxy| ≤ p, |vy| ≥ 1, and
- uv ixy iz is in L for every integer i ≥ 0.
[edit] Use of lemma
The pumping lemma for context-free languages can be used to show that certain languages are NOT context-free.
We can show that the language L = {ajbjcj, j>0} is NOT context-free.
- Suppose L is context-free
- Conditions:
- | vxy | ≤ p. That is, the middle portion is not too long.
- vy ≠ ε (empty). Since v and y are the pieces to be “pumped”, this condition says that at least one of the strings we pump must not be empty.
- For all i ≥ 0, uvixyiz is in L. That is, the two strings v and y may be “pumped” any number of times, including 0, and the resulting string will still be a member of L.
- Conditions:
- If string w ∈ L where | w | > p, it follows that w = uvxyz, where | vxy | ≤ p
- Now, choose a value for j that is greater than p.
- Then, wherever vxy occurs in the string ajbjcj, vxy cannot contain more than two distinct letters, since the rightmost a is j+1 positions away from the leftmost c.
- Ex: It can be all a’s, all b’s, or all c’s, or it can be a’s and b’s, or it can be b’s and c’s.
- Thus, the string vxy cannot contain more than two distinct letters, but by the pumping lemma it cannot be empty either, so it must contain at least one letter.
- Ex: It can be all a’s, all b’s, or all c’s, or it can be a’s and b’s, or it can be b’s and c’s.
- Now we can start "pumping"
- Since uvxyz is in L, uv2xy2z must also be in L. Since v and y can’t both be empty, | uv2xy2z | >
| uvxyz |, so we have added letters. - But since vy does not contain all three distinct letters, we cannot have added the same number of each letter. We have pumped more letters and contradicted the original definition of L. Thus,
uv2xy2z cannot be in L.
- Since uvxyz is in L, uv2xy2z must also be in L. Since v and y can’t both be empty, | uv2xy2z | >
- We have arrived at a contradiction. Therefore, our original assumption, that L is context free, is false. KJ
[edit] See also
[edit] References
- Michael Sipser (1997). Introduction to the Theory of Computation. PWS Publishing. ISBN 0-534-94728-X. Section 1.4: Nonregular Languages, pp.77–83. Section 2.3: Non-context-free Languages, pp.115–119.