Pumping lemma for context-free languages
From Wikipedia, the free encyclopedia
The pumping lemma for context-free languages, also known as the Bar-Hillel lemma, is a lemma that gives a property that all context-free languages have. Its primary use is to prove a language is not context-free.
The pumping lemma for context-free languages cannot be used to prove that any arbitrary non-context-free language is not context-free. In some cases the more generalized Ogden's lemma must be used.
Contents |
[edit] Formal statement
If a language L is context-free, then there exists some integer p > 0 such that any string s in L with |s| ≥ p (where p is a pumping length) can be written as
- s = uvxyz
with substrings u, v, x, y and z, such that |vxy| ≤ p, |vy| ≥ 1, and
- uv ixy iz is in L for every integer i ≥ 0.
[edit] Informal statement and explanation
The pumping lemma for context-free languages (called just "the pumping lemma" from now on) describes a property that all context-free languages are guaranteed to have. The property is a property of all strings in the language that are of length at least p, where p is a constant -- called the pumping length -- that varies between context-free languages. Say s is a string of length at least p that is in the language. The pumping lemma states that s can be split into five substrings, s = uvxyz, where vy is non-empty and the length of vxy is at most p, such that repeating v and y any (and the same) number of times in s produces a string that is still in the language (it's possible and often useful to repeat zero times, which removes v and y from the string). This process of "pumping in" additional copies of v and y is what gives the pumping lemma its name.
Note that finite languages (which are regular and hence context-free) obey the pumping lemma trivially by having p equal to the maximum string length in L plus one.
The pumping lemma is often used to prove languages non-context-free by showing that some string s in the language of length at least p does not have the properties outlined above, i.e. that it cannot be "pumped" without producing some strings that are not in the language.
[edit] Use of lemma
This section may require cleanup to meet Wikipedia's quality standards. Please improve this article if you can (April 2008). |
The pumping lemma for context-free languages can be used to show that certain languages are NOT context-free.
We can show that the language L = {ajbjcj, j>0} is NOT context-free.
- Suppose L is context-free
- Conditions:
- | vxy | ≤ p. That is, the length of the middle portion does not exceed the pumping length.
- vy ≠ ε (empty). Since v and y are the pieces to be “pumped”, this condition says that at least one of the strings we pump must not be empty.
- For all i ≥ 0, uvixyiz is in L. That is, the two strings v and y may be “pumped” any number of times, including 0, and the resulting string will still be a member of L.
- Conditions:
- If string w ∈ L where | w | > p, it follows that w = uvxyz, where | vxy | ≤ p
- Now, choose a value for j that is greater than p.
- Then, wherever vxy occurs in the string ajbjcj, vxy cannot contain more than two distinct letters, since the rightmost a is j+1 positions away from the leftmost c.
- Ex: It can be all a’s, all b’s, or all c’s, or it can be a’s and b’s, or it can be b’s and c’s.
- Thus, the string vxy cannot contain more than two distinct letters, but by the pumping lemma it cannot be empty either, so it must contain at least one letter.
- Ex: It can be all a’s, all b’s, or all c’s, or it can be a’s and b’s, or it can be b’s and c’s.
- Now we can start "pumping"
- Since uvxyz is in L, uv2xy2z must also be in L. Since v and y can’t both be empty, | uv2xy2z | >
| uvxyz |, so we have added letters. - But since vy does not contain all three distinct letters, we cannot have added the same number of each letter. We have pumped more letters and contradicted the original definition of L. Thus,
uv2xy2z cannot be in L.
- Since uvxyz is in L, uv2xy2z must also be in L. Since v and y can’t both be empty, | uv2xy2z | >
- We have arrived at a contradiction. Therefore, our original assumption, that L is context free, is false.
[edit] See also
[edit] References
- Michael Sipser (1997). Introduction to the Theory of Computation. PWS Publishing. ISBN 0-534-94728-X. Section 1.4: Nonregular Languages, pp.77–83. Section 2.3: Non-context-free Languages, pp.115–119.