Talk:Parsing expression grammar
From Wikipedia, the free encyclopedia
Contents |
[edit] Incorrect example
I'm not convinced that the grammar listed for {anbncn:n > 0} is correct.
- A ← (a A b)?
- B ← (b B c)?
A matches any number of a's followed by the same number of b's. B matches any number of b's followed by the same number of c's.
- S ← &(A !b) a* B !c
&(A !b) ensures that we have any number of a's followed by the exactly the same number of b's. a* consumes all the a's, and then B !c matches any number of b's followed by the exactly the same number of c's.
In other words, A and B overlap: they match the same string of b's. They are more symmetrical that the rule suggests.
The problem is that the A and B rules also match the empty string, so this grammar accepts for n=0, which per the prose, it shouldn't. (Also, since we know there should be at least one a, we can use a+ instead of a*).
I'm going to change the A and B rules not to match the null string, the S rule to use a+, and use math markup for the equation (and n >= 1 rather than n > 0, both for consistency with Context-sensitive grammar).
Malcolm rowe 13:06, 26 September 2005 (UTC)
(Removing my own earlier comment.) QTJ 05:23, 11 October 2006 (UTC)
[edit] See also REBOL?
What does REBOL have to do with parsing expression grammars? Using them or providing a mechanism to use them doesn't count.
- I agree, finding a given language in the See also section is surprising. If PEG is central/essential or at least present as standard feature in this language, explain it after this entry. Otherwise, move it to the External links (or remove it!).
- I suggest also that links in the PEG parser generators section are removed from the External links one (duplicates)
- PhiLho 08:53, 3 January 2007 (UTC)
[edit] Please clarify N and Σ in the examples
The definition of a parsing expression grammar says it consists of
a finite set N of nonterminal symbols a finite set Σ of terminal symbols that is disjoint from N a finite set P of parsing rules
Please provide clearly and explicitly in all of the examples, what all three of these are.
S <- if C then S else S / if C then S
"if" "then" and "else" are in the finite set of terminal symbols? If so, then say so. And so forth for the rest of the examples.
thank you. --LAM 66.28.244.66 17:04, 15 May 2006 (UTC)
[edit] First example is incorrect.
The first example (PEG for mathematical formulas that use the basic four operations) is (was, now) incorrect. It allows expressions such as 3.2 + ..2..3..000
Only a single period should be allowed. —Preceding unsigned comment added by HappyDog (talk • contribs)
- Thanks, that's a good point. I think the best way is just to take out the decimal point, and only allow the example to recognize non-negative integers. If it's supposed to cover real numbers, it needs a negative sign too, and some needlessly complex parsing rules. I've changed the example to just use non-negative integers. rspeer / ɹəədsɹ 23:37, 25 May 2006 (UTC)
-
- A good solution. Thanks for the prompt reply. --HappyDog 17:07, 26 May 2006 (UTC)
[edit] Suggest Merge of Packrat Parser Into This
I suggest that the packrat parser article stub be merged into this article. Since I am a parsing technology author myself, I recuse myself from the discussion beyond this suggestion, to avoid conflicts-of-interest. My rational is that packrat parsing is not widely enough adopted to get a full stub. Packrat parser could then be a redirect to this page, instead.
-- QTJ 00:40, 19 October 2006 (UTC)
I second QTJ's suggestion.
--Junkblocker 00:39, 20 January 2007 (UTC)
[edit] Mistake?
For (* comments *) the article claims:
- C ← "(*" N* "*)"
- N ← C / (!"(*" Z)
- Z ← any single character
but wouldn't N* eat everything including the closing comment bracket? Shouldn't this be:
- C ← "(*" N* "*)"
- N ← C / (!"*)" Z)
- Z ← any single character
to avoid eating the closing comment backet? The only reason to not allow N ← (* is to generate an error if unbalanced. 59.112.63.188 18:09, 15 April 2007 (UTC)
- You're right that the closing bracket needs to be avoided within N. So does the opening bracket, because you need to be able to balance comment brackets to nest them. I'll change the example. rspeer / ɹəədsɹ 19:59, 15 April 2007 (UTC)
[edit] Different interpretation
We say this:
Parsing expression grammars look similar to regular expressions or context-free grammars (CFG) in Backus-Naur form (BNF) notation, but have a different interpretation.
It's not clear to me how they have a different interpretation from regular expressions. Seems to me that's what they are, with the possible exception of the & and ! lookahead functionality. Can someone explain how they are different? --P3d0 03:33, 13 August 2007 (UTC)
- The most important difference is that they are recursive. Regular expressions can't refer to themselves or other regular expressions. rspeer / ɹəədsɹ 08:07, 25 October 2007 (UTC)
[edit] Problems with the current definition
Under disadvantages it is mentioned that PEG's cannot handle left recursive rules. I believe this statement is correct, but it is not implied by the current definition. ie. The current definition of a peg is deficient.
—Preceding unsigned comment added by 202.37.96.11 (talk) 22:54, 24 October 2007 (UTC)
The definition of a peg doesn't mention grouping of expressions (e), but the example uses it. Which is right?
I suspect the answer is the definition needs to be extended to match the example.
—Preceding unsigned comment added by 202.37.96.11 (talk) 22:47, 24 October 2007 (UTC)
[edit] Hard to see PEGs as new
Note that ordered choice interpretation of arbitrary context-free grammars is not new. One example of this is the TXL programming language (http://www.txl.ca/), which has existed since the mid 80s. Add that regular expressions can be de-sugared to context free grammars and it is hard to see PEGs as a new kind of formalism. —Preceding unsigned comment added by Thurston51 (talk • contribs) 23:40, 27 November 2007 (UTC)
[edit] PEGS vs CFGs
How do PEGs compare to CFGs in expressive power? Rp (talk) 10:42, 30 November 2007 (UTC)