Talk:Earley parser

From Wikipedia, the free encyclopedia

The article says that rules like

   a -> a b

will cause both top-down and bottom-up parsers to go into an infinite loop, but I thought it was only top-down parsers that had a problem with left-recursion. Any comments? — Cadr

Since no-one's commented, I'll go ahead and change the page... — Cadr
It's not exactly true, top down parsers like LL(1) will enter in a infinite loop, but other top-down parsers, indeterministic LL, (and I think some kind of top-down parser that uses breadth-first search also) will eventually finish.

[edit] NP-complete?

Parsing Unrestricted languages is not NP-complete, it is much harder.

It can be easily shown (simulation of Turing machines by grammars gives an immediate reduction of the halting problem to parsing of general languages) that the problem is not computable.

Unless anybody objects, I'll correct it.

--UrsusArctos 22:37, 19 Oct 2004 (UTC)

Be careful with the meaning of the word "parse", though. You could argue that you don't have to be able to decide a language in order to parse it, and consequently even unrestricted languages would be "parseable" in a sense, i.e. the sense that if there was a parse, you would get it eventually. I'm not sure there's any general agreement on whether this would count as "parsing" or not, so I think we should word it carefully. Otherwise yeah, the NP-complete thing is incorrect. Cadr 12:15, 20 Oct 2004 (UTC)


[edit] Turning the recognizer into a parser

Everyone describes Earley's algorithm as a parser, but no-one ever bothers to explain how you turn the recognizer into a parser. My understanding is that you modify the completer to record a backpointer.. but exactly what item gets the backpointer and what it is to point to is still kind of a mystery to me. I believe that the new item you are creating should receive the backpointer and it should point to the item you are copying (to advance the dot in). However this is not sufficient, you also need some mechanism to inherit backpointers, otherwise you end up with a disconnected tree. Can anyone clarify this process?

That would be very useful! Unfortunately, I haven't found any description about that yet... --ThSoft 21:39, 14 September 2006 (UTC)

[edit] Error in the algorithm description?

I believe that:

  • Prediction: For every state in S(k) of the form (X → α • Y β, j) (where j is the origin position as above), add (Y → • γ, j+1) to S(k) for every production with Y on the left-hand side.

Should be changed to:

  • Prediction: For every state in S(k) of the form (X → α • Y β, j) (where j is the origin position as above), add (Y → • γ, k) to S(k) for every production with Y on the left-hand side.

This is a change of "j+1" to "k". See the example. —Preceding unsigned comment added by 87.116.85.20 (talk) 17:14, 4 December 2007 (UTC)