Parse tree

From Wikipedia, the free encyclopedia

A parse tree or concrete syntax tree is a tree node that represents the syntactic structure of a string according to some formal grammar. In a parse tree, the interior nodes are labeled by non-terminals of the grammar, while the leaf nodes are labeled by terminals of the grammar. A program that produces such trees is called a parser. Parse trees may be generated for sentences in natural languages (see natural language processing), as well as during processing of computer languages, such as programming languages. Parse trees are distinct from abstract syntax trees (also known simply as syntax trees) which are a related concept in compilers.

[edit] Basic description

A parse tree is made up of nodes and branches. Below is a linguistic parse tree, here representing the English sentence "John hit the ball". (Note: this is only one possible parse tree for this sentence; different kinds of linguistic parse trees exist.) The parse tree is the entire structure, starting from S and ending in each of the leaf nodes (John, hit, the, ball).

A simple parse tree
A simple parse tree

In a parse tree, each node is either a root node, a branch node, or a leaf node. In the above example, S is a root node, NP and VP are branch nodes, while John, hit, the, and ball are all leaf nodes. (To understand better what "S", "VP", "NP" etc. mean, see [1])


A node can also be referred to as parent (mother) node or a child (daughter) node. A parent node is one that has at least one other node linked by a branch under it. In the example, S is a parent of both NP and VP. A child node is one that has at least one node directly above it to which it is linked by a branch of the tree. Again from our example, hit is a child node of V. The terms mother and daughter are also sometimes used for this relationship.

[edit] See also

[edit] External links