Newick format
From Wikipedia, the free encyclopedia
Examples of Newick tree format:
(,,(,)); no nodes are named (A,B,(C,D)); leaf nodes are named (A,B,(C,D)E)F; all nodes are named (:0.1,:0.2,(:0.3,:0.4):0.5); all but root node have a distance to parent (A:0.1,B:0.2,(C:0.3,D:0.4):0.5); distances and leaf names (popular) (A:0.1,B:0.2,(C:0.3,D:0.4)E:0.5)F; distances and all names A; a (degenerate) tree with one named node ((B:0.2,(C:0.3,D:0.4)E:0.5)F:0.1)A; a tree rooted on a leaf node (rare)
It is typically used for tools like PHYLIP and is a minimal definition for a phylogenetic tree.
Contents |
[edit] Rooted and Unrooted Binary Trees
Trees are generally rooted on an internal node and it is rare (but legal) to root a tree on a leaf node. When a tree is unrooted an arbitrary internal node is chosen as its root.
Rooted binary trees that are rooted on an internal node have exactly two main top-level nodes, and each internal node has exactly two immediate descendants. Unrooted binary trees that are rooted on an arbitrary internal node have exactly three main top-level nodes, and each internal node has exactly two immediate descendants. A binary tree rooted from a leaf has at most one main top-level node, and each internal node has exactly two immediate descendants.
[edit] Grammar
A grammar for parsing the Newick format.
[edit] The Grammar Nodes
Tree: The full input Newick Format for a single tree Subtree: an internal node and its descendant (sub)subtree or a leaf node Leaf: a leaf node Internal: an internal node and its descendants BranchList: a set of one or more Branches Branch: a tree edge and its descendant subtree. Name: the name of a node Length: the length of a tree edge.
[edit] The Grammar Rules
Note, "|" separates alternatives.
Tree --> Subtree ";" Subtree --> Leaf | Internal Leaf --> Name Internal --> "(" BranchList ")" Name BranchList --> Branch | Branch "," BranchList Branch --> Subtree Length Name --> empty | string Length --> empty | ":" number
Whitespace (spaces, tabs, carriage returns, and linefeeds) within number is prohibited. Whitespace within string is often prohibited. Whitespace elsewhere is ignored.