Newick format

From Wikipedia, the free encyclopedia

Examples of Newick tree format:

(,,(,));                               no nodes are named
(A,B,(C,D));                           leaf nodes are named
(A,B,(C,D)E)F;                         all nodes are named
(:0.1,:0.2,(:0.3,:0.4):0.5);           all but root node have a distance to parent
(A:0.1,B:0.2,(C:0.3,D:0.4):0.5);       distances and leaf names (popular)
(A:0.1,B:0.2,(C:0.3,D:0.4)E:0.5)F;     distances and all names
A;                                     a (degenerate) tree with one named node
((B:0.2,(C:0.3,D:0.4)E:0.5)F:0.1)A;    a tree rooted on a leaf node (rare)

It is typically used for tools like PHYLIP and is a minimal definition for a phylogenetic tree.

Contents

[edit] Rooted and Unrooted Binary Trees

Trees are generally rooted on an internal node and it is rare (but legal) to root a tree on a leaf node. When a tree is unrooted an arbitrary internal node is chosen as its root.

Rooted binary trees that are rooted on an internal node have exactly two main top-level nodes, and each internal node has exactly two immediate descendants. Unrooted binary trees that are rooted on an arbitrary internal node have exactly three main top-level nodes, and each internal node has exactly two immediate descendants. A binary tree rooted from a leaf has at most one main top-level node, and each internal node has exactly two immediate descendants.

[edit] Grammar

A grammar for parsing the Newick format.

[edit] The Grammar Nodes

   Tree: The full input Newick Format for a single tree
   Subtree: an internal node and its descendant (sub)subtree or a leaf node
   Leaf: a leaf node
   Internal: an internal node and its descendants
   BranchList: a set of one or more Branches
   Branch: a tree edge and its descendant subtree.
   Name: the name of a node
   Length: the length of a tree edge.

[edit] The Grammar Rules

Note, "|" separates alternatives.

   Tree --> Subtree ";"
   Subtree --> Leaf | Internal
   Leaf --> Name
   Internal --> "(" BranchList ")" Name
   BranchList --> Branch | Branch "," BranchList
   Branch --> Subtree Length
   Name --> empty | string
   Length --> empty | ":" number

Whitespace (spaces, tabs, carriage returns, and linefeeds) within number is prohibited. Whitespace within string is often prohibited. Whitespace elsewhere is ignored.