Tree rearrangement

From Wikipedia, the free encyclopedia

Tree rearrangements are used in heuristic algorithms devoted to searching for an optimal tree structure. They can be applied to any set of data that are naturally arranged into a tree, but have most applications in computational phylogenetics, especially in maximum parsimony and maximum likelihood searches of phylogenetic trees, which seek to identify one among many possible trees that best explains the evolutionary history of a particular gene or species.

[edit] Basic tree rearrangements

The simplest tree-rearrangement, known as nearest-neighbor interchange, exchanges the connectivity of four subtrees within the main tree. Because there are three possible ways of connecting four subtrees,^[1] and one is the original connectivity, each interchange creates two new trees. Exhaustively searching the possible nearest-neighbors for each possible set of subtrees is the slowest but most optimizing way of performing this search. An alternative, more wide-ranging search, subtree pruning and regrafting (SPR), selects and removes a subtree from the main tree and reinserts it elsewhere on the main tree to create a new node. Finally, tree bisection and reconnection (TBR) detaches a subtree from the main tree at an interior node and then attempts all possible connections between branches of the two trees thus created. The increasing complexity of the tree rearrangement technique correlates with increasing computational time required for the search, although not necessarily with their performance.^[2]

[edit] Tree fusion

The simplest type of tree fusion begins with two trees already identified as near-optimal; thus, they most likely have the majority of their nodes correct but may fail to resolve individual tree "leaves" properly; for example, the separation ((A,B),(C,D)) at a branch tip versus ((A,C),(B,D)) may be unresolved.^[1] Tree fusion swaps these two solutions between two otherwise near-optimal trees. Variants of the method use standard genetic algorithms with a defined objective function to swap high-scoring subtrees into main trees that are high-scoring overall.^[3]

[edit] References

^ ^a ^b Felsenstein J. (2004). Inferring Phylogenies Sinauer Associates: Sunderland, MA.
^ Takahashi K, Nei M. (2000). Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used. Mol Biol Evol 17(8):1251-8.
^ Matsuda H. (1996). Protein phylogenetic inference using maximum likelihood with a genetic algorithm. Pacific Symposium on Biocomputing 1996, pp512-23.