Generalized tree alignment

From Wikipedia, the free encyclopedia

In computational phylogenetics, generalized tree alignment is the problem of producing a multiple sequence alignment and a phylogenetic tree on a set of sequences simultaneously, as opposed to separately.

Formally, Generalized tree alignment is the following optimization problem.

Input: A set S and an edit distance function d between sequences,

Output: A tree T leaf-labeled by S and labeled with sequences at the internal nodes, such that \Sigma_{e \in T} d(e) is minimized, where d(e) is the edit distance between the endpoints of e.

Note that this is in contrast to tree alignment, where the tree is provided as input.