Clustal
From Wikipedia, the free encyclopedia
Clustal | |
---|---|
Developed by | Gibson T. (EMBL), Thompson J. (CNRS), Higgins D. (UCD) |
Latest release | 2.0 |
OS | UNIX, Linux, Mac, MS-Windows |
Genre | Bioinformatics tool |
Licence | Free for academic users |
Website | Clustal |
Clustal is a widely used multiple sequence alignment computer program. The latest version is 2.0. There are two main variations:
- ClustalW: command line interface
- ClustalX: This version has a graphical user interface. It is available for Windows, Mac OS and Unix/Linux.
This program is available from the Clustal Homepage or European Bioinformatics Institute ftp server. Choose unix for Unix/Linux, mac for Mac OS, or dos for Windows.
Contents |
[edit] Input/Output
This program accepts a wide range on input format. Included NBRF/PIR, FASTA, EMBL/Swissprot, Clustal, GCC/MSF, GCG9 RSF and GDE.
The output format can be one or many of the following: Clustal, NBRF/PIR, GCG/MSF, PHYLIP, GDE, NEXUS.
[edit] Multiple sequence alignment
There are three main steps:
- Do a pairwise alignment
- Create a phylogenetic tree (or use a user-defined tree)
- Use the phylogenetic tree to carry out a multiple alignment
These are done automatically when you select "Do Complete Alignment". Other options are "Do Alignment from guide tree" and "Produce guide tree only".
[edit] Profile alignments
Pairwise alignments are computed for all against all sequences, and similarities are stored in a matrix. This is then converted into a distance matrix, where the distance measures reflect the evolutionary distance between each pair of sequences.
From this distance matrix, a guide tree, or phylogenetic tree, for the order in which pairs of sequences are to be aligned and combined with previous alignments is constructed using a neighbour-joining clustering algorithm. Sequences are progressively aligned at each branch point, starting from the least distant pair of sequences.
[edit] Setting
Users can align the sequences using the default setting, but occasionally it may be useful to customize one's own parameters.
The main parameters are the gap opening penalty, and the gap extension penalty.
[edit] References
- Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997). The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research, 25:4876-4882.
- Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD (2003). Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res, 31:3497-3500.
[edit] See also
[edit] External links
- Clustal Homepage (free Unix/Linux, Mac, and Windows download)
- ClustalW and ClustalW mirror at the EBI (free Unix/Linux, Mac, and Windows download)
- "White Paper - Accelerating Intensive Applications at 10x-50x Speedup to Remove Bottlenecks in Computational Workflows". Progeniq Pte Ltd.