Clustal
Developer(s) |
|
---|---|
Stable release | 1.2.0 / 12 June 2013 |
Written in | C++ |
Operating system | UNIX, Linux, Mac, MS-Windows |
Type | Bioinformatics tool |
Licence | GNU General Public License, version 2[1] |
Website |
www |
Developer(s) | Gibson T. (EMBL), Thompson J. (CNRS), Higgins D. (UCD) |
---|---|
Stable release | 2.1 / 17 November 2010 |
Written in | C++ |
Operating system | UNIX, Linux, Mac, MS-Windows |
Type | Bioinformatics tool |
Licence | GNU Lesser General Public License [2] |
Website |
www |
In Bioinformatics Clustal is a series of widely used computer programs for multiple sequence alignment.[3] There have been many incarnations of Clustal that are listed below:
- Clustal: The original software for progressive alignment based on a phylogenetic tree.[4]
- ClustalV: A rewrite of the original Clustal package that included phylogenetic tree reconstruction on the final alignment for the first time.[5]
- ClustalW: command line interface[6]
- ClustalX: This version has a graphical user interface.[7]
- Clustal Omega: Command line-only program.[8][9]
The papers describing the clustal software have been very highly cited, with two appearing in a list of the most cited papers of all time.[10]
The more recent version of the software available for Windows, Mac OS, and Unix/Linux. This program is available from the Clustal Homepage or European Bioinformatics Institute ftp server.
Input/Output
This program accepts a wide range of input formats, including NBRF/PIR, FASTA, EMBL/Swiss-Prot, Clustal, GCC/MSF, GCG9 RSF, and GDE.
The output format can be one or many of the following: Clustal, NBRF/PIR, GCG/MSF, PHYLIP, GDE, or NEXUS.
Multiple sequence alignment
There are three main steps:
- Do a pairwise alignment
- Create a guide tree (or use a user-defined tree)
- Use the guide tree to carry out a multiple alignment
These are done automatically when you select "Do Complete Alignment". Other options are "Do Alignment from guide tree" and "Produce guide tree only".
Setting
Users can align the sequences using the default setting, but occasionally it may be useful to customize one's own parameters.
The main parameters are the gap opening penalty, and the gap extension penalty.
Names
The guide tree in the initial programs was constructed via a UPGMA cluster analysis of the pairwise alignments, hence the name CLUSTAL.[11]cf.[12] The first four versions in 1988 had Arabic numerals (1 to 4), whereas with the fifth version Des Higgins switched to Roman numeral V in 1992.[11]cf.[13][14] In 1994 and in 1997, for the next two versions, the letters after the letter V were used and made to correspond to W for Weighted and X for X Window.[11]cf.[15][16] The name omega was chosen to mark a change from the previous ones.[11]
See also
References
- ↑ See file COPYING, in source archive . Accessed 2014-01-15.
- ↑ "ClustalW / ClustalX: Multiple Sequence Alignment". Retrieved 1 October 2013.
- ↑ Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD (2003). "Multiple sequence alignment with the Clustal series of programs". Nucleic Acids Res 31 (13): 3497–3500. doi:10.1093/nar/gkg500. PMC 168907. PMID 12824352.
- ↑ Higgins DG, Sharp PM (December 1988). "CLUSTAL: a package for performing multiple sequence alignment on a microcomputer". Gene 73 (1): 237–44. doi:10.1016/0378-1119(88)90330-7. PMID 3243435.
- ↑ Higgins DG, Bleasby AJ, Fuchs R (April 1992). "CLUSTAL V: improved software for multiple sequence alignment". Comput. Appl. Biosci. 8 (2): 189–91. doi:10.1093/bioinformatics/8.2.189. PMID 1591615.
- ↑ Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG (2007). "ClustalW and ClustalX version 2". Bioinformatics 23 (21): 2947–2948. doi:10.1093/bioinformatics/btm404. PMID 17846036.
- ↑ Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997). "The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools". Nucleic Acids Research 25 (24): 4876–4882. doi:10.1093/nar/25.24.4876. PMC 147148. PMID 9396791.
- ↑ Sievers F, Wilm A, Dineen DG, Gibson TJ, Karplus K, Li W, Lopez R, McWilliam H, Remmert M, Söding J, Thompson JD, Higgins DG (2011). "Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega". Mol Syst Biol 7 7 (539). doi:10.1038/msb.2011.75.
- ↑ "Clustal Omega source code". Retrieved 2014-01-15.
- ↑ Van Noorden, R.; Maher, B.; Nuzzo, R. (2014). "The top 100 papers: Nature explores the most-cited research of all time". Nature (London) 514 (7524): 550. doi:10.1038/514550a. PMID 25355343.
- 1 2 3 4 Des Higgins, presentation at the SMBE 2012 conference in Dublin.
- ↑ Higgins, D. G.; Sharp, P. M. (1988). "CLUSTAL: A package for performing multiple sequence alignment on a microcomputer". Gene 73 (1): 237–244. doi:10.1016/0378-1119(88)90330-7. PMID 3243435.
- ↑ Higgins, D. G.; Sharp, P. M. (1989). "Fast and sensitive multiple sequence alignments on a microcomputer". Computer applications in the biosciences : CABIOS 5 (2): 151–153. doi:10.1093/bioinformatics/5.2.151. PMID 2720464.
- ↑ Higgins, D. G.; Bleasby, A. J.; Fuchs, R. (1992). "CLUSTAL V: Improved software for multiple sequence alignment". Computer applications in the biosciences : CABIOS 8 (2): 189–191. doi:10.1093/bioinformatics/8.2.189. PMID 1591615.
- ↑ Thompson, J. D.; Higgins, D. G.; Gibson, T. J. (1994). "CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice". Nucleic Acids Research 22 (22): 4673–4680. doi:10.1093/nar/22.22.4673. PMC 308517. PMID 7984417.
- ↑ Thompson, J. D.; Gibson, T. J.; Plewniak, F.; Jeanmougin, F.; Higgins, D. G. (1997). "The CLUSTAL_X windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools". Nucleic Acids Research 25 (24): 4876–4882. doi:10.1093/nar/25.24.4876. PMC 147148. PMID 9396791.
External links
- Clustal Homepage (free Unix/Linux, Mac, and Windows download)
- Clustal Omega mirror at the EBI
- "Accelerating Intensive Applications at 10x-50x Speedup to Remove Bottlenecks in Computational Workflows" — White Paper by Progeniq Pte Ltd.