UGENE
Developer(s) | Unipro |
---|---|
Stable release | 1.13.0 / 11 December 2013 |
Written in | C++, QtScript |
Operating system | Cross-platform |
Available in | English, Russian, Czech, Chinese |
Type | Bioinformatics toolkit |
License | GPL |
Website | ugene.unipro.ru |
UGENE is free open-source cross-platform bioinformatics software.[1][2]
It integrates dozens of well-known biological tools and algorithms, providing both graphical user and command line interfaces. Using UGENE Workflow Designer, one can arrange the required tools and algorithms into a workflow schema.[3]
In order to provide maximum possible performance UGENE utilizes multicore CPUs and GPUs to optimize some of its computational routines. Another way to speed up computations is to use Amazon EC2 cloud resources.[4]
Key features
The software supports the following features:
- Creating, editing and annotating nucleic acid and protein sequences
- Search through online databases: NCBI, PDB, UniProtKB/Swiss-Prot, UniProtKB/TrEMBL
- Multiple sequence alignment: Clustal, MUSCLE, Kalign, MAFFT, T-Coffee
- Online and local BLAST search
- Restriction analysis with integrated REBASE restriction enzyme database
- Integrated Primer3 package for PCR primers design
- Search for direct, inverted and tandem repeats in DNA sequences
- Constructing dotplots for nucleic acid sequences
- Search for transcription factor binding sites (TFBS) with weight matrix and SITECON algorithms
- Aligning short reads with Bowtie, BWA and UGENE Genome Aligner
- Search for ORFs
- Cloning in silico
- 3D structure viewer for files in PDB and MMDB formats, anaglyph view support
- Protein secondary structure prediction with GOR IV and PSIPRED algorithms
- HMMER2 and HMMER3 packages integration
- Building (using integrated PHYLIP package) and viewing phylogenetic trees
- Local sequence alignment with optimized Smith-Waterman algorithm
- Combining various algorithms into custom workflows with UGENE Workflow Designer
- Search for a pattern of various algorithms' results in a nucleic acid sequence with UGENE Query Designer
- Visualization of next generation sequencing data (BAM files) using UGENE Assembly Browser
User interface
The software has three main views to display biological data on the user's screen.
- The Sequence view is used to visualize, analyze and modify nucleic acid or protein sequences. Depending on the sequence type and the options selected the followings views can be presented inside the Sequence view window:
- 3D structure view
- Circular view
- Chromatogram view
- Dotplot view
- The Alignment editor is used to visualize, analyze and modify a nucleic acid or protein multiple sequence alignment.
- The Assembly Browser allows to visualize and browse next-generation sequencing data.
- The Phylogenetic tree viewer.
UGENE Workflow Designer
UGENE Workflow Designer allows creating and running complex computational workflow schemas.[5]
The elements that a schema consists of correspond to the bulk of algorithms integrated into UGENE. Using the Workflow Designer one can also create custom workflow elements.
The workflow schemas can be run both locally and remotely, either using the graphical interface or launched from the command line.
UGENE Query Designer
UGENE Query Designer allows a user to analyze a nucleotide sequence using different algorithms (Repeats finder, ORF finder, Weight matrix matching, etc.) at the same time imposing constraints on the positional relationship of the results obtained from the algorithms.
A schema of the algorithms and constraints is either created from the GUI or edited as a plain text.
The results are saved as a set of annotations to a specified file in the GenBank format.
UGENE Assembly Browser
UGENE Assembly Browser project was started in 2010 as an entry for Illumina iDEA Challenge 2011. The Assembly Browser allows a user to visualize and browse large (up to hundreds of millions of short reads) next generation sequence assemblies. The only format currently supported is BAM (which is the binary version of SAM). To browse assembly data in UGENE an input file should be converted to a UGENE database file. This approach has both advantages and disadvantages. The disadvantages are that the conversion may take time for a large BAM file and there should be enough disk space to store the database. On the other hand this allows to overview the whole assembly, navigate in it and go to well-covered regions rather rapidly. In addition before the conversion the user can choose contigs to be extracted from the BAM file. By this mean it is possible to open big files such as 1000 Genomes Project data.
Supported biological data formats
- Sequences and annotations: FASTA (.fa), GenBank (.gb), EMBL (.emb), GFF (.gff)
- Multiple sequence alignments: Clustal (.aln), MSF (.msf), Stockholm (.sto), Nexus (.nex)
- 3D structures: PDB (.pdb), MMDB (.prt)
- Chromatograms: ABIF (.abi), SCF (.scf)
- Short reads: Sequence Alignment/Map(SAM) (.sam), binary version of SAM (.bam), ACE (.ace), FASTQ (.fastq)
- Phylogenetic trees: Newick (.nwk)
- Other formats: Bairoch (enzymes info), HMM (HMMER profiles), PWM and PFM (position matrices), etc.
Release cycle
UGENE is primarily developed by Unipro LLC. Each iteration lasts about 6 weeks. By the end of iteration a release comes out. One can also download a development snapshot of the software.
The features to be included into the next release are mostly initiated by users.
See also
- Sequence alignment software
- Bioinformatics
- Computational biology
- List of open source bioinformatics software
Related software
References
- ↑ Okonechnikov, K.; Golosova, O.; Fursov, M.; the UGENE team (2012). "Unipro UGENE: a unified bioinformatics toolkit". Bioinformatics. doi:10.1093/bioinformatics/bts091.
- ↑ Fursov, M.; Novikova, O. (2008). "Multitasking software system for DNA analysis". Proceedings of the Sixth International Conference on Bioinformatics of Genome Regulation and Structure 1: 78. ISBN 978-5-91291-005-0.
- ↑ Fursov, M. Y.; Oshchepkov, D. Y; Novikova, O. S. (2009). "UGENE: interactive computational schemes for genome analysis". Proceedings of the Fifth Moscow International Congress on Biotechnology 3: 14–15. ISBN 5-7237-0372-2.
- ↑ Efremov, I. E.; Fursov, M. Y; Danilova, Yu. E. (2009). "UGENE: high performance genome analysis suite". Proceedings of the Fifth Moscow International Congress on Biotechnology 2: 405–406. ISBN 5-7237-0372-2.
- ↑ Fursov, M. Y.; Varlamov, A. (2009). "UGENE - A practical approach for complex computational analysis in molecular biology". Proceedings of the 10th Annual Bioinformatics Open Source Conference: 7.
External links
- Official website
- UGENE podcast
- UGENE documentation
- UGENE forum
- http://www.linuxformat.ru/foss-contest#foss2010-results
- http://www.t-platforms.ru/ru/about/allnews/newsarchive/87--l-r-powerxcell-8i.html