List of sequence alignment software

This list of sequence alignment software is a compilation of software tools and web portals used in pairwise sequence alignment and multiple sequence alignment. See structural alignment software for structural alignment of proteins.

Database search only

Name	Description	Sequence Type*	Link	Authors	Year
BLAST	local search with fast k-tuple heuristic (Basic Local Alignment Search Tool)	Both	NCBI EMBL-EBI DDBJ DDBJ (psi-blast) GenomeNet PIR (protein only)	Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ^[1]	1990
CS-BLAST	sequence-context specific BLAST, more sensitive than BLAST, FASTA, and SSEARCH. Position-specific iterative version CSI-BLAST more sensitive than PSI-BLAST	Protein	CS-BLAST server download	Angermueller C, Biegert A, Soeding J^[2]	2013
CUDASW++	GPU accelerated Smith Waterman algorithm for multiple shared-host GPUs	Protein	homepage publication	Liu Y, Maskell DL and Schmidt B	2009/2010
FASTA	local search with fast k-tuple heuristic, slower but more sensitive than BLAST	Both	EMBL-EBI DDBJ GenomeNet PIR (protein only)
GGSEARCH / GLSEARCH	Global:Global (GG), Global:Local (GL) alignment with statistics	Protein	FASTA server
HMMER	local and global search with profile Hidden Markov models, more sensitive than PSI-BLAST	Both	download	Durbin R, Eddy SR, Krogh A, Mitchison G^[3]	1998
HHpred / HHsearch	pairwise comparison of profile Hidden Markov models; very sensitive, but can only search alignment databases (Pfam, PDB, InterPro...)	Protein	server download	Söding J^[4]	2005
IDF	Inverse Document Frequency	Both	download
Infernal	profile SCFG search	RNA	download	Eddy S
KLAST	high-performance general purpose sequence similarity search tool	Both	homepage publication		2009/2014
PSI-BLAST	position-specific iterative BLAST, local search with position-specific scoring matrices, much more sensitive than BLAST	Protein	NCBI PSI-BLAST	Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ^[5]	1997
PSI-Search	Combining the Smith-Waterman search algorithm with the PSI-BLAST profile construction strategy to find distantly related protein sequences, and preventing homologous over-extension errors.	Protein	EMBL-EBI PSI-Search	Li W, McWilliam H, Goujon M, Cowley A, Lopez R, Pearson WR^[6]	2012
ScalaBLAST	Highly parallel Scalable BLAST	Both	ScalaBLAST	Oehmen et al.^[7]	2011
Sequilab	Linking and profiling sequence alignment data from NCBI-BLAST results with major sequence analysis servers/services	Nucleotide/peptide	server		2010
SAM	local and global search with profile Hidden Markov models, more sensitive than PSI-BLAST	Both	SAM	Karplus K, Krogh A^[8]	1999
SSEARCH	Smith-Waterman search, slower but more sensitive than FASTA	Both	EMBL-EBI DDBJ
SWAPHI	the first parallelized algorithm employing the emerging Intel Xeon Phis to accelerate Smith-Waterman protein database search	Protein	homepage	Liu Y and Schmidt B	2014
SWAPHI-LS	the first parallel Smith-Waterman algorithm exploiting Intel Xeon Phi clusters to accelerate the alignment of long DNA sequences	DNA	homepage	Liu Y, Tran TT, Lauenroth F, Schmidt B	2014
SWIPE	fast Smith-Waterman search using SIMD parallelization	Both	homepage	Rognes T	2011

*Sequence Type: Protein or nucleotide

Pairwise alignment

Name	Description	Sequence Type*	Alignment Type**	Link	Author	Year
ACANA	fast heuristic anchor based pairwise alignment	Both	Both	download	Huang, Umbach, Li	2005
AlignMe	Alignments for membrane protein sequences	Protein	Both	download,server	M. Stamm, K. Khafizov, R. Staritzbichler, L.R. Forrest	2013
Bioconductor Biostrings::pairwiseAlignment	Dynamic programming	Both	Both + Ends-free	site	P. Aboyoun	2008
BioPerl dpAlign	Dynamic programming	Both	Both + Ends-free	site	Y. M. Chan	2003
BLASTZ,LASTZ	Seeded pattern-matching	Nucleotide	Local	download,download	Schwartz et al.^[9]^[10]	2004,2009
DNADot	Web-based dot-plot tool	Nucleotide	Global	server	R. Bowen	1998
DOTLET	Java-based dot-plot tool	Both	Global	applet	M. Pagni and T. Junier	1998
FEAST	Posterior based local extension with descriptive evolution model	Nucleotide	Local	site	A. K. Hudek and D. G. Brown	2010
G-PAS	GPU-based dynamic programming with backtracking	Both	Local, SemiGlobal, Global	site+download	W. Frohmberg, M. Kierzynka et al.	2011
GapMis	GapMis is a tool for pairwise sequence alignment with a single gap	Both	SemiGlobal	site	K. Frousios, T. Flouri, C. S. Iliopoulos, K. Park, S. P. Pissis, G. Tischler	2012
GGSEARCH, GLSEARCH	Global:Global (GG), Global:Local (GL) alignment with statistics	Protein	Global in query	FASTA server	W. Pearson	2007
JAligner	Open source Java implementation of Smith-Waterman	Both	Local	JWS	A. Moustafa	2005
K*Sync	Protein sequence to structure alignment that includes secondary structure, structural conservation, structure-derived sequence profiles, and consensus alignment scores	Protein	Both	Robetta server	D. Chivian & D. Baker ^[11]	2003
LALIGN	Multiple, non-overlapping, local similarity (same algorithm as SIM)	Both	Local non-overlapping	server FASTA server	W. Pearson	1991 (algorithm)
NW-align	Standard Needleman-Wunsch dynamic programming algorithm	Protein	Global	server and download	Y Zhang	2012
mAlign	modelling alignment; models the information content of the sequences	Nucleotide	Both	doc code	D. Powell, L. Allison and T. I. Dix	2004
matcher	Waterman-Eggert local alignment (based on LALIGN)	Both	Local	Pasteur	I. Longden (modified from W. Pearson)	1999
MCALIGN2	explicit models of indel evolution	DNA	Global	server	J. Wang et al.	2006
MUMmer	suffix tree based	Nucleotide	Global	download	S. Kurtz et al.	2004
needle	Needleman-Wunsch dynamic programming	Both	SemiGlobal	EMBL-EBI Pasteur	A. Bleasby	1999
Ngila	logarithmic and affine gap costs and explicit models of indel evolution	Both	Global	download	R. Cartwright	2007
Path	Smith-Waterman on protein back-translation graph (detects frameshifts at protein level)	Protein	Local	server download	M. Gîrdea et al.^[12]	2009
PatternHunter	Seeded pattern-matching	Nucleotide	Local	download	B. Ma et al.^[13]^[14]	2002–2004
ProbA (also propA)	Stochastic partition function sampling via dynamic programming	Both	Global	download	U. Mückstein	2002
PyMOL	"align" command aligns sequence & applies it to structure	Protein	Global (by selection)	site	W. L. DeLano	2007
REPuter	suffix tree based	Nucleotide	Local	download	S. Kurtz et al.	2001
SABERTOOTH	Alignment using predicted Connectivity Profiles	Protein	Global	download on request	F. Teichert, J. Minning, U. Bastolla, and M. Porto	2009
Satsuma	Parallel whole-genome synteny alignments	DNA	Local	download	M.G. Grabherr et al.	2010
SEQALN	Various dynamic programming	Both	Local or Global	server	M.S. Waterman and P. Hardy	1996
SIM, GAP, NAP, LAP	Local similarity with varying gap treatments	Both	Local or global	server	X. Huang and W. Miller	1990-6
SIM	Local similarity	Both	Local	servers	X. Huang and W. Miller	1991
SPA: Super pairwise alignment	Fast pairwise global alignment	Nucleotide	Global	available upon request	Shen, Yang, Yao, Hwang	2002
SSEARCH	Local (Smith-Waterman) alignment with statistics	Protein	Local	EMBL-EBI FASTA server	W. Pearson	1981 (Algorithm)
Sequences Studio	Java applet demonstrating various algorithms from ^[15]	Generic sequence	Local and global	code applet	A.Meskauskas	1997 (reference book)
SWIFT suit	Fast Local Alignment Searching	DNA	Local	site	K. Rasmussen,^[16] W. Gerlach	2005,2008
stretcher	Memory-optimized Needleman-Wunsch dynamic programming	Both	Global	Pasteur	I. Longden (modified from G. Myers and W. Miller)	1999
tranalign	Aligns nucleic acid sequences given a protein alignment	Nucleotide	NA	Pasteur	G. Williams (modified from B. Pearson)	2002
UGENE	Opensource Smith-Waterman for SSE/CUDA, Suffix array based repeats finder & dotplot	Both	Both	UGENE site	UniPro	2010
water	Smith-Waterman dynamic programming	Both	Local	EMBL-EBI Pasteur	A. Bleasby	1999
wordmatch	k-tuple pairwise match	Both	NA	Pasteur	I. Longden	1998
YASS	Seeded pattern-matching	Nucleotide	Local	server download	L. Noe and G. Kucherov ^[17]	2004–2011

*Sequence Type: Protein or nucleotide. **Alignment Type: Local or global

Multiple sequence alignment

Name	Description	Sequence Type*	Alignment Type**	Link	Author	Year	License
ABA	A-Bruijn alignment	Protein	Global	download	B.Raphael et al.	2004	Proprietary, without charge for educational, research and non profit.
ALE	manual alignment ; some software assistance	Nucleotides	Local	download	J. Blandy and K. Fogel	1994 (latest version 2007)	GPL2
AMAP	Sequence annealing	Both	Global	server	A. Schwartz and L. Pachter	2006
anon.	fast, optimal alignment of three sequences using linear gap costs	Nucleotides	Global	paper software	D. Powell, L. Allison and T. I. Dix	2000
BAli-Phy	Tree+Multi alignment ; Probabilistic/Bayesian ; Joint Estimation	Both	Global	WWW+download	BD Redelings and MA Suchard	2005 (latest version 2015)	GPL
Base-By-Base	Java-based multiple sequence alignment editor with integrated analysis tools	Both	Local or Global	download	R. Brodie et al.	2004	Free, requires registration.
CHAOS/DIALIGN	Iterative alignment	Both	Local (preferred)	server	M. Brudno and B. Morgenstern	2003
ClustalW	Progressive alignment	Both	Local or Global	download EMBL-EBI DDBJ PBIL EMBNet GenomeNet	Thompson et al.	1994	GNU Lesser GPL
CodonCode Aligner	Multi alignment; ClustalW & Phrap support	Nucleotides	Local or Global	download	P. Richterich et al.	2003 (latest version 2009)
Compass	COmparison of Multiple Protein sequence Alignments with assessment of Statistical Significance	Protein	Global	download and server	R.I. Sadreyev, et al.	2009
DECIPHER	Progressive/iterative alignment	Both	Global	download	Erik S. Wright	2014	GPL
DIALIGN-TX and DIALIGN-T	Segment-based method	Both	Local (preferred) or Global	download and server	A.R.Subramanian	2005 (latest version 2008)
DNA Alignment	Segment-based method for intraspecific alignments	Both	Local (preferred) or Global	server	A.Roehl	2005 (latest version 2008)
DNA Baser Sequence Assembler	Multi alignment; Automatic batch alignment	Nucleotides	Local or Global	www.DnaBaser.com	Heracle BioSoft	2006 (latest version 2014)
EDNA	Energy Based Multiple Sequence Alignment for DNA Binding Sites	Nucleotides	Local or Global	sourceforge.net/projects/msa-edna/	Salama, RA. et al.	2013
FSA	Sequence annealing	Both	Global	download and server	R. K. Bradley et al.	2008
Geneious	Progressive/Iterative alignment; ClustalW plugin	Both	Local or Global	download	A.J. Drummond et al.	2005 (latest version 2009)
Kalign	Progressive alignment	Both	Global	server EMBL-EBI MPItoolkit	T. Lassmann	2005
MAFFT	Progressive/iterative alignment	Both	Local or Global	GenomeNet MAFFT	K. Katoh et al.	2005
MARNA	Multiple Alignment of RNAs	RNA	Local	server download	S. Siebert et al.	2005
MAVID	Progressive alignment	Both	Global	server	N. Bray and L. Pachter	2004
MSA	Dynamic programming	Both	Local or Global	download	D.J. Lipman et al.	1989 (modified 1995)
MSAProbs	Dynamic programming	Protein	Global	download	Y. Liu, B. Schmidt, D. Maskell	2010
MULTALIN	Dynamic programming/clustering	Both	Local or Global	server download	F. Corpet	1988
Multi-LAGAN	Progressive dynamic programming alignment	Both	Global	server	M. Brudno et al.	2003
MUSCLE	Progressive/iterative alignment	Both	Local or Global	server	R. Edgar	2004
Opal	Progressive/iterative alignment	Both	Local or Global	download	T. Wheeler and J. Kececioglu	2007
Pecan	Probabilistic/consistency	DNA	Global	download	B. Paten et al.	2008
Phylo	A human computing framework for comparative genomics to solve multiple alignment	Nucleotides	Local or Global	site	McGill Bioinformatics	2010
Praline	Progressive/iterative/consistency/homology-extended alignment with pre-profiling and secondary structure prediction	Protein	Global	server	J. Heringa	1999 (latest version 2009)
PicXAA	non-progressive/maximum expected accuracy alignment	Both	Global	download and server	S.M.E. Sahraeian and B.J. Yoon	2010
POA	Partial order/hidden Markov model	Protein	Local or Global	download	C. Lee	2002
Probalign	Probabilistic/consistency with partition function probabilities	Protein	Global	server	Roshan and Livesay	2006
ProbCons	Probabilistic/consistency	Protein	Local or Global	server	C. Do et al.	2005
PROMALS3D	Progressive alignment/hidden Markov model/Secondary structure/3D structure	Protein	Global	server	J. Pei et al.	2008
PRRN/PRRP	Iterative alignment (especially refinement)	Protein	Local or Global	PRRP PRRN	Y. Totoki (based on O. Gotoh)	1991 and later
PSAlign	Alignment preserving non-heuristic	Both	Local or Global	download	S.H. Sze, Y. Lu, Q. Yang.	2006
RevTrans	Combines DNA and Protein alignment, by back translating the protein alignment to DNA.	DNA/Protein (special)	Local or Global	server	Wernersson and Pedersen	2003 (newest version 2005)
SAGA	Sequence alignment by genetic algorithm	Protein	Local or Global	download	C. Notredame et al.	1996 (new version 1998)
SAM	Hidden Markov model	Protein	Local or Global	server	A. Krogh et al.	1994 (most recent version 2002)
Se-Al	Manual alignment	Both	Local	download	A. Rambaut	2002
StatAlign	Bayesian co-estimation of alignment and phylogeny (MCMC)	Both	Global	download	A. Novak et al.	2008
Stemloc	Multiple alignment and secondary structure prediction	RNA	Local or Global	download	I. Holmes	2005	GPLv3 (parte de DART)
T-Coffee	More sensitive progressive alignment	Both	Local or Global	server download	C. Notredame et al.	2000 (newest version 2008)	GPL2
UGENE	Supports multiple alignment with MUSCLE, KAlign, Clustal and MAFFT plugins	Both	Local or Global	download	UGENE team	2010 (newest version 2012)	GPL2
VectorFriends	VectorFriends Aligner, MUSCLE plugin, and ClustalW plugin	Both	Local or Global	download	BioFriends team	2013	Proprietary, but free for academic researchers
GLProbs	Adaptive pair-Hidden Markov Model based approach	Protein	Global	download	Y. Ye et al.	2013

*Sequence Type: Protein or nucleotide. **Alignment Type: Local or global

Genomics analysis

Name	Description	Sequence Type*	Link
ACT (Artemis Comparison Tool)	Synteny and comparative genomics	Nucleotide	server
AVID	Pairwise global alignment with whole genomes	Nucleotide	server
BLAT	Alignment of cDNA sequences to a genome.	Nucleotide	^[18]
GMAP	Alignment of cDNA sequences to a genome. Identifies splice site junctions with high accuracy.	Nucleotide	http://research-pub.gene.com/gmap
Splign	Alignment of cDNA sequences to a genome. Identifies splice site junctions with high accuracy. Able to recognize and separate gene duplications.	Nucleotide	http://www.ncbi.nlm.nih.gov/sutils/splign
Mauve	Multiple alignment of rearranged genomes (also available inside Geneious)	Nucleotide	download
MGA	Multiple Genome Aligner	Nucleotide	download
Mulan	Local multiple alignments of genome-length sequences	Nucleotide	server
Multiz	Multiple alignment of genomes	Nucleotide	download
PLAST-ncRNA	Search for ncRNAs in genomes by partition function local alignment	Nucleotide	server
Sequerome	Profiling sequence alignment data with major servers/services	Nucleotide/peptide	server
Sequilab	Profiling sequence alignment data from NCBI-BLAST results with major servers/services	Nucleotide/peptide	server
Shuffle-LAGAN	Pairwise glocal alignment of completed genome regions	Nucleotide	server
SIBsim4 / Sim4	A program designed to align an expressed DNA sequence with a genomic sequence, allowing for introns	Nucleotide	download
SLAM	Gene finding, alignment, annotation (human-mouse homology identification)	Nucleotide	server

*Sequence Type: Protein or nucleotide

Motif finding

Name	Description	Sequence Type*	Link
PMS	Motif search and discovery	Both	server server
FMM	Motif search and discovery (can get also positive & negative sequences as input for enriched motif search)	Nucleotide	server
BLOCKS	Ungapped motif identification from BLOCKS database	Both	server
eMOTIF	Extraction and identification of shorter motifs	Both	servers
Gibbs motif sampler	Stochastic motif extraction by statistical likelihood	Both	server server
HMMTOP	Prediction of transmembrane helices and topology of proteins	Protein	homepage & download
I-sites	Local structure motif library	Protein	server
JCoils	Prediction of Coiled coil and Leucine Zipper	Protein	homepage & download
MEME/MAST	Motif discovery and search	Both	server
CUDA-MEME	GPU accelerated MEME (v4.4.0) algorithm for GPU clusters	Both	homepage
MERCI	Discriminative motif discovery and search	Both	homepage & download
PHI-Blast	Motif search and alignment tool	Both	Pasteur
Phyloscan	Motif search tool	Nucleotide	server
PRATT	Pattern generation for use with ScanProsite	Protein	server
ScanProsite	Motif database search tool	Protein	server
TEIRESIAS	Motif extraction and database search	Both	server
BASALT	Multiple motif and regular expression search	Both	homepage

*Sequence Type: Protein or nucleotide

Benchmarking

Name	Link	Authors
BAliBASE	download	Thompson, Plewniak, Poch
HOMSTRAD	download	Mizuguchi
Oxbench	download	Raghava, Searle, Audley, Barber, Barton
PFAM	download
PREFAB	download	Edgar
SABmark	download	Van Walle, Lasters, Wyns
SMART	download	Letunic, Copley, Schmidt, Ciccarelli, Doerks, Schultz, Ponting, Bork

Alignment Viewers/Editors

Please see the List of alignment visualization software.

Short-Read Sequence Alignment

Name	Description	paired-end option	Use FASTQ quality	Gapped	Multi-threaded	License	Link	Reference	Year
BarraCUDA	A GPGPU accelerated Burrows-Wheeler transform (FM-index) short read alignment program based on BWA, supports alignment of indels with gap openings and extensions.	Yes	No	Yes	Yes (POSIX Threads and CUDA)	GPL	link
BBMap	Uses a short kmers to rapidly index genome; no size or scaffold count limit. Higher sensitivity and specificity than Burrows-Wheeler aligners, with similar or greater speed. Performs affine-transform-optimized global alignment, which is slower but more accurate than Smith-Waterman. Handles Illumina, 454, PacBio, Sanger, and Ion Torrent data. Splice-aware; capable of processing long indels and RNA-seq. Pure Java; runs on any platform. Used by the Joint Genome Institute.	Yes	Yes	Yes	Yes	BSD	link		2010
BFAST	Explicit time and accuracy tradeoff with a prior accuracy estimation, supported by indexing the reference sequences. Optimally compresses indexes. Can handle billions of short reads. Can handle insertions, deletions, SNPs, and color errors (can map ABI SOLiD color space reads). Performs a full Smith Waterman alignment.				Yes (POSIX Threads)	GPL	link	^[19]	2009
BLASTN	BLAST's nucleotide alignment program, slow and not accurate for short reads, and uses a sequence database (EST, sanger sequence) rather than a reference genome.						link
BLAT	Made by Jim Kent. Can handle one mismatch in initial alignment step.				Yes (client/server).	Free for academic and non-commercial use.	link	^[20]	2002
Bowtie	Uses a Burrows-Wheeler transform to create a permanent, reusable index of the genome; 1.3 GB memory footprint for human genome. Aligns more than 25 million Illumina reads in 1 CPU hour. Supports Maq-like and SOAP-like alignment policies	Yes	Yes	No	Yes (POSIX Threads)	Artistic License	link	^[21]	2009
HIVE-hexagon	Uses a hash table and bloom matrix to create and filter potential positions on the genome. For higher efficiency uses cross-similarity between short reads and avoids realigning non unique redundant sequences. It is faster than bowtie and bwa and allows indels and divergent sensitive alignments on viruses and bacteria as well as more conservative eukaryotic alignments.	Yes	Yes	Yes	Yes	Free for academic and non-commercial users registered to HIVE deployment instance.	link	^[22]	2014
BWA	Uses a Burrows-Wheeler transform to create an index of the genome. It's a bit slower than bowtie but allows indels in alignment.	Yes	No	Yes	Yes	GPL	link	^[23]	2009
BWA-PSSM	A probabilistic short read aligner based on the use of position specific scoring matrices (PSSM). The aligner is adaptable in the sense that it can take into account the quality scores of the reads and models of data specific biases, such as those observed in Ancient DNA, PAR-CLIP data or genomes with biased nucleotide compositions.^[24]	Yes	Yes	Yes	Yes	GPL	link	^[24]	2014
CASHX	Quantify and manage large quantities of short-read sequence data. CASHX pipeline contains a set of tools that can be used together or as independent modules on their own. This algorithm is very accurate for perfect hits to a reference genome.				No	Free for academic and non-commercial use.	link
Cloudburst	Short-read mapping using Hadoop MapReduce				Yes (Hadoop MapReduce)	Artistic License	link
CUDA-EC	Short-read alignment error correction using GPUs.				Yes (GPU enabled)		link-
CUSHAW	A CUDA compatible short read aligner to large genomes based on Burrows-Wheeler transform.	Yes	Yes	No	Yes (GPU enabled)	GPL	link	^[25]	2012
CUSHAW2	Gapped short-read and long-read alignment based on maximal exact match seeds. This aligner supports both base-space (e.g. from Illumina, 454, Ion Torrent and PacBio sequencers) and ABI SOLiD color-space read alignments.	Yes	No	Yes	Yes	GPL	link		2014
CUSHAW2-GPU	GPU-accelerated CUSHAW2 short-read aligner.	Yes	No	Yes	Yes	GPL	link
CUSHAW3	Sensitive and Accurate Base-Space and Color-Space Short-Read Alignment with Hybrid Seeding	Yes	No	Yes	Yes	GPL	link	^[26]	2012
drFAST	Read mapping alignment software that implements cache obliviousness to minimize main/cache memory transfers like mrFAST and mrsFAST, however designed for the SOLiD sequencing platform (color space reads). It also returns all possible map locations for improved structural variation discovery.	Yes	Yes (for structural variation)	Yes	No	BSD	link
ELAND	Implemented by Illumina. Includes ungapped alignment with a finite read length.
ERNE	Extended Randomized Numerical alignEr for accurate alignment of NGS reads. It can map bisulfite-treated reads.	Yes	Low quality bases trimming	Yes	Multithreading and MPI-enabled	GPL v3	link
GASSST	Finds global alignments of short DNA sequences against large DNA banks				Multithreading	CeCILL version 2 License.	link	^[27]	2011
GEM	High-quality alignment engine (exhaustive mapping with substitutions and indels). More accurate and several times faster than BWA or Bowtie 1/2. Many standalone biological applications (mapper, split mapper, mappability, and other) provided.	Yes	Yes	Yes	Yes	Dual (free for non-commercial use); GEM source is currently unavailable	link	^[28]	2012
Genalice MAP	Ultra fast and comprehensive NGS read aligner with high precision and small storage footprint.	Yes	Low quality bases trimming	Yes	Yes	Commercial	link
Geneious Assembler	Fast, accurate overlap assembler with the ability to handle any combination of sequencing technology, read length, any pairing orientations, with any spacer size for the pairing, with or without a reference genome.				Yes	Commercial	link
GensearchNGS	Complete framework with user-friendly GUI to analyse NGS data. It integrates a proprietary high quality alignment algorithm as well as plug-in capability to integrate various public aligner into a framework allowing to import short reads, align them, detect variants and generate reports. It is geared towards re-sequencing projects, namely in a diagnostic setting.	Yes	No	Yes	Yes	Commercial	link
GMAP and GSNAP	Robust, fast short-read alignment. GMAP: longer reads, with multiple indels and splices (see entry above under Genomics analysis); GSNAP: shorter reads, with a single indel or up to two splices per read. Useful for digital gene expression, SNP and indel genotyping. Developed by Thomas Wu at Genentech. Used by the National Center for Genome Resources (NCGR) in Alpheus.	Yes	Yes	Yes	Yes	Free for academic and non-commercial use.	link
GNUMAP	Accurately performs gapped alignment of sequence data obtained from next-generation sequencing machines (specifically that of Solexa/Illumina) back to a genome of any size. Includes adaptor trimming, SNP calling and Bisulfite sequence analysis.		Yes (also supports Illumina _int.txt and _prb.txt files with all 4 quality scores for each base)		Multithreading and MPI-enabled		link	^[29]	2009
iSAAC	iSAAC has been designed to take full advantage of all the computational power available on a single server node. As a result iSAAC scales well over a broad range of hardware architectures, and alignment performance improves with hardware capabilities	Yes	Yes	Yes	Yes	BSD	github paper
LAST	LAST uses adaptative seeds and copes more efficiently with repeat-rich sequences (e.g. genomes). For example: it can align reads to genomes without repeat-masking, without becoming overwhelmed by repetitive hits.	Yes	Yes	Yes	No	GPL	link	^[30]	2011
MAQ	Ungapped alignment that takes into account quality scores for each base.					GPL	link
mrFAST and mrsFAST	Gapped (mrFAST) and ungapped (mrsFAST) alignment software that implements cache obliviousness to minimize main/cache memory transfers. They are designed for the Illumina sequencing platform and they can return all possible map locations for improved structural variation discovery.	Yes	Yes (for structural variation)	Yes	No	BSD	mrFAST mrsFAST
MOM	MOM or maximum oligonucleotide mapping is a query matching tool that captures a maximal length match within the short read.				Yes		link
MOSAIK	Fast gapped aligner and reference-guided assembler. Aligns reads using a banded Smith-Waterman algorithm seeded by results from a k-mer hashing scheme. Supports reads ranging in size from very short to very long.				Yes		link
MPscan	Fast aligner based on a filtration strategy (no indexing, use q-grams and Backward Nondeterministic DAWG Matching)						link	^[31]	2009
Novoalign & NovoalignCS	Gapped alignment of single end and paired end Illumina GA I & II, ABI Colour space & ION Torrent reads.. High sensitivity and specificity, using base qualities at all steps in the alignment. Includes adapter trimming, base quality calibration, Bi-Seq alignment, and option to report multiple alignments per read.	Yes	Yes	Yes	Multi-threading and MPI versions available with paid license.	Single threaded version free for academic and non-commercial use.	Novocraft
NextGENe	NextGENe® software has been developed specifically for use by biologists performing analysis of next generation sequencing data from Roche Genome Sequencer FLX, Illumina GA/HiSeq, Life Technologies Applied BioSystems’ SOLiD™ System, PacBio and Ion Torrent platforms.	Yes	Yes	Yes	Yes	Commercial	Softgenetics
NextGenMap	Flexible and fast read mapping program (twice as fast as BWA), achieves a mapping sensitivity comparable to Stampy. Internally uses a memory efficient index structure (hash table) to store the positions of all 13-mers present in the reference genome. Mapping regions where pairwise alignments are required are dynamically determined for each read. Uses fast SIMD instructions (SSE) to accelerate the alignment calculations on the CPU. If available, alignments are computed on the GPU (using OpenCL/CUDA) resulting in an additional runtime reduction of 20 - 50%.	Yes	No	Yes	Yes (POSIX Threads, OpenCL/CUDA, SSE)	Open Source	Official GitHub Page	^[32]	2013
Omixon	The Omixon Variant Toolkit includes highly sensitive and highly accurate tools for detecting SNPs and indels. It offers a solution to map NGS short reads with a moderate distance (up to 30% sequence divergence) from reference genomes. It poses no restrictions on the size of the reference, which, combined with its high sensitivity, makes the Variant Toolkit well-suited for targeted sequencing projects and diagnostics.	Yes	Yes	Yes	Yes	Commercial	www.omixon.com
PALMapper	PALMapper, efficiently computes both spliced and unspliced alignments at high accuracy. Relying on a machine learning strategy combined with a fast mapping based on a banded Smith-Waterman-like algorithm it aligns around 7 million reads per hour on a single CPU. It refines the originally proposed QPALMA approach.				Yes	GPL	link
Partek	Partek® Flow software has been developed specifically for use by biologists and bioinformaticians. It supports un-gapped, gapped and splice-junction alignment from single and paired-end reads from Illumina, Life technologies Solid TM, Roche 454 and Ion Torrent raw data (with or without quality information). It integrates powerful quality control on FASTQ/Qual level and on aligned data. Additional functionality include trimming and filtering of raw reads, SNP and InDel detection, mRNA and microRNA quantification and fusion gene detection.	Yes	Yes	Yes	Multiprocessor/Core, Client-Server installation possible	Commercial, FREE trial version
PASS	Indexes the genome, then extends seeds using pre-computed alignments of words. Works with base space as well as color space (SOLID) and can align genomic and spliced RNA-seq reads.	Yes	Yes	Yes	Yes	Free for academic and non-commercial use.	PASS_HOME
PerM	Indexes the genome with periodic seeds to quickly find alignments with full sensitivity up to four mismatches. It can map Illumina and SOLiD reads. Unlike most mapping programs, speed increases for longer read lengths.				Yes	GPL	link	^[33]
PRIMEX	Indexes the genome with a k-mer lookup table with full sensitivity up to an adjustable number of mismatches. It is best for mapping 15-60bp sequences to a genome.	No	No	Yes	No (multiple processes per search)		link		2003
QPalma	Is able to take advantage of quality scores, intron lengths and computation splice site predictions to perform and performs an unbiased alignment. Can be trained to the specifics of a RNA-seq experiment and genome. Useful for splice site/intron discovery and for gene model building. (See PALMapper for a faster version).				Yes (client/server)	GPLv2	link
RazerS	No read length limit. Hamming or edit distance mapping with configurable error rates. Configurable and predictable sensitivity (runtime/sensitivity tradeoff). Supports paired-end read mapping.					LGPL	link
REAL, cREAL	REAL is an efficient, accurate, and sensitive tool for aligning short reads obtained from next-generation sequencing. The programme can handle an enormous amount of single-end reads generated by the next-generation Illumina/Solexa Genome Analyzer. cREAL is a simple extension of REAL for aligning short reads obtained from next-generation sequencing to a genome with circular structure.		Yes		Yes	GPL	link
RMAP	Can map reads with or without error probability information (quality scores) and supports paired-end reads or bisulfite-treated read mapping. There are no limitations on read length or number of mismatches.	Yes	Yes	Yes		GPL v3	link
rNA	A randomized Numerical Aligner for Accurate alignment of NGS reads	Yes	Low quality bases trimming	Yes	Multithreading and MPI-enabled	GPL v3	link
RTG Investigator	Extremely fast, tolerant to high indel and substitution counts. Includes full read alignment. Product includes comprehensive pipelines for variant detection and metagenomic analysis with any combination of Illumina, Complete Genomics and Roche 454 data.	Yes	Yes, for variant calling	Yes	Yes	Free for individual investigator use.	link
Segemehl	Can handle insertions, deletions and mismatches. Uses enhanced suffix arrays.	Yes	No	Yes	Yes	Free for non-commercial use	link	^[34]	2009
SeqMap	Up to 5 mixed substitutions and insertions/deletions. Various tuning options and input/output formats.					Free for academic and non-commercial use.	link
Shrec	Short read error correction with a Suffix trie data structure.				Yes (Java)		link
SHRiMP	Indexes the reference genome as of version 2. Uses masks to generate possible keys. Can map ABI SOLiD color space reads.	Yes	Yes	Yes	Yes (OpenMP)	BSD derivative	link	^[35] ^[36]	2009 - 2011
SLIDER	Slider is an application for the Illumina Sequence Analyzer output that uses the "probability" files instead of the sequence files as an input for alignment to a reference sequence or a set of reference sequences.						link
SOAP, SOAP2, SOAP3 and SOAP3-dp	SOAP: Robust with a small (1-3) number of gaps and mismatches. Speed improvement over BLAT, uses a 12 letter hash table. SOAP2: using bidirectional BWT to build the index of reference, and it is much faster than the first version. SOAP3: GPU-accelerated version that could find all 4-mismatch alignments in tens of seconds per one million reads. SOAP3-dp, also GPU accelerated, supports arbitrary number of mismatches and gaps according to affine gap penalty scores.	Yes	No	SOAP3-dp:Yes	Yes (POSIX Threads), SOAP3, SOAP3-dp need GPU with CUDA support.	GPL	link	^[37]^[38]
SOCS	For ABI SOLiD technologies. Significant increase in time to map reads with mismatches (or color errors). Uses an iterative version of the Rabin-Karp string search algorithm.				Yes	GPL	link
SSAHA and SSAHA2	Fast for a small number of variants.					Free for academic and non-commercial use.	link
Stampy	For Illumina reads. High specificity, and sensitive for reads with indels, structural variants, or many SNPs. Slow, but speed increased dramatically by using BWA for first alignment pass).	Yes	Yes	Yes	No	Free for academic and non-commercial use	link	^[39]	2010
SToRM	For Illumina or ABI SOLiD reads, with SAM native output. Highly sensitive for reads with many errors, indels (full from 0 to 15, extended support otherwise). Uses spaced seeds (single hit) and a very fast SSE/SSE2/AVX2/AVX-512 banded alignment filter. For fixed-length reads only, authors recommend SHRiMP2 otherwise.	No	Yes	Yes	Yes (OpenMP)	Open source	link	^[40]	2010
Subread and Subjunc	Superfast and accurate read aligners. Subread can be used to map both gDNA-seq and RNA-seq reads. Subjunc detects exon-exon junctions and maps RNA-seq reads. They employ a novel mapping paradigm called "seed-and-vote".	Yes	Yes	Yes	Yes	GPL3	link link
Taipan	de-novo Assembler for Illumina reads					Free for academic and non-commercial use.	link
UGENE	Visual interface both for Bowtie and BWA, as well as an embedded aligner	Yes	Yes	Yes	Yes	Opensource, GPL	link
VelociMapper	FPGA-accelerated reference sequence alignment mapping tool from TimeLogic. Faster than Burrows-Wheeler transform-based algorithms like BWA and Bowtie. Supports up to 7 mismatches and/or indels with no performance penalty. Produces sensitive Smith-Waterman gapped alignments.	Yes	Yes	Yes	Yes	Commercial	TimeLogic
XpressAlign	FPGA based sliding window short read aligner which exploits the embarrassingly parallel property of short read alignment. Performance scales linearly with number of transistors on a chip (i.e. performance guaranteed to double with each iteration of Moore's Law without modification to algorithm). Low power consumption is useful for datacentre equipment. Predictable runtime. Better price/performance than software sliding window aligners on current hardware, but not better than software BWT-based aligners currently. Can cope with large numbers (>2) of mismatches. Will find all hit positions for all seeds. Single-FPGA experimental version, needs work to develop it into a multi-FPGA production version.					Free for academic and non-commercial use.	link
ZOOM	100% sensitivity for a reads between 15 - 240bp with practical mismatches. Very fast. Support insertions and deletions. Works with Illumina & SOLiD instruments, not 454.				Yes (GUI) No (CLI).	Commercial	link	^[41]

References

↑ Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ; Gish; Miller; Myers; Lipman (October 1990). "Basic local alignment search tool". Journal of Molecular Biology 215 (3): 403–10. doi:10.1016/S0022-2836(05)80360-2. PMID 2231712.
↑ Angermüller, C.; Biegert, A.; Söding, J. (Dec 2012). "Discriminative modelling of context-specific amino acid substitution probabilities". Bioinformatics 28 (24): 3240–7. doi:10.1093/bioinformatics/bts622. PMID 23080114.
↑ Durbin, Richard; Eddy, Sean R.; Krogh, Anders; Mitchison, Graeme, eds. (1998). Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge, UK: Cambridge University Press. ISBN 978-0-521-62971-3.
↑ Söding J (April 2005). "Protein homology detection by HMM-HMM comparison". Bioinformatics 21 (7): 951–60. doi:10.1093/bioinformatics/bti125. PMID 15531603.
↑ Altschul SF; Madden TL; Schäffer AA et al. (September 1997). "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs". Nucleic Acids Research 25 (17): 3389–402. doi:10.1093/nar/25.17.3389. PMC 146917. PMID 9254694.
↑ Li W; McWilliam H; Goujon M et al. (June 2012). "PSI-Search: iterative HOE-reduced profile SSEARCH searching". Bioinformatics 28 (12): 1650–1651. doi:10.1093/bioinformatics/bts240. PMC 3371869. PMID 22539666.
↑ Oehmen, C.; Nieplocha, J. (August 2006). "ScalaBLAST: A scalable implementation of BLAST for high-performance data-intensive bioinformatics analysis". IEEE Transactions on Parallel & Distributed Systems 17 (8): 740–749. doi:10.1109/TPDS.2006.112.
↑ Hughey, R.; Karplus, K.; Krogh, A. (2003). SAM: sequence alignment and modeling software system. Technical report UCSC-CRL-99-11 (Report). University of California, Santa Cruz, CA.
↑ Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R,Hardison RC, Haussler D, Miller W; Kent; Smit; Zhang; Baertsch; Hardison; Haussler; Miller (2003). "Human-mouse alignments with BLASTZ". Genome Research 13 (1): 103–107. doi:10.1101/gr.809403. PMC 430961. PMID 12529312.
↑ Harris R S (2007). Improved pairwise alignment of genomic DNA (Thesis).
↑ Chivian D, Baker D; Baker (2006). "Homology modeling using parametric alignment ensemble generation with consensus and energy-based model selection". Nucleic Acids Research 34 (17): e112. doi:10.1093/nar/gkl480. PMC 1635247. PMID 16971460.
↑ Girdea M, Noe L, Kucherov G; Noe; Kucherov (January 2010). "Back-translation for discovering distant protein homologies in the presence of frameshift mutations". Algorithms for Molecular Biology 5 (6): 6. doi:10.1186/1748-7188-5-6. PMC 2821327. PMID 20047662.
↑ Ma B ,Tromp J,Li M; Tromp; Li (2002). "PatternHunter: faster and more sensitive homology search". Bioinformatics 18 (3): 440–445. doi:10.1093/bioinformatics/18.3.440. PMID 11934743.
↑ Li M ,Ma B, Kisman D,Tromp J; Ma; Kisman; Tromp (2004). "Patternhunter II: highly sensitive and fast homology search". Journal of Bioinformatics and Computational Biology 2 (3): 417–439. doi:10.1142/S0219720004000661. PMID 15359419.
↑ Gusfield, Dan (1997). Algorithms on strings, trees and sequences. Cambridge university press. ISBN 0-521-58519-8.
↑ Rasmussen K, Stoye J, Myers EW; Stoye; Myers (2006). "Efficient q-Gram Filters for Finding All epsilon-Matches over a Given Length". Journal of Computational Biology 13 (2): 296–308. doi:10.1089/cmb.2006.13.296. PMID 16597241.
↑ Noe L, Kucherov G; Kucherov (2005). "YASS: enhancing the sensitivity of DNA similarity search". Nucleic Acids Research. 33 (web-server issue) (suppl_2): W540–W543. doi:10.1093/nar/gki478. PMC 1160238. PMID 15980530.
↑ http://hgdownload.cse.ucsc.edu/admin/exe/
↑ Homer, Nils; Merriman, Barry; Nelson, Stanley F. (2009). "BFAST: An Alignment Tool for Large Scale Genome Resequencing". PLOS One 4 (11): e7767. doi:10.1371/journal.pone.0007767. PMC 2770639. PMID 19907642.
↑ Kent, W. J. (2002). "BLAT---The BLAST-Like Alignment Tool". Genome Research 12 (4): 656–664. doi:10.1101/gr.229202. ISSN 1088-9051. PMC 187518. PMID 11932250.
↑ Langmead, Ben; Trapnell, Cole; Pop, Mihai; Salzberg, Steven L (2009). "Ultrafast and memory-efficient alignment of short DNA sequences to the human genome". Genome Biology 10 (3): R25. doi:10.1186/gb-2009-10-3-r25. ISSN 1465-6906. PMC 2690996. PMID 19261174.
↑ Santana-Quintero, Luis; Dingerdissen, Hayley; Thierry-Mieg, Jean; Mazumder, Raja; Simonyan, Vahan (2014). "HIVE-Hexagon: High-Performance, Parallelized Sequence Alignment for Next-Generation Sequencing Data Analysis". PLOS ONE 9 (6): 1754–1760. doi:10.1371/journal.pone.0099033. PMID 24918764.
↑ Li, H.; Durbin, R. (2009). "Fast and accurate short read alignment with Burrows-Wheeler transform". Bioinformatics 25 (14): 1754–1760. doi:10.1093/bioinformatics/btp324. ISSN 1367-4803. PMC 2705234. PMID 19451168.
↑ 24.0 24.1 Kerpedjiev, Peter; Frellsen, Jes; Lindgreen, Stinus; Krogh, Anders (2014). "Adaptable probabilistic mapping of short reads using position specific scoring matrices". BMC Bioinformatics 15 (1): 100. doi:10.1186/1471-2105-15-100. ISSN 1471-2105.
↑ Liu, Y.; Schmidt, B.; Maskell, D. L. (2012). "CUSHAW: a CUDA compatible short read aligner to large genomes based on the Burrows-Wheeler transform". Bioinformatics 28 (14): 1830–1837. doi:10.1093/bioinformatics/bts276. ISSN 1367-4803. PMID 22576173.
↑ Liu, Y.; Schmidt, B. (2012). "Long read alignment based on maximal exact match seeds". Bioinformatics 28 (18): i318–i324. doi:10.1093/bioinformatics/bts414. ISSN 1367-4803. PMID 22962447.
↑ Rizk, Guillaume; Lavenier, Dominique (2010). "GASSST: global alignment short sequence search tool". Bioinformatics 26 (20): 2534–2540. doi:10.1093/bioinformatics/btq485. PMC 2951093. PMID 20739310.
↑ Marco-Sola, Santiago; Sammeth, Michael; Guigó, Roderic; Ribeca, Paolo (2012). "The GEM mapper: fast, accurate and versatile alignment by filtration". Nature Methods 9 (12): 1185–1188. doi:10.1038/nmeth.2221. ISSN 1548-7091. PMID 23103880.
↑ Clement, N. L.; Snell, Q.; Clement, M. J.; Hollenhorst, P. C.; Purwar, J.; Graves, B. J.; Cairns, B. R.; Johnson, W. E. (2009). "The GNUMAP algorithm: unbiased probabilistic mapping of oligonucleotides from next-generation sequencing". Bioinformatics 26 (1): 38–45. doi:10.1093/bioinformatics/btp614. ISSN 1367-4803. PMID 19861355.
↑ Kielbasa, S.M.; Wan, R.; Sato, K.; Horton, P.; Frith, M.C. (2011). "Adaptive seeds tame genomic sequence comparison". Genome Research 21 (3): 487–493. doi:10.1101/gr.113985.110. PMC 3044862. PMID 21209072.
↑ Rivals, Eric; Salmela, Leena; Kiiskinen, Petteri; Kalsi, Petri; Tarhio, Jorma (2009). "mpscan: Fast Localisation of Multiple Reads in Genomes". Algorithms in Bioinformatics. Lecture Notes in Computer Science 5724: 246–260. doi:10.1007/978-3-642-04241-6_21. ISBN 978-3-642-04240-9.
↑ Sedlazeck, Fritz J.; Rescheneder, Philipp; von Haeseler, Arndt (2013). "NextGenMap: fast and accurate read mapping in highly polymorphic genomes". Bioinformatics 29 (21): 2790–2791. doi:10.1093/bioinformatics/btt468. PMID 23975764.
↑ Chen, Yangho; Souaiaia, Tade; Chen, Ting (2009). "PerM: efficient mapping of short sequencing reads with periodic full sensitive spaced seeds". Bioinformatics 25 (19): 2514–2521. doi:10.1093/bioinformatics/btp486. PMC 2752623. PMID 19675096.
↑ Searls, David B.; Hoffmann, Steve; Otto, Christian; Kurtz, Stefan; Sharma, Cynthia M.; Khaitovich, Philipp; Vogel, Jörg; Stadler, Peter F.; Hackermüller, Jörg (2009). "Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures". PLoS Computational Biology 5 (9): e1000502. doi:10.1371/journal.pcbi.1000502. ISSN 1553-7358. PMC 2730575. PMID 19750212.
↑ Rumble, Stephen M.; Lacroute, Phil; Dalca, Adrian V.; Fiume, Marc; Sidow, Arend; Brudno, Michael (2009). "SHRiMP: Accurate Mapping of Short Color-space Reads". PLOS Computational Biology 5 (5): e1000386. doi:10.1371/journal.pcbi.1000386. PMC 2678294. PMID 19461883.
↑ David, Matei; Dzamba, Misko; Lister, Dan; Ilie, Lucian; Brudno, Michael (2011). "SHRiMP2: Sensitive yet Practical Short Read Mapping". Bioinformatics 27 (7): 1011–1012. doi:10.1093/bioinformatics/btr046. PMID 21278192.
↑ Li, R.; Li, Y.; Kristiansen, K.; Wang, J. (2008). "SOAP: short oligonucleotide alignment program". Bioinformatics 24 (5): 713–714. doi:10.1093/bioinformatics/btn025. ISSN 1367-4803. PMID 18227114.
↑ Li, R.; Yu, C.; Li, Y.; Lam, T.-W.; Yiu, S.-M.; Kristiansen, K.; Wang, J. (2009). "SOAP2: an improved ultrafast tool for short read alignment". Bioinformatics 25 (15): 1966–1967. doi:10.1093/bioinformatics/btp336. ISSN 1367-4803. PMID 19497933.
↑ Lunter, G.; Goodson, M. (2010). "Stampy: A statistical algorithm for sensitive and fast mapping of Illumina sequence reads". Genome Research 21 (6): 936–939. doi:10.1101/gr.111120.110. ISSN 1088-9051. PMID 20980556.
↑ Noe, L.; Girdea, M.; Kucherov, G. (2010). "Designing efficient spaced seeds for SOLiD read mapping". Advances in Bioinformatics 2010: 708501. doi:10.1155/2010/708501. PMC 2945724. PMID 20936175.
↑ Lin, H.; Zhang, Z.; Zhang, M.Q.; Ma, B.; Li, M. (2008). "ZOOM! Zillions of oligos mapped". Bioinformatics 24 (21): 2431–2437. doi:10.1093/bioinformatics/btn416. PMC 2732274. PMID 18684737.

External links

Pollard, Daniel A; Bergman, Casey M; Stoye, Jens; Celniker, Susan E; Eisen, Michael B (2004). "Benchmarking tools for the alignment of functional noncoding DNA". BMC Bioinformatics 5: 6. doi:10.1186/1471-2105-5-6. PMC 344529. PMID 14736341. : The authors discuss LAGAN, CHAOS, and Dialign as the most effective tools tested for certain uses.

List of sequence alignment software

Database search only

Pairwise alignment

Multiple sequence alignment

Genomics analysis

Motif finding

Benchmarking

Alignment Viewers/Editors

Short-Read Sequence Alignment

See also

References

External links