Talk:Sequence alignment

From Wikipedia, the free encyclopedia

Sequence alignment is a featured article; it (or a previous version of it) has been identified as one of the best articles produced by the Wikipedia community. Even so, if you can update or improve it, please do.

This article appeared on Wikipedia's Main Page as Today's featured article on August 28, 2006.

	Article milestones

Date

Process

Result

July 8, 2006

Peer review

Reviewed

July 22, 2006

Featured article candidate

Promoted

Current Status: Featured article

MCB Portal

This article is within the scope of the Molecular and Cellular Biology WikiProject. To participate, visit the WikiProject for more information. The WikiProject's current monthly collaboration is focused on improving Restriction enzyme.

This article has been rated as FA-Class on the assessment scale.

Mid

This article is on a subject of Mid-importance within molecular and cellular biology.

Article Grading: The article has been rated for quality and/or importance but has no comments yet. If appropriate, please review the article and then leave comments here to identify the strengths and weaknesses of the article and what work it will need.

This page has been selected for Version 0.5 and subsequent release versions of Wikipedia. It has been rated FA-Class on the assessment scale (comments).

1 Pre-2004 adds
2 Needs massive software listing update
3 Reverted reference conversion
4 old external links section
5 Grammar Suggestion
6 Wording of the lead
7 Assessment of significance
8 "Bioinformatics sequence alignment"

[edit] Pre-2004 adds

Sorry. I rather forged ahead and added a lot of content to this page without suggesting it first. I hope you can forgive me - I was just rather eager to add something on a topic that I know about.

I have taken pains not to remove anything, so if you don't want what I've added it should be easy enough to get rid of my new stuff.

MockAE.

[edit] Needs massive software listing update

The software listing is horribly out of date. I'm currently working on benchmarking such aligment packages, and the ones listed here are fast but awful in quality. T-Coffee, Di-Align, MUSCLE, and others merit mention. Davidstrauss

[edit] Reverted reference conversion

Tooto helpfully refconverted this page, and I temporarily reverted that change. I meant to put a comment in the article asking people not to change the references, but I figured, what are the odds of someone converting this exact page in the next week or two?

I'm actively working on this article and find it much easier to add the references in the old style first and then use refconvert at the end, so that the reference text isn't interspersed with the article text. So I'll re-convert the references after the text is more complete. Opabinia regalis 03:38, 24 June 2006 (UTC)

[edit] old external links section

I removed the external links section from the main article pending their merger with sequence alignment software. For the time being I'm storing them here for easy reference. Opabinia regalis 04:40, 4 July 2006 (UTC)

Blast Server at the NCBI

Local alignment tools:
- Smith-Waterman (online): Emboss::WATER (full memory dynamic programming matrix) - SSEARCH - STRETCHER (optimized dynamic programming matrix) - SEQALN
- Suffix_tree based (fast): REPuter
- Seed based (online): FASTA - BLAST family - human BLAT
- Spaced seed based (more accurate): PatternHunter - human BLASTZ - YASS
Global alignment tools:
- Needleman-Wunsch (online): Emboss::Needle (full memory dynamic programming matrix) - Emboss::Matcher (optimized dynamic programming matrix)
- Suffix_tree based (fast): MUMmer
Multiple alignment tools (online): DIALIGN-T - Clustal - Dialign - MAFFT - Multalin - MAVID - Multi-LAGAN - Muscle - POA - ProbCons - T-Coffee

An excellent article at the NCBI web site on the methodology of the BLAST algorithm and the statistical significance of sequence alignments in general.
JAligner is an open source Java implementation of the dynamic programming algorithm Smith-Waterman for biological pairwise local sequence alignment.

Alignment of Genomes with Rearrangement: Mauve - Mulan - Shuffle-LAGAN (pairwise only)

Visualisation tools for alignments
- VISTA genome browser http://pipeline.lbl.gov
- Mauve visualization system http://gel.ahabs.wisc.edu/mauve
- STRAP - 3D-alignments and sequence alignments http://3d-alignment.eu
Open content directory of sequence alignment resources (BioDirectory)
- Sequence Alignment

[edit] Grammar Suggestion

I'd suggest rearranging this sentence to improve readability: "If two sequences in an alignment share a common ancestor, mismatches can be interpreted as point mutations and gaps as indels (that is, insertion or deletion mutations) introduced in one or both lineages in the time since they diverged from one another." GravityIsForSuckers 22:09, 28 August 2006 (UTC)

Do you have a suggested rewording? Perhaps removing the parenthetical explanation of indels? It sounds fine to me, but it should, since I wrote that sentence in the first place :) Opabinia regalis 05:03, 29 August 2006 (UTC)

It would be easier for me to be more specific (or I would have just changed it myself) if I knew this particular subject matter. Perhaps someone else will have an opinion on this. GravityIsForSuckers 05:29, 29 August 2006 (UTC)

How about - The differences in the aligned sequences correspond to mutations that have occurred in one or both lineages since their time of divergence.If two sequences share a common ancestor, mismatches and gaps in the aligned sequences can be interpreted as point mutations or insertion/deletion mutations (indels), respectively. Gribskov 04:04, 20 September 2007 (UTC)

[edit] Wording of the lead

The lead has gone through a few changes since this hit the main page. Theuser, I take your point about arranging residues rather than sequences, but the deficiency of the "residues" wording is that it implies that the order of the residues in the sequence is altered, which is more ambiguous than the alternative "arranging primary sequences". Also, the removal of the word "may" or its equivalent in the statement about emphasizing similarity is much too strong. Spurious similarity happens and there shouldn't be an implication that the results are more definitive than they are. Opabinia regalis 00:37, 29 August 2006 (UTC)

I'm also confused by the wording "historically similar". Certainly sequence-alignment algorithms don't have any information about history, but just operate on sequences? Biologists may use sequence-alignment results to make inferences about history, but sequence-alignment itself doesn't look for things that are "historically similar"; rather, it finds things that are similar by some algorithmic metric. --Delirium 01:24, 29 August 2006 (UTC)

You're right, I reworded it closer to the original. The algorithms themselves are usually ignorant of history (except some that can use an independently-derived phylogenetic tree as input), but the results are usually interpreted as reflecting evolutionary change. Opabinia regalis 01:33, 29 August 2006 (UTC)

Looks better now; thanks! --Delirium 20:05, 31 August 2006 (UTC)

[edit] Assessment of significance

I think this section is unnecessarily vague (even for a non-technical audience). I could add a few details here. Also, the discussion of convergence, IMO, makes it sound much more likely than it really is. Patterson (I think) made a compelling argument in a paper sometime in the 80s (again, I think). I could dig this up, or reconstruct it. Gribskov 04:10, 20 September 2007 (UTC)

[edit] "Bioinformatics sequence alignment"

I think "bioinformatics sequence alignment" is a horrible name for this page. It showed up on my watchlist and my immediate reaction was "what's that?". If it absolutely has to move, it should go to "Protein and nucleic acid sequence alignment" or something like that. I think the article should stay at "sequence alignment" though. --Aranae 22:26, 24 October 2007 (UTC)

Fully agree. The move should at least have been discussed, especially since it is a featured article. Have moved it back to the original name. Shyamal 01:06, 25 October 2007 (UTC)

Me three. Thanks, Shyamal. I don't see a compelling reason for this at all; there's no outstanding ambiguity that needs to be resolved by a longer and less intuitive title. Opabinia regalis 02:38, 25 October 2007 (UTC)

Talk:Sequence alignment

From Wikipedia, the free encyclopedia

Contents

[edit] Pre-2004 adds

[edit] Needs massive software listing update

[edit] Reverted reference conversion

[edit] old external links section

[edit] Grammar Suggestion

[edit] Wording of the lead

[edit] Assessment of significance

[edit] "Bioinformatics sequence alignment"

Views

Navigation

Interaction

Search