Talk:Multiple sequence alignment

From Wikipedia, the free encyclopedia

Good article Multiple sequence alignment has been listed as one of the Natural sciences good articles under the good article criteria. If you can improve it further, please do. If it no longer meets these criteria, you can delist it, or ask for a reassessment.
September 12, 2006 Good article nominee Listed
Molecular and Cellular Biology WikiProject This article is within the scope of the Molecular and Cellular Biology WikiProject. To participate, visit the WikiProject for more information. The WikiProject's current monthly collaboration is focused on improving Restriction enzyme.
Good article GA This article has been rated as GA-Class on the assessment scale.
Mid This article is on a subject of mid-importance within molecular and cellular biology.

Article Grading: The article has been rated for quality and/or importance but has no comments yet. If appropriate, please review the article and then leave comments here to identify the strengths and weaknesses of the article and what work it will need.

[edit] Good article nomination

My suggestions:

  • in section Dynamic programming and computational complexity + Progressive alignment construction (after finishing the article, I must say in nearly all sections), there are several external links that should be converted into internal references
  • Maybe not so important, but a reference for that statement: "PRRP performs best when refining an alignment previously constructed by a faster method." would be useful
  • "which is also available as a web portal": this kind of sentences should be all converted into reference. It looks bad in the article.
  • What about external links section? I now the article is full of external links (see above), but there must be some links writing about MSA.

Anyway, absolutely great article, I'm not an expert in that topic, but I could easily read it. Congrat! NCurse work 07:03, 10 September 2006 (UTC)

Thanks. I'm glad it's readable by a knowledgeable non-expert; bioinformatics can be very difficult to explain well. I agree that not all the external links are written/segued into very well, but integrating them into the text was a (semi-)conscious decision also followed in the parent sequence alignment and the (siblings, I guess?) structural alignment and computational phylogenetics. I figure that many readers are probably looking for a way to actually perform a sequence alignment, and if they find a method they like, it's much easier to have the external link in the relevant paragraph than at the bottom of the article. Tracking down the current download page/web portal from the original paper can also be a challenge, since these things change hosting/get absorbed by larger projects/follow a postdoc when he sets up his own lab/etc. On rereading I do notice a couple of programs that have an external link but no literature reference, so I'll clean that up.
The PRPP statement belongs to ref #9, which appears in the immediately preceding sentence as well. I added a second note - do you think that looks redundant? I'll look for useful links tomorrow - there's lots of them but the signal-to-noise ratio is not so great. Opabinia regalis 07:57, 10 September 2006 (UTC)
FYI, I added external links and references to the tools that were paperless, and cleaned up some of the web portal links. I did leave the direct external links to the relevant tools, but hopefully a bit more integrated. What do you think? Opabinia regalis 23:37, 10 September 2006 (UTC)

Some things left:

  • "A web portal and download site is available at MUSCLE."
  • "A database-search technique based on HMM methods is available in the program HMMer.[14]"
  • "A server for locating motifs in unaligned sequences is located at BLOCKS.".....etc

I think the external links (not internal references) can stay in the article, but they should be merged into the paragraphs, because now these are lonely statements. Anyway everything seems to be good to me. NCurse work 13:12, 11 September 2006 (UTC)

I think I got them all, thanks. Opabinia regalis 04:06, 12 September 2006 (UTC)

Great work, thanks for the reactions. Congrat! :) It's now a good article. NCurse work 14:47, 12 September 2006 (UTC)

Thanks for the review! Opabinia regalis 01:03, 13 September 2006 (UTC)

[edit] some clarifications

the statement

Because HMMs are probabilistic, they do not produce the same solution every time they are run on the same dataset; thus they cannot be guaranteed to converge to an optimal alignment. HMMs can produce both global and local alignments. Although HMM-based methods have been developed relatively recently, they offer significant improvements in computational speed, especially for sequences that contain overlapping regions.

is incorrect. HMMs are probablistic in the sense that they are a statistical model, however, they are completely deterministic and will produce the same result every time on a given dataset. HMM alignments use the same algorithms as local sequence alignments and therefore have no computational speed advantage.

One of the most common motif-finding tools, known as MEME, uses expectation maximization and hidden Markov methods to generate motifs that are then used as search tools by its companion MAST in the combined suite MEME/MAST.[19][20]

MEME uses a PSSM (position specific scoring matrix), but does not contain insertion or deletion probabilities or other characteristics of a typical sequence HMM.

Gribskov 03:55, 20 September 2007 (UTC)