Talk:Computational phylogenetics
From Wikipedia, the free encyclopedia
[edit] GA Review
- It is well written.
- a (prose): b (structure): c (MoS): d (jargon):
- It is factually accurate and verifiable.
- a (references): b (inline citations): c (reliable): d (OR):
- It is broad in its coverage.
- It follows the neutral point of view policy.
- It is stable.
- It contains images, where possible, to illustrate the topic.
- a (tagged and captioned): b (lack of images does not in itself exclude GA): c (non-free images have fair use rationales):
- Overall:
- The prose, I was starting to fell asleap, lots of times, I lost track what I was reading about, it lacks flow.
- Jargon, plenty of undefined words lying around, for example all to all matrix etc...
- broad in its coverage, cannot decide on that point, as the text is too heavy to read, so I lost track what it was all about.
- Images, there must be some images in this article, defining what it's about, not just text, as I for example havn't as of yet understand if it's about computer sience or biomolecular sience.
Result: fail →AzaToth 23:13, 14 October 2006 (UTC)
[edit] More GA Review
Sorry, I don't really get the GA formatting so I'm adding another section. For that matter I had initially intended to review this earlier, but couldn't figure out where to do so. Apparently it's here on the article talk page instead of on the GA page (as in FAs). Here are a list of comments I have concerning this article.
- The article is massively biased toward genetic data. I'm not sure where this comes from, but the article strongly suggests that computational phylogenetics is exclusively for nucleotides/genetic data. This bias is rampant throughout the article. Neighbour-joining, Fitch-Margoliash, parsimony heuristic algorithms, and almost everything else discussed here (short of ML based and tree-based alignment methods) were developed for use with morphological data and later applied to molecular data.
- I'm having trouble trying to identify the intended scope of the article so I'm having issues with determining what's needed. Is it about data treatment and analysis? Is it about any computerized algorithm you'd use to help you build a tree? Here are a list of things that may apply depending on the intended scope:
- Postional homology - This is treated very nicely elsewhere, and should be given a prominent set of links and a quick summary here. It's really the homology of characters that's important. The alignment is important because of homology. The present second sentence and second paragraph are awkward. They are constructed as if this subject is an offshoot of the field of sequence alignment. All the alignment stuff in the intro is very strangely placed. Discussion of homology for morphological characters should also be given a quick overview. At this level discussion of genetic and morphological characters would involve similar statements.
- Coding of characters - Again, morphological characters have been strangely omitted, but how to code characters is very important. Are they coded as binary characters? Are they ordered or unordered?
- Model selection - ML, Bayesian, and (though it's not discussed as much in the literature) NJ algorithms are really defined by the model of evolution imposed on the data. There are many approaches to determining which model should be used.
- Indels - How are gaps treated? Missing data, additional character state, as binary characters? Most of this literature concerns molecular data, but it's important in dealing with morphological data as well (i.e. it's hard to count toes if you don't have a foot).
- Outgroup selection - Covered a bit, but warrant more as they are the only way to root most trees.
- Ancestral state reconstruction - It's brought up and could be applicable here, but may be better treated in depth elsewhere.
- Combining data - Different genes and/or molecular + morphology.
- Nodal support
- Setup - Right now the article headings are technique based, yet that's more reasonable in an article (such as Phylogenetics) that discusses the theory behind the approaches, but this article should be more nuts and bolts. Perhaps an outlne very roughly along the lines of:
- Homology
- Morphology
- Alignment
- Missing data and indels
- Character coding
- Morphology
- Nucleotide
- Amino Acid
- Approaches
- Phenetic
- UPGMA
- Neighbor-joining
- Fitch-Margoliash
- Parsimony
- Overview
- MALIGN and POY
- Maximum likelihood
- Model selection
- Character weighting
- Hierarchical-based model selection
- Bayesian approach to model selection
- Searching tree-space
- Exhaustive
- Branch and bound
- Heuristic approaches
- Identifying and reaching multiple islands
- Rooting
- Outgroup rooting
- Rooting without an outgroup
- Midpoint rooting
- Molecular clock
- Nodal support
- Bootstrap
- Jacknife
- Taxon jacknifing
- Bayesian posterior probability
- Phenetic
- Homology
- Other comments:
- The term "phylogenetic tree construction" should probably be replaced with "phylogenetic tree reconstruction". The tree-builder isn't making the tree for the first time, but is trying to replay history (crudely).
- The wording suggests that the gene sequence is the OTU, which may be true in some instances (such as in detecting gene duplication events or teasing apart gene trees from species trees). I think most questions are looking at the relatedness among taxa (i.e. species).
- Most algorithms build an unrooted tree and use a user-selected outgroup to impose a root after the analysis is finished. I find the wording in "Types of phylogenetic trees" to be a bit awkward.
- UPGMA has enough historical significance and is such a simple technique to understand that I think it warrants as much of a section as NJ.
For now I'd have to agree with the Fail decision listed above. My only deal-breaking reason is the lack of coverage for morphlogical characters. As it's currently written a reader might very easily think that computational phylogenetics is restricted to nucleotide/amino acid data. I think misleading a reader to this degree is fundamentally grounds for disqualification as a GA. I'm also uncomfortable with the treatment of this topic as if it is a subdiscipline of aligning sequences. Props still go to Opabinia regalis and other editors who have clearly put a lot of well-researched work into this. --Aranae 05:54, 16 October 2006 (UTC)
- Excellent, thanks for the very detailed comments! Exactly what I was hoping for :) This is what I meant about single authorship being a problem; the lack of discussion of morphology comes from the fact that I have zero experience working with morphological data, and know very little about the methodology (especially character coding, which has always seemed somewhat arbitrary to me). I will look at this in more detail tomorrow, but a couple of quick comments - I originally wrote/expanded this as a subarticle of sequence alignment, which is why it seems to lean in that direction. The usage of "phylogenetic tree construction" was intended to roughly follow the distinction in the articles phylogenetic tree vs evolutionary tree, where the merge "discussion" generally agreed that the latter is the "true" history and the former is a construction. Thanks again for such a thorough review! Opabinia regalis 06:34, 16 October 2006 (UTC)