Wikipedia:WikiProject Molecular and Cellular Biology/Proposals

From Wikipedia, the free encyclopedia

< Wikipedia:WikiProject Molecular and Cellular Biology

Discussion
This is an appropriate place for general discussion about the project and its direction.

Announcements
This is an appropriate place to make announcements to other project members.

Help Requests
This is an appropriate place to ask help of other project members.

Proposals
This is an appropriate place to make and discuss proposals with other project members.

Discuss proposals concerning the Molecular and Cellular Biology Wikiproject here.

Please click here to start a new proposal.

Archives
1 2

1 Abbreviations
- 1.1 Gene and Protein Names
  - 1.1.1 Genes
  - 1.1.2 Proteins
- 1.2 Small molecules
2 Proteins
3 Standards of quality for biochemistry articles
4 Standardizing protein representation pictures
- 4.1 Pymol
  - 4.1.1 Pros
  - 4.1.2 Cons
- 4.2 Molmol
  - 4.2.1 Pros
  - 4.2.2 Cons
- 4.3 VMD
  - 4.3.1 Pros
  - 4.3.2 Cons
- 4.4 Chimera
  - 4.4.1 Pros
  - 4.4.2 Cons
- 4.5 MolScript
  - 4.5.1 Pros
  - 4.5.2 Cons
- 4.6 SPDBV
  - 4.6.1 Pros
  - 4.6.2 Cons
- 4.7 Cn3D
  - 4.7.1 Pros
  - 4.7.2 Cons
- 4.8 QuteMol
  - 4.8.1 Pros
  - 4.8.2 Cons
- 4.9 Blender
  - 4.9.1 Pros
  - 4.9.2 Cons
5 Collaboration history
6 MCB Barnstars
- 6.1 Okay, now what do we call this thing
- 6.2 Shall we have a vote?
7 Stubs, stubs everywhere!
8 Our scope: should it include unicellular critters?
9 External links
10 MCB image
11 Protein Representation Picture Requests
12 Protein Infobox Addition Proposal
13 Wikipedia, Pfam/Interpro and SMART

[edit] Abbreviations

[edit] Gene and Protein Names

"it is recomended that we use the ABR instead of the full name form" <-- Use of abbreviations for gene and protein names should follow the rules for jargon. --JWSchmidt 16:04, 29 December 2005 (UTC)

We don't really have a choice. We talked about the amino acids and other small molecules, so i think we are settled there, but the proteins are different - most of them have very long names. We can use the same strategy here too, whether it is insulin or INS doesn't make any difference, but consider - "Tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein, beta polypeptide" and "YWHAB" (its a 14-3-3 protein), by the time the reader gets to the middle of the name they'll forget the begining. If the article is too big a problem pops up - the person will not always remember where they have come across with the ABR for the first time, probably somewhere in the begining but where, so now they have to scan the text to find the link, what if there was a link right here (at least there was one yesterday) but someone has made an edit and has removed it. It could be better if we put the ABR at the bottom or the top, just under the intro, of the article, so when they get lost to find them easily? Or they could simply make a search on "Wiki". -- Boris 18:14, 29 December 2005 (UTC)

I think ABR abbreviations (which are quite unsightly) should only be used where the protein name is long and arguably more unsightly. Alternative names and abbreviations should all be given in the first sentence, immediately after the full name (isn't this a wikipedia-wide policy?) --Username132 (talk) 08:27, 1 September 2006 (UTC)

[edit] Genes

The abreviations (ABR) of the genes are according to HUGO Gene Nomenclature Committee and written in ITALIC font style (the full names are also written in italic), it is recomended that we use the ABR instead of the full name form. The human gene names are in all big caps (correct me if i haven't written it well), example - ALDOA, INS, etc. Human homologous of genes in other species start with a big cap and the rest of the letters are small caps - mouse Aldoa, bovine Ins, etc. When we write about the genes this is correct - "the ALDOA gene is regulated...", "the rat gene for Aldoa is regulated..." or "ALDOA is regulated...", while this should not be allowed - "the gene ALDOA is regulated", b/c it's redundant. And if anyone starts arguing with you about it, say "Sorry, but we have a standard to follow" and redirect them to this section. -- Boris 15:32, 29 December 2005 (UTC)

I have moved this to the talk page since it seems to be the correct place to formulate the plan. Unfortunately, different organisms use different nomenclature therefore it is hard to use a one rule fits all cases scenario. This is difficult for wikipedia since it has to deal with all organisms as opposed to scientist who are familiar with the nomenclature for a few organisms. I agree that a standard ITALICISED ALL CAPS format is the norm for wildtype genes, that is a good start. The ALL CAPS format for proteins is not standard. C. elegans also uses a first letter capitalisation format for proteins. ALDOA and Aldoa could both be correct depending on the organism. We should also note that recessive genes should be italicisd lower case. Dominant alleles are more problematic since different organisms have different rules. One possible solution is to set up a table with all the different nomenclatures commonly used in all the genetic model systems. This would be a very useful resource. At the end of the day it is important that wikipedia follows standard nomentclature and not make up our own unified nomenclature. Of course, that is easier said than done. David D. (Talk) 16:50, 29 December 2005 (UTC)

Below is an example of what i think we should attempt to do.

Type	Protein	Gene	Recessive allele	Dominant allele
mouse	ALDOA	ALDOA	aldoa
human
yeast
C. elegans	Aldoa	ALDOA	aldoa
dropsophila
zebra fish
arabidopsis	SHOOT MERISTEMLESS or STM	SHOOT MERISTEMLESS or STM	shoot meristemless or stm	shoot meristemless or stm
maize	KNOTTED1 or KN1	KNOTTED1 or KN1	knotted1 or kn1	Knotted1 or Kn1

Here are sources for nomenclature in different model organisms.

arabidopsis (Arabidopsis thaliana) [1]

maize (Zea mays) [2]

Caenorhabditis elegans [3]

zebra fish (Danio rerio) [4]

Drosophila melanogaster [5]

Homo sapiens [6]

mouse (Mus musculus) [7]

yeast (Saccharomyces cerevisiae) [8]

Dave, the "all caps" is used mostly for the human forms - protein and gene, as if there is some unoficial agrememnt or something, well, we could apply it strictly here. Although, if you look here you will see that at least for the enzymes they have naming standards - "All caps name"_"five letter species code" - ALDOA then would be ALDOA_HUMAN, the mouse Aloda - ALDOA_MOUSE, the bovine - ALDOA_BOVIN. We could use this standard but it seems to me that it has more of a "database search" purpose than of a publication/writing one, i could be wrong tho.
As for the alels, the "wild type" form has a superscript - ALDO^WT and ALDO^WT, respectivelly, and should be used only when there is a lot of talk about the different allels, mutants and all. All the rest would be designated according to the changes - ALDOA^E206K means ALDOA with glutamate (E, one letter code) at position 206 replaced by lysine (K) in the mutant (or experimentally designed) form, and ALDOA^N1numberN2, where "N1" is the Wild type nucleotide, "number" - its position, "N2" - the change. I have seen that a lot, i have also seen articles using "ALDOA(GLU206->LYS)", or simply "E206K" (if the article is about ALDOA only), i guess all depends on the content - example. Although "E206K" is my favorite, for the purpose of "Wiki" i like best the one with the superscript. For deletions some authors use different approach, ADLOA^Δ50-120. I don't think that many of these additional features have a common standard yet, we could set up one here, right? - Boris 18:14, 29 December 2005 (UTC)

I hear you. I work with plants and we do it differently. This is my main point. Nomenclature in science is NOT standardised across the board. Since this is the case, we have to be careful when editing wikipedia articles since it would make no sense to use human nomenclature for maize or arabidopsis. David D. (Talk) 18:19, 29 December 2005 (UTC)

So far i've been carefull (i think). I try not to mess up with the plant material, i was in a "head collision" with it when i was in college and my neck still "hirts" - it's a totally different world. I'm all into the animal stuff, more specifically - the human. -- Boris 19:05, 29 December 2005 (UTC)

You are the norm, I too have a very foggy idea of nomenclature outside plants. It is this very reason that a table that documents the differences and similarities in nomenclature in the different model organisms would be a fantastic resource for this project. I will try and fill out the table above, adding more categories, for the plant nomenclature. Between us we can then add the nomenclature for the other models. David D. (Talk) 19:11, 29 December 2005 (UTC)

The table should definately be the other way around, with species forming row upon row, instead of column upon column (makes for better viewing. --Username132 (talk) 19:52, 30 August 2006 (UTC)

[edit] Proteins

The same as with the genes, but in NORMAL font style. -- Boris 15:32, 29 December 2005 (UTC)

[edit] Small molecules

Hi everyone. We have to decide whether - to use the abreviations (ABRs) for the objects (such as Glc for glucose) throughout the project's articles; use them in the object's article only ot not use them at all. I preffer to use them all the way, with one exception - in the introduction part of the article, so those who are not familiar to be able to understand, after that the text becomes high level anyway so we don't need to have the full names. There are many reasons for that but the main is that i want our articles to look very well done almost pro, so we can atract more and more scientist to join us in making the project because with their knowledge they can help us tremendously (i know some of you are, but we need more - hundreds, or thousands is even better), i want these people to take us seriously and not just as some sort of online-encyclopedia joke, i want that these people know that anytime they need a reliable and good writen info they can come here and find it easily. Boris 17:00, 18 December 2005 (UTC)

I agree with the goal of attracting experts. I agree that when scientists can have an adverse reaction when they see Wikipedia articles that do not look professional. However, in many cases, abbreviations are part of scientific Jargon and they represent a serious barrier to the function of Wikipedia: communicating with a broad audience of non-experts (see Wikipedia:Explain jargon). I think the goal of good communication with Wikipedia readers has to be a high priority. I think that the choice between using "glucose" and "Glc" needs to be made on a case-by-case basis based on what provides the best communication. In the article Glucose, the replacement of the name of the subject of the article (glucose) with an abbreviation (Glc) does not improve communication. There is no reason not to use the word "glucose" to refer to glucose in the main text of an encyclopedia article about glucose. One reason I can think of to use "Glc" is in diagrammatic representations of oligosaccharide structures where it is conventional to use abbreviations for the names of sugars. However, in the main text of article, "glucose" is better than "Glc". Here is an example of an article that uses a reasonable mixture of full names and abbreviations. --JWSchmidt 19:05, 18 December 2005 (UTC)

I agree with JWS that this needs to be addressed on a case by case basis. I think that the use of Glc is inferior to glucose. On the other hand F6P for fructose 6-phosphate is probably suitable. I have also noticed short hand for pyruvate and i think that us unecessary too. With regard to experts, I don't think there is a particular convention to use ABR's for all words in research papers. Normally glucose is often written as glucose. The only thing that will turn off experts is when the articles are just plain wrong. It was encouraging to see the nature review of the 40 science articles in wikipedia. The number of errors they detected was much lower than I expected. David D. (Talk) 19:24, 18 December 2005 (UTC)

We have to set some sort of criteria (rules) about when it is appropriate to use the ABR form - apparently in some cases it would actually make the text harder to read. We know that the genes and the proteins have their own ABR nomenclature and we can use them here but what about the small molecules, any sources? PubChem? Boris 15:52, 19 December 2005 (UTC)

Only use abbreviations if the abbreviations are less cumbersome than the full-form. Glc for glucose is unsightly, and its not obvious as to what it means. Where the full-form however, would disrupt the flow of the article, use an abbreviated form, listed in bold, in brackets, in the first sentence of the article.

Suggested policy: Only use abbreviations if the abbreviations are less cumbersome than the full-form. Where the full-form would disrupt the flow of the article, use an abbreviated form, listed in bold, in brackets, in the first sentence of the article. --Username132 (talk) 09:21, 1 September 2006 (UTC)

[edit] Proteins

[edit] Standardising Article Structure

Archived Boris' ideas (long read)

I think that standardising structure of protein articles is a capital idea! What we have from Boris' suggestion is;

Introduction
Structure
Binding partners
Function
Regulation
Gene
Mutations
Clinical implications

From all this, I have a couple of things to say; I think there is an argument for the combination of structure and function sections which often go hand-in-hand and that gene and mutation sections should summarised with the main gene article kept separate. Please add further section suggestions or rename as necessary (the list below) --Username132 (talk) 20:20, 30 August 2006 (UTC)

Introduction
Structure & function
1. Constituent domains (i.e. domain I, domain II...)
Binding partners
Regulation
Gene
Mutations
Clinical implications
Commercial exploitations

Copied from User talk:Opabinia regalis:

...my experience at Photon has set me brooding. Do you think that we as a community could come up with a standard set of questions that should be considered to make a biochemistry article "complete", e.g.,

biological/cellular function
activity (what does it do?) and passivity (what is done to it?)
synthesis/expression, modification and degradation
what does it bind to?
cellular location/transport process
biochemical pathways/biological processes involved
Close relatives, homologs

along with subquestions for each of these, such as

- regulation
- energy source/spontaneous
- what inhibits/activates?
- duration/timing
- history of discovery

It might be useful to have some kind of checklist for editors to help gauge how complete an article is. What do you think? It's more of a thought for a thought. Willow 15:15, 31 August 2006 (UTC)

This sounds like a pretty good outline for articles on proteins or complexes, with the possible exception of "history" - which is useful to know if there's something important/interesting/relevant about the protein's discovery, but sounds like it might make for a lot of stubby one-sentence sections that say "SomeGuy proposed the existence of an enzyme with SomeActivity in 19xx, but the protein was not isolated until 19xx+yy when SomePoorGradStudent purified and characterized it from SomeNastyAnimalTissue." I'm also not sure what "duration/timing" means - the lifetime of the protein? speed of the reaction?

Protein articles could really use a guideline that's more complete than just the infobox outline. I'm not sure what's the best approach to articles on other things (organelles, cellular processes, methods) because what makes this one (or its descendants) useful is its relative specificity, and the other types of articles are more heterogeneous.

On the subject of article assessment/review, would it be useful to have a system of notification for biochem-related peer reviews? I'm not sure a separate peer review system as some projects have is really warranted, but so many peer review requests seem to go by with nothing but automated suggestions and maybe somebody complaining about the lead. Opabinia regalis 03:59, 1 September 2006 (UTC)

[edit] Templates

[edit] Protein Categorisation

As a first step for this project, we can create a standardized system for presenting information about proteins and genes within wikipedia. A logical place to start is with a system for categorization of proteins, start here: Whole proteome analysis. (see also: EC number, protein nomenclature talk, talk about protein lists, Talk:List of proteins and discussion about a specific class of non-enzyme proteins).

For organismal biology evolutionary relatedness of organisms provides a way to categorize all organisms. For proteins, functional categories are probably best and that is what is emphasized at Whole proteome analysis.

[edit] Standards of quality for biochemistry articles

moved to #Standardising_article_structure.

[edit] Standardizing protein representation pictures

This topic is now being voted upon!

I think it would be nice to standardize the rendered images of proteins in some way to give the same look and feel for this kind of articles. Of course this might not be possible in all cases as there are far too many protein articles. Also there are some that have good illustrations already which should not be removed just for the sake of standardization. However for all those pictureless protein articles where an image might be useful it would be nice to use the same representation. What do you think?

I suggest to use the open source program PyMol to render the images. It is widely used, gives nice results and had cartoon representations of both proteins and nucleic acids. For some articles I am interested in I did some images: actin, Arp2/3 complex, MreB, Profilin, Zinc finger(Protein + DNA)

If you think this would be useful, I'd volunteer to render some images on request. Just leave a message on my talk page.

Any thoughts?--Splette ^Talk 12:16, 1 September 2006 (UTC)

It seems that others are also interested in what's used to make protein representation pictures. There's a discussion over on the help page about what tools everyone uses. I think it's a great idea that we standardize the rendered images. We should discuss the pro's and con's of the various imaging tools and vote on it. And then produce a tutorial to instruct new users of the program how to use it and a series of guidelines so that the images all share similar standards. GAThrawn22 22:53, 2 September 2006 (UTC)

I left some tool suggestions on the other talk page, but I wanted to say here that I'm not sure specific guidelines on protein images are really necessary or useful. Proteins are different enough that each image might want to illustrate different features of the structure. Maybe what we want is a suggestion for what the most "basic" protein image should look like, with the understanding that more illustrative images may replace the basic one for proteins with interesting features - which case I'd say as a first approximation: cartoon or ribbon representation colored by index or secondary structure. I'd be happy to work on a tutorial/help page if the "recommended" program or programs is one that I'm familiar with. Opabinia regalis 00:13, 3 September 2006 (UTC)

I think Opabina might be right in that when it comes to highlighting specific features, another program might be more suitable (I've never operated any of these programs). The recommended program (for which a tutorial page would be ideal) should be relatively quick and easy to use and produce good output. So... what are the pros and cons (fill in the thing at the bottom of this discussion with any points that spring to mind)? Username132 (talk) 11:10, 4 September 2006 (UTC)

Nice to get so much feedback. Seems like there is quite some interest in the topic here. I think the list below so far summarizes the pros and cons of the different programs quite well. However I guess we are not in a rush to make a decision, so should we wait a bit more before starting to vote? --Splette

^Talk 09:02, 7 September 2006 (UTC)

Yeah, I think so. There aren't enough people commenting and some of the programs pros and cons are still blank (thanks for the work you put into that, Opabinia) - The next news letter should inform the readers of the change of discussion layout and urge people to get involved in the discussion of proposals particularly. --Username132 (talk) 18:12, 7 September 2006 (UTC)

Hi all. I hadn't noticed this page before! Got to explore more often. I use SwissPDB viewer for simple images like HIV protease and DHFR. I tend to prefer bright cartoon colors since these images are viewed at 300 px and any detail is lost. TimVickers 15:36, 30 September 2006 (UTC)

I put some of the minor points in small print because we are discussing here which program is best for images on wikipedia. Therefore features like, sequence alignment, caclulation of electrostatics or movie export are of lower interest. --Splette

^Talk 17:14, 30 September 2006 (UTC)

Something useful but easy to agree on would be a standard for secondary-structure representation. What about Helix - red, sheet - yellow and coil - blue? TimVickers 20:57, 1 October 2006 (UTC)

If i can add my small contrib, i would also suggest QuteMol as a another possible tool for rendering molecules. It is fairly limited in terms of visualization modes (currently it only perform spacefill, ballandstick and licorice) but it has some unique rendering features that makes it useful for creating high quality images, like for example ambient occlusion and depth aware silhouettes. ALoopingIcon 23:40, 3 October 2006 (UTC)

I should mention that Ravedave's nomination of my DNA clamp image as a featured picture has inspired me to stop being lazy and start doing proper raytraces of my molecules. I suppose this is obliquely an argument in favor of PyMol since its raytracer is integrated and efficient. (Though I'm not sure how it compares to POV-Ray on surface representations, there is at least the advantage of not generating a gargantuan POV file as an intermediate.)

Tim, I'd be more specific and say standard red, yellow, and blue (#FF0000, #FFFF00, #0000FF) or otherwise specify the colors. (I don't know what I think about turns being such a bright, saturated color; I usually leave them cyan so they don't obscure the other features. But I bet everyone's favorite SS color scheme is the default for whatever program they use most often :) Opabinia regalis 05:01, 4 October 2006 (UTC)

Funny that you added QuteMol just now. I came across QuteMol only a few days ago and think the image quality is fantastic. The lighting and shadows are much better in comparison to other programs and all this in realtime. However the options are very limited, too. From what I saw it doesn't even have a cartoon-mode. Or am I wrong? I guess that would be a 'must' for our purposes. Splette

^Talk 19:02, 4 October 2006 (UTC)

Thanks for the kudos! Yes you are right, QuteMol is somewhat limited in terms of possible visualization modes, but it is evolving rapidily :). ALoopingIcon 21:30, 4 October 2006 (UTC)

Sadly Qutemol won't run on my computer, I'll try this at home. TimVickers 19:17, 4 October 2006 (UTC) Nope, won't run on my home computer either. TimVickers 00:04, 5 October 2006 (UTC)

Wow, Qutemol's shading options are great! The spacefill representations look beautiful, and the ability to export a transparent PNG is very useful. It does feel a little underdeveloped though - some of the usual features that you would expect from a tool like this just aren't there. Cartoon representations being a big one (though maybe 'sparse' representations like that would benefit less from the shading) but also simple things the ability to load multiple molecules at once or change the default color scheme. Although we can't really use it for general protein images, it would be nice to have these sorts of representations for molecules with deep active-site clefts and for small molecules. Opabinia regalis 01:14, 5 October 2006 (UTC)

I disagree with the original proposal that base representations should be standardized at the software/technical level. Because structure is intimately connected with function, the structural representation (which will be the first thing that people see, and the first thing they focus on) should be designed to immediately call attention to the primary function or defining characteristic of that protein. I'll admit this is not an easy thing to define, but I do feel it should be the general goal. The exact representation is secondary. My area of interest is certainly a special example, but it serves to illustrate the point: a viral protein which is relevant only as an assembled complex (>4 MegaDa) needs a much different representation than cyt-c.

In my opinion, a style guideline would be more useful. Colors, background, labels, etc. I personally take into consideration color blindness when using colors: avoid red/green distinctions for critical areas. Since I'm not that familiar with Wikipedia, are animated images appropriate? Are representations with multiple level-of-detail layers ok? These kinds of style guidelines would let the experts for each protein choose the best way to show them while retaining some standards for presentation.

On a different issue, I've added a software option to the visualization list. Unless you're really serious about this stuff, it's pretty much a novelty, but it does give choices and representations that just aren't available elsewhere. I've also changed some pros/cons regarding Chimera (source code is available, for example). TheTweaker 19:55, 20 October 2006 (UTC)

Thanks for the Chimera comments! The idea in that case is to provide an easy way to get a "minimal" useful image, especially when there's not much structure-specific information about the protein. (I'm not sure how useful it would be to distribute a standard script, since most of the people creating the images probably know how to use the software, but it's an idea.) At present there's quite a bit of variation in secondary structure coloration, background, rendering method, etc., and a consistent "look and feel" would be nice, though probably a pipe dream :) You have a good point about color-blindness; it's not something I consciously think about. I also don't think of this idea as applying to very large protein complexes or to small peptides.

Animations are fine if they add a lot to the article but there's a bit of a bias against them because they tend to be large files. Multiple levels of detail in a single image sounds fine to me as long as they're all represented clearly - just remember that the images in the articles tend not to be especially large, so too much fine detail may be lost. Opabinia regalis 01:14, 21 October 2006 (UTC)

[edit] Pymol

[edit] Pros

Commonly used for visualization
Publication-quality images
- for example: Zinc finger, Actin, Arp2/3 complex

Integrated internal raytracer - produces very nicely shaded images without an external raytracer such as POV-Ray
Basic manipulations are fairly straightforward for new users
Especially nice cartoon/secondary structure manipulation
Very easy scripting, either in internal commands or in Python
Free and open-source
Extensive wiki for additional documentation
Advanced users can make excellent movies surprisingly easily

[edit] Cons

Documentation significantly lags behind new releases
Beyond the basic features, not especially user-friendly
Idiosyncratic selection syntax
Looks like in the long term, the "free" version will be only distributed as source code, while only paid subscribers will have access to compiled versions
Poor selection of tools for trajectory analysis inhibits the usefulness of those movies
Electrostatic calculations are unstable and "under development"

[edit] Molmol

[edit] Pros

Publication-quality images
- For example, see Pi helix, Ferredoxin fold, Beta-propeller domain
Freely available here and open-source (not GPL, though)
Easy to use and learn; online tutorial available
Excellent selection language and control of ribbon, bond, atom, etc. parameters such as size and color
Built-in calculators for many features (2ndary structure, dihedral angles, electrostatic potentials, etc.)
Excellent for NMR structures, e.g., sausage diagrams
Simple to extend with macros
Decent set of output image formats
Decent documentation built into program itself

[edit] Cons

No longer under development?
Annoyingly hard to do a few simple tasks, e.g., label side chains and make rainbow ribbon diagram
Trajectory animation is relatively primitive
Depiction of electrostatic/electron density surfaces hard to learn; Chimera is much better

[edit] VMD

[edit] Pros

Extremely common (at least in the molecular dynamics and structure prediction communities) and well-established
Most basic manipulations are fairly easily accessible to new users
Good documentation
Easy to produce standardized images; could distribute a script specifying our desired standard parameters
Compiled versions for most OSes; source code also available
Many options for image export, including POV-Ray and Tachyon files for raytracing

[edit] Cons

Licensing is free but does require registration
Many integrated tools may be overkill for our purposes
Without significant customization, VMD images are often recognizable as VMD images (like Excel plots are recognizable as Excel plots)
Can be clunky to export a basic, non-raytraced image

[edit] Chimera

[edit] Pros

High-quality images
A good balance of user-friendliness and flexibility
Good documentation
License does not require academic affiliation
Downloadable binaries for most OSes
Available source code
Lots of cool tools, especially the sequence viewer and volume/surface represenations
Updated regularly for bugfixes and new features - centralized development team
Supports an internal command-line with script support.
Python scripting access to nearly all features.
Saved files are cleartext script-based definitions.

[edit] Cons

Newer and less widely used than other alternatives
"Feature creep"; some features/options aren't likely to be used and just clutter the work surface
Interactive tools work slightly differently than other programs (VMD in particular); can be difficult to switch
Stability issues under Linux with certain X configurations
New features (as add-ins) take time to be incorporated into stable releases. Documentation can lag behind feature releases.
Difficult to compile. Many dependencies on other programs.
Python implementation of features can differ from internal command line implementation.

[edit] MolScript

[edit] Pros

Publication-quality images
We could plausibly distribute a standard script so that users only need to change the target PDB file
Probably the most powerful/flexible option in terms of illustrating specific features

[edit] Cons

Clunky licensing requirements (must have, or have had, some academic affiliation)
Distributed as source code; can be fussy to compile and install
Heavy on command-style scripting
User-unfriendly, especially to those with no Linux experience

[edit] SPDBV

[edit] Pros

Often recommended to students - if someone has used only one program, it's probably either this or Rasmol
Mac and Windows versions.
Simple coloring/view options.
Calculation and display of surfaces, electrostatics.
Allows mutation and alignment of structures.

[edit] Cons

~~Not as powerful as other options; fewer analysis tools~~ I seem to have been misinformed; it apparently has more analysis functions than I thought.
Ray-tracing requires separate program.
No ability to export animations.

[edit] Cn3D

[edit] Pros

[edit] Cons

[edit] QuteMol

[edit] Pros

High quality publication-class images
high end rendering effects to create comprehensible images of molecules in real time
Freely available and open source (GPL)

[edit] Cons

Not as powerful as other options; no analysis tools
limited set of visualization modes (no cartoon)
Require a recent graphics card
No ability to export animations.
Very recent and not very used

[edit] Blender

[edit] Pros

Most options for lighting, colors, materials, etc.
Supports distributed rendering.
Excellent support for irregular and large surfaces.
Highly-developed GUI for manual positioning of structures.
Hardware acceleration.
Advanced animation system.
Saved files can be used as a linked library system for reuse.
Limitless structural representation options.
Internal and external raytracing engines.
Extremely large support community.
Python scripting access to all features.
Available compiled for most platforms.
Small footprint: ~10MB installed.
Distributed under GNU license.
Under constant and extensive development

[edit] Cons

NOT originally designed as molecular graphics modeling software: provides raw, low-level modeling options.
Lacks import/export filters for common file formats.
No built-in support for common chemistry features: bonds, atom features/characteristics, chains, subunits, etc.
Has no historic use as a molecular graphics platform.
Steep learning curve for people without a background in 3D computer animation.

[edit] Collaboration history

Re. Wikipedia:WikiProject Molecular and Cellular Biology/Collaboration of the Month/History - could the table perhaps have include links to the article before and after. Also, I'm not sure how the table works, but the cell nucleus does have a review, a fact that should be included in the appropriate data-cell. ShaiM 11:48, 22 October 2006 (UTC)

Sure, I can do that. I'll add this right now. – ClockworkSoul 16:52, 25 October 2006 (UTC)
- There - how's that look? – Clockwork Soul 17:10, 25 October 2006 (UTC)
- Ace. ShaiM 13:21, 26 October 2006 (UTC)

[edit] MCB Barnstars

The new biology "barnstar"?, from PDB 1IAS.	Just search for pentamer... From PDB 1T0T.	The so called β-star secondary fold...	...same without the background... From PDB BARN :-).
...with gfp in the middle to make it more barnstary. Zephyris 12:19, 3 November 2006 (UTC)

It would be fun to have a few more MCB-specific barnstars, as suggested above, perhaps one for every major area? The DNA Barnstar is great. For proteins, Tup1 is a possibility, but it has seven blades, not five; maybe we should try another β-propellor such as 1gyh? IgM is another five-fold possibility. Willow 11:15, 24 October 2006 (UTC)

What other areas are you thinking? I'm thinking that the DNA star may be suffient, but I'm open to new ideas! – ClockworkSoul 16:51, 25 October 2006 (UTC)
- How bout this? Zephyris 23:55, 2 November 2006 (UTC)
  - I really like the TGF beta receptor. David D. (Talk) 00:11, 3 November 2006 (UTC)
    - I think the shape should be still recognisable as a barnstar. --Splette :) ^Talk 02:10, 3 November 2006 (UTC)
      - I'm really liking the β-star (without the background). Hope you don't mind, I arranged everything into a gallery. – Clockwork Soul 06:41, 3 November 2006 (UTC)
        
        I must say the β-stars a bit of a beauty... —The preceding unsigned comment was added by Zephyris (talk • contribs).
        
        Perfect!--Splette :) ^Talk 15:36, 3 November 2006 (UTC)
        
        The new one with the GFP core, you mean? I agree! – Clockwork Soul 15:38, 3 November 2006 (UTC)
        
        Yes. This is what I call collaboration ... --Splette :) ^Talk 16:00, 3 November 2006 (UTC)

Unorganised and unintentional none the less! Zephyris _Talk 00:23, 4 November 2006 (UTC)

Very cool! Dr Aaron 06:10, 6 November 2006 (UTC)

Sorry to rain on the parade here. I don't like the version with the GFP in the middle much prefer the simpler version. Where does the protein structure some from? David D. (Talk) 09:36, 6 November 2006 (UTC)

Its from PDB ID 1ERJ. + some "structural remodeling" in Photoshop --Splette :) ^Talk 11:09, 6 November 2006 (UTC)

[edit] Okay, now what do we call this thing

Like the title says... every barnstar should have a name, so what do we call it? – Clockwork Soul 18:27, 3 November 2006 (UTC)

"molecular biology", "cellular biology", "biochemistry", "protein", etc. i apologise in advance, cheap wine! something summarising general biology molecularness... Zephyris _Talk 00:34, 4 November 2006 (UTC) - tho i'm embarrassed to sign this :)

Hmm... the "Biostar" is already taken. How about something "moleculey" (hey, I made up a word!) like "The αβ-Star"? Beats me. I've actually been editing for a change, and my brain is shot. – Clockwork Soul 06:15, 6 November 2006 (UTC)

MCB βαρη-star? Not very creative either, I know... --Splette :) ^Talk 09:28, 6 November 2006 (UTC)

I think just "The Protein Barnstar" might be best... - Zephyris _Talk 12:19, 6 November 2006 (UTC)

I'm not so sure... that implies that we only give out the star for protein contributions, but we have a wider scope than that. – Clockwork Soul 14:45, 6 November 2006 (UTC)

"Molecular life barnstar" or "Biochemistry barnstar" TimVickers 18:17, 6 November 2006 (UTC)

"Molecular life barnstar" seems adequate. Maybe "Biochemistar"? (or maybe not...) – Clockwork Soul 19:36, 7 November 2006 (UTC)

The InSilico Star? TimVickers 20:23, 7 November 2006 (UTC)

I kind of like that one. – Clockwork Soul 22:29, 7 November 2006 (UTC)

But what's the start/peoples MCB contributions got to do with in silico? I think we ought to call it the 'MCB Barnstar'... perfect. :) --Username132 (talk) 11:18, 9 November 2006 (UTC)

[edit] Shall we have a vote?

Might as well use the vote page... shall we vote on name and which image to use? - Zephyris _Talk 13:27, 7 November 2006 (UTC)

I think we could use a few more suggestions for the name. There is non that appeals to me so far. - Splette :) ^Talk 14:01, 7 November 2006 (UTC)

Tis true, i was thinking more about the image. - Zephyris _Talk 20:54, 7 November 2006 (UTC)

Yes, time for image voting. Maybe during the vote someone will come up with a good name for that thing --Splette :) ^{How's my driving?} 06:03, 23 November 2006 (UTC)

[edit] Stubs, stubs everywhere!

There are a number of stub classes that relate to this project, but seem to relate to one another in a haphazard way, if they do at all. So far, I've found:

Category:Biochemistry stubs, which contains
1. Category:Protein stubs (~300 articles)
2. Category:Enzyme stubs (~300 articles)
Category:Cell biology stubs (~450 articles)
Category:Molecular and Cellular Biology stubs AND Category:Molecular and cellular biology stubs (both ~30 articles) which are the categories that our {{Molecular and Cellular Biology-stub}} and {{Molcellbio-stub}} templates, respectively, adds articles to.

I would like to sort these stub categories by placing them in a hierarchy beneath - I'm thinking - Category:Molecular and Cellular Biology stubs. Any thoughts? – Clockwork Soul 17:10, 25 October 2006 (UTC)

I went and merged both of those in #3. That one seemed like a no brainer, at least. – ClockworkSoul 03:57, 26 October 2006 (UTC)
- What do you mean by hierarchy? ShaiM 13:24, 26 October 2006 (UTC)
  - That one category should contain the others as sub- and/or sub-sub categories, in some logical combination. – Clockwork Soul 13:33, 26 October 2006 (UTC)

If this hasn't been reorganized yet, biochem and cell bio should be subcategories of mol & cell bio, proteins should be a subcategory of biochem, and enzymes a subcategory of proteins. Ideally that would reflect however the corresponding article categories are organized. Opabinia regalis 01:40, 4 November 2006 (UTC)

Opabinia, I concur. GAThrawn22 02:38, 4 November 2006 (UTC)

Better organisation would help the stubs. Perhaps Molecular & Cellular biology and Protein being the two "big" categories, and Enzyme, Developmental Biology, Cell Biology, Molecular Biology, and Biochemistry being subcategories. Whatever the decision, clearly putting how things should be classified on the MCB page would be good. Dr Aaron 06:09, 6 November 2006 (UTC)

[edit] Our scope: should it include unicellular critters?

Tim's recent efforts into improving influenza got me to wondering a simple thing: does it fit within the scope of out MCB project? If it does, then do all varieties of viruses and unicellular critters have a place on out worklist? Looking over several similar pages, none of them seem to be claimed by any particular project already. However, our project's collective plate is already pretty full. Does anybody have any thoughts on the matter? – Clockwork Soul 17:43, 25 October 2006 (UTC)

There is a Viruses project, but is seems pretty quiet. Might be best to keep focussed until we've covered the core of MCB a little more thoughly. TimVickers 18:41, 25 October 2006 (UTC)

I'm tending to agree, although I think that Wikipedia:WikiProject Viruses may have been abandoned. – Clockwork Soul 19:51, 25 October 2006 (UTC)

Well I'm not taking that project over. My plan is to get the "big three" diseases to FA (only malaria to go) and then go back to more biochem-type subjects. TimVickers 20:54, 25 October 2006 (UTC)

I think I'm going to mark it is inactive. Let me know if you need any help, Mr. FA machine. – Clockwork Soul 21:38, 25 October 2006 (UTC)

If the Viruses project has been abandoned then I think it most definitely falls under our project. Perhaps we could send a notice to the virus project members and see if they're still working on it. Another suggestion is we can adopt the virus project as a daughter project. Remember, microbiology can be considered as a sub-specialty of cellular biology. GAThrawn22 21:41, 25 October 2006 (UTC)

I just marked it as inactive. Perhaps we should adopt it, but if we do so, we would also have to adopt the also-defunct Wikipedia:WikiProject Prokaryotes and protists. The question I have is this: what would adopting them mean for us in terms of our responsibilities? – Clockwork Soul 21:45, 25 October 2006 (UTC)

I thought MCB already covered unicellular organisms and viruses. I guess our responsibilities would expand - we can just pick up where those other projects left off, add their to-do list to ours etc. I don't think the issue needs a whole lot of consideration - let's just do it. --Username132 (talk) 12:44, 26 October 2006 (UTC)

I dunno, seeing as inclusion of those topics would substantially broaden the area covered by MCB. Also, I always took Mol. and Cel. Bio. to mean those topics relating to the working of a cell, with any interest in pathology/specific organisms being peripheral. I would be against such a move while recommending a vote, seeing as opinion depends largely on individual interest. ShaiM 13:30, 26 October 2006 (UTC)

Well, let's just do this: we'll leave those projects as seperate entities, but put a bit of effort into cleaning up the main pages, making basic "this article is part of the X project" templates, and creating nice pretty assessment lists. The talk page templates alone tend to serve as a pretty solid form of advertisement, so we'll probably manage to catch the attention of at least a couple of people interested in one or both of those projects. In addition, we can post recruitment requests on high-profile pages like virus and influenza. With any luck, in a couple of weeks, the projects would have a couple of reasonably dedicated members who will be willing to take the lead, and we won't have to babysit the projects anymore. How's that sound? – Clockwork Soul 13:44, 26 October 2006 (UTC)

That sounds like an excellent plan. TimVickers 14:50, 26 October 2006 (UTC)

Okay... I cleaned up their project page a bit (really, I just recycled by design for our own page), set them up for the MathBot to build them a Viruses equivalent of our own worklist, and built them a new project template (I can't believe they didn't even have one!). If anybody bored can assist by tagging a few virus-related articles, I'm sure it would go a long way towards recruiting a few people to revive the project. – Clockwork Soul 05:15, 28 October 2006 (UTC)

[edit] External links

I think it might be a good idea to make external links to Uniprot for individual proteins, or to Pfam for groups of proteins described in Wikipedia articles. As for links to PDB, this is not so simple, because every protein can be represented by a large set of PDB entries. The references to SCOP are better (anyone has an idea why SCOP is so outdated?). Also, it would be better to refer to PDBsum instead of the PDB, because PDBsum provides a much better interface with other databases and a lot of structural parameters of proteins that can not be found in PDB. Biophys 19:27, 31 October 2006 (UTC)

[edit] MCB image

Re. the image seen on the templates (as put up on talk pages) and also on the user box: How come they keep changing? I think we're onto our third one. Personally I liked the one before this one, the molecule image. I don't care much for the current one, the colour scheme isn't too atractive. Any chance of reverting? ShaiM 04:28, 1 November 2006 (UTC)

Oh, you wrote about this before - sorry - I forgot all about it. :/ I rather liked the color scheme: I thought the purple image went well with the blue and purple links. Are there any images that you prefer? – Clockwork Soul 05:25, 1 November 2006 (UTC)

[edit] Protein Representation Picture Requests

User:Splette volunteered to do protein pictures upon request, but if there are others willing to do this, it might make sense to add a page like Wikipedia:WikiProject Molecular and Cellular Biology/ProteinPictureRequests where "Protein Picture Hitpersons" could watch and field requests. Are there others willing to do this is? I'd be most happy, since I am terrible at rendering these things. ~Doc~ EquationDoc 16:33, 10 November 2006 (UTC)

DONE! See Wikipedia:WikiProject_Molecular_and_Cellular_Biology/Requested_proteins for protein image and pathway diagram requests--field a request or add one to the list! ~Doc~ EquationDoc 16:46, 11 November 2006 (UTC)

[edit] Protein Infobox Addition Proposal

Does anyone else see a need for aliases/alternative names/etc in the protein info box? I know this sort of works against the important goal of standardization of gene/protein names, but the reality is you often can't find references (PubMed, google, etc) without them. Moreover, in most cases, it probably makes little sense to devote space to them in article content, so adding them to the template seems reasonable (to me). Anyone? ~Doc~ EquationDoc 16:52, 10 November 2006 (UTC)

I'm not sure if you are asking if it should or shouldn't include an alias field. None of the fields (except for the standard protein name) are required in the Template:protbox. For its usage see: Protbox usage If you are adding a protbox to a protein article you are not required to put in an alias in the other names field. I find that there are several proteins that have many names in use and some people may be accustomed to one over another. The alternate names field will help someone be sure that they've found the right protein. Many other protein databases on the web include such a field in their tables, so I don't see why it should be a problem. GAThrawn22 00:17, 11 November 2006 (UTC)

Doh! I was using Template:Protein which doesn't allow alternate names. Thanks for pointing it out. ~Doc~ EquationDoc 02:31, 11 November 2006 (UTC)

[edit] Wikipedia, Pfam/Interpro and SMART

Most of you know about Pfam/Interpro that provides brief but very systematic annotations (short summaries) for different protein families, and also about SMART that does the same for different protein domains. These summaries are at the level of "stubs" or better. I understand that Pfam/Interpro and SMART operate under the same "open access" policy as Wikipedia, which means that everyone can copy and modify the content. It would be possibile to identify a set of most important protein families and domains that are missing in Wikipedia but present in Interpro and SMART, and copy their summaries as the initial Wikipedia "stubs" with a reference and link to the corresponding Interpro or SMART entries. We could also ask people from Interpro and SMART what they think about such idea, and they might be even willing to help. Biophys 17:04, 16 November 2006 (UTC)

See list of SMART domains: [9]. Few of them can be found in Wikipedia. I think the summaries can be downloaded to Wikipedia automatically, but it is important to have a consent from SMART authors. Of course, the idea is to improve these short summaries in the future. Biophys 17:43, 16 November 2006 (UTC)

SMART uses annotation from InterPro which contains copyrighted information such as PROSITE annotation. I have e-mailed Pfam and asked about the copyright status of their database. TimVickers 19:42, 16 November 2006 (UTC)

Their reply was as follows:

Hi Tim,

Pfam is distributed under the terms of the GNU GPL license. According to that license any derivatives should also be distributed under GNU GPL. However, we tend to take a pragmatic view for small parts of the data to make Pfam maximally useful. Do you have an example of the kind of info you would take from Pfam?

Pfam is really a database of protein family annotations rather than for individual proteins. We would certainly be interested in providing links etc and whatever information we can.

Yours sincerely Alex Bateman

Good. If I understand correctly, Wikipedia operates under GNU license. What I mean is this. For example, Wikipedia has no article about C2 domains. I would go to SMART C2 domain annotation : [10], copy the annotation, maybe modify this annotation (but maybe not), make internal Wikipedia references within the annotation, and provide this link to SMART [11]. That would be a stab about C2 domains. Someone could improve in the future. Whould that be fine? I can do this for a couple of domains as an experiment, and then ask Alex Bateman if he likes it. Of course, it would be much better if people from SMART/PFAM team generate such Wikipedia stubs automatically (but one have to make sure that the corresponding article is not already in Wikipedia). Then, someone could look through these stubs and wikify them. Biophys 22:51, 11 December 2006 (UTC)

You can't do that with SMART, because as I said earlier, this contains copyrighted information from Prosite. However, you can do this with Pfam. TimVickers 23:21, 11 December 2006 (UTC)

Then I will use Pfam if needed. Actually, the annotation in SMART consists of two parts. One part is abstract from INTERPRO, and it is exactly the same as in Pfam. Another part is a kind of header ("Description"), which is not taken from PROSITE but can be found only in SMART. Biophys 00:56, 12 December 2006 (UTC)

I have created several new articles using this method. Pfam helps a lot, but some editing is usually required. Unfortunately, some Pfam entries are poorly annotated. Biophys 04:05, 12 December 2006 (UTC)

Retrieved from "http://en.wikipedia.org../../../w/i/k/Wikipedia%7EWikiProject_Molecular_and_Cellular_Biology_Proposals_79b6.html"

Wikipedia:WikiProject Molecular and Cellular Biology/Proposals

From Wikipedia, the free encyclopedia

Contents

[edit] Abbreviations

[edit] Gene and Protein Names

[edit] Genes

[edit] Proteins

[edit] Small molecules

[edit] Proteins

[edit] Standardising Article Structure

[edit] Templates

[edit] Protein Categorisation

[edit] Standards of quality for biochemistry articles

[edit] Standardizing protein representation pictures

[edit] Pymol

[edit] Pros

[edit] Cons

[edit] Molmol

[edit] Pros

[edit] Cons

[edit] VMD

[edit] Pros

[edit] Cons

[edit] Chimera

[edit] Pros

[edit] Cons

[edit] MolScript

[edit] Pros

[edit] Cons

[edit] SPDBV

[edit] Pros

[edit] Cons

[edit] Cn3D

[edit] Pros

[edit] Cons

[edit] QuteMol

[edit] Pros

[edit] Cons

[edit] Blender

[edit] Pros

[edit] Cons

[edit] Collaboration history

[edit] MCB Barnstars

[edit] Okay, now what do we call this thing

[edit] Shall we have a vote?

[edit] Stubs, stubs everywhere!

[edit] Our scope: should it include unicellular critters?

[edit] External links

[edit] MCB image

[edit] Protein Representation Picture Requests

[edit] Protein Infobox Addition Proposal

[edit] Wikipedia, Pfam/Interpro and SMART

Views

Navigation

Search