Wikipedia:WikiProject Molecular and Cellular Biology/Proposals

From Wikipedia, the free encyclopedia

Discussion
This is an appropriate place for general discussion about the project and its direction.
Announcements
This is an appropriate place to make announcements to other project members.
Help Requests
This is an appropriate place to ask help of other project members.
Proposals
This is an appropriate place to make and discuss proposals with other project members.
Discuss proposals concerning the Molecular and Cellular Biology Wikiproject here.
Please click here to start a new proposal.
Archive

Archives


1 2 3


Contents

[edit] Wikipedia, Pfam/Interpro and SMART

Most of you know about Pfam/Interpro that provides brief but very systematic annotations (short summaries) for different protein families, and also about SMART that does the same for different protein domains. These summaries are at the level of "stubs" or better. I understand that Pfam/Interpro and SMART operate under the same "open access" policy as Wikipedia, which means that everyone can copy and modify the content. It would be possibile to identify a set of most important protein families and domains that are missing in Wikipedia but present in Interpro and SMART, and copy their summaries as the initial Wikipedia "stubs" with a reference and link to the corresponding Interpro or SMART entries. We could also ask people from Interpro and SMART what they think about such idea, and they might be even willing to help. Biophys 17:04, 16 November 2006 (UTC)

See list of SMART domains: [1]. Few of them can be found in Wikipedia. I think the summaries can be downloaded to Wikipedia automatically, but it is important to have a consent from SMART authors. Of course, the idea is to improve these short summaries in the future. Biophys 17:43, 16 November 2006 (UTC)
SMART uses annotation from InterPro which contains copyrighted information such as PROSITE annotation. I have e-mailed Pfam and asked about the copyright status of their database. TimVickers 19:42, 16 November 2006 (UTC)
Their reply was as follows:

Hi Tim,

Pfam is distributed under the terms of the GNU GPL license. According to that license any derivatives should also be distributed under GNU GPL. However, we tend to take a pragmatic view for small parts of the data to make Pfam maximally useful. Do you have an example of the kind of info you would take from Pfam?

Pfam is really a database of protein family annotations rather than for individual proteins. We would certainly be interested in providing links etc and whatever information we can.

Yours sincerely Alex Bateman

Good. If I understand correctly, Wikipedia operates under GNU license. What I mean is this. For example, Wikipedia has no article about C2 domains. I would go to SMART C2 domain annotation : [2], copy the annotation, maybe modify this annotation (but maybe not), make internal Wikipedia references within the annotation, and provide this link to SMART [3]. That would be a stab about C2 domains. Someone could improve in the future. Whould that be fine? I can do this for a couple of domains as an experiment, and then ask Alex Bateman if he likes it. Of course, it would be much better if people from SMART/PFAM team generate such Wikipedia stubs automatically (but one have to make sure that the corresponding article is not already in Wikipedia). Then, someone could look through these stubs and wikify them. Biophys 22:51, 11 December 2006 (UTC)
You can't do that with SMART, because as I said earlier, this contains copyrighted information from Prosite. However, you can do this with Pfam. TimVickers 23:21, 11 December 2006 (UTC)
Then I will use Pfam if needed. Actually, the annotation in SMART consists of two parts. One part is abstract from INTERPRO, and it is exactly the same as in Pfam. Another part is a kind of header ("Description"), which is not taken from PROSITE but can be found only in SMART. Biophys 00:56, 12 December 2006 (UTC)
I have created several new articles using this method. Pfam helps a lot, but some editing is usually required. Unfortunately, some Pfam entries are poorly annotated. Biophys 04:05, 12 December 2006 (UTC)
Pfam got back in touch this morning. TimVickers 16:58, 2 February 2007 (UTC)
Hi Tim,
I have speoken to several members of the Pfam consortium and there is unanimous support for you doing this. Please let me know if we can help with this.
Are you also interested in RNA families? I am also in charge of the Rfam database. One of our goals for the coming year is to make the annotation for Rfam into a community resource using a wiki. However if this were part of Wikipedia then so much the better. Do you think that is feasible?
Yours sincerely
Alex Bateman

[edit] Scientific citations

Would your WikiProject like to endorse Wikipedia:Scientific citation guidelines? If so, please let those editors at that guideline know. --ScienceApologist 19:07, 1 December 2006 (UTC)

I agree. Consistent referencing is important for all articles (although I can only be bothered to do Harvard and will leave it to some wikignome to convert Harvard to footnote references (which I do prefer over Harvard; just not enough to go through the bother)). I predict support from the rest of MCB and am now on my way to let those editors know that we endorse the guidelines (if enough people disagree (I think unlikely) we can always revoke our endorsement. --Username132 (talk) 22:20, 15 December 2006 (UTC)
Doesn't seem to be much activity on this issue; if there's an official stage of endorsement or whathaveyou, we can probably move on to that. Opabinia regalis 04:13, 20 December 2006 (UTC)
I'm not sure what would constitute an "official stage". These guidelines are already operational. The page currently starts off with:
This page is a guideline for Mathematics, Physics, and Chemistry.
It expresses the consensus of editors in those projects about specific details of inline citation. Editors in other scientific projects should follow the practice followed by those projects.
WikiProject Chemistry was just added today, following a "vote" of endorsement at Wikipedia talk:WikiProject Chemistry#Wikipedia:Scientific citation guidelines. The question here is: can Molecular and Cellular Biology be added to the projects explicitly listed on the guideline page? At the moment there is no indication of consensus, but only the absence of manifest opposition.  --LambiamTalk 08:37, 20 December 2006 (UTC)

Vote on proposal CLOSED I set up a vote on this. TimVickers 18:39, 20 December 2006 (UTC) Vote page

[edit] Proposal from Novartis/GNF

I got an interesting e-mail this morning.

Hi Tim,
Since it looks like you have some official (or at least very active) role in the MCB project at Wikipedia, I thought I'd try emailing you first. I'm wondering if there is a potential synergy between our two projects...
I lead the "Symatlas" project at GNF (http://symatlas.gnf.org/SymAtlas/). The goal of this application is two-fold. First, we want this to serve as a "gene portal" (with a mammalian bias) which collates all the relevant information in the public domain for all genes. Second, we use this application to release our data into the public domain. Right now, we primarily have gene expression data, centered around our "GeneAtlas" data set which measures expression across an anatomically diverse set of tissues. In the future, we will also post our data for large-scale siRNA screening.
Right now, we're in the process of rebuilding SymAtlas to improve the user interface, responsiveness, features -- pretty much everything. One of the things on our list of new features is a wiki. We were originally thinking of maintaining (and possibly coding) our own wiki for tighter integration with SymAtlas, but actually the MCB effort may be a good partner. You guys have seeded quite a bit of content and probably have a pretty broad audience. We have a bunch of custom data that we could contribute, and we also have a decent sized audience (3000 visitors and 50,000 pageviews per week).
Anyway, let me know if you think this might be mutually beneficial.
Cheers,
-andrew
Andrew Su, Ph.D.
Genomics Institute of the Novartis Research Foundation

I replied.

Hi Andrew
Thank you, this sounds like a good opportunity. We are always happy to co-ordinate with people who want to add content to Wikipedia. Obviously, anything that is added must be licensed under the GFDL and be verifiable (published elsewhere). The advantage to using Wikipedia for distributing data is that it has very high visibility and can be integrated with an unlimited number of other resources. The disadvantage is that, due to open editing, the data can be altered and is thus less reliable than information maintained on a third-party website.
With these advantages and disadvantages in mind, what information and what form of presentation were you considering? One possibility that come to mind is the Protein Infobox
http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Molecular_and_Cellular_Biology/Style_guidelines#Infoboxes
It might be possible to import data from your project to this standard format and produce a basic summary for each gene/protein in your database.
I will post this on the MCB website and canvas for other ideas. Please feel free to join and participate directly!
Thank you again for your interest.
Dr Tim Vickers

So does anybody else have other suggestions as to how we could coordinate? This could be a very valuable collaboration. TimVickers 16:56, 5 January 2007 (UTC)

Hi Tim!
It's a very tempting prospect, and I'd support it all; way to go, big N! :) I'm worried only that the uploading of new (albeit verifiable) data to Wikipedia would violate WP:NOR. Could we get a special dispensation for uploading data that anyone could verify, e.g., the results of publicly available web-servers for a given protein sequence and similar stuff? Alternatively, they could "publish" at their site and then copy the data over to us (that's probably what you were thinking?) but the unnecessary duplication pains me somewhat; Occam's razor and all that.
Perhaps they'd be willing to open up their wiki to members of the MCB WikiProject as a GFDL resource, so that we and others could add content with a good conscience? I'd devote hours to describing my favourite proteins, as would others here, I believe; at least I recall several people on our membership list who wanted to make a page for every known protein... Willow 18:05, 5 January 2007 (UTC)


Thanks Tim for putting this in the appropriate place for discussion. I thought I'd supplement my rather general email above with a few specific proposals/ideas.
First, two ideas on how we might be able to contribute content to Wikipedia that could be accomplished pretty easily (if they were deemed to be desirable).
1) For all the gene entities that are currently in Wikipedia, we could add the gene expression profile from our SymAtlas database to the Wikipedia gene page. Yes, these data have been published previously. I would need to check with our legal department to confirm that we can release it under GFDL, but I'm confident that could happen. In terms of whether the data is appropriate/interesting, we feel that information on where a gene expressed in the human/mouse anatomy is a basic piece of gene annotation.
2) As mentioned above, one of the goals of SymAtlas is to be a gene portal, and as such we have a database of nonredundant mammalian genes, plus all the links to native data sources (Entrez Gene, Ensembl, Affymetrix chips, GO, etc.) In total, we have ~190K species-specific entries between mouse, human, and rat, which means just over 60k species-independent genes. (I know, counts are high, but so far we've been conservative with collapsing gene entities.) Anyway, we could do some sanity filtering, and then create stubs for a large number (thousands or tens of thousands) of genes with expression patterns and appropriate database links.
And now, two ideas on possible ways SymAtlas could utilize Wikipedia:
3) We'd like to embed a wiki section in a SymAtlas "gene report" (together with other gene annotation information). If the wiki were Wikipedia, then the link to edit a page would clearly just redirect to Wikipedia itself. But for display in the gene report, we'd need to figure out a way to capture the content without a lot of the surrounding elements. For example, we'd want to take off the left navigation bar and much of the top header. Is this a faux pas?
4) We'd like to create simple ways for people to add content in a structured way. For example, I'd like it if we had a simple text-box where people could enter a Pubmed ID or a URL, and hitting "submit" would trigger a bot to add that link to an appropriate area in the wiki. We get a lot of users at SymAtlas who don't have a lot of computer sophistication, so if we want them to contribute their biological info, we need a pretty darn low barrier of activation.
And finally, two ideas/thoughts on SymAtlas-specific needs that come to mind during this initial brainstorming process:
5) Although Wikipedia wants to catalog previously published findings, I think there's probably a use to have a wiki that allows contribution of less-substantiated results. (Perhaps this has already been discussed in this forum?) I think it would be cool if someone could, for example, take a list of 100 genes that were differentially expressed in their gene expression experiment, search for them in SymAtlas, create a tag that describes the preliminary finding, and post it to each of those 100 wiki pages (in a specific section, of course). This might be a decent way to foster new collaborations. However, given the less rigorous nature of these data, maybe that's an argument to set up a parallel wiki effort, and SymAtlas would combine content from them both.
6) And finally, (the answer the question of what does GNF/Novartis have to gain from all this) I'd really like an internal-only section for proprietary content. This would house data like "four small molecules targeting this gene all failed due to liver toxicity" with links to the appropriate internal reports. Perhaps this is an argument for a third parallel wiki (or building a custom wiki solution with a security model built-in).
Okay, I think that's it for now. Sorry for the long-winded reply, but we've been thinking about the possibilities for a while and are quite excited! AndrewGNF 18:27, 5 January 2007 (UTC)
A few quick notes. First, does the fact that these data have all been published satisfy the WP:NOR? Second, we don't yet have a wiki, but when we do have it set up, it will be open to the entire community. Whether we use wikipedia, the MediaWiki software, or build our own is still up for discussion. But we hope to have something in place in six months. Third, in case you want to see what a SymAtlas gene report currently looks like, this is the one for ITK (though as mentioned above, the user interface is currently undergoing a thorough refactoring, also targeted to be done in ~six months...) AndrewGNF 19:02, 5 January 2007 (UTC)
We have been discussing using Pfam summaries as the basis of short articles. If we could merge your expression data and links to other databases with the relevant Pfam description, this would be a good solid base for a gene page. TimVickers 19:46, 5 January 2007 (UTC)
That could be a great collaboration! But I just received a message from User:Where who said that Wikipedia is under the GFDL, and not the GPL, so he is not sure if we can use Pfam summaries. So, can we actually use them? Biophys 20:08, 5 January 2007 (UTC)
We do have links to protein domains, but actually through Interpro. Interpro is to protein domains what SymAtlas is to genes. Given many different data providers with different IDs for a single concept (genes/families), SymAtlas and Interpro seek to create a nonredundant index. Check out, for example, the SymAtlas page for CDK2. Half way down the annotation column on the right, you'll see a section for "Protein Family" and a link to "Protein Kinase". That InterPro ID links to Pfam, as well as the corresponding model for "Protein kinase" in Prodom and Prosite. But bottom line, our expression data is on a per-gene basis, so it wouldn't make sense to link on that level (since protein families contain multiple genes). These expression data (and all the other links to public databases we've assembled) I think would be most appropriately presented on a "gene" page (e.g., P53) AndrewGNF 20:20, 5 January 2007 (UTC)
This may be slightly off topic, but have you guys considered adding OMIM content? e.g,. P53. In terms of having highly curated and referenced annotation, I can't think of a better source. Their terms of use also appear to be very permissive. AndrewGNF 21:28, 5 January 2007 (UTC)
Sadly we can't use that. All content in Wikipedia must be licensed by the GFDL licence, which allows unrestricted copy, modification and use. TimVickers 21:45, 5 January 2007 (UTC)

This collaboration looks great but i do wonder how this will work with respect to copyright issues. I am unclear on what you mean, Andrew, by an internal-only section. If this is what you need then you will certainly have to have your own wiki. i wonder if your best bet is to use the wikipedia as a testing ground for what you want to do on your own wiki. This allows all scientist (actually anyone) to have some input with regard to the format and content. In other words develop it here and then tranfer the results to your own database. In that way you will be able to tap into opinions and expertise here without running foul of the open access. It seems you gain more than us but if you do a good job we will have access to a lot of great data and that is all that matters. David D. (Talk) 22:28, 5 January 2007 (UTC)

Obviously there are some issues to work out here, but this sounds like a great idea overall. To go through the list above..
  1. Is an excellent idea. We'll need to have a centralized list of all our gene articles (I'm sure there's some stubs in the wrong category/not in any category), but that can probably be arranged.
  2. Sounds like the MCB answer to Rambot's articles :) Andrew, have you given any specific thought to how this might be automated? (Maybe Rambot itself could be modified and repurposed.)
  3. You can reuse, redistribute, and modify Wikipedia content as much as you like as long as you stay within the terms of the GFDL. It sounds like you might want a Wikipedia mirror; note that you probably can't "remote-load" from Wikipedia's servers on-the-fly. There are instructions and information for obtaining regular static database dumps here and here if you want to have a look and see if this might suit your purposes. You can still link to Wikipedia for editing, but your local copy won't update until you load the next data dump containing any edits made by your users.
  4. I think this idea needs some fleshing out. What is the user intending to do with this PMID or URL? It's unlikely that people here will approve of users elsewhere being able to trigger a bot that systematically adds links to a large number of articles - that would be interpreted as (and has the potential for abuse as) spam, even if the links are well-meant. Adding links to appropriately titled external links sections of individual articles, however, is very easy, and possibly someone could write very simplified editing instructions just for this if you anticipate that it will be a common activity. (Bear in mind that, even if not bot-assisted, large numbers of users arriving from the same site to add related external links will set off some people's spam alerts anyway, so this may require some coordination.)
  5. I agree that this would be useful, but it doesn't sound like the sort of content that Wikipedia hosts, and doesn't sound like the sort of thing you'd want on a publicly editable wiki. (Vandalism to a list of genes could make a very big mess.)
  6. Internal data absolutely should be on your own internal wiki, not here. That's definitely not the sort of thing that Wikipedia hosts, and the nonprofit Wikimedia Foundation probably couldn't legally host that kind of information on its servers. Not to mention the fact that, from your perspective, it would be incredibly insecure to host internal data and documentation on external servers, maintained at least partially by volunteers, that also contain data routinely modified by the general public. Opabinia regalis 03:32, 6 January 2007 (UTC)

A few clarifications based on questions/issues above. First, by "internal-only section", I was meaning a place for GNF/Novartis scientists to contribute to a wiki that would not be visible to the public domain. Clearly this would be independent of Wikipedia, maybe even a separate MediaWiki instance that we would host locally (and within our firewall). I'd be interested to hear if anyone has done any work creating a single user interface that combined content from two (or more) wikis.

I hadn't given any specific thought on how to automate the creation of stubs, but I assume we could create some sort of bot to do it. If anyone has any specific suggestions or knowledge (beyond the Rambot links above which we will check out), please point us in the right direction...

Based on the feedback above, the current tentative plan (for item #3 above) would be for SymAtlas to link up three separate wikis. One would be Wikipedia, which hosts well-substantiated findings. The second would be a wiki for more speculative content, and this would be where we may try to put the simple URL / PMID / keyword tagging feature described in #4 and #5 above. We would host this wiki but it would be publicly accessible. The third wiki would be for "internal GNF/Novartis content" (described in #6 above), and we would host this internally within our firewall. SymAtlas would then aggregate all content from these three wiki sources (taking into account the remote-loading policies) and display it integrated within our gene reports. Any comments on this plan are welcome...

Finally, it sounds like there is reasonable consensus and support for using our content to seed Wikipedia stubs. This would contain links to NCBI, Ensembl, Interpro, etc. and also a chart showing the gene expression pattern across anatomic regions (#1 and #2 above). I'll propose that we start with creating just 5-10 gene stubs to get feedback. Anyone have any comment on plan? Also, I don't want to commit to any timeline yet until our SymAtlas development plan is clearer. AndrewGNF 02:30, 11 January 2007 (UTC)

This sounds like a good plan. Thanks Andrew. TimVickers 03:35, 11 January 2007 (UTC)

I suggest consultation, becase such a link could be misinterpreted. I think a good way to proceed if you want to use an internal wiki is to go boldly ahead and do so--MediaWiki is available for anyone, and many organizations have done so. One way to link WP to it would be a federated search engine , run by you, that merely searches WP (as mentioned). Alternatively, you could maintain a WP mirror, many organizations do, and use it however you please. You could certainly make it available to the public, and it would be very good to do so, but I suspect usingthe WP name for it would not be liked. You could also use it in any combination you like privately--there is no restriction on commercial use of WP. You could use any template or infobox in article space. WP will gladly accept PD content from anywhere, especially from such a reliable source as you, and I think would make arrangements to load the material as it has many other PD sources. We can use outside software that is explicitedly GFDL or PD.

Material that has been published in a peer-reviewed article is always acceptable. Information that is not, but has been taken from an authoritative web source that is known to screen material and maintains integrity can also be used--we use PubChem without concern for where exactly they obtained the data.

OK, you;ve got three similar views. I agree with the suggestion thatthe best step would be for your people to conribute to WP according to the usual WP standards. perhaps this is already being done? DGG 04:25, 11 January 2007 (UTC)

PD = Public Domain? When you say "explicitly GFDL or PD", is there a similarly specific definition of PD? AndrewGNF 21:37, 11 January 2007 (UTC)

[edit] Migration To White Background Images

When I ecounter an image on a black background, I find I have to turn up the brightness/contrast of my monitor which is unnecessary when the background is white. Do you find it acceptable that the MCB project should support the sawpping of protein representations with black backgrounds to ones with white backgrounds? I'm not saying that we should take on the task of changing all pictures with a black backgrounds, but I think it is acceptable to do so (cf. with it being unacceptable in most cases, to swap a white-background image for a black-background version).

I've been told that most visualization programs default to black backgrounds because it's easier to make colors appear to blend with black than white when not using anti-aliasing. Observation bears this out, though I'm not sure why. Raytraced shadowing and depth cuing also look more realistic to me on black, although I agree that white looks nicer in articles (though I've never had any contrast issues) and that white should be preferred in most cases. I just don't always remember to switch ;)
More generally: the recommendations section of the pymol tutorial is currently way down the bottom; should it be further up and/or on a separate page? Opabinia regalis 01:08, 31 January 2007 (UTC)
I think it's best at the bottom, since people aren't ready for recommendations when they're still learning to use the program. I have my monitor brightness turned all the way down and contrast pretty low most of the time. I find it more comfortable (I think the monitor more closely resembles paper this way. --Seans Potato Business 21:43, 31 January 2007 (UTC)

[edit] Advice and guideline subpages

Why do we have both [the help page] and Wikipedia:WikiProject Molecular and Cellular Biology/Advice? The title sounds redundant and the current content doesn't really read like a place to get advice. The contents of the advice and external links subpages seem like they ought to be merged to something called "resources"; am I missing the point of these? Opabinia regalis 03:17, 1 February 2007 (UTC)

I think you're right. We need to consider the best way to deal with our advice pages:

Wikipedia:WikiProject Molecular and Cellular Biology/Advice
Wikipedia:WikiProject Molecular and Cellular Biology/Style guidelines
Wikipedia:WikiProject Molecular and Cellular Biology/External links
Wikipedia:WikiProject Molecular and Cellular Biology/References
Wikipedia:WikiProject Molecular and Cellular Biology/Pymol tutorial
Wikipedia:WikiProject Molecular and Cellular Biology/Diagram guide
They should be integrated in whatever manner deemed suitable rather than allowed to develop independantly of each other. --Seans Potato Business 17:18, 1 February 2007 (UTC)

[edit] Proposal: Delete Page: Articles Needing Attention

I propose that the needs of the articles needing attention page is met by the article worklist and should be removed. Unless I'm missing something of course... --Seans Potato Business 01:41, 14 February 2007 (UTC)

I think the idea there was to trigger immediate work on particularly abominable articles, though plainly that hasn't panned out yet. Possibly because it's hard to keep track of category contents? Opabinia regalis 02:04, 14 February 2007 (UTC)
But since the work list combines a rating of how important an article is with how complete it is, do we really need to keep this page? If we do, shouldn't we have it update automatically using the info from the worklist (i.e. high-importance yet low-state-of-completeness articles)? --Seans Potato Business 06:07, 14 February 2007 (UTC)
I would suggest to keep the page. There may be articles that need attention for other reasons, e.g. NPOV on controversial topics, incorrect statements, articles tagged by other editors/users as too technical or needing expert attention etc. - tameeria 16:00, 19 February 2007 (UTC)

[edit] MCB Template Text - Comments Too Small

When someone makes comments that are presented on an MCB-supported article talkpage template, they are far too small. I have to struggle to read it. I'm talking about, for example, where I say: Needs more yeast-based coverage. Some sections over-represent bacterial-based methods, relative to yeast. on [Talk:Two-hybrid_screening] - could someone increase the size to normal? I had a go but couldn't produce an effect. Thanks. --Seans Potato Business 05:49, 14 February 2007 (UTC)

Does it look better now? I increased the font size to 90%. If it doesn't look any different, try refreshing your cache?
On a related note, do we want to keep the recent change that adds the collaboration of the month to the template? Announcing the collaboration article on 4000 pages that may have only a tangential relation to the collaboration topic seems a little spammy to me, but maybe it'd draw in more contributors. Opabinia regalis 17:20, 19 February 2007 (UTC)

[edit] Worklist - Some Entries May Be Redirects

I anticipate that cell metabolism will soon be merged into metabolism. This makes me wonder whether the current Worklist system is designed to detect this? Would it be an idea to enable automatic detection of all tagged articles (every month or so) that has been turned into redirects, allowing for automatic removal from the worklist (perhaps after automatic checking that the redirected-to page was already in the list and (automatically) adding it if it isn't)? --Seans Potato Business 06:06, 14 February 2007 (UTC)

[edit] Userboxes

I propose that a userbox for WikiProject Molecular and Cellular Biology should be created. This may spread publicity about this project , bringing more people to work for related articles. It may also look nice on a userpage. Sodaplayer talk contributions 01:05, 23 February 2007 (UTC)

{{user Mol Cell Bio}} is probably what you're looking for? Opabinia regalis 01:10, 23 February 2007 (UTC)

[edit] new orthologs template

It's taken a while to get around to it, but I've put together a first-draft proposal of a protein info box that we (GNF) could populate in an automated fashion for ~10K genes. (Recall the previous discussion here.) I put this draft on my user page out of ignorance of a better place to put it (moved a working example to ITK (gene) AndrewGNF 19:27, 13 March 2007 (UTC)). Also, I created/modified two templates (Template:GNF_Ortholog_box and Template:GNF_Protein_box) to create this example; the "GNF" prefix was to make sure I wouldn't muck up anything existing.

In the example, I tried to integrate the ortholog box into the main protein box here, but it obviously didn't work. Anyone have thoughts on how to accomplish this? Possibly a related question, how do I find out how the "drugInfoBox" works?

Finally, any other comments/questions/suggestions would be welcome. If people generally like this, then the next step on our end would be to write a bot to create 5-10 of these stubs for further comment. (And in case it's not clear, we're definitely Wikipedia newbies, so any and all suggestions are welcome...)

AndrewGNF 18:06, 7 March 2007 (UTC)

Excellent. I'm afraid I'm not a technical person, but I will try to help in any administrator or proofreading way I can. TimVickers 19:25, 7 March 2007 (UTC)

Having a look at Wikipedia:Bot policy might help, we could also request for somebody to write it at Wikipedia:Bot requests. TimVickers 01:42, 9 March 2007 (UTC)

Thanks, we've definitely had a look at the Bot policy. Looks like we have the expertise here to handle it. Bandwidth is a little more uncertain, but I'm trying to get a couple interns to consider this project. If not, then perhaps I will post it over there as a bot request. (Or, if there is anyone here who's interested in collaborating on writing the bot, let me know...)
BTW, I moved the example template to ITK (gene)... AndrewGNF 22:29, 9 March 2007 (UTC)
Andrew, did you get the templates to tile as you intended? Now that she's back, you might want to ask Willow if it's still not working; she's done some nice and fancy template work. Opabinia regalis 00:04, 14 March 2007 (UTC)
Nope, still haven't gotten it resolved. Thanks for the tip -- I will see if Willow can work her magic... AndrewGNF 00:53, 15 March 2007 (UTC)

FYI, I figured out the whole nested table issue and updated my example gene (ITK). Since my last post here, I've also added a bunch of other information that we have in our database. It's not pretty, but I think it has a lot of useful information. I think we're close to finding a student to take on the project of writing the bot (in collaboration with the bioinformatics program at SDSU). As always, I'd love to get any feedback... AndrewGNF 01:12, 30 March 2007 (UTC)

Oh, and in full disclosure, I also just approached the CZ folks with the same idea. Personally I'm pretty agnostic with respect to where we do this, and it's certainly not mutually exclusive either. Anyway... the CZ forum post AndrewGNF 01:17, 30 March 2007 (UTC)

[edit] Matching page titles with HUGO names

A colleage and I were looking at the entry for his favorite protein, initially called Zif268 and later renamed as Egr1. Egr1 is now the official name at HUGO (HGNC:3238), and HUGO is the "official" source for gene names. Right now, Egr1 redirects to Zif268, but I wonder if we should reverse this so that Zif268 redirects to Egr1. We'd also need to update ZENK and Early Growth Response Protein 1, other alternate symbols/names which redirect to Zif268, to avoid double redirects. Two questions -- is this the correct thing to do, and is it an important thing to do? Presumably there are many cases where the main page is not found under the HUGO title, and perhaps this is another candidate for a bot to fix (and certainly our stub creation bot should be aware of the best practice in this regard). AndrewGNF 02:26, 10 March 2007 (UTC)

I think the rule is that the gene/protein should be found under it's most common name. Obviously, for most genes this is a moot point, as they have no real common name. Otherwise, of a gene has a protein product that has a famous name (such as trypsin or PRSS1) then the info should really be integrated into any existing content at the trypsin page. In practice, 99% of human genes will have no Wikipedia entry, so I think adding the genes under the HUGO names and ignoring trying to automatically fix redirects would be safest. As you say, changing a page to a redirect is best done manually, as there are dependencies in other pages that may need altered. TimVickers 05:05, 10 March 2007 (UTC)