User:ProteinBoxBot
From Wikipedia, the free encyclopedia
Contents |
[edit] Proposal overview (brief version)
This bot will create or amend up to ~10,000 pages corresponding to mammalian genes. Pages will be created in groups of ~100 to ensure page quality. Each new page will be seeded with content from databases in the public domain. This content will include information about the gene's symbol, description, function, genomic location, structure and identifiers. Genes which do not have any existing WP pages for its symbol, aliases, or title will be created (e.g., MMP9). Genes which do have these conflicts in the WP namespace will be flagged for manual integration (e.g., Apolipoprotein_E). More details are presented in User:ProteinBoxBot/Ideas. This bot is currently being designed and developed by AndrewGNF and JonSDSUGrad. The list of all pages which were created or edited with ProteinBoxBot content is shown here.
[edit] Requests
Please add any requests for specific genes to the requests page.
- A ProteinBoxBot link to "Bioinformatic Harvester" would be nice
- Example directly via IPI: http://harvester.fzk.de/harvester/human/IPI00218/IPI00218982.htm
- Example via query: http://harvester.fzk.de/cgi-bin/human/search.cgi?zoom_query=brca1
..a very nice bot by the way...thanks :-) Ivo (talk) 05:39, 15 May 2008 (UTC)
[edit] Trial Run
A trial run for this bot was approved. The trial was completed and the log is here: User:ProteinBoxBot/PBB_Log_Wiki_Live_Run.
After making quite a few adjustments, a second trial run was completed and the log file is here: User:ProteinBoxBot/PBB_Log_Wiki_Live_Run3_Char_Fix
The bot was subsequently approved and granted bot status.
The eight pages created by the ProteinBoxBot in the trial are:
AKT1 | HIF1A | MAPK1 | MMP9 |
NFKB1 | PPARG | PTGS2 | TGFB1 |
In addition, these pre-existing pages were supplemented with ProteinBoxBot content in a semi-automated edit:
The discussion of the ProteinBoxBot's trial run is archived at Wikipedia:Bots/Requests_for_approval/ProteinBoxBot.
[edit] Logic Flow
The following Flow charts describe the logic of Protein Box Bot:
[edit] Links
Protein Box bot does extensive logging of its activities.
- Bot Log File: User:ProteinBoxBot/PBB_Log_Index
Protein Box Bot does not always know the exact name of a protein page. This page has been created to help with that.
- Bot Page Directory: User:ProteinBoxBot/Protein_Directory
[edit] Protein Box Bot Quick Manual
When dealing with wikipages it is often difficult to automatically determine how and what to update - especially for a bot. Therefore a group of templates were created to ensure that Protein Box Bot behaves appropriately and will not overwrite any information without permission. The templates provide update options and editing boundaries. The Templates are described below:
[edit] Template: PBB_Controls (Required)
PBB_Controls does not display any information on the gene page, instead its sole purpose is to allow update options for PBB. PBB cannot update a gene page that is missing this template. (See the template page for further details)
[edit] Template: PBB_Summary
This template contains the entrez summary for the gene. If no summary is available, then this template is left blank. It is suggested that a blank template be left on the gene page to provide a location for possible future summary updates. During an update, all information in this template is overwritten. See Template:PBB_Summary for more information.
[edit] Template: GNF_Protein_box
The GNF_Protein_Box is the core template updated by PBB. The majority of the information provided by PBB is places in this protein. While it is possible to exclude this template from a gene page, it is not recommended.
During an update, all information in the protein box is overwritten (even with blank values) with the exception of 'image' and 'image_source', which are carried over into the new box. Only if those fields are blank will the Bot try and locate an image. Default image file names follow this format:
PBB_Protein_<protein symbol>_image.jpg
Where <protein symbol> is the actual symbol for the protein (such as PBB_Protein_AKT1_image.jpg).
[edit] Template: PBB_Further_reading
PBB_Further_reading is the template that PBB uses to store citation information. All entries within this template are overwritten when PBB does an update.
[edit] TAG: No Bots (Optional)
<!-- NO BOT EDITS -->
OR
{{nobots}}
This tag will cause the bot to skip updating this page. As the presence of this tag will abort the operation of the bot, its use is optional and not required for bot operation.