User:ProteinBoxBot/Specs
From Wikipedia, the free encyclopedia
[edit] ProteinBoxBot specs
- Content has already been assembled as part of a non-WP project. Data will be provided to ProteinBoxBot as an XML or CSV file. Images will be provided in a zip file or local directory.
- For each mammalian gene with significant available annotation, a new gene page will be created that corresponds to the HUGO-approved symbol.
- If a page with that name already exists:
- If page contains a Protein infobox or GNF_Protein_box, then changes will overwrite previous infobox but leave surrounding content intact
- If not, the gene will be flagged for manual review. Log entry and proceed to next gene
- Image (when available from RSCB according to public domain use) will be uploaded.
- A protein infobox will be created and populated with relevant data. (Manually-created example: ITK (gene)
- A redirect will be created from the full gene name. (For example: IL2-inducible T-cell kinase)
- If a page with the full gene name already exists, gene will be flagged for manual review
- Free-text summary will be included from NCBI page, add wikilinks if appropriate.
- Create references section based on gene2pubmed and/or generifs
- In trial phase, only 10 gene pages will be created. If necessary to better define how much information is necessary for a useful stub, a secondary trial period for ~100 genes will be proposed.
- Bot will check User_talk:ProteinBoxBot and stop with any new messages.
- Bot will cap edits at 10 per minute.
- New protein infoboxes will contain notice that changes can/will be overwritten on further bot updates
- If bot encounters agreed flag (e.g., "<!-- NO_BOT_EDITS -->") then entry will be logged and skipped.
- Bot will maintain log of all edits and edit times.
- Add all modified pages to ProteinBoxBot's watchlist to track further page edits.