User:WikipedianProlific/PS

From Wikipedia, the free encyclopedia

Contents

[edit] Prolificity Sentinel

[edit] Basic Outline

I am in the early stages of designing and building a Bot named above. The function of Prolificity Sentinel will be to inform wikipedians of articles which have a high chance of being obsolete or degraded by vandalism. Now Wikipeidia is starting to get older articles represent a normal distribution. Those in the middle are frequently check and gain credibility with successive edits. Those at the outside (the outliers) are rarely if ever checked by wikipedians and degrade a poor or non existant state.

Whilst most articles fit into the middle, some remain outliers which are infrequently if ever checked and their authors have long since abandoned them. These articles are highly vulnerable to vandalism, both accidental and deliberate. The Prolificity Sentinel will scan the most vulnerable articles, identify if they qualify as outliers to the standard distribution and flag them for attention.

[edit] Specification

It will perform the following roles

‘’’Flag articles meeting the following criteria:’’’

  • They (and their talk page) have not been updated for a minimum of 6 months.
  • They will not be flagged if they contain under 1000 characters. (a measure to avoid flagging stubs and minor articles like most biographies).


The Bot will then write all of these articles into a category viewable by all wikipedians. The category will be called ‘Sentinel Alerts’. People can then follow the links inside the category from its main page and check the articles flagged are intact and up to standard. If they are, wikipedians will be advised to delete the ‘Sentinel Alerts category’ status from the article. In doing so they will have actively edited the page and so it will not be flagged by the Sentinel for another 6 months if no more edits are made.

It is important that the Sentinel adds the ‘Sentinel Alert’ status to an article when it finds it. This of course requires editing of the article and so by its nature means that the sentinel won’t flag the same article twice in one (or consequent) runs unless the article has met the criteria again which would take at least 6 months.

[edit] Computation Process

For this to work the Bot will need to operate in the following order of function:

  • Login
  • Scan the database for articles by date created if possible
  • Starting with the oldest (and thus most likely to be degraded) it will ask the following:
  • Is last main page edit older than 6 months?
  • Is last talk page edit older than 6 months?
  • Is total character count under 1000?
  • if the answer to all is yes then add ‘Sentinel Alert Status’ to the article and save.

Obviously this bot could in course access all 1.2 million pages of the wikipedia database. This is simply not necessary. It can drastically cut down on its server use (and thus avoid server hogging and accidental DoS) by not checking articles created in the last 6 months. Criteria like how old a page must be, how many characters it must have and when it was created must all be variable, the bot may need to adapt if too few, or more likely, too many articles are flagged by these criteria.

[edit] Thoughts & Summary

It is my hope that if the Bot can get off the ground and perform the above function then it can be added to, for example future versions could offer more autonomy and less user interfacing. Such as scanning articles for specific words (i.e. swear words, single character posts and no more) and flag them for immediate action in a high priority ‘Sentinel Alert Category’.

Obviously Prolificity Sentinel will not itself edit articles or make them any better, it cannot, it is simply a series of computational functions. But it will enable a group of wikipedians to identify those articles most likely to be in a compromised position and bring them into the standard deviation. Hopefully with an end result of a standard deviation with only a minute percent of its contents as unrealiable outliers.

Please realise that by and large at the moment this bot is a pipe dream likely to be a month or two down the line if the concept is even proven to be doable. But if we can successfully code such a bot then we should find that it revolutionises the way Wikipeida articles are looked at by giving them a time stamp and ensuring that 99% of articles do not fall into a state of disrepair. This will greatly add to the credibility of Wikipeida by ensuring finished and old articles do not disappear or become destroyed.