User:FritzpollBot/FAQ

From Wikipedia, the free encyclopedia

On this page, I hope to answer any regularly asked questions about the bot. Further questions should be directed to User talk:Fritzpoll, and if asked often enough, will be added!

Contents

[edit] Technical

[edit] 1. What is FritzpollBot for, and why bother?

The bot is designed to help complete Wikipedia's coverage of all populated settlements in the world. At present, there are a number of users making stub-level articles to achieve the same aim who consequently cannot devote time to expanding these articles and otherwise improving the encyclopaedia. This work should also correct a systematic bias in the English Wikipedia at present: namely that geographical coverage concentrates primarily on settlements in the USA and United Kingdom. This bot will help to correct that.

[edit] 2. How does it work? Surely a bot can't do the work of a human being? There will be errors, disambiguation pages to make, etc. etc.

The bot uses internet sites to index the United States National Geospatial Intelligence Agency database, which includes a comprehensive list of all populated places in the world.

Computer programs are essentially stupid tools. Imagine a can opener trying to decide which tin to open for your dinner: quite a bizarre concept. The bot serves simply to make the job easier, not replace the human element altogether. The process runs as follows:

  1. Bot extracts data and uploads lists to the subpages of Wikipedia:WikiProject_Missing_encyclopedic_articles/Places which lists all the articles to be created to be manually checked or dabbed by editors before the bot is run.
  2. Data is then checked by human editors. This includes checking for disambiguation requirements, any spelling errors not consistent with Wikipedia's existing content, and any other issues.
  3. Once the check is complete, I am notified. The bot then scans the list, creating articles for any article that is red-linked - it will not do anything is the page already exists, beyond write out a log to me of all places that it did not create.

As part of this process, I have encouraged the inclusion of Wikiprojects in these areas - as I upload, and common errors are checked, I hope to contact all interested parties. This will allow consistency within a project, more specialised templates, categories, etc. Inclusion of others will also make it more likely that we can find extra data, such as census data, etc. that can be built-in to the article from day one. I am happy to include this where available, provided it is in a readable, accessible and publically available format.

[edit] 3. Can't you add more data?

Yes! Provided it is in a readable format from a reliable source, anything can be included automatically. Part of the reason for involving as many editors as possible is the hope that more sources can be found to do add details such as population, elevation, etc.

[edit] 4. How many articles?

My current estimate based on existing coverage of settlements within Wikipedia is 1.8 million.

[edit] 5. How long will that take?

With the element of human intervention, plus the fact that the bot is very supervised (so I need to be here to assign it certain blocks of tasks) means that I suspect it will take more than a year to complete. This is significantly quicker, however, than individual editors extracting the data manually and creating the articles by hand.

[edit] 6. What about server space, the performance hit, etc.?

I don't worry about performance, and neither should you - Wikipedia:Don't worry about performance

[edit] 7. What about the Random Page feature?

These additions will expand the encyclopaedia significantly, almost doubling its size from today's value, leading to questions of the Random pages feature just bringing up these stubs. Three replies: a) Good, it means the stubs should be expanded even faster. b) Wikipedia will probably be much bigger in a couple of years anyway, diluting these articles as a percentage of the whole. c) The encyclopaedia should be comprehensive - if that means having to click the random page button N times instead of one to get a non-geographical article, then I see this as a net benefit nonetheless.

[edit] 8. Don't you have to get approval?

From a technical perspective, no. The bot was approved by the Bot approvals group after a trial run, and the bot is flagged as a bot to prevent new page patrollers being swamped during an article creation run. The bot will, however, only create lists until it is demonstrated that consensus will not be opposed to this bot as programmed.