User:The Anomebot2

From Wikipedia, the free encyclopedia


Note: Blocking will stop further edits: the bot will intermittently retry errors for several minutes, but should then automatically shut itself down until restarted manutally; please use a ten minute block or longer to be sure of stopping it.

This bot is designed to add standardized machine-readable geodata records to relevant articles in the English-language Wikipedia, using data from GNS, GNIS, OSGB coordinates in UK articles, plaintext geodata scraped from article text, and interwiki-linked geotag data from other-language Wikipedias. -- The Anome 12:13, 22 September 2007 (UTC)

Contents

[edit] Status

100,000+ geotags added to date. -- The Anome (talk) 23:27, 11 May 2008 (UTC)

[edit] To do

  • Standardize existing geotags.
  • Scan for unusual/broken parameters in infoboxes.
  • Start work on standardizing infoboxes.

-- The Anome 12:12, 22 September 2007 (UTC)

[edit] Forthcoming attractions

With ~70,000 data points, I now have enough data to do a spatial analysis of the category tree, and to generate lists of possibly misclassified or mislocated outliers. The cleaned up bounding data could then be used as a Bayesian classifier for future work. -- The Anome 10:14, 24 August 2007 (UTC)

[edit] Current problems

Because of severe name ambiguity problems, Japanese locations are now filtered out of most machine-matched geodata sets.

Recent Canadian data has had similar problems, and is now also filtered from the output of several matching algorithms. -- The Anome 12:17, 22 September 2007 (UTC)