User:AlexNewArtBot

From Wikipedia, the free encyclopedia

This user account is a bot operated by Alex Bakharev (talk).

It is not a sock puppet, but rather an automated or semi-automated account for making repetitive edits that would be extremely tedious to do manually.
Administrators: if this bot is malfunctioning or causing harm, please block it.


This is an account for my experiments on creation a "New Article Bot".

It is supposed to patrol New Articles and put relevant articles into the New Articles lists of Portals and Projects.

It has not received Bot Approval yet. All the edits are in semiautomatical mode.

Please forward all the questions to bot owner Alex Bakharev 03:54, 27 January 2007 (UTC)

Contents

[edit] Currently supported

The bot currently supports (in chronological order):

  1. Portal:Russia/New article announcements (Search result, Log, Rules)
  2. Portal:Ukraine/New article announcements (Search result, Log, Rules)
  3. Portal:Poland/New article announcements (Search result, Log, Rules)
  4. Portal:Belarus/New article announcements (Search result, Log, Rules)
  5. Wikipedia:Baltic_States_notice_board#New_articles (Search result, Log, Rules)
  6. Wikipedia:WikiProject_Lithuania#New_articles_related_to_Lithuania (Search result, Log, Rules)
  7. Portal:Georgia (country)/New article announcement (Search result, Log, Rules)
  8. Portal:Armenia/New_article_announcements (Search result, Log, Rules)
  9. Portal:Azerbaijan/New article announcements (Search result, Log, Rules)
  10. Wikipedia:New articles (Australia) (Search result, Log, Rules)
  11. Wikipedia:WikiProject Turkey/New article announcements (Search result, Log, Rules)
  12. Wikipedia:Iranian_Wikipedians'_notice_board#New_Wikipedia_articles_related_to_WikiProject_Iran (Search result, Log, Rules)
  13. Portal:Romania/New article announcements (Search result, Log, Rules)
  14. Wikipedia:New articles (New Zealand) (Search result, Log, Rules)
  15. Wikipedia:Swedish Wikipedians' notice board/New articles (Search result, Log, Rules)
  16. Wikipedia:WikiProject_Assyria#New_Articles_Notifications (Search result, Log, Rules)
  17. Wikipedia:Africa-related regional notice board/New articles (Search result, Log, Rules)
  18. Wikipedia:WikiProject Greece/New articles (Search result, Log, Rules)
  19. Wikipedia:WikiProject China/New articles (Search result, Log, Rules)
  20. rather pathetic "Bad list" (Search result, Log, Rules)
  21. even more pathetic "Good List" (Search result, Log, Rules)
  22. Portal:Karnataka/New Pages (Search result, Log, Rules)
  23. Wikipedia:WikiProject Finland (Search result, Log, Rules)
  24. Wikipedia:WikiProject Netherlands (Search result, Log, Rules)
  25. Wikipedia:WikiProject Albania (Search result, Log, Rules)
  26. Portal:Cricket/New Articles (Search result, Log, Rules)
  27. Wikipedia:WikiProject_Czech_Republic/New_article_announcements (Search result, Log, Rules)
  28. Portal:Hungary (Search result, Log, Rules)
  29. WP:PLT (Search result, Log, Rules)
  30. Wikipedia:WikiProject_Bulgaria#New_articles (Search result, Log, Rules)
  31. Wikipedia:WikiProject France/New article announcements (Search result, Log, Rules)
  32. Portal:Gardening (Search result, Log, Rules)
  33. WP:PLANTS (Search result, Log, Rules)
  34. Portal:Literature (Search result, Log, Rules)
  35. Portal:Italy (Search result, Log, Rules)
  36. Law articles(Search result, Log, Rules)
  37. Wikipedia:WikiProject Military history/New articles (Search result, Log, Rules)
  38. Portal:Film (Search result, Log, Rules)
  39. WP:WPE&R (Search result, Log, Rules)
  40. Portal:Chemistry (Search result, Log, Rules)
  41. Wikipedia:WikiProject Sheffield/New articles (Search result, Log, Rules)
  42. WP:ARTH (Search result, Log, Rules)
  43. Wikipedia:WikiProject Energy (fossil fuels) (Search result, Log, Rules)
  44. Wikipedia:WikiProject Architecture (Search result, Log, Rules)
  45. Wikipedia:Conflict of interest/Noticeboard (Search result, Log, Rules)
  46. Wikipedia:WikiProject Opera (Search result, Log, Rules)
  47. Wikipedia:WikiProject_Ballet#New_articles (Search result, Log, Rules)
  48. Wikipedia:WikiProject hip hop (Search result, Log, Rules)
  49. Wikipedia:WikiProject Albums (Search result, Log, Rules)
  50. Wikipedia:WikiProject Eastern Orthodoxy (Search result, Log, Rules)
  51. Wikipedia:WikiProject Trains (Search result, Log, Rules)
  52. WP:EDUCATION (Search result, Log, Rules)
  53. Hydrology (Search result, Log, Rules)
  54. Wikipedia:WikiProject Israel (Search result, Log, Rules)
  55. Wikipedia:WikiProject Energy (nuclear) (Search result, Log, Rules)
  56. Wikipedia:WikiProject Bridges (Search result, Log, Rules)
  57. Wikipedia:WikiProject Islam (Search result, Log, Rules)
  58. Wikipedia:WikiProject California (Search result, Log, Rules)
  59. Wikipedia:WikiProject Microbiology (Search result, Log, Rules)
  60. Portal:Cooking (Search result, Log, Rules)
  61. Wikipedia:WikiProject Philosophy (Search result, Log, Rules)
  • For announcements of new feeds use {{subst:User:AlexNewArtBot/Announcement|key}}
  • Master configuration

[edit] How to add feeds to the new article bot

Yes, you can add new feeds for the bot yourself. Here I describe how to do it. It is a little bit tricky, so if you unsure what you are doing you better ask the bot's owner

Steps to create bot feed:

[edit] Select a name for the new feed

The name should not be used for any other feeds. It should not contain spaces and other non-letter symbols. It should be disambigous. E.G. UK is not good: is it Ukraine or United Kingdom. It should be reasonably short and you should be able to spell it uniformly through out a few step. E.g. Pneumonoultramicroscopicsilicovolcanoconiosis is not a good name.

[edit] Announce the new feed on this page

Put the template {{Subst:User:AlexNewArtBot/NewFeed|FeedName|Portal Name}} at the bottom of the #Currently supported section of this page. Here the FeedName stand for the name you have selected and the Portal Name is the name of a Portal page with the feed (Portal:### or Wikipedia:WikiProject ###). The feed would create redlinks for the Rules, Search Results and Log of the new feed. The Portal Name will be linked automatically, so there is no need to put it in square brackets.

[edit] Compile the rules

Well this is the most tricky part: You have to provide some rules for the bot. Each rule has some numerical value (might be negative). All the values from the rules applicable to the article are added together to get a score. If a rule matches the lead of the article the points for the rule are doubled. If the final score is above the threshold the article is in.

Both threshold and rules a written in the rules page. One line per rule (and one line per threshold).

Threshold is specified as

@@number@@ where the number is the threshold (duh). If ommited the default threshold is 10 points. E.g.
@@14@@

means 14 points threshold

Rules has format:

Points /Pattern that we should have/ , /Inhibitor1/ , /Inhibitor2/ , /Inhibitor3/ ...

The Points is the number of points for the rule. If omitted than by default every rule costs 10 points. Do not forget that if the lead is matched points are doubled. Pattern that we should have is a Regular expression as in Perl that should be matched in the text of an article so to rule fired. The inhibitors are the patterns that "inhibit" the rule. Making it inactive even if the Pattern that we should have is matched. E.g when creating the rules for the Russia related articles I want to include Saint Petersburg - the second largest city in Russia, On the other hand many American articles mention Saint-Petersburg, Florida and other American cities. Thus, I could want to decrease the value of the rule and inhibit it completely if Florida mentioned:

7 /Petersburg/ , /Florida/

In general the names of a country or of its capital are often mentioned in unrelated articles (e.g. somebody travelled there, etc.) But lead rarely mentions unrelated articles, thus usually we would want to have the name of the country to be below the threshold but above the half-threshold.

Categories are usually friends of the bot (if only all the new article writers used them!) so they deserve the cost above the threshold.

Note that \W (uppercase only) is needed to mark a word boundary. Without this by default a rule can match any part of a word.

The following symbols

{}[]()^$.|*+?\

must be preceded by \ to be taken literally. Otherwise they have special functions: * - wildcard, x? - optionality of the preceding symbol or bracketed string, (xy) - scope marking (e.g. for the purpose of | or ?), (x|y) or [xy] - alternatives, etc.

For other inspirations look in the rules for similar newsfeeds.

There are a few magical words in the rule file:

  • $USER substitutes into the user name
  • $SIZE>value / $SIZE < value : if the pattern true then it is matched.

[edit] Inform the bot about the new job for it

Put into the bottom User:AlexNewArtBot/Master a new line with the name of the newsfeed (the same as the name of the rules file after the /).

If the bot is suppose to feed new articles into a board, we usually do not want the bot to post the articles already published on the board. We give the bot the name of such a board by putting =>Board name after the newsfeed name

And that is all the next time the bot works it would work on your feed