Wikipedia:Bots/Requests for approval/ClueBot
From Wikipedia, the free encyclopedia
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
[edit] ClueBot
tasks • contribs • count • sul • logs • page moves • block user • block log • flag log • flag bot
Operator: Winbots
Automatic or Manually Assisted: Manually assisted until I have corrected all bugs and false positives are few and far between.
Programming Language(s): PHP, my own classes to interact with Wikipedia's query.php, api.php, and index.php
Function Summary: Reverting vandalism.
Edit period(s) (e.g. Continuous, daily, one time run): Continuous
Edit rate requested: Only as often as it finds vandalism via Special:RecentChanges, but no faster than 10 edits per minute.
Already has a bot flag (Y/N): N
Function Details: Request RecentChanges and perform basic analysis on the data returned. If it finds something suspicious, request more information about that page and perform more in depth analysis on that information. If it determines that it looks like vandalization and it is in read-only mode: send information to owner. If it determines that it looks like vandalization and it is in read-write mode, request more information on user. If user has a long history on Wikipedia, log incident, but do nothing. If user is new/IP address, revert, log incident, and post a notice on the user's talk page.
The bot will also set it's user page and it's source code page when starting.
[edit] Discussion
A couple questions.
- Why not use the recent changes IRC feed? It's a lot less bandwidth-intensive, both on the server and on the client.
- Why only report potential vandalism to the owner? Why not open up the communication to the community?
- Are you aware that bots like this already exist, such as AntiVandalBot and MartinBot?
- Do you envision this bot being a replacement for bots like AntiVandalBot and MartinBot, or a supplement?
- Are you confident that as a relatively-new user (57 contributions), that you fully understand Wikipedia:Vandalism and the Wikipedia:Bot policy?
Thanks! — Madman bum and angel (talk – desk) 21:14, 24 July 2007 (UTC)
- A couple of answers.
- The bot does not request Special:RecentChanges but rather uses the api.php page specifically designed to be used by bots.
- I
mightdid add a /PossibleVandalism page, but what I meant by reporting it to its owner is on the command line, which is inaccessible to other users for obvious reasons. - Yes, I am. But, this being my first wikipedia bot, I thought I should try something reasonably easy to code. I do plan on expanding (after getting permission, of course) to make the bot more useful.
- More as a supplement. I don't see the need to deactivate the current bots. Their rules are likely much more refined. I am making this bot because I enjoy creating automated systems in my spare time, and thought I could contribute to Wikipedia.
- Yes, I am. I have read both pages, I have corrected a fair amount of vandalism during the last couple of days (as you can see here). I have implemented all the items under the "Good form" header. And I intend to fix any errors my bot makes.
- Thanks! Winbots 22:33, 24 July 2007 (UTC)
- What algorithim will your bot use to determine vandalism, will it be using a scoring system, a dictionary? — xaosflux Talk 22:27, 24 July 2007 (UTC)
As you're probably aware, it took quite a bit of work to get our current antivandalism bots to where they are today. False positives obviously aren't the end of the world but they are still a problem. Having said that, I'd be interested in seeing what kind of scoring system you plan on implementing. -- S up? 08:35, 25 July 2007 (UTC)
- How about the bot removing reports after a certain amount of time on the page? ~ Wikihermit 08:49, 25 July 2007 (UTC)
- Currently it looks like the bot searches for large deletions of text from articles. You could also create a blacklist of words, and have the bot scan recent changes for edits that include words on those list. You should get a whitelist if you plan to do it to help cut down on false positives. ~ Wikihermit 08:58, 25 July 2007 (UTC)
I do plan on making itIt now does remove reports aftera certain amount of time5 hours, otherwise the page gets very large. About the searching for large deletions, yes, that iswhatone of the things the code currently does. It also searches for massive additions (and runs them through the scoring system), page blanks, and page replaces. Ido plan on makinghave made a blacklist/whitelist and a scoring system (documentation) (see above). Winbots 13:49, 25 July 2007 (UTC)- The bot is having some problems. See diff. It also had a problem when the vandal replaced the page with Image:Example.jpg. It reported it as [[Image:Example.jpg]] which made the image appear on the page. See diff ~ Wikihermit 15:53, 26 July 2007 (UTC)
- Yes, sorry, I have fixed that problem now. Winbots 17:43, 26 July 2007 (UTC)
- The bot is having some problems. See diff. It also had a problem when the vandal replaced the page with Image:Example.jpg. It reported it as [[Image:Example.jpg]] which made the image appear on the page. See diff ~ Wikihermit 15:53, 26 July 2007 (UTC)
- Currently it looks like the bot searches for large deletions of text from articles. You could also create a blacklist of words, and have the bot scan recent changes for edits that include words on those list. You should get a whitelist if you plan to do it to help cut down on false positives. ~ Wikihermit 08:58, 25 July 2007 (UTC)
I have now implemented the maxlag feature. The bot will sleep 10 seconds, then abort the edit if the server is lagged more than 2 seconds. Winbots 01:24, 27 July 2007 (UTC)
I have now coded the reverting/warning feature. It will remain disabled until such time as the bot is given a trial run, though. Winbots 16:57, 28 July 2007 (UTC)
I have also coded a feature to ask me about each revert before proceeding, if enabled. I plan on enabling this feature during the trial run, when the trial run is granted. Winbots 22:43, 28 July 2007 (UTC)
- MartinBot (AVB is out of commission now :( ) uses a custom warning. I suggest that you do the same, to make it clear that the warning has come from a bot. Also, will the bot be able to increment warnings and/or report users to WP:AIV (the bot section, ideally)? Thanks, Martinp23 23:01, 28 July 2007 (UTC)
-
-
- Any particular reason it doesn't detect others? If your bot doesnt pick up one or two, and others do, and get to final warning, then you give another level 1 warning, its unusual and they should be reported to AIV. I don't see any good in not detecting other warnings. Matt/TheFearow (Talk) (Contribs) (Bot) 23:15, 28 July 2007 (UTC)
-
-
-
-
- It now does detect others' warnings. It also will not revert the same title more than once per day. Winbots 00:02, 29 July 2007 (UTC)
-
-
-
-
-
-
- It also will revert back more than edit if the vandal has made several consecutive edits until it has found a edit not made by the vandal. It will not try to go back further than 5 edits. Winbots 00:40, 29 July 2007 (UTC)
-
-
-
- ("undenting") We don't flag bots that deal with vandalism, so the edit rate will have to be lower to not clog up recentchanges. Otherwise it seems we ironed out all of the problems. ~ Wikihermit 00:43, 29 July 2007 (UTC)
I'm approving you for a 50 revert trial. Each diff should be manually checked before allowing the bot to revert, so I expect an edit rate of no more than 2 epm. For the final 10 reverts, presuming you have had no problems, please set the bot to full-auto mode and (while watching) let it go at an edit rate of no more than 6epm (a 10 second sleep). Approved for trial. Thanks, Martinp23 00:57, 29 July 2007 (UTC)
- Before this is approved, I'd really like to see it go through a longer trial (perhaps a fortnight) so that we can pick up on any problems raised by the community as they come to see it. Of course, approval for that trial can only come after the current one! Martinp23 23:32, 30 July 2007 (UTC)
- I agree. ((BotTrial|days=14|editrate=6)). Matt/TheFearow (Talk) (Contribs) (Bot) 01:47, 31 July 2007 (UTC)
- Trial completed. All 50 edits. The 40 assisted edits at 2epm. The 10 fully automated edits at 6epm. Winbots 05:51, 1 August 2007 (UTC)
- Not completely done looking through, but it looks great! One note: It needs to detect other users final warnings, as I have seen several cases it should have AIV reported. I'll comment more when I check more. Matt/TheFearow (Talk) (Contribs) (Bot) 05:55, 1 August 2007 (UTC)
- Everything else looks good, and no false positives. Approved for trial (14 days). If a higher edit rate would be useful, I will approve for that. Matt/TheFearow (Talk) (Contribs) (Bot) 05:58, 1 August 2007 (UTC)
- Not completely done looking through, but it looks great! One note: It needs to detect other users final warnings, as I have seen several cases it should have AIV reported. I'll comment more when I check more. Matt/TheFearow (Talk) (Contribs) (Bot) 05:55, 1 August 2007 (UTC)
Matt, this is a major bot, so it helps to have more than one person look over the trial results before jumping ahead. As it is, there are a few instances of things like this, which don't make sense. Also, I note that the bot is reverting to the same page more than once, which I believe I explained on IRC to not be a good idea - we often get IPs or new users blanking massive sections of articles completely legitimately, where those articles violate BLP policy. There are similar cases which mean that a bot edit war is completely undesirable (by all means have a mode, manually set, which allows you to give the bot permission to make more than one revert, but by default it should be off).
For warnings, I'd suggest that you craft your own, and issue only two before reporting the user (so, report on the third offence). It would be nice if the bot could recognise other peoples' warnings, but not essential. Thanks, Martinp23 12:21, 1 August 2007 (UTC)
- The bot will not revert more than once per article per 24 hour period. As I have watched it, it hasn't (that I am aware of) reverted more than once in a 24 hour period. The bot adds vandalism to /PossibleVandalism whether or not it reverts it due to certain contraints. I agree, I need to make it check more thoroughly that it actually corrected it before saying it corrected it on the /PossibleVandalism page. That link you provided was because of a strange coincidence where the edit was within the last second of the recent changes the bot requested the first time, so when it requested all articles since then, at a later time, the recent changes page still gave it that article. Winbots 17:06, 1 August 2007 (UTC)
- OK. I've seen the "reverted by Cluebot before I saw it" message quite often now - not sure if it's something you can fix or not (it's only a cosmetic issue anyway). Martinp23 17:14, 1 August 2007 (UTC)
- Yeah, I'll see if I can fix that. Winbots 17:30, 1 August 2007 (UTC)
- Also, it does use it's own warnings, based off of the official warning templates. And it does detect others' warnings. Just, it only honors others' warnings made within the last 48 hours. Winbots 17:29, 1 August 2007 (UTC)
- Martin, sorr about that. I went through, and saw nothing wrong, so I re-approved. I'll wait a bit longer on these in the future. A single feature I would recommend, that would probably be a different bot, but it should be easy to implement, is to make it report a page to RFPP if it recognises more than 30 peices of vandalism on it from several different names in a 48 hour period - it would be incredibly useful, and it would find out a lot of pages subject to heavy vandalism. This is probably good as a different bot entirely, but it would be quite possible using this bot and its already good vandalism detecting features. Matt/TheFearow (Talk) (Contribs) (Bot) 21:35, 1 August 2007 (UTC)
- OK. I've seen the "reverted by Cluebot before I saw it" message quite often now - not sure if it's something you can fix or not (it's only a cosmetic issue anyway). Martinp23 17:14, 1 August 2007 (UTC)
Looks like it's working well, I don't see why not to Approved.. --ST47Talk·Desk 22:05, 12 August 2007 (UTC)
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.