User:SuggestBot
From Wikipedia, the free encyclopedia
SuggestBot is a program that attempts to help Wikipedia users find pages to edit. More detail is below.
If you want to see some personalized recommendations for you, please leave your name at User:SuggestBot/Requests.
[edit] About SuggestBot
This is a Wikipedia bot belonging to ForteTuba. Its purpose is to match people with pages they might like to contribute to based on their past contributions. It uses a variety of algorithms, including standard information retrieval and collaborative filtering techniques, to make suggestions. It also sometimes points people to the Community Portal, or their past edits, as a source of inspiration.
It mostly runs at the GroupLens Research Lab on various machines, mostly using a recent copy of the Wikipedia database. It downloads people's contributions when making recommendations (to avoid recommending pages they've edited since the last dump) and it also occasionally downloads recent contributions to check whether people are taking recommendations. It is still under development, written in Perl.
It makes suggestions in two ways:
- People who ask, it posts them directly to their talk page, like this.
- It randomly picks, creating a subpage of this user page and putting the links there, putting a brief note on their talk page.
If this bot made personalized recommendations for you, please tell leave feedback on whether they were useful and how to make them better. Comments on recommendations, as well as general comments, suggestions, or complaints, are best left on the bot's talk page. Comments are welcome and valuable, as they will help SuggestBot do a better job of helping Wikipedia.
No one had strong objections on Wikipedia_talk:Bots when this bot was proposed, so I'm running off and on. It has run a couple of pilot tests, and this bot has a page where people can request suggestions. Eventually, the creator of SuggestBot wants it to become a Wikipedia Tool.
[edit] Limitations/issues
- It's still not good with non-low-ASCII characters in usernames. Sigh.
- Some people would like wanted articles (redlinks). Hard because the only info we have on a redlink is a title and the pages that link to it -- no edit history to work from. Might be able to do this.
- Someone suggested removing section stubs from the stubs list. Probably the right thing to do.
- Right now you have to make requests each time you want recommendations. It should have an easy way to support repeat customers.
- Should probably remember what's been recommended to a person, and avoid re-recommending for repeat customers.
- Needs to eventually, automatically, re-download lists of articles.
- Automated posting of suggestions/notifications is broken for some talk pages, and I don't know why. Probably redirects, someone pointed this out to me.
- Only reads up to N (=500 as of Mar 6 2006) of a person's most recent edits when making recs. It tries to get older edits from a dump, in order to not recommend articles people have edited in the past, but this isn't perfect because dumps go out of date (there might be a gap between your last 500 edits and any edits it finds in the dump, and articles in that space might be recommended).
- Doesn't handle redirects (also leading to recommendation of already-edited pages). This appears to be a relatively minor issue based on a little bit of testing of recommthey were all the same page (and maybe do better).
- Ignores anything outside of main namespace (taking article talk pages into account might be interesting, a better representation of people's interests than just edits of articles directly. On the other hand, people often post on talk pages of articles they'd like to see deleted?)...
[edit] Changelog
- Try to improve profiles by ignoring minor and disambig edits. -- 11:28, 7 August 2006 (UTC)
- Kick over coedit recommender to 7-17 dump. -- 11:28, 7 August 2006 (UTC)
- Removed random recommendations, they were rarely followed. -- 05:01, 27 March 2006 (UTC)
- Eliminate most previously edited articles by looking at a relatively recent local dump. Many dumps fail on en, and processing them takes days, so for now we're on a mostly-processed version of the 2-19-06 dump. -- 16:24, 15 March 2006 (UTC)
- Maybe fixed all accented character issues? -- 15:58, 15 March 2006 (UTC)
- Add a filter to not recommend articles in the top N% (N=1) of edited articles -- a better way to handle the controversial article problem, and consistent across recommendation algorithms. -- 00:11, 15 March 2006 (UTC)
- Instead of recommending among all articles, focus on recommending articles tagged as stubs or needing work. (Somewhat like OpenTask but giving more weight to stubs since there are so many more of them.) -- 21:49, 14 March 2006 (UTC)
- Expand edit removing to include protection actions. -- ForteTuba 16:16, 14 March 2006 (UTC)
- Make some random recommendations, to make sure all articles get recommended eventually (a la User:Pearle's maintenance of Template:Opentask). -- 18:48, 13 March 2006 (UTC)
- Harshly penalize articles with lots of links in the link-based recommender, to recommend fewer popular pages (that presumably have less opportunity to contribute to). -- 16:35, 10 March 2006 (UTC)
- Remove many edits as input to recommendations, if the comment suggests they're reverting vandals. These edits appear to cause recommendations to zero in on controversial pages. -- 16:35, 10 March 2006 (UTC)
- Fix many accented character issues. -- 21:41, 7 March 2006 (UTC)