Wikipedia talk:Version 1.0 Editorial Team/Index/Archive 5

From Wikipedia, the free encyclopedia

Archive This is an archive of past discussions. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Contents

Making the main stats table 2D

Not so long ago all the stats tables for individual projects have become two-dimensional in the sense of displaying quality vs importance data. Only the main table Wikipedia:Version 1.0 Editorial Team/Statistics is still 1D. I am considering making it 2D also. That would make it like the other tables, and would be one fewer subroutine for me to maintain.

Of course, the bigger size of the table could be a problem, but from what I saw, it shows up only at Wikipedia:Version 1.0 Editorial Team/Work via Wikiprojects and Wikipedia:Version 1.0 Editorial Team/Index as transclusion, and there it could be pushed down or up the page so that its width does not cause problems. Any comments about that? Oleg Alexandrov (talk) 20:13, 1 January 2007 (UTC)

Out of curiosity: how would it handle articles that had been given different importance ratings by different projects? Such cases are fairly numerous now. Kirill Lokshin 20:25, 1 January 2007 (UTC)
Currently, when counting for the big table, if an article was encountered once in a project, it would be ignored when it is encountered second time in another project. That means obviously that the importance data is not perfectly accurate, but this avoids repetitions (when an article is counted twice). I don't see a good way to take into account that an article has different importance ratings in different projects. Oleg Alexandrov (talk) 21:00, 1 January 2007 (UTC)
I generally find that the higher importance rating is for articles within very specific WikiProjects, which can inflate importance that would be given to that article within a general encyclopedia (which the 1.0 release would be). The lower rating is usually given by a wikiproject that is (theoretically) only peripherally interested in the article. I've been guilty of tagging borderline articles for the attention of a WikiProject, usually on the basis that if that WikiProject won't deal with it, no-one else will. I've seen the film WikiProject 'claim' film sections of articles, which is fair enough, but the importance rating will generally relate only to that section, even if the quality rating is for the article as a whole. What happens when quality ratings differ? Carcharoth 12:34, 23 January 2007 (UTC)

"None importance" in cvg

On Wikipedia:Version 1.0 Editorial Team/Computer and video game articles by quality statistics, one of the table headers is "None", which uses {{No-Class}}, but actually refers to Category:Unknown-priority computer and video game articles. Since the statistics page is generated by Mathbot, I didn't just want to change it. Would it be ok for me to go ahead and change that label? JACOPLANE • 2007-01-2 18:03

I'm also not exactly sure what it should be replaced with. {{Needed-Class}} ? JACOPLANE • 2007-01-2 18:09
Needed-Class is used by other projects to indicate articles that need to be written. If the text in the table is changed to anything other than None, I would suggest Unknown (or Unk.) similar to the way it says Unassessed in the quality scale. Slambo (Speak) 18:19, 2 January 2007 (UTC)
Thanks for the response. Just to make sure: if I change the label to "Unknown" that won't mess up Mathbot, right? JACOPLANE • 2007-01-2 18:29
My guess is that the next bot run would overwrite the column heading back to what it shows now when it updates the counts in each field. I don't know how much work it would be to make such a change on a project-specific basis (I'm guessing that such a change would be impractical), but if the column label is changed globally... I know in WP:TWP it wouldn't make much difference if it said None or Unknown for that column, but I'm curious about other projects' opinions on this before we advocate such a change. Slambo (Speak) 18:39, 2 January 2007 (UTC)
The bot now treats the "No-importance", "Unknown-importance", "Unassessed-importance", and "Unassigned-importance" as if they were synonymous. That can be changed of course, if there are good reasons for that. Oleg Alexandrov (talk) 21:30, 7 January 2007 (UTC)

Disambig Pages

Why are there two different templates for the same thing? (Template:Dab-Class and Template:Disambig-Class) Cbrown1023 18:21, 6 January 2007 (UTC)

They're both unofficial—the assessment system doesn't keep track of disambiguation pages—so it doesn't matter too much. I supect we can redirect one to the other with no ill effects. Kirill Lokshin 19:58, 6 January 2007 (UTC)

WP China

Anyone else noticed the monstrous maze of categories created by WP China? For example: Category:Stub-Class China-related articles of High-importance.

Should anyone be in the mood for a mass CFD or indeed a spot of rogue adminship there's a target for you... --kingboyk 19:30, 10 January 2007 (UTC)

Eh, unless they're causing problems, I wouldn't bother. If the project in question finds them useful, I see no reason to get rid of them. Kirill Lokshin 19:37, 10 January 2007 (UTC)
I'll bet you a dollar they don't actually find them useful ;) Seemed like a good idea at the time though no doubt.
Seriously, I think it gives the assessment scheme a bad name if it's seen to grow to such ludicrous proportions. Just my 2c. You can have the other 98c later. --kingboyk 19:40, 10 January 2007 (UTC)
We ought to ask the project about it, at the least.
(Obviously, though, I'm a bit biased; I have my own reasons for not wanting people to start taking an axe to assessment categories. ;-) Kirill Lokshin 19:49, 10 January 2007 (UTC)
Their problem, I'd say. :) Oleg Alexandrov (talk) 03:44, 11 January 2007 (UTC)
'oly 'hit, you guys are insane. :) Titoxd(?!?) 05:52, 22 January 2007 (UTC)

WikiProject Canada template problem

This template has a variable "type", which allows the item to be labeled a temple/list/category. But it doesn't work if its assessed NA-class. See Talk:Lieutenant Governors of Nova Scotia for it working, and Template talk:St. John's landmarks for it not working. Any ideas? - Trevor MacInnis (Contribs) 01:05, 13 January 2007 (UTC)

Trial category intersection

For those wanting to intersect importance and rating (eg. to find all the unassessed top-importance articles in a WikiProject), a trial Category Intersection system is at http://aerik.com/wikintersections.php. Please don't overload it! :-) See Wikipedia talk:Category intersection for details of the person who set that up. Carcharoth 16:04, 21 January 2007 (UTC)

Oh $*%&!! "I'm using a copy of the relevant tables from November, so this isn't live data" - forget that, but hassle whoever can get this system up and running. It would be really good. See what I did at Category:Unassessed Tolkien articles. Carcharoth 16:13, 21 January 2007 (UTC)
Nice. Cbrown1023 18:54, 21 January 2007 (UTC)
The developers are somewhat aware of this work: [1]. Titoxd(?!?) 19:37, 21 January 2007 (UTC)
Would this intersection feature be that useful for this project? Oleg Alexandrov (talk) 05:43, 22 January 2007 (UTC)
It would generate a link for every row/column intersection in the individual project stats tables. WP India asked for that previously, didn't they? Titoxd(?!?) 05:53, 22 January 2007 (UTC)
And WP China seem to be doing by hand (see a few sections above). Carcharoth 23:23, 22 January 2007 (UTC)
The reason I want this is that if you look at the stats box currently transcluded at the top right on Wikipedia talk:WikiProject Middle-earth, you can see that we made an initial pass over the ~1200 articles to find around 330 that are of top, high and mid importance. Most of the other 870 or so are of low importance or perma-stubs that will be merged. Picking out these articles as a priority was easier than assessing them at the same time, though with hindsight assessing both importance and class at the same time would have been best. Anyway, the situation yesterday was that we had 969 unassessed articles, and I wanted to pick out from those the ones that had been rated important, without wasting time by clicking on the already assessed ones. I eventually did a manual intersection using Excel to compare lists from the unassessed category and the importance category. The result was the lists at Wikipedia:WikiProject Middle-earth/Assessment/Current work. I would have much preferred not to do those lists manually (it only took 30 minutes or so), as an intersection would allow people to work on the same dynamically generated intersection without needing to manually update a list. Also, I could have used Wikipedia:Version 1.0 Editorial Team/Tolkien articles by quality/1, but this was out-of-date as assessment work had been done that day. Working from the categories was the only option other than waiting for the bot to update the list. This type of set-up probably only applies to WikiProjects with large number of articles needing to be organised, and with large numbers of stubs, and needing to pull out a core set of articles. WikiProject Biography springs to mind. Look at Wikipedia:Version 1.0 Editorial Team/Biography articles by quality statistics. They've pulled out 200 top-importance articles. But say that they eventually have another, lower tier of 1000 mid-importance articles (I know they've deprecated that, but this is an example). If there were 1000 unassessed mid-importance articles, how would they be separated from the 133177 other unassessed articles? How would someone find those 1000 articles that one person had labelled as of mid-importance, to enable them to assess these mid-importance articles? Maybe the Film WikiProject is a better example. See Wikipedia:Version 1.0 Editorial Team/Film articles by quality statistics. Now, can you see why someone might want to find out what the 52 high-importance stubs are, and work on getting them up to start level at least? Or the seven top-importance Good Articles, and work on improving those. Do you see what I mean by these examples? If your bot could provide links to those numbers, that would be so awesome. It could even just link to right section of the list here, if you can organise those lists by sections, rather than cut into the 45 lists at Wikipedia:Version 1.0 Editorial Team/Film articles by quality. The list is not ideal though, as that is only updated daily. A dynamically generated category intersection would still be best, as it automatically updates as people work on assessing articles. Carcharoth 11:29, 22 January 2007 (UTC)
To pull out a single suggestion from that rather long post, is it possible to get the bot to add div-id tags to label the points where the following transitions are in the list: FA-top, FA-high, FA-mid, FA-low, FA-unknown, A-top, A-high, etc (for all 35 permutations down to unassessed-unknown)? Then, when it writes the number into the table, it would do it in the form [[PAGENAME#FA-top|NUMBER]], but put no link if the number was 0. That sounds terribly complicated, doesn't it, especially as PAGENAME varies depending on where the cut-off point between pages is. Don't worry, I'm sure Category Intersection won't be that far away, and I don't mind doing manual intersections for now, waiting a day and copying off the list after the bot updates. Carcharoth 02:50, 23 January 2007 (UTC)
OK, so if I understand it correctly, the category intersection thing does not yet work. It will be rather simple to modify the stats table to have links to the intersections, I will work on it when that tool comes live. You are right in that the div-id tags looks like it would be complicated to implement, and the fact that it would be just a temporary solution makes me even more reluctant to work on it. Will it take long until the category intersection thing works? Oleg Alexandrov (talk) 04:13, 23 January 2007 (UTC)
I wouldn't like to say. Single unified log-in and stable versions are touted as the big things that developers are working on at the moment. After that, I don't really know. I would hazard a guess at anything from a few months to a few years, depending on what time the developers have free (and I don't know any developers, this is just from memories of what I've read elsewhere). It is possible that technical problems delay it indefinitely, but I really, really hope not. There is meta:DynamicPageList (installed on WikiNews), and meta:DynamicPageList2 (intended for WikiNews), but those are not (yet) available on en-Wikipedia. Carcharoth 12:43, 23 January 2007 (UTC)
Hi - sorry to take awhile to get over here. The issue is entirely performance. The implementation of category intersection I'm testing may have good enough performance for en, but honestly, I think it's probably borderline. Performance is why DPL isn't installed on en, too (conjecture on my part, but from a point of some knowledge - the SQL DPL uses get's really bogged down with large datasets). I think we're right there though - I'm not a full fledged developer; this is my first only real contribution to the codebase, but I think everyone thinks I'm on the right track. I'm going to collect more data with this test script, and also write one that uses Lucene. I'm sorry though - I don't have solid plans to update the data; it took me awhile to download and then build the table I'm using for testing.--Aerik 17:25, 23 January 2007 (UTC)

Bot missed an article?

Does the bot often miss articles? It seemed to miss Elvish languages. I assessed it here at 21:32 on 21 January 2007. The bot updated the list Wikipedia:Version 1.0 Editorial Team/Tolkien articles by quality/1 with this edit at 22:20 on 22 January 2007, but the article is still listed as unassessed? Anything to worry about? Carcharoth 23:21, 22 January 2007 (UTC)

There are two templates on that page, so it appears in both categories. The bot went with the unassessed as that is where it would likely get more attention. Cbrown1023 01:12, 23 January 2007 (UTC)
ROTFL! I've come across that before. So easy to miss that. At least I'll know next time! Any way to scan for duplicate templates? Carcharoth 02:42, 23 January 2007 (UTC)

Bot down for today

... due to scheduled computer network downtime at my work. The bot should run tomorrow as usual. Oleg Alexandrov (talk) 05:04, 24 January 2007 (UTC)

Quarter million articles assessed!

I see that we have finally made it to 250,000 articles assessed! Not bad for about 8 months work. Hats off to all of those hard working people across 300+ projects, as well as to Oleg for his patience and dedication! We should celebrate and publicise this achievement. Walkerma 07:20, 28 January 2007 (UTC)

Caribbean unassessed articles category

The statistics page for the Caribbean WikiProject links "Unassessed" to the empty Category:Unassessed-Class Caribbean articles, but it should link to Category:Unassessed Caribbean articles, which is where the unassessed articles actually are. I've tried changing the link by hand, but mathbot changed it back with the next update. Anyone know how to fix this? Jwillbur 21:33, 3 February 2007 (UTC)

Those two categories were duplicating each other. In that case, the bot links to whichever it finds first. I deleted one of them and now the bot links to the other one. Oleg Alexandrov (talk) 18:19, 4 February 2007 (UTC)

Bot rename

I created a new bot account, WP 1.0 bot which I am considering using instead of Mathbot to update the WP 1.0 pages. That because updating these pages takes so many edits that Mathbot's supposedly mathematical edits can barely be seen in its contributions.

Nothing should change but the bot name. Oleg Alexandrov (talk) 22:28, 4 February 2007 (UTC)

We'll probably need to update all the places where Mathbot is explicitly named as updating the statistics; but that shouldn't be a problem.
Please don't forget to clear the new bot account with the approval board before moving operations to it, incidentally. ;-) Kirill Lokshin 22:33, 4 February 2007 (UTC)
I am now asking for approval at Wikipedia:Bots/Requests for approval/WP 1.0 bot‎. I could not find any places where mathbot is mentioned by name, but perhaps I did not know where to look. Oleg Alexandrov (talk) 22:41, 4 February 2007 (UTC)
OK, mathbot was mentioned at Wikipedia:Release Version and I fixed that. Oleg Alexandrov (talk) 22:51, 4 February 2007 (UTC)

Comments are welcome at Wikipedia:Bots/Requests for approval/WP 1.0 bot on the frequency of bot runs. Oleg Alexandrov (talk) 00:22, 5 February 2007 (UTC)

new WP 1.0 bot performance

seems a tad slower if that is possible. Quite a bit of slippage from the first days updates! :: Kevinalewis : (Talk Page)/(Desk) 13:49, 6 February 2007 (UTC)

Per the discussion at Wikipedia:Bots/Requests for approval/WP 1.0 bot I had the bot edit a page every 10 seconds only, instead of every 5 seconds.
In a sense, that it takes longer and longer to update pages makes sense, we are talking a number of articles which is good fraction of a million. My proposal would be to run the bot every other day only. People who are impatient can occasionally run the cgi script which does things on demand, although it seems that it dies out half-way for very large projects. Oleg Alexandrov (talk) 02:18, 7 February 2007 (UTC)
Well, I still don't completely understand why the read requests have to hold off for two seconds, as these barely registered on the radar for the 18K hits/sec Wikipedia has been getting. Perhaps cutting the delay to 1 sec (or perhaps even 0.5 sec) would be all right? (I still don't like the added constraint for write time, though, but that's a different issue altogether.) Titoxd(?!?) 02:24, 7 February 2007 (UTC)
The bot was fetching wikicode every 2 seconds and the contents of categories every second. I now made the bot fetch wikicode every second too. Let's see if that helps. I'd be kind of reluctant to fetch things faster. Oleg Alexandrov (talk) 03:27, 7 February 2007 (UTC)

Problem with bot?

Is there something wrong with the bot? It is adding articles to the LGBT log as being unassessed, but most of them already are. Dev920 (Have a nice day!) 20:10, 11 February 2007 (UTC)

That was a bug, sorry. It was affecting the log only, not the lists themselves. I fixed it now. Thanks. Oleg Alexandrov (talk) 23:08, 11 February 2007 (UTC)

Strange reporting of renames

It seems like the bot is reporting the new name of the page for both the old and new ones, resulting in a bunch of log entries like "X renamed to X"; here, for example. Kirill Lokshin 21:06, 14 February 2007 (UTC)

So one more bug. Last week I made a lot of changes to the script to make it much easier to translate to to other Wikipedia languages. I tested the code (carefully, I thought) but some bugs crept in anyway. I fixed this now. Thanks. Oleg Alexandrov (talk) 03:26, 15 February 2007 (UTC)

Using WP1.0 Bot for GAs

Please read this proposal and leave comments. Thanks, Walkerma 05:01, 22 February 2007 (UTC)

Category:Computer and video game articles by quality

This project was renamed, and this is now handled by Category:Video game articles by quality. The category is listed to be deleted, but I want to make sure you're all done with it first. What's is the correct way to remove this from assesment? Please respond on my talk page ... -- Prove It (talk) 15:05, 23 February 2007 (UTC)

Well, all you need to is delete Category:Computer and video game articles by quality, Category:Computer and video game articles by importance, Category:Computer and video game articles with comments, all the pages in that category, and all the subpages in Wikipedia:Version 1.0 Editorial Team/Computer and video game articles by quality. Since you guys wanted the rename so badly, I guess you've go to do all that cleanup. :) I will reply on your talk page too. Oleg Alexandrov (talk) 02:48, 24 February 2007 (UTC)

Well, I was afraid of that. So is there any fast way to delete a bunch of subpages at once? -- Prove It (talk) 02:55, 24 February 2007 (UTC)

I don't think so. You guys wanted the rename, you've got to delete the old names (39 of them, see here). Sorry. :) Oleg Alexandrov (talk) 03:00, 24 February 2007 (UTC)

Proposal to run the bot every 48 hours

For at least two days the bot took around or more than 36 hours to run. I think that we arrived at a time when we should run the bot once every two days instead of every day. Comments? Oleg Alexandrov (talk) 03:08, 24 February 2007 (UTC)

I'd support that. Or would it be possible to clone the bot and have one just hit the big ones (MILHIST, WPBIO, Album, France, Australia, Film, India, Computer & Video Game) and the other hit the rest?↔NMajdantalk 23:00, 28 February 2007 (UTC)
I thought of this too. But then it would not be possible to compute the total number of articles. Well, that can be accomplished by saving things to disk, etc. But I doubt it would be worth the trouble, I think updating the lists every other day should keep things reasonably up to date. Oleg Alexandrov (talk) 04:21, 1 March 2007 (UTC)
Ok, that makes sense. Go for it, if you haven't already.↔NMajdantalk 14:54, 2 March 2007 (UTC)
Due to no objections, the bot has been running every other day for the last few days. Oleg Alexandrov (talk) 16:03, 2 March 2007 (UTC)

Removing the importance part from our projects assessments

The importance rating has cause enough controversy and is not being used to its full potential in the Aircraft project. What would be the easiest way of removing this part from our assessment profile. Can we just delete the related categories and remove the code from the project banner? What will the bot do after this is done? - Trevor MacInnis (Contribs) 21:26, 26 February 2007 (UTC)

The above will be enough, the bot won't complain. However, the "Importance:None" column will still show up in Wikipedia:Version 1.0 Editorial Team/Aircraft articles by quality statistics, as you see now in Wikipedia:Version 1.0 Editorial Team/Military history articles by quality statistics. Hope that helps. Oleg Alexandrov (talk) 03:09, 27 February 2007 (UTC)

Something broken or I missed a change?

Wikipedia:Version 1.0 Editorial Team/The KLF articles by quality log hasn't been updated in some weeks and it looks like we dropped off the Index too. Has something broken or has there been a change in my absence? --kingboyk 22:52, 28 February 2007 (UTC)

That's really strange. I can't see any reason why it wouldn't show up; perhaps Oleg can spot something I'm missing. Kirill Lokshin 10:57, 2 March 2007 (UTC)
Oleg? --kingboyk 18:36, 2 March 2007 (UTC)
Sorry. I noticed your comment above only this morning, but did not have time to reply. The problem seems to be that while Category:The KLF articles by quality has at the bottom the Category: Wikipedia 1.0 assessments, when actually browsing Category: Wikipedia 1.0 assessments one could not find Category:The KLF articles by quality in there. This sounds paradoxal, but this happens every now and then with pages/categories which have not been edited for a while.
I did a dummy edit to Category:The KLF articles by quality to make it pop up in the base category, reran the bot, and now it showed up in the index.
KLF is one of our oldest projects. It is quite likely that other projects whose categories have not been edited for a while will start disappearing. Something to keep an eye on. Oleg Alexandrov (talk) 03:41, 3 March 2007 (UTC)
No need to apologise Oleg, and thanks ever so much for sorting that out. Just one of those quirks I guess :) Cheers and thanks again. (And yes, if anyone else comes complaining of the same thing we know what to look for now :)) --kingboyk 10:44, 3 March 2007 (UTC)

This page needs a TOC

Can this page be divided up with a {{CompactTOC}}? It'll make looking through it a bit easier, if people are looking to do so. - Trevor MacInnis (Contribs) 22:09, 2 March 2007 (UTC)

It needs to be archived. Something I can handle later unless somebody else does it before.↔NMajdantalk 04:41, 3 March 2007 (UTC)
Sorry, I guess I should clarify, it's not the talk page that I think could use a TOC but Wikipedia:Version 1.0 Editorial Team/Index. - Trevor MacInnis (Contribs) 04:47, 3 March 2007 (UTC)
Well, there is only one section there, called "See also". Not much for a TOC, right? :) Actually the whole page is a big fat TOC already, all it has is a list of lists. I don't see how adding a TOC would help. Oleg Alexandrov (talk) 16:52, 3 March 2007 (UTC)
Ok then. I just hate having to scroll,scroll,scroll,scroll,scroll down, oops too far, scroll up, to find the projects under, say, "M". - Trevor MacInnis (Contribs) 17:29, 3 March 2007 (UTC)
That's correct. But what we need then is not a TOC per se, but rather sections, in other words, the table may need to be split into subtables, for each letter in the alphabet, and each subtable placed into its own section. That should not be hard to implement, I can do it if there is good support for this. Oleg Alexandrov (talk) 18:40, 3 March 2007 (UTC)
Yes, please, that would be very helpful.--DorisHノート 22:08, 23 March 2007 (UTC)
Wouldn't ID divs or spans work? As in having <span id="B"/> before the first B item, etc, and {{CompactTOC}} (or better {{CompactTOC8}}) at the top? It would work inside the table, see for example Towns of Alberta#T. --Qyd 21:48, 4 June 2007 (UTC)

New Projects are not having their stats created

I've added a few projects, one Category:Rotorcraft articles by quality two and a half days ago, the others Category:Red Bull Air Race World Series articles by quality and Category:Gliding articles by quality more recently, and their statistics pages have yet to be created by the bot. They are using the same project banner as the aviation project, {{WPAVIATION}}, in the same way that the Military history project uses the same banner for all its projects. Could someone look over them to see if I missed something that the bot looks for in order to "do its thing". Thanks. - Trevor MacInnis (Contribs) 05:36, 5 March 2007 (UTC)

Aha! Rotorcraft just got done. I guess it just takes a lot longer than it used to to begin updates. - Trevor MacInnis (Contribs) 05:52, 5 March 2007 (UTC)

Stats

To help out with Wikipedia:WikiProject Biography/Assessment/Assessment Drive could the bot tally up the total of assessed articles? I'm trying to encourage folks to focus on how much they've achieved, not the bogus unassessed number (bogus because nearly 40,000 living person articles - and lord knows how many bios about dead people - don't have any {{WPBiography}} tag). --kingboyk 13:56, 6 March 2007 (UTC)

Well, the total number of assessed articles is the total number of articles minus the number of unassessed ones. The latter two numbers are in the stats already. I could modify the code generating the stats to print in addition the total of assessed articles also, but then that number will be printed out for each of the 400 projects stats. I can do that if people think that would be helpful overall for the projects. Otherwise as a solution specific to your project one could write a bot to read the current stats, do the subtraction, and post that number in a place where you guys could easily see. Oleg Alexandrov (talk) 16:04, 6 March 2007 (UTC)
I know how to do the maths Oleg, but I prefer to have machines do these things for me :) We're talking pretty big numbers at WPBio, and at some other Projects. I was thinking a line could be added to the Stats table to complement the (bogus) unassessed count. I'm happy to wait and see if other projects think this would be useful before you decide. Thanks for the reply, as always. --kingboyk 19:38, 6 March 2007 (UTC)
I am for it. It is a good statistic to have and more reflective of a WikiProject.↔NMajdantalk 19:40, 6 March 2007 (UTC)
OK, I will work on this in the weekend. Oleg Alexandrov (talk) 04:15, 8 March 2007 (UTC)
I agree that it would be helpful. The first thing I do when I look at a stats list is subtract unassessed from total to work out this number - it would be nice to have this actually displayed. Thanks, Walkerma 04:23, 8 March 2007 (UTC)
Sorry, I did not get to that in the weekend, and I am very consumed by the real life this week. I'll try to do it this coming weekend. Oleg Alexandrov (talk) 03:57, 13 March 2007 (UTC)
OK, I added the assessed row, as seen at Wikipedia:Version 1.0 Editorial Team/African military history articles by quality statistics. Later today when the bot runs will add that row to all stats tables. Oleg Alexandrov (talk) 00:10, 18 March 2007 (UTC)
Thank you Oleg! Much appreciated. --kingboyk 15:31, 22 March 2007 (UTC)

WP 1.0 bot and article class

I have a question about Wikipedia:Version 1.0 Editorial Team/Numismatic articles by quality statistics. Pages in this project now have a category class, template, dab, etc. These new classes can be found at Category:WikiProject Numismatics articles. Do you think you can upgrade the bot to identify these classes? Thanks. --ChoChoPK (球球PK) (talk | contrib) 08:09, 11 March 2007 (UTC)

The bot is currently hard coded to accept only FA-Class, GA-Class, A-Class, B-Class, Stub-Class, and Unassessed-Class. The problem with Category-Class is that it contains categories, not articles, which would need special treatment in my code. The problem with new classes in general is that the bot needs to be told for each one how to sort it in the table relative to the other classes.
All in all, taking into account that there are more than 400 projects, and my code is being translated to other language Wikipedia too, I am very reluctant to expand the code to support project specific needs. Perhaps there are other ways you guys can keep track of those categories? Oleg Alexandrov (talk) 15:24, 11 March 2007 (UTC)
I understand your difficulty. If I simply add rows to the table, which link to the proper categories, and question marks in place of the count, would you have your bot leave that part alone (a temporary solution for the moment). --ChoChoPK (球球PK) (talk | contrib) 15:35, 11 March 2007 (UTC)
Well, the statistics table is recreated each time so any changes are overwritten. What you suggest would not be easy to implement. It would require the bot to first read the table, then decide based on some kind of algorithm what to keep and what to overwrite, and then write back the table. Definitely not impossible, but it would make the code much too complicated I think. Oleg Alexandrov (talk) 20:25, 11 March 2007 (UTC)
These parameters (list, disambig, etc.) are quite distinct from quality classes. I would suggest writing a separate bot to produce this kind of information. This bot might, perhaps, be able to post this type of category information into the /comments subpage, and then WP1.0 Bot would put the information into the main table. Alternatively, you may want to keep such pages out of the main table altogether, and have the new bot produce its own listing of non-article pages, organised by category. If you can get something like this working, please share it with us, because I know others would be interested. Cheers, Walkerma 03:22, 12 March 2007 (UTC)
I have something like that working here. It's not hard to implement from Oleg's framework. It would be much easier if each attribute had a category assigned to it. CMummert · talk 12:44, 12 March 2007 (UTC)
That looks very nice- thanks for sharing it! It shows how projects can tailor their output. We may want to use your bot for Version 0.7, where we have existing categories such as history, maths, natural sciences, etc., which are not readable by WP 1.0 bot. Thanks, Walkerma 21:57, 12 March 2007 (UTC)
CMummert, how about showing some code? :) Oleg Alexandrov (talk) 03:57, 13 March 2007 (UTC)
Here it is: [2]. It would need some editing to be useful for other people, but the idea is extremely simple. CMummert · talk 00:10, 15 March 2007 (UTC)

FA-Class article count

I believe that somewhere, 17 articles as FA class when they are not. According to Wikipedia:Featured articles there are 1307, but according to Wikipedia:Version 1.0 Editorial Team/Statistics there are 1324. Can the bot be programmed to catch this? Or is this just a problem for the project's involved to correct?- Trevor MacInnis (Contribs) 00:14, 18 March 2007 (UTC)

I think I can guess the reason for at least some of the discrepancy. When a FA gets first listed, its assessment is clear, and it will probably get upgraded to FA-Class immediately. When a FA gets delisted, the project tags may get left at FA for some time after delisting. I don't think this matters too much - I'm glad the numbers are so close! It shows that pretty much all of our FAs are being tracked by one WikiProject or another!
I just made a list from Wikipedia:Featured articles using AWB, and it gives 1307 mainspace links from that page that are true article listings Meanwhile, Category:Wikipedia featured articles gives 1303 mainspace article talk pages. The difference is with the following:
  1. 1 − 2 + 3 − 4 + · · ·
  2. Flag of Portugal
  3. George VI of the United Kingdom
  4. The Notorious B.I.G.
Unfortunately I can't get a list of the FAs found by the 1.0 Bot directly.
I think it would be a bad idea to try and get this bot to make checks of this sort, because I think bots should have a clear purpose. This bot is so busy it takes 36 hours to complete one cycle, so we don't want to burden it with more tasks. It's also got a lot of code, we don't want to make it any more complicated. However, I think it might be worth a check to make sure it's not double counting anything, stuff like that. Is there some may to get a list of the FAs it is finding? Or is this what Category:Wikipedia featured articles is supposed to be? Thanks, Walkerma 01:20, 18 March 2007 (UTC)
I love this last paragraph (about not making the code more complicated, one can always hide behind that :) Now, the bot does not overcount, it keeps a global hash making sure that in the total stats each article shows up only once. Oleg Alexandrov (talk) 04:00, 18 March 2007 (UTC)
Sure it would be possible to get a list of the articles that the WP 1.0 bot counts - fetch a list of all subcategories of Category:FA-Class_articles and then fetch the contents of those categories. Oleg has released the code for WP 1.0 bot, so this would only take a few extra lines of perl. The following works for me:
  my $Root_category = 'FA-Class_articles';
  my @tmp_cats;
  my @tmp_articles;

  &fetch_articles_cats($Root_category, \@tmp_cats, \@tmp_articles);

  my $cat;
  my @tmp_cats2;
  my %FeaturedArticles;
  my $featured_article;
  foreach $cat  (@tmp_cats) {
    print "fetch 2 $cat\n";
    &fetch_articles_cats($cat, \@tmp_cats2, \@tmp_articles);
    foreach $featured_article ( @tmp_articles) { $FeaturedArticles{$featured_article} =1;}
    print "$cat " . (scalar @tmp_articles) . "\n";
  }

  print "Count: " . (scalar keys %FeaturedArticles) . "\n";


I count 1370 featured articles this way. Category:Wikipedia featured articles is added by {{ArticleHistory}} if currentstatus is FA. So by cross referencing it would be easy to make a list of the exceptional articles. CMummert · talk 02:13, 18 March 2007 (UTC)
Here are the impostors this morning:
Talk:Split infinitiveTalk:Floppy disk • Talk:Pribor-3B Rifle • Talk:National Capital Territory of DelhiTalk:House of LordsTalk:Blast shelterTalk:Get BackTalk:Legia WarszawaTalk:DelphiTalk:Space RaceTalk:Elizabethtown, KentuckyTalk:Lech Poznań
Most of them seem be former FAs where the project rating was not changed. Since the list is so short, I'll go through and fix the project ratings to non-FA. So you may have to look at the history to see what the problems were. CMummert · talk 12:15, 18 March 2007 (UTC)
There are periodic checks for the self-consistency of WP:FA, Category:Wikipedia featured articles, and transclusions of Template:Featured article, and similar for WP:FFA and its category. I have seen some projects assign "FA" class to project-specific "selected articles" though I think this is just a misunderstanding. Gimmetrow 21:19, 18 March 2007 (UTC)
The FA-Class tag is also used for featured lists, at least by some projects; that'll cause the numbers not to match up even if everything is consistent. Kirill Lokshin 21:27, 18 March 2007 (UTC)
There are also Wikipedia:Featured images. I think that WP 1.0 bot is likely to include all three in the "FA-class" count, provided that the project puts its rating template on all of them. CMummert · talk 01:14, 19 March 2007 (UTC)
Yes, of course, Talk:United States Navy enlisted rates. But there are 230-some featured lists, and the difference mentioned above is less than 70. There are some featured articles without any project tag, too. Gimmetrow 21:36, 18 March 2007 (UTC)

Unassessed question

Somehow, my project has two unassessed categories. I think this happened when a bunch of empty categories were deleted awhile ago. Which one does the bot look at? My banner places unassessed articles in Category:Unassessed University of Oklahoma articles but the statistics table links to Category:Unassessed-Class University of Oklahoma articles. I want to make sure I delete the correct one. Thanks.↔NMajdantalk 21:06, 21 March 2007 (UTC)

I think the bot should be happy with either one. If you delete one of the unassessed categories, the bot will link to the other one in the stats. Oleg Alexandrov (talk) 03:33, 22 March 2007 (UTC)

WP 1.0 bot is so slow

WP 1.0 bot is so slow. Can you increase the speed of the bot. I read recently that many bots could increase their speeds. Don't remember where I read that, but I just think your bot is going too slow. --Paracit 02:37, 24 March 2007 (UTC)

The bot is slow because it has to do a lot of work (458 projects with thousands of articles each). Currently it does a read request each second and a write request each 10 seconds. I was told to not have it write more often than that.
Also, I'd think that having the bot run every other day is acceptable, no? People who want a quick bot run can do so here. Oleg Alexandrov (talk) 03:14, 24 March 2007 (UTC)
Note that if you run it from the toolserver, the stress on the servers themselves in both reads and writes is reduced; at least that's what I was told a while ago in #wikimedia-tech. I'm not sure about the details, but it may be something to think about. Titoxd(?!? - cool stuff) 04:25, 24 March 2007 (UTC)
Really? I have an account on the toolserver which I could use. But I am kind of skeptical that the toolserver has a more intimate relationship with the main database than other computers on the net. If this were for sure, it would be great. Oleg Alexandrov (talk) 04:34, 24 March 2007 (UTC)
It's not really a database issue (we all know that database replication there sucks); it's more of having a direct connection to the database cluster, bypassing the squid cache servers. Or something like that. However, read and write performance is much higher, AFAIK. Titoxd(?!? - cool stuff) 04:38, 24 March 2007 (UTC)
There are two things to consider. Once is that my script is a resource hog, it can gobble up tens of megabytes of memory (if not more), perhaps taking resources from other programs on the toolserver. Second, and more importantly, from my experience the toolserver can be down every now and then, and even if it is down for a moment, an entire two-day run of the bot is interrupted. But one of these days I'll give it a try moving things to the toolserver, let's see if things become faster. Oleg Alexandrov (talk) 14:57, 25 March 2007 (UTC)

The bot did its last two runs on the toolserver. I can't say if it was faster because neither run was finished. Either the machine was rebooted or the script died or something. I am moving it back to my department's machines. I'll also think of ways to make the script faster. Oleg Alexandrov (talk) 02:44, 11 April 2007 (UTC)

Make it a class assignment? ;) Titoxd(?!? - cool stuff) 03:39, 11 April 2007 (UTC)

AWB script to create article assessment categories

Every so often I find myself creating a new set of assessment categories for a WP Biography workgroup. It's a dull and repetitive task, so yesterday I knocked up a script in the shape of an AWB plugin to do the job. It's not a bot, it just asks the user for some config info, creates a category list which it adds to the AWB list, and then fills in the categories with some boilerplate text. User can review the text before save and is always in full control.

The plugin should ship with the next version of AWB, and source code (VB.net) is in the AWB subversion repository. Please try it!

Some examples created with this tool: Category:Biography (baronets) articles by quality, Category:Biography (peerage) articles by quality, Category:Biography (peerage) articles by priority. --kingboyk 12:54, 26 March 2007 (UTC)

There is also my script, at Wikipedia:Version 1.0 Editorial Team/Generate categories which has been working for a while. But that one requires admin privileges. So it's nice to have the AWB alternative. Oleg Alexandrov (talk) 14:37, 26 March 2007 (UTC)
Bah, didn't know about that. Oh well, as you say there's an alt for non-admins now :) --kingboyk 14:38, 26 March 2007 (UTC)

Intersections etc on WP1.0 bot

The first two comments here were pasted from Oleg's talk page by TimNelson

More ideas:

  • Could you generate a page that lists the 10 pages in Wikipedia with the most WikiProjects attached?
  • Could you make it so that it lists the 10 Wikipedia projects that have the most overlap, but no common task force? Eg. if WP Biography and WP Mathematics have a lot of overlap, but no common task force, it might be an indication that a Mathematicians Biography task force should be set up.

-- TimNelson 09:53, 3 April 2007 (UTC)

Thanks, these are good suggestions. But I have rather little time for coding for the moment, and if I had more, more pressing tasks would be making WP 1.0 bot faster, introducing a table of contents in the index (as requested at WT:1.0/I), and a few others. I can work on this, but it may take me a few weeks to get to it. You can also try to raise this at WT:1.0/I. There are a few perl programmers there who could implement this. Cheers, Oleg Alexandrov (talk) 15:05, 3 April 2007 (UTC)
  • I'm another one who's come here wondering what had happened to the bot for the Wine Project ;-/ but after the obligatory thanks for doing this in the first place, and bearing in mind the above comments about time, I thuoght I'd do a 'me too' on the original intersection idea - and specifically on the Stubs line (if that helps at all with server load), for the purpose of prioritising stub killing. A few weeks ago the Wikiwino stubs were distributed something like 2 Top, 55 High, 300 Mid, 700 Low (something like that anyway), and I ended up doing a quick and dirty VLOOKUP + filter in Excel to identify the Top and High priorities. If a filter for those stubs was a single click away, I'm sure it would help with stub killing. But having the table showing stubs ticking away is great for morale ;-) - thanks again FlagSteward 20:41, 13 April 2007 (UTC)
It seems to me like what you're talking about is (possibly) a different task. Can I point out m:CatScan, which currently doesn't update from English Wikipedia due to lack of hardware. Possibly you could contact the owner of CatScan and offer money for additional hardware or something -- I dunno. But I'd be interested in seeing CatScan going, for much the same reasons.
-- TimNelson 10:30, 14 April 2007 (UTC)
Or you could create a category structure like this Category:India articles by quality and importance on your banner that would give you a category such as this, Category:Stub-Class India articles of Top-importance. Let me know if I need to explain further. Regards, Ganeshk (talk) 14:33, 14 April 2007 (UTC)

Interwiki on statistics page

The French Wikipedia now has about 13,000 articles assessed. Is it possible to add an interwiki link to this stats page from Wikipedia:Version 1.0 Editorial Team/Statistics? I'm hesitating because I know that page is edited daily by the bot. Can we add a noinclude section that the bot will ignore? I hope other languages will take off with bot assessments like the English & French, and if so we will want to have interwiki links. We could also use such a section to add the page to a category. Walkerma 05:50, 10 April 2007 (UTC)

Nice to know the French are doing well. :) Tomorrow I'll modify the bot code to allow a section which the bot won't overwrite. Oleg Alexandrov (talk) 03:39, 11 April 2007 (UTC)
I modified the code so that text after a bot tag in Wikipedia:Version 1.0 Editorial Team/Statistics will not be modified. I also added the "fr" link at the bottom. Oleg Alexandrov (talk) 02:17, 12 April 2007 (UTC)
Thanks! I also appreciate your fixing the Hungarian link as well, I'd assumed they'd given up or something! Good to see things under way there, too. Cheers, Walkerma 05:02, 12 April 2007 (UTC)

Why non bot run?

What's wrong with the bot? Many projects haven't had a run since 7 Apr while others have had two runs since then. It's always the projects at the end of the alphabet that lose out.23:07, 12 April 2007 (UTC)

Is the bot simply running out of time to process the later projects? I was under the impression that this shouldn't happen with a two-day run. Or has there been some change made to how it operates?
The lack of updates for nearly a week now is, admittedly, disconcerting. Kirill Lokshin 00:50, 13 April 2007 (UTC)
See the section #WP 1.0 bot is so slow above. I tried to move the bot to the toolserver to make it faster. The results were sad, the bot never finished its run (the toolserver can't be counted to be up and running continuously for two days in a row I think). The bot is back to my school's computer from yesterday night and it's been running well since then. Oleg Alexandrov (talk) 02:12, 13 April 2007 (UTC)
Ah, ok; that explains it. (Pretty sad how unreliable the toolserver is, though.) Kirill Lokshin 02:24, 13 April 2007 (UTC)
Just as a suggestion, how about maintaining a tidemark as the bot goes through the categories, and then restarting at the tidemark if the last run failed? That way the pain would be shared if there's server problems, and every category would get an update every eg 3 days rather than Aardvarks getting a daily update and Zebras get none at all. Just saying, from the perspective of the Wine project ;-/ FlagSteward 20:45, 13 April 2007 (UTC)
This is a good idea but now that I moved the bot back to the original server, crashes and interruptions should happen very seldom (judging from past history) so I hope there is no need to implement advanced crash recovery. Oleg Alexandrov (talk) 00:29, 14 April 2007 (UTC)
Wikipedia:Version 1.0 Editorial Team/Biography articles by quality statistics hasn't been updated for 8 days. Any idea when we might see an update Oleg? (Please don't stress over this, if the answer is "not until next week/month" then c'est la vie :)) --kingboyk 21:56, 13 April 2007 (UTC)
It will get updated today. The biography is last in the list, because it is by far the hugest category. Oleg Alexandrov (talk) 00:29, 14 April 2007 (UTC)

Mathematics grading now produces lists by sub-field

Folks here might be interested in what the Maths wikiproject have done with assessment. We have a field parameter which is used to place the article in a sub field, say algebra or geometry. User:CMummert has now written a bot which reads this field and produces field specific lists like Geometry and topology. A similar scheme could be useful for other wiki projects which have very large number of articles. --Salix alba (talk) 16:25, 22 April 2007 (UTC)

Dunno if you guys considered this, but you could avoid having to deal with your own bot simply by having the field parameter generate a 1.0-bot-readable category; see, for example, how {{WPMILHIST}} creates assessments for each task force. Kirill Lokshin 16:28, 22 April 2007 (UTC)
The bigger difficulty is that the math project uses a B+ grade but WP 1.0 does not. The bot does a lot more cross indexing that WP 1.0 does. Start with the table and click on any link that isn't a category link to see it in action. CMummert · talk 21:16, 26 April 2007 (UTC)
Why would you want an extra grade? --kingboyk 21:47, 26 April 2007 (UTC)
We found there was a very large gap between start and GA classes. B+ is generally those articles which could be considered closest to being put forward as a GA nom. --Salix alba (talk) 23:17, 26 April 2007 (UTC)
I see. Fair enough, of course, but I'd have thought "non-standard" gradings would just give you more work. Perhaps you like work, I don't know :) --kingboyk 23:51, 26 April 2007 (UTC)
That's a fantastic tool, CMummert! OK, Kirill points out one other way to do this, but I think this is a very nice alternative. It's clearly been very well thought out. Would it be possible for us to use this bot for the 1.0 project (outside Math)? We already have ten "fields" (and yes, one of them is Mathematics), the only difference is that our template uses the word "category" instead of "field". I dare say we could do an AWB sweep to change that. I would love to be able to produce a nice list of (say) all of the history articles in Version 0.7. At present all we can do (that I know about) is to use AWB on Category:History Version 0.7 articles and convert it to a list which we can then upload onto a new page. This method was used with Version 0.5 when writing navigation pages. I can imagine that some other WikiProjects may want to use it too - it depends if you're willing to support it. You know that Oleg has his hands full running the WP 1.0 bot.... Thanks, and great work, Walkerma 01:10, 27 April 2007 (UTC)
(lower indent) The Version 0.7 setup is pretty easy to adapt the script to. I uploaded examples at User:VeblenBot/Version_0.7/MainTable and User:VeblenBot/Version_0.7/History (actually, I uploaded the whole table set). I have no objection to maintaining this script - it's under 500 lines of perl and there should be no need to change it once it is set up correctly. Unfortunately its not very modular, so configuring it means editing the source, not writing a config file like the WP 1.0 bot. I don't mind running the script for other projects if they are interested. But I would need to put in a bot request before adding to the current automatic workload. CMummert · talk 03:15, 27 April 2007 (UTC)
Hey, that's amazing! When we set up the category parameter, this was exactly the sort of thing I had in mind! Could you make the bot request? I'm pretty sure others would be interested in using this. Thank you SO much! And I just realised - it's the bot's birthday today - how appropriate! Thank you, Walkerma 04:16, 27 April 2007 (UTC)
I don't really understand the way that the release version tags are set up. There are 0.5 articles and 0.7 articles - should my tables include both? Right now they only include the 0.7 categories. It would help with the request if other people associated with the release version are interested in the tables/lists. And there are some formatting issues I need to take care of.
One reason I am suspicious is that my bot counts 2099 articles but Oleg's bot only counts 2067. I have to figure that out. My list is very thorough, which could account for the difference. Does Oleg's bot ignore NA-Class articles? It also seems to ignore the incorrect tag on Talk:Grateful Dead. CMummert · talk 04:56, 27 April 2007 (UTC)
Everything from Version 0.5 is automatically in Version 0.7, and the template should be set up to say that. As for the discrepancy, there may be some lists that don't show, or something like that. I think that Oleg's bot includes NA-Class. A useful tool is the list-comparer in AWB - I used that a lot to resolve the differences we found when putting together the Version 0.5 listings. You may want to check that things aren't being double counted - not an issue in Maths, but maybe possible if an article has BOTH 0.5 and 0.7 tags on it. Things are in a bit of a state of flux at the moment, we just switched to a new 1.0 template recently, and not everything has been changed over. Also, I haven't checked the log for talk page vandalism recently, but I'll go through that when I get a chance. Many thanks, once again! Walkerma 05:16, 27 April 2007 (UTC)
The bot does not count NA-Class articles. I was not aware of this class until now. Is it a quality or importance class? Oleg Alexandrov (talk) 06:29, 27 April 2007 (UTC)
NA-Class is a quality class for articles that don't fit into the quality system, like year and century pages (19th century). In the math project there is also Image-Class and List-Class, but the {{releaseversion}} template doesn't support those. If I subtract from my list that one exception (Talk:Grateful Dead)) and the NA class articles then my count matches Oleg's count. WP 1.0 bot must not count Grateful Dead because it's in Category:Version 0.7 articles with invalid quality ratings.
I can't run AWB because my home and work computers run Linux, but I manage somehow. CMummert · talk 13:30, 27 April 2007 (UTC)
Since they don't fit into the quality system, then I guess they should not be counted by the bot. By the way, I run Linux at home and work too. :) Oleg Alexandrov (talk) 14:44, 27 April 2007 (UTC)
Indeed, Oleg, the whole point of NA is that it's not applicable to the assessments system and thus your bot needn't know about it.
I also use that value to turn off the "priority" rating when a {{WPBiography}} has more than one workgroup active with different priority params (and then kludge it by adding the priority categories manually). --kingboyk 14:54, 27 April 2007 (UTC)

Interest?

Is there interest here in VeblenBot's tables? There are examples at User:VeblenBot/Version_0.7/MainTable and subpages. If there is interest in doing something with them here, I'll put in a bot request to update them daily. CMummert · talk 13:17, 30 April 2007 (UTC)

If there is good interest in the table, one option could also be to merge CMummert's code into the main WP 1.0 bot code so that all projects could use it. But that of course would need discussion to make sure people agree with that. Oleg Alexandrov (talk) 15:19, 30 April 2007 (UTC)
I would have no objection to that. The main work that would have to be completed is to find a way to modularize the code to use a per-project configuration file. CMummert · talk 15:50, 30 April 2007 (UTC)
I would love for VeblenBot to become a standard tool for the 1.0 project. If it can be incorporated nicely into WP 1.0 Bot, that would be great. If it would be more reliable/efficient to keep them separate, let's do that. Either way, it's a really nice feature, thanks, Walkerma 02:38, 1 May 2007 (UTC)
If there is any project that would like to use the bot, I would be glad to set it up. I need to put in another bot request, though, which means I need to know exactly what is being requested. CMummert · talk 03:09, 7 May 2007 (UTC)
I was just about to post a request here - you must have read my mind! Please go ahead and set up the bot for:
All of these use the same 11 categories, see {{WP1.0}} for a list. With the GA project, the "topic" (equiv. to category) parameter is present in the template, but it has not been used so far, so the 11 individual GA categories are currently empty. Without your bot, these categories provided no useful information, but with your bot it will become worthwhile for people to include the category. Once again, thanks! Walkerma 04:35, 7 May 2007 (UTC)
OK. The table at User:VeblenBot/Version 0.7/MainTable and its subpages should take care of the Release Version project. I'll look into the vital articles project next. Could someone look over the release version pages and let me know if there are any problems? CMummert · talk 15:48, 7 May 2007 (UTC)
This looks excellent to me, I can't see anything wrong! I'm guessing that it needs both importance and quality for the top table - right? We have very few assessed for importance at present. I love the category table. Thank you very much, this is very helpful. Walkerma 04:13, 8 May 2007 (UTC)
I put in a request for this function; I expect it should be approved pretty quickly. CMummert · talk 01:39, 9 May 2007 (UTC)

I was inspired by WP:MATH in its organization by fields. I used the Military history project as the model for WP:PHILO because it permits for more than one field. I am wondering if the bot can interface with this set up to produce information by field as the math project does. The banner produces categories for each field. You can view a test page which has displays all options for the banner. This has resulted in these charts for assessment info by field. Greg Bard 02:43, 16 September 2007 (UTC)

Short log or temporary archive

Hi. WP:WINE has a bit of a problem at the moment - we like to have the assessment log in some inner HTML on our homepage in order to keep an eye on what other people have been up to recently, but thanks to a stub assessment drive in early March our log is currently running at over 300kb, which is slowing down our home page a little. I appreciate that the logs expire after 3 months, but we can't really wait that long. I've had a bit of a poke round the talk archives here but hadn't found anything to match this problem. I've thought of two options :

Manual kludge

I was wondering if it would be OK if I just set up Wikipedia:Version_1.0_Editorial_Team/Wine_articles_by_quality_log/Archive, manually cut out everything over a month old, set up a link to the archive from the main log and then deleted it in two months time - would that break the bot horribly? Doesn't need any work on your part, and just gets us out of this temporary hole.

New feature

A more elegant solution that might be useful on many project portals would be a separate 'shortlog' page, that just had the changes since the last botrun, or the last week or something, plus a link to the main log page. I appreciate this option involves extra coding, but I thought I'd float it.....

Of course there is a third option, to delete the log from our homepage for the next few weeks, but I'd only do that if the 'temporary manual archive' option isn't available. FlagSteward 21:58, 30 April 2007 (UTC)

The bot will not be affected no matter what you do. It usually takes the log page as it is and just appends to it. Oleg Alexandrov (talk) 04:04, 1 May 2007 (UTC)
OK - I've now set up Wikipedia:Version_1.0_Editorial_Team/Wine_articles_by_quality_log/Archive to offload 300kb-worth of log ;-/ temporarily. I've appended a note to the end of the log to explain where the old stuff has gone, and a note to the top of the archive explaining where it came from. I hope this is OK - give us a shout if it breaks anything FlagSteward 01:37, 2 May 2007 (UTC)
Unless I'm missing something, there's no need to do that. Just truncate the log. I know it has a header saying in effect "leave this page alone" but that's really only to discourage people disambiguating links. If the log is too long for transclusion you can truncate it; I've done it often enough and the bot doesn't mind. The old revisions stay in the history, so there's really no need to archive it. --kingboyk 11:25, 3 May 2007 (UTC)

Small update on the bot

I modified the bot code to fetch the latest history version of articles as suggested a while ago by Titoxd and Salix alba (I remember it was both), by doing a query of the form

[3]

which does a bunch of articles at the same time (five in this case). The bot should be faster as a result, but in the last several days since it's been running I have not noticed great improvements. Well, at least it does not get slower. :) Oleg Alexandrov (talk) 02:11, 3 May 2007 (UTC)

Proposal to make the bot faster

The bot is taking around three days to do the update nowadays, which is not good. I have a proposal. If we remove the "last updated" tag and the date at the bottom of subpages (see here for an example of what I mean), then the bot won't need to update subpages on which no changes happen except the datestamp. The main indeces for each subject would still get their datestamp (like the index Wikipedia:Version 1.0 Editorial Team/Aircraft articles by quality of the above subpage). Would people agree with this? Oleg Alexandrov (talk) 18:31, 19 May 2007 (UTC)

Well, it's a pain, but I think it's worth it for faster updates. The only exception I'd like to see is that entirely new pages should still be done (ie. if they've never been done before). -- TimNelson 00:54, 21 May 2007 (UTC)
They will be done, but the datestamp won't show up in subpages (one can always find the last update date from history). I now implemented this, let's see if it makes the bot faster. Oleg Alexandrov (talk) 02:01, 13 June 2007 (UTC)

Bot missed the end of the alphabet AGAIN

on 30 May, the bot didn't make it to the zebras again. Something should be done about the alligators and jackals always getting an update and the sloths and zebras missing out all too often.Rlevse 12:58, 31 May 2007 (UTC)

Articles not in categories

I just added the importance scale to a new task force I helped create. But, no articles are being added to the categories. I know this isn't an issue with the bot but I was wondering if anybody else has ran into this issue and what you did to resolve. I was able to resolve one article by simply removing the rating then re-adding it but that is not a solution for hundreds of articles. For instance, Talk:Tulsa Zoo is properly tagged and the correct category (Category:Mid-importance Tulsa articles) is at the bottom. But if you go to that cat, there is nothing in it. Any ideas?↔NMajdantalk 14:21, 31 May 2007 (UTC)

Well, they're there now. Must've been a huge backlog in Wikipedia's queue.↔NMajdantalk 15:58, 31 May 2007 (UTC)

WP 1.0 bot stopped for now due to a problem

I stopped the bot because there is something wrong with the query which finds articles in a given category. For example, consider the large Category:Stub-Class mathematics articles. To find the articles in there, one has to do several consecutive queries, each giving 200 articles. The following query

http://en.wikipedia.org/w/query.php?what=category&cptitle=Stub-Class+mathematics+articles&format=txt&cpfrom=Cl

works, but if you replace "Cl" at the end by "Cm", so instead of giving the articles starting from "Cl" on, give the articles starting from "Cm" on,

http://en.wikipedia.org/w/query.php?what=category&cptitle=Stub-Class+mathematics+articles&format=txt&cpfrom=Cm

the query gives an error. I contacted Yurik about this. Any ideas in what is going on? Oleg Alexandrov (talk) 02:46, 6 June 2007 (UTC)

I spent some time looking at the code, but I'm not experienced enough to make any further progress without being able to see the data that query.php gets from the SQL server. I also let Yurik know about it, and there is a note at User_talk:Yurik/Query_API#Categories (copied from VPT) about the issue. — Carl (CBM · talk) 12:35, 6 June 2007 (UTC)

Query.php is now working correctly on the math-related categories. I don't know what was changed to make it work. — Carl (CBM · talk) 01:39, 10 June 2007 (UTC)

Yep, the links above work well indeed now. Carl, thanks. I restarted the bot. If anybody notices it behaving oddly, at some point, it just should be blocked. Oleg Alexandrov (talk) 05:31, 10 June 2007 (UTC)

New Stick - Biography (science and academia) articles

Soon after getting restarted it appears the Bot has come to a grinding halt again - this time just after "Biography (science and academia) articles" . thanks. :: Kevinalewis : (Talk Page)/(Desk) 09:13, 11 June 2007 (UTC)

I think it is running, see Special:Contributions/WP 1.0_bot. Oleg Alexandrov (talk) 15:02, 11 June 2007 (UTC)
I certainly appears to be I assume you gave it a "bit of a kick". :: Kevinalewis : (Talk Page)/(Desk) 15:29, 11 June 2007 (UTC)
I guess I woke up in the middle of the night, kicked the bot in the butt, and forgot everything by morning. :) Oleg Alexandrov (talk) 15:33, 11 June 2007 (UTC)

New Stick - Former country articles

And again soon (but not as soon) after getting restarted it appears the Bot has come to a grinding halt again - this time just after "Former country articles". thanks. :: Kevinalewis : (Talk Page)/(Desk) 08:22, 12 June 2007 (UTC)

Seems to be going again after a short hiatus!. :: Kevinalewis : (Talk Page)/(Desk) 12:05, 12 June 2007 (UTC)
Either I was wrong or it has stopped again. :: Kevinalewis : (Talk Page)/(Desk) 13:57, 12 June 2007 (UTC)

And once again

The bot has broken at least twice in the last 8 days, and it always restarts with "A", so once again the Aardvarks get updates while the Zebras don't. Result: Aardvarks have had two updates while the rest of us have had zero in eight days. Why can't it restart where it left off when it breaks or do one run A-Z and the next Z-A? Rlevse 10:09, 12 June 2007 (UTC)

Restart where it left off. Please. :-) Carcharoth 10:19, 12 June 2007 (UTC)
That would be rather hard to code up I think. The big problem is that the bot doesn't scale anymore, there are almost a million articles to go through every two days. You can always run the bot by hand using this form for the Zebras, but long term something needs to be done to make things more efficient, and I don't know what. Oleg Alexandrov (talk) 15:38, 12 June 2007 (UTC)
For some reason the bot will not update the Biography article count. I've tried running it manually, multiple times, but it always stops around log 440. Why would this happen? --Psychless 18:41, 12 June 2007 (UTC)
One can't run manually the biography articles by quality, since that one is too large, and the server at my school cuts the job after a while. The biography articles are the last, since that list is big. Let's see if it gets updated in a day or two. Oleg Alexandrov (talk) 01:59, 13 June 2007 (UTC)
Ditto the WP:Novels articles, I can't get this to complete either. Probably for the same reason, however it seems to be often just as it is about to finish, which is frustrating. :: Kevinalewis : (Talk Page)/(Desk) 08:10, 13 June 2007 (UTC)
  • How about splitting the tasks between multiple bots? The Biography section is probably encountering problems because it is by far the largest. Why not run the bot without the Biography section (is that possible?) and see what happens? If things work again, it is definitely a scale problem. Carcharoth 20:22, 12 June 2007 (UTC)
  • As I recall, running the bot manually doesn't update the master copy does it?Rlevse 20:42, 12 June 2007 (UTC)
Making a couple of bots working in parallel would just increase the server load I believe. Even now the bots run in parallel so to speak, since a complete bot run takes around 3-4 days, and since it runs once in two days at any instance of time two copies of the bot are active.
Let's see if the current two runs get interrupted too. If yes, it means that something needs to be done about it.
And lastly, running the bot manually or doing bots in parallel won't update the global stats, for that a complete run is necessary. Oleg Alexandrov (talk) 01:57, 13 June 2007 (UTC)

New Stick - Military historiography articles

And another occasion (but again not as soon) after getting restarted it appears the Bot has come to a grinding halt at Military historiography articles by quality log. (Nearly at the WP:Novels articles, so near but so far - being selfish of course) :: Kevinalewis : (Talk Page)/(Desk) 09:48, 13 June 2007 (UTC)

Aardvarks 3, Zebras 0. I do appreciate all your effort on this, but something needs to be done.Rlevse 10:03, 13 June 2007 (UTC)
The bot did not come to a halt, it is running, see Special:Contributions/WP_1.0_bot. Kevin, the bot can't edit all the time, it has to read information too. That's why you see it pausing every now and then. Hopefully we'll get to the zebras too. Oleg Alexandrov (talk) 14:56, 13 June 2007 (UTC)
Ok, my mistake - but it was thinking for quite a "long" time, hence my error I suppose. Is there a log of it's thinking to watch as there used to be with Mathbot!? :: Kevinalewis : (Talk Page)/(Desk) 15:00, 13 June 2007 (UTC)
The logs are large now and I can't keep them in the public-visible portion of my user account, I don't have enough room there. As a rule of thumb, a large project can take several hours. Oleg Alexandrov (talk) 15:16, 13 June 2007 (UTC)
question--the Scouting project got a bot run today 12/13 Jun GMT, but the log entry is for 10 June, even though the history of the page shows it's real date of 13 June....?? Just curious here.Rlevse 00:51, 14 June 2007 (UTC)
Because the bot has been running for three days so far, that's why. :) Today I made a couple more of changes which should make the bot a bit faster. Oleg Alexandrov (talk) 02:59, 14 June 2007 (UTC)
OK, just curious. Thanks for all your help. Rlevse 10:01, 14 June 2007 (UTC)

Log and statistics subpages

Here is a message I left at User talk:Oleg Alexandrov. He told me to raise this issue here. I've also copied his reply here. --ZeroOne (talk | @) 11:55, 15 June 2007 (UTC)

Could you modify the bot from using page names such as Wikipedia:Version 1.0 Editorial Team/Chess articles by quality statistics to using Wikipedia:Version 1.0 Editorial Team/Chess articles by quality/statistics? That would create handy back links to the statistics and log pages too. Currently those pages do not link to Wikipedia:Version 1.0 Editorial Team/Chess articles by quality which is, in my opinion, more essential than linking to Wikipedia:Version 1.0 Editorial Team which is what they do. --ZeroOne (talk | @) 15:21, 14 June 2007 (UTC)

That is a good idea, but perhaps a bit too late (I wish I thought about it earlier). It would require doing hundreds of moves to fix all the existing pages (in order to be consistent among them). You could try raising this at WT:1.0/I, but I am not sure if it is worth it given the amount of work needed to bring all the existing page in the same naming convention. Oleg Alexandrov (talk) 15:26, 14 June 2007 (UTC)
Agree with Oleg, great idea but lots of work, esp if done by one person. Coordinating help via project coordinators would be a nightmare too. Rlevse 12:13, 15 June 2007 (UTC)
I believe there exist bots that can automatically move pages according to some rules? --ZeroOne (talk | @) 17:22, 15 June 2007 (UTC)
Right. The bigger problem is to synchronize those bots with WP 1.0 bot which runs all the time. Again, it is doable, if people agree it is worth it, we may go for it. I myself am not sure. (Note that bots need supervision too, double redirects may need to be checked, and all for hundreds of pages.) Oleg Alexandrov (talk) 01:56, 16 June 2007 (UTC)
You could stop the WP 1.0 bot for a day or two to allow the other bot to complete its task, couldn't you? It's hardly a critical bot anyway, unlike some vandal protection bots. --ZeroOne (talk | @) 11:17, 16 June 2007 (UTC)
Sure, if there is support for your proposal. Then a move bot needs to be hired too (I don't have one). So let's wait and see what others say. Oleg Alexandrov (talk) 15:04, 16 June 2007 (UTC)

Put the flags out!

We have just passed the 1,000,000 articles tagged for assessment. Is this a couse for celebration or for more umph for this Bot??! :: Kevinalewis : (Talk Page)/(Desk) 16:13, 4 July 2007 (UTC)

Cool. Only 800,000+ to go.  :) —Preceding unsigned comment added by Stevietheman (talkcontribs)
Yes, and the great part is that over 700,000 have already been assessed! Walkerma 20:43, 4 July 2007 (UTC)

Bot to run every three days

Well, it was bound to happen. In spite of a few optimizations and an increased edit rate (one edit per five seconds), now a bot run takes almost three days (and a good chunk of CPU and memory too). I switched the bot to a run every three days. As before, people who need an instant run can use the online tool. Hope that's fine with people. Oleg Alexandrov (talk) 04:16, 6 July 2007 (UTC)

No problem at all - I would anticipate that the regular schedule would take this route further (4 or 5 days soon) and have got there earlier. However the problem with the requested run remains for those with a large article base like NovelsWikiProject, it just never completes at all. So in our case the "regular" and now less frequent run is the only thing we have. :: Kevinalewis : (Talk Page)/(Desk) 07:52, 6 July 2007 (UTC)

API change heads-up

Hi. Few things:

  • I have greatly (i hope) improved the API documentation page at mw:API. Feel free to browse, add examples, correct...
  • Backlinking queries (backlinks, embeddedin, imageusage) are now using a new parameter ??title=xxx, instead of titles=xxx. Titles is still supported, but will be obsolete soon. Please update bots once the main tree goes live.

--Yurik 07:53, 6 July 2007 (UTC)

Thanks! Our project does not use backlinking queries as far as I am aware, but only category and history queries, but this is good to know. Oleg Alexandrov (talk) 15:18, 8 July 2007 (UTC)

Possible way to reduce bot workload

Would it be possible to maintain a list (with a timestamp) of projects that have not had an update in awhile and not run the bot on those projects? For instance, if the bot hits a project that has not had an update to the log it 3-4 runs, it would add that project to a list with the date it was added. Then, when it ran next time, it would not process that project. When it has been 21 or 30 or however many days since it was added to the list, it would then re-run it on that project and see if there has been an update. If not, back on the list, if so, then it goes back to being apart of the normal process. Of course, if somebody from that project wants the bot ran on their project they can either ask here for it to be removed or run the bot manually themselves with the web form. I just went through the A's and there are six projects with no updates the last three bot runs (1, 2, 3, 4, 5, 6). Ideally, this would knock 75-100 projects off the normal bot run which should save some considerable time. Whether or not its enough to knock the bot run back down to 2 days, I doubt it but at least the bot won't be overloaded and it may hopefully reduce errors and crashes. The bot could still take a quick look at previous assessment counts for these skipped projects to get the overall numbers. Thoughts?↔NMajdantalk 16:57, 3 August 2007 (UTC)

The simplest way to reduce the workload is to remove the biography project, which I think takes a third of the time (and sometimes the bot is having trouble reading those huge categories entirely). Seriously, recently the bot has been pretty stable and I am rather happy with the three-day run.
Also, removing those projects may not affect the performance much, since I think the bot spends most of time in finding the histories of articles which changed and submitting the pages that changed).
If people want this implemented however, I can do it, won't be hard, although I have reservations in how much faster things will become and if the members of the somewhat active projects will be happy with this. Oleg Alexandrov (talk) 05:17, 4 August 2007 (UTC)

Problem with categories and disappearing articles

To continue from here, we are having a problem with articles disappearing from categories, and then from the lists maintained by the bot.

Here's the deal. Talk:Gireum Station is obviously in Category:Stub-Class Korea-related articles, yet when one clicks on that category, the article is nowhere to be found. This is very odd, and some kind of Wikipedia server tricks.

The consequence is that the bot massively removes articles, see Wikipedia:Version 1.0 Editorial Team/Korea-related articles by quality log. Comments? Oleg Alexandrov (talk) 02:52, 15 August 2007 (UTC)

Seems like something that needs to be asked in higher-level (dev?) discussions, maybe. Girolamo Savonarola 02:54, 15 August 2007 (UTC)

Or maybe not. I tried to click on the "next 200" link in Category:Stub-Class Korea-related articles and I get the following link:

http://en.wikipedia.org/w/index.php?title=Category:Stub-Class_Korea-related_articles&from=3%2A

and if you click on "next 200" on the new page, the same link shows up, so one gets an infinite loop, never going beyond the first 200 articles. Something may be wrong with {{WikiProject Korea}} but I can't tell what. Oleg Alexandrov (talk) 02:59, 15 August 2007 (UTC)

Ah, I've noticed that before but never thought anything of it. It could be the non-alphabetical sorting of these categories that is causing problems here. I'll try removing this feature from {{WikiProject Korea}} and we can see what happens next time the bot runs. PC78 14:51, 15 August 2007 (UTC)

problem with counting

Hi. Is it normal that this page says we have 1610 FAs and 2710 GAs, whereas WP:GA says 1547 and 2744? How often is the bot running? I don't remember us ever having that much FAs.--SidiLemine 11:35, 15 August 2007 (UTC)

The FA count likely includes a fair number of FLs as well, so that's not a surprising discrepancy. I have no idea what's affecting the GA count, but it may very well be a legitimate change, given how rapidly that status can be added or removed. (It may, of course, simply be articles marked as GA-Class without GA tags.) Kirill 12:29, 15 August 2007 (UTC)
FLs! Right, didn't think of that. As for the GAs, as you say it seems possible. Thanks!--SidiLemine 16:42, 15 August 2007 (UTC)

Stiffed again?

Are we sure this Bot process hasn't stiffed again, seems to get so far then stop! Maybe wrong but looks that way. :: Kevinalewis : (Talk Page)/(Desk) 08:25, 31 August 2007 (UTC)

Right, the bot did not finish its run because of some odd error a few days ago. Let's see if the current run goes well. I'll try to think of what is going on. Oleg Alexandrov (talk) 17:08, 31 August 2007 (UTC)
The bot hasn't run on LGBT articles in almost a week - should I be concerned? Should I run it manually? -- SatyrTN (talk | contribs) 13:48, 1 September 2007 (UTC)
Feel free to run it manually at any time. Apparently now the bot is running well (currently at "I"). I made the script more robust at the place it crashed last time. 15:50, 1 September 2007 (UTC) —Preceding unsigned comment added by Oleg Alexandrov (talkcontribs)
Well, if it's at "i", I'll let it run to "L" :) -- SatyrTN (talk | contribs) 17:09, 1 September 2007 (UTC)

Zebras not being done again

The bot has only made one full run in the last 12 days, Aug 31. This occurs more and more. Please find a permanent fix.Rlevse 10:53, 6 September 2007 (UTC)

I wonder if we could get a server at the Foundation dedicated to this task - would that be helpful? I think so many people have come to depend on this bot, yet we are still at the mercy of a server on a university campus. (Is that correct?) Personally, I think once a project has a stable set of articles even a monthly run will often be adequate, but I understand that when building a list people want to check things more often. In the meantime, how about renaming the project, WikiProject:Aardvarks and zebras? :) Also, I'd like to thank Oleg once again for being one of the rocks on whom we can depend - he's been helpful and available since this project began in 18 months ago. Thanks for your commitment, resourcefulness and wisdom, Oleg. Walkerma 14:26, 6 September 2007 (UTC)
The stats got updated quite regularly. Is there a particular project which did not get updated?
Recently the bot did not work too well for six days (two runs), but the last time before that it happened was a couple of months ago if I remember. I am fixing bugs as I discover them (the recent problem was that the bot was not robust enough to the server being down for a while). Oleg Alexandrov (talk) 16:49, 6 September 2007 (UTC)
And I'd appreciate any programming help I could get. I'll start a rather demanding job in less than a month and I won't have a very large amount of time if something breaks. Oleg Alexandrov (talk) 17:12, 6 September 2007 (UTC)
I fixed a silly bug in the code I introduced recently when trying to make the code more robust. The bug was making the bot very slow. I'll have to restart the bot now. Oleg Alexandrov (talk) 19:57, 6 September 2007 (UTC)
Oleg, I do appreciate all you do. But we depend on this. YOu've a victim of your own success. Compare Aardvark log to the zebra log and you'll see what I mean.Rlevse 22:24, 6 September 2007 (UTC)
Heh, you want a fail-safe free service, don't you? :) Seriously, any Perl programmers out there who can help maintain the code? The source code is in a google code repository. Oleg Alexandrov (talk) 22:28, 6 September 2007 (UTC)
Fail-safe and free would be nice, but I'd settle for getting updated as often as the Aardvarks; being left out makes me feel unwanted-;).Rlevse 22:30, 6 September 2007 (UTC)
The bot was running perfectly for two months. The recent trouble are because the server was not very stable and the bot did not take that very well. It took a week or so to fix because the bot runs only twice a week and it takes a while to realize it was broken. Also, while the bot is not doing regular updates, it can be ran by hand. Don't you think that's good enough? :) Oleg Alexandrov (talk) 15:44, 7 September 2007 (UTC)
Oleg, is there any particular thing the code needs? My Perl is a tiny bit rusty, but I'd be glad to help out if I have something to look for or something specific to change. I too have become used to regular updates. Being a Lemur, I'm at least better off than the Zebras :) -- SatyrTN (talk | contribs) 13:54, 7 September 2007 (UTC)
That would be great. You could start by checking out the code from the google code repository above, downloading Perlwikipedia.pm, and see if the code runs for you. I can also add you as one of the developers (you'd need to give me a gmail address for that). This way, if there are issues with the bot again in the future, perhaps you could help if I am not quick enough. Oleg Alexandrov (talk) 15:44, 7 September 2007 (UTC)

A fix to the zebra problem

I modified the bot code a bit so that each time it starts it goes through the list of projects not alphabetically, but rather in the order of oldest first, meaning that the projects that have not been run for longest time will come first. Hopefully this will put to rest the problem of the projects earlier in the list being run more often. Oleg Alexandrov (talk) 22:38, 7 September 2007 (UTC)

Great! Hopefully that means everyone will get done eventually, even when there are a lot of problems. Thanks a lot, Walkerma 03:27, 8 September 2007 (UTC)
Outstanding. Also, is there any progress on the idea of have a dedicated server for this at the foundation? That would likely help too.Rlevse 10:53, 8 September 2007 (UTC)
I am actually looking forward to getting the bot off my school's network since I no longer work there. A dedicated server may be too much to ask, but having access to a reasonable stable server with say 100 MB of storage would be nice. Otherwise I'll just make the bot a bit more stable to interruptions and move to the toolserver (hopefully it would perform better there than what it did last time when I tried this). Oleg Alexandrov (talk) 15:32, 8 September 2007 (UTC)
I'll see if we can get help with this. Walkerma 16:00, 8 September 2007 (UTC)

Help! My snowman's melting!

Looks like the bot is deleting the index. What's going on? Girolamo Savonarola 02:55, 11 September 2007 (UTC)

A few changes in the API of Query API is confusing the bot. I fixed one of the consequences (">" being replaced by "&gt" recently, but there may be more, like plain & being replaced by &amp;).
I am in the middle of a busy relocation to my new job. I will have no time tomorrow and very little time the day after, and no internet access for a few days at least at the new home. I can't say now when I will look into this problem. If any Perl people are willing to try to fix things, the complete bot source and dependencies (and instructions) are available at a link at User:WP 1.0 bot. The problem is in any code calling the above API (do a grep searching for the lines "query.php" in the bot source code directory and in wikipedia_perl_bot/bin. Sorry can't write more, wife says "lift butt up and pack, it's 10 PM". :) Oleg Alexandrov (talk) 05:06, 11 September 2007 (UTC)
Or like the brain would process it:
INT WIKI
MOV UP, BUTT
HLT
INT STAND
CALL PACK
CALL SLEEP
NOP
;) Have fun at the new job! And thanks for everything you've done so far for us. :) Titoxd(?!? - cool stuff) 19:16, 12 September 2007 (UTC)
Heh Titoxd, my wife found this amusing. :)
I fixed the bot, hopefully, and now it is running again for all the projects. Since recently the bot has been modified to run the projects in the order of oldest first, the bot now starting running from "F", where it stopped last time several days ago when its run got interrupted because of the API change. Oleg Alexandrov (talk) 00:31, 15 September 2007 (UTC)

Any way to detect redirects?

It would be nice if the bot could add "(redirect)" to the tables it generates when it encounters an article which is a redirect, to help users remove the wikiproject templates from redirect talk pages. --jacobolus (t) 20:49, 20 September 2007 (UTC)

The bot can't do that, for the reason that it never visits articles, all it does is collecting talk pages from categories. It could be implemented, but would require visiting a million articles each run, which is infeasible. Perhaps anybody else would have any ideas. Oleg Alexandrov (talk) 02:22, 21 September 2007 (UTC)
May be this is better done as a one off job, it does not need to be done every day. Basically you would need to get a list of redirects and a list of assessed articles and find the matches. Theres various ways both lists could be obtained, but they are not trivial. --Salix alba (talk) 07:26, 21 September 2007 (UTC)
I can generate a list of all assessed articles (it will be huge). A list of redirects can be obtained by querying a database dump, I guess. But indeed, this is not something that can be automated and run on a regular basis. Perhaps a big cleanup once a year could be done, if anybody's willing to come up with a list of redirects. But I am not sure overall if the presence of a few redirects is such a big problem. Oleg Alexandrov (talk) 14:48, 21 September 2007 (UTC)
I missed this conversation earlier. It is possible to efficiently tell which pages are redirects using the API (see [4]). Up to 5000 pages can be queried per HTTP request. Given a list of all the assessed articles, this API feature would be straightforward to make a list of assessed redirects. — Carl (CBM · talk) 14:35, 12 October 2007 (UTC)

Proposal to remove the main biography project from the bot run

See Wikipedia talk:Version 1.0 Editorial Team#Proposal to remove the main biography project from the bot run. Oleg Alexandrov (talk) 16:04, 27 September 2007 (UTC)

Importance: "None"

I think the word "None" in Template:No-Class should be changed to "Unknown". Right now, the label implies that articles in that column of the statistics table have NO importance; in fact, their importance is simply not known.

I realize that if a bot (somehow) uses this label in its data gathering, then the bot code would need to be modified; if so, that seems worth doing, since "Unknown" gives the unwary reader a much better sense of the situation. -- John Broughton (♫♫) 14:18, 12 October 2007 (UTC)

Unknown would assume that it hasn't been assessed. Some pages (generally those which aren't articles) are no importance. Girolamo Savonarola 14:47, 12 October 2007 (UTC)
I'm referring to the table at Wikipedia:Version 1.0 Editorial Team/Work via Wikiprojects (and smaller tables with the same column and row headings that exist at individual WikiProjects). As best as I can tell, only articles are being included. The statistics are limited to articles where a WikiProject has put their template on the article talk page; there is no reason to believe that WikiProjects are templating talk pages of non-articles.
To give an obvious example of the problem: of the 1864 FAs, the table shows that 755 have an importance of "None". I find it hard to believe that any FA exists on a topic of no importance. -- John Broughton (♫♫) 21:25, 12 October 2007 (UTC)
Hurricane Irene (2005).
But more seriously, many articles are assessed as "no importance" due to errors in typing the parameters. Template parameters are case sensitive, so Importance=mid is not the same as Importance=Mid. I've found many errors like that. Titoxd(?!? - cool stuff) 21:42, 12 October 2007 (UTC)
This is precisely why some of the more sophisticated banners (such as WPMILHIST) have a subsection that contains a #switch parser function in order to align most common spelling variations to the needed one for the template. (Arguably, there perhaps should be a function added to the software to allow the software to do this automatically if so specified in template code.) Girolamo Savonarola 22:00, 12 October 2007 (UTC)
That is not due to the bot but rather to {{No-Class}}. Note that a long word like "Unknown" would make the table columns wider in the stats tables. Oleg Alexandrov (talk) 03:17, 13 October 2007 (UTC)
If the problem is the width of the column heading, then "Unk." or "???" or something else would still be better than "None". -- John Broughton (♫♫) 13:31, 13 October 2007 (UTC)
True, but there's another factor at work here. There are, in fact, two different "levels" of importance assessment that feed into "None":
  • "Unknown" - the importance is not currently assessed, but eventually will be
  • "Not applicable" - the importance is not currently assessed, and will not be
I'm not convinced that simply having everything show up as "Unknown" will be an improvement. Kirill 17:12, 13 October 2007 (UTC)
I think "NA" might be better than "none", it wouldn't mislead the way that "none" can. Walkerma 03:57, 17 October 2007 (UTC)
"Importance: ???" and "Importance: N/A" are more explicit, agreed. Titoxd(?!? - cool stuff) 05:15, 17 October 2007 (UTC)

Marxism task force

The bot doesn't seem to be compiling the data for the Marxism task force. The chart produced by this template: {{philosophy task force assessment|Marxism}} as can be seen among the Philosophy task force assessments. I'm not sure why it would be different from the rest. Pontiff Greg Bard 23:16, 16 October 2007 (UTC)

The log shows up recently activity. Is there any reason to suspect something is not right? Oleg Alexandrov (talk) 03:40, 17 October 2007 (UTC)
I tagged articles in Category:Marxist theory and then I ran the bot manually. Those articles aren't showing up in the worklist. There should be a bunch more. Pontiff Greg Bard 05:20, 17 October 2007 (UTC)
Can you give an example of an article which should show up and does not? Oleg Alexandrov (talk) 03:24, 18 October 2007 (UTC)
Cultural hegemony is one example. All of the rest of Category:Marxist theory as well. Pontiff Greg Bard 03:45, 18 October 2007 (UTC)
Per the instructions, Talk:Cultural hegemony should be in a Marxism quality category, which in turn should be in the Wikipedia 1.0 category. Try to do this and see what happens. Oleg Alexandrov (talk) 03:30, 20 October 2007 (UTC)

WP 1.0 bot: tweak formatting

Hi. I'd like to request a change be made to the formatting of the statistics generated by the bot, matching this change. This will remove some excessive whitespace after the table when it is transcluded (and remove some redundant bolding). Thanks in advance. --PEJL 19:31, 19 October 2007 (UTC)

That's easy enough to implement. I wonder what people think. Oleg Alexandrov (talk) 03:32, 20 October 2007 (UTC)
Fine by me, although I'm not sure of the necessity. John Carter 16:13, 20 October 2007 (UTC)
It is needed to remove excessive whitespace below the table when transcluded, like at WP:SONG#Progress (which includes the change). --PEJL 16:34, 20 October 2007 (UTC)
Looks good to me. Kirill 21:28, 20 October 2007 (UTC)
I modified the bot, as seen in this diff. The change will propagate on the next bot run. Oleg Alexandrov (talk) 03:15, 25 October 2007 (UTC)
Thanks! --PEJL 12:38, 28 October 2007 (UTC)

Request for more work

I know it would probably be insanely difficult, but would there be any way to structure the assessments such that for the main articles by quality chart on the page here, the articles contained in a given box could be pointed out. I'm thinking particularly here that it might be useful to be able to click on the box containing stub articles of top importance and seeing exactly which articles are included there. It might make choosing collaboration topics, if nothing else, a lot easier. John Carter 16:13, 20 October 2007 (UTC)

I'm guessing that this would have to be done within the project banner template code, much the way that the current assessment does. So you'd need a whole new set of categorization in addition to the current sets. Although there probably are other ways to do it. The better question is how necessary it would be? I can't imagine a project with an overwhelming number of Top or High importance articles (bc that would cheapen the value of it), and it is unused in many projects altogether anyway. Girolamo Savonarola 21:33, 20 October 2007 (UTC)
I guess you are talking about the intersection of two categories (say Top importance & Stub quality). There was a tool somewhere which did that (web-based I think). I don't know how I could implement it as part of the bot. I am also not sure there is a big need for such a thing in the table of stats. Oleg Alexandrov (talk) 03:30, 21 October 2007 (UTC)
Frankly, I'd rather see the two list classes added as proposed above - I think it would be more productive to the projects, and it would require far less work collectively to alter the bot code instead of creating hundreds (if not thousands) of new categories for this proposed change. Girolamo Savonarola 04:28, 21 October 2007 (UTC)

Move of Wikipedia:WikiProject Comics/Article Classification to Wikipedia:WikiProject Comics/Assessment

I've moved Wikipedia:WikiProject Comics/Article Classification to Wikipedia:WikiProject Comics/Assessment as that seems to be the consensus name for such pages, but I was wondering if that will affect the bot. I have updated most links to the name, the only links I haven't updated are the ones in the bot generated Wikipedia:Version 1.0 Editorial Team/Comics articles by quality page and subpages. I can't work out how the bot generates the link to Wikipedia:WikiProject Comics/Article Classification, so I can't work out what parameter or variable I need to change or where that might be. Any help appreciated. Hiding Talk 14:19, 22 October 2007 (UTC)

Your change in the index is enough, the bot will read and reflect that change in the subpages at the next run. Oleg Alexandrov (talk) 14:50, 22 October 2007 (UTC)
    • Thanks, I thought it would be something like that but wanted to check. Hiding Talk 16:32, 22 October 2007 (UTC)

Should I fetch the Magnums?

I've been manually prodding the bot to assess the WP Films task forces, which it's had little trouble doing. However, it seems to stall in the middle of surveying the Stubs (which were 23k at last count), and my browser still shows "Waiting for..." in the status bar, but the tab icon indicates it's given up, and there's no continued building of the page, even after several minutes of waiting for what usually takes ~15 seconds. So is the project size choking the bot? Many thanks, Girolamo Savonarola 20:48, 23 October 2007 (UTC)

Just tried it again, last line before it dies is "Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=Stub-Class+film+articles&format=txt&cpfrom=Love+on+the+Dole&cplimit=500". Girolamo Savonarola 20:55, 23 October 2007 (UTC)
That is not the problem with the bot. Since the interface is web-based, it has to go through the web server, and my guess is that the web server cuts the connection after a while. Therefore, long projects can't run from the online version of the bot.
Now that I fixed my mistake which made the bot not update for the last ten days hopefully there will be less of a need to run the bot by hand using the web interface. Oleg Alexandrov (talk) 02:59, 25 October 2007 (UTC)
Okay, something of that ilk was sorta my second suspicion - I originally tried to test on different computer/browser combinations in the thought that it was the client that was hanging up, but since they all did, the server does seem to be the better culprit. Girolamo Savonarola 03:58, 25 October 2007 (UTC)

14th since Novels

We seem to have some automated process problems again. Novels have not bee updated since the 14th and no major processing appear to be happening at present. Or am I wrong? :: Kevinalewis : (Talk Page)/(Desk) 09:29, 24 October 2007 (UTC)

Sorry, I made a syntax mistake when modifying the cron job back then, that's why it would not run. I should have checked the bot every now and then of course, but did not get to it these days. I fixed that mistake now and started the bot by hand. Hopefully from now on it will run regularly again. Oleg Alexandrov (talk) 14:59, 24 October 2007 (UTC)
Thanks, always difficult when questioning someone else's truly hard work, which in itself is appreciated by many. :: Kevinalewis : (Talk Page)/(Desk) 15:08, 24 October 2007 (UTC)

Does this main table include articles in sub-projects twice?

I was wondering whether the numbers on the table on this page count an article twice if both its project and its sub-project have v1.0 tables. How does it work in the case of projects like WikiProject British Royalty, which uses the WikiProject Biography template with a parameter set to "yes" that adds it to both projects? --Arctic Gnome (talkcontribs) 21:36, 14 October 2007 (UTC)

The bot counts each article only once in the big stats table on WP:I. At least, that's how I think I programmed it. :) Oleg Alexandrov (talk) 03:36, 17 October 2007 (UTC)
If an article is ranked at different importence levels between two projects, at what importence is it listed in the table? --Arctic Gnome (talkcontribs) 20:10, 9 November 2007 (UTC)

Dab-class

The other article space class that we need to clean up our reports is Dab, Disamb, or Disambig class. Let's get the article space cleaned up in the reports.--TonyTheTiger (t/c/bio/tcfkaWCDbwincowtchatlotpsoplrttaDCLaM) 21:51, 6 November 2007 (UTC)

Now here's something we really don't need. Aren't disambiguation pages (only if they have talk pages in the first place, not just created to show some banner) just rated under NA-Class? If they are, I don't think we need to start creating a class for every new type of article. It's just making things slightly harder for new users when they start to assess articles. It's always been known: Featured lists and articles under FA, GA, A, B, Start, and Stub under themselves, Lists under {{List-Class}}, and anything else (category, portals, etc) under {{NA-Class}}. That's a simple outline of assessment here, and I really don't think we should make the assessment process even more difficult. Spebi 20:06, 14 November 2007 (UTC)

Update the category-making bot to add lists

Should the category-making bot be updated to make a category for list-class articles now that they are a regognised part of the project? --Arctic Gnome (talkcontribs) 03:14, 16 November 2007 (UTC)

Done. Thanks, I forgot about that. Oleg Alexandrov (talk) 04:01, 16 November 2007 (UTC)