Wikipedia talk:Version 1.0 Editorial Team/Index
From Wikipedia, the free encyclopedia
Archives |
---|
[edit] Intended meaning of importance
There was a disagreement at Talk:NPA personality theory about how to rate the article's quality and importance for Wikipedia:WikiProject Psychology. The quality issue was cleared up pretty quickly, but there was quite a discussion about what exactly "importance" is supposed to mean in this context, and some additional discussion took place at Wikipedia talk:WikiProject Psychology/Assessment. I know that the WikiProjects are not required to use the same assessment scale as the Version 1.0 Editorial Team, but I think we want it to be as consistent as possible. I wanted to ask this group, what exactly are you looking at when you assess an article's importance? How do you intend to use this assessment in your project? I appreciate any feedback you can give us to clear this up! —Cswrye 19:14, 19 October 2006 (UTC)
- For WP:TWP assessments, the rule of thumb I've been using is judging how vital an understanding of the article's topic is to understanding the history and technology of rail transport worldwide. The top-level technology definition articles on {{Train topics}} are the only articles we've marked with Top importance, while at the other end, almost all subway stations are rated as Low importance. So far, only one of the ratings that I've applied has been questioned (the other editor thought an article should be High rather than Mid importance), so the guidelines that I put on the Assessment page seem to have been accepted by the rest of the project. Slambo (Speak) 19:22, 19 October 2006 (UTC)
-
- As was stated in the NPA theory talk page, we at WP:1.0 are using importance to prioritise articles within a particular subject area. Your importance criteria look excellent to me - they fit within our broad guidelines, but they have been tailored for your particular subject. If we are to produce a DVD with 20,000 articles, perhaps 200 of those might be related to psychology. Which 200 should we pick? A featured article or even good article that is equivalent to Slambo's project's Jordanhill Railway Station (i.e., low importance) may be a nice read, but it has no place in a general encyclopedia. However a topic like behaviorism (high importance) would be appropriate, even if only B-Class quality. We are currently looking to set up a bot that will pull out all of the usable articles with a certain level of importance, while allowing for the scope of the WikiProject (thus, a "High-importance" WikiProject:Psychology article would carry more weight than a "High-importance" WikiProject:Behaviourism article, say). We will also compensate for how a project grades, to allow for a project trying to cram all its articles as Top or High importance. Once we have some pilot trials done, I'll let people here know what's happening.
-
- Scanning quickly through the comments on the talk page you cite, I see the problem is perhaps one that is more common in psychology than in many fields (my field is chemistry, where things are more hierarchical). There may be a lot of theories out there that are broad in scope, but only a few of them are widely accepted. IMHO the "widely accepted" (and known) should trump the broad scope every time. Would a fresh psychology BS/BSc/BA know about this theory? Would Britannica have an article on this topic? If the answer to both of these is NO then it is Mid or Low. If it's Mid or Low, and you were to look at a newly-written encyclopedia of psychology, would you expect to see it included? If not, then it is probably Low importance; if yes then it would be Mid. If the theory gains popularity, then it could rise in importance over time, but we shouldn't be trying to push things ahead of the psychology community - we can't assume that this particular theory will become important in the future. Does all this seem reasonable? Walkerma 20:38, 19 October 2006 (UTC)
-
-
- Makes sense to me. In fact, the way that you portrayed importance is exactly the way that I think it should be assessed. I appreciate the input. —Cswrye 15:19, 27 October 2006 (UTC)
-
- Update. The article in question was deleted and its now a story on slachdot, where they say (incorectly) thats its a hoax. --Salix alba (talk) 10:53, 6 November 2006 (UTC)
[edit] Non-article class parameters
The following postings were copied by User:Walkerma from the main 1.0 talk page
Can anyone advise on how the non-article class parameters are supposed to work for the purposes of these combined WikiProject/assesment banners being placed on non-article talk pages? I have seen these banners placed on talk pages for relevant categories and template using class=NA, class=template, class=category, and so on. But the approach doesn't seem completely consistent.
An example is Category:Template-Class_film_articles where the Film WikiProject has grouped templates that have been "rated" template-class. This imprecise wording is avoided if the "NA wording" is used to say that the template is a template and doesn't need rating. Some template have been set up to do this, but I can't find any examples at the moment. Can anyone remind me where they are, or how to tweak the wording?
Going back to the film non-article parameters. The blurb on Category:Film_articles_by_quality shows that the system has been extended to include other classes such as List, Category and Disambig (I haven't found anyone yet using a "redirect" class to organise redirects, though see Category:Middle-earth redirects). I assume, that like the NA classification, these "non-article" classifications don't appear in the film quality statistics page and other stats pages, which I believe are maintained by a bot. I can understand why it doesn't include them directly, but what is the best way to generate statistics based on these non-article parameters such as NA, category, and template?
An alternative approach is seen at WikiProject Middle-earth, where Template:ME-project is used on article talk pages, Template:ME-category is used on category talk pages, and Template:ME-template is used on template pages.
Is there any reason to prefer putting all the parameters inside one template (as in the Film WikiProject), or to use separate banners (the Middle-earth WikiProject)? I prefer the latter approach, but was wondering if the assessment statistcs approach could be adapted to include stats on the number of templates, categories and other non-article pages? Carcharoth 11:32, 21 October 2006 (UTC)
- Another approach is the one I have started at Wikipedia:WikiProject_Middle-earth/Assessment#Page_types, where I am proposing a separate set of "page type" parameters and a separate line in the banner on which to display this parameter. Would this be helpful? Part of the reason for this is that it would be helpful to be able to assess some lists (currently, people tend to mix a "list" parameter into the rating scale), and in some cases to assess some of the larger templates (though this is not essential). Carcharoth 15:00, 21 October 2006 (UTC)
End of copied comments
- Well, {{WPMILHIST}} uses the wording change trick, but it only supports the single "NA" class. If you wanted to have multiple classes, doing it with if statements would probably be too ugly to even contemplate; the cleaner solution would presumably be to use a switch statement for that line in the template:
{{#switch:{{{class}}} |FA |A |B |Start |Stub= This article has been rated as {{{class}}}-Class. |Dab= This article is a disambiguation page, and does not require a rating. |Cat= This page is a category, and does not require a rating ... }}
- There doesn't seem to be any particularly good way of getting statistics for these, but I'm not entirely certain why they would be all that useful anyways.
- As far as having a separate field for these is concerned, I don't think it's a good idea. With the possible exception of lists (but those are genuine articles, and should really be assessed as any other article is), the various things that get these other tags can't be meaningfully assigned to any of the quality levels. It's meaningless, in other words, to talk about an "A-Class category" or a "FA-Class disambiguation page". If they're all just going to have "NA" in the first field, though, I don't see the point in introducing a second one; the existing "class" parameter can just as easily be used, as it's not doing anything productive there anyways. Kirill Lokshin 05:38, 22 October 2006 (UTC)
- Thanks for the sample code. That is very helpful. I found that having separate banners was easier and produces the same result, though any general wording changes have to be tweaked over 4-5 templates, but I'll have a look at the coding sometime. Someone also mentioned that some templates are complex enough to need a rating. eg. Template:Saffir-Simpson_full. Part of the problem is that I am using the example of the Assessment bot (Mathbot) to try and generate statistics about all the pages maintained by a WikiProject, not just the pages that need assessing. That is of more interest to the WIkiProject than the assessment project. I agree that class=List is not helpful, but the existence of a separate "Featured-list" process means that Class-FA is not applicable, so another quality parameter is needed for lists. Carcharoth 09:58, 23 October 2006 (UTC)
India project and Trains banner templates have the following class values.
- Disambig or Dab - The article is a disambiguation page.
- Redirect or Redir - The article is a redirect page.
- Template - The page is a template.
- Category or Cat - The page is a category.
- Image or Img - The page is a image.
- List - The page is a list
- NA - any other than the above.
If you need more information, check out the banner templates. Regards, Ganeshk (talk) 06:41, 22 October 2006 (UTC)
Those classifications have absolutely no bearing whatsoever on Wikipedia 1.0 assessments (which this page is about). Mathbot doesn't read them and it really doesn't matter how they're formatted. That's why there's no standard scheme. I agree with Kirill about class=List and said pretty much the same when somebody mentioned that my plugin doesn't support it: a list is an article, it can be featured, it should be assessed. --kingboyk 11:37, 22 October 2006 (UTC)
- These classifications help the project in grouping them into different categories. I hope Mathbot would someday check them too. :) The banners I think are useful for both 1.0 assessments and a general awareness of article count and quality within the project. Until now, there was no such project-level statistic. Some lists are just list of wiki-links. Those would be tagged class=List. Lists that have good content are given class ratings such as FA, GA etc. -- Ganeshk (talk) 16:44, 22 October 2006 (UTC)
- If they're just links they should be a category or (if they're a red link farm) a worklist in WikiProject space, imho.
-
- Agreed about the usefulness of banners and Mathbot's work; I'm just giving some historical perspective as to why that part (class=NA etc) isn't standardised. --kingboyk 18:56, 22 October 2006 (UTC)
- I understand that Mathbot is set up to deal with assessment categories. I should have made clearer that I'm using its example to set up similar statistics pages. I really like the stats pages that this bot generates. So would it be better to try and get a separate bot to run over any categories that have been set to track the templates, categories and whatnot that a WikiProject also deals with, and what would be the best way of doing that? It is fairly easy to manually cut and paste categories and set them up as a list in a Wikipedia project page to visually survey as a tree, and also as a more permanent snapshot than the toolserver CategoryTree tool, but I like the idea of getting a bot to do the counting. Carcharoth 09:58, 23 October 2006 (UTC)
- Agreed about the usefulness of banners and Mathbot's work; I'm just giving some historical perspective as to why that part (class=NA etc) isn't standardised. --kingboyk 18:56, 22 October 2006 (UTC)
-
-
-
- Different specific Wikiprojects projects of course have different/more needs than the WP1.0 project. I'd be rather reluctant to modify the current bot script to serve such (diverging) requests. However, I'd be more than happy to give my Perl code (or at least the subroutine for reading categories) to anybody who knows some Perl and would be willing to implement extra features for his/her specific Wikiproject (although for somebody starting fresh a better idea may be to use the Python bot framework instead which is more mature than the Perl bot framework). Oleg Alexandrov (talk) 15:00, 23 October 2006 (UTC)
-
-
[edit] Problem at Wikipedia:Version 1.0 Editorial Team/Ethnic groups articles by quality
Thread moved from Wikipedia talk:Version 1.0 Editorial Team/Work via Wikiprojects
Hi,
There's a problem with one of the comments at Wikipedia:Version 1.0 Editorial Team/Ethnic groups articles by quality. The article in question is "Obotrites," but it has weirdness in the comment section... help would be appreciated!--Ling.Nut 15:50, 21 October 2006 (UTC)
PS - it may have something to do with the fact that the comment page for "Rukai people", which probably appeared in that particular slot previuosly, was deleted. --Ling.Nut 16:01, 21 October 2006 (UTC)
- I'm not sure if Mathbot had formatted the page correctly or not (I think maybe not, so please check the old revision Oleg), but one immediate problem was that Talk:Rukai people/Comments was redlink (it had been deleted). Another potential problem for including comments in tables was that one of your members has a | in his signature. --kingboyk 16:13, 21 October 2006 (UTC)
-
- That was a bug in my code which was triggered by redlinked comment pages (as remarked above by Kingboyk). It was actually a big bug, I wonder how come it did not cause more trouble. Fixed now. Thanks! Oleg Alexandrov (talk) 15:33, 22 October 2006 (UTC)
[edit] Two template suggestions
I would like to make two template suggestions:
- The Jim Thorpe Problem - If you look at the referenced talk page, you'll see banners for the following WikiProjects: Beisbol, Penn, OK, Indigenous peoples, Biography and NFL. It would be good to have a template with a single assesment and that allows more than one WP to be listed.
- The Overlapping Problem - If you look at WikiProject California and WikiProject Southern California 'or WikiProject Pennsylvania and WikiProject Philadelphia; you'll see that they overlap. It would be good to have a template for the lower group in the hierarchy that allows it to have its articles placed in both wikiprojects.
--evrik 20:03, 24 October 2006 (UTC)
- Yes, I agree, this is a problem we've been discussing here and at WP:WVWP. The problem is the explosion of these templates and assessments. We already have a few hybrid templates - all chemical elements are in Wikipedia 0.5, so we have a joint Version 0.5/Chemistry template. The key here is not the technology - that's easy - the key is to get the projects to talk to one another. I think this is probably best handled through WP:COUNCIL, but that's a pretty new group with little clout as yet, so much of the template consolidation for now will have to happen on a case-by-case basis, with projects agreeing things between each other. Try posting a comment at the relevant WikiProject! Thanks, Walkerma 21:59, 24 October 2006 (UTC)
I found a silly example of this at List of fictional battles (I know, I know...). Given that the list can be expanded almost indefinitely, does each fictional universe WikiProject get to assess the list? But I am sure that better examples of overview articles can be found. Such as Human or Earth - several WikiProjects have probably fought pitched battles over those articles already! :-) I tend towards the share-and-share-alike mentality, but then I found someone had put a WP-Film template on Tom Bombadil, and I went livid! :-) But I do have issues with articles where there are bits about adaptations in films, eg. many of the LotR character articles have a bit about the film adaptation, so Gandalf has a "film-project" tag because there is a small section about Adaptations. For LotR, there are separate film articles, so that is not so much of a problem. But where overlap is so unbalanced, at what point can another WikiProject get a foot in the door, so to speak? Carcharoth 05:45, 25 October 2006 (UTC)
- The fundamental criterion, I think, is whether there is (or should be) content in the article related to the scope of the project. In your examples, Gandalf would be in Films because there's a discussion about him in films, while Bombadil probably isn't (but could be, if editors decide to add discussion of why he was cut from the films to the article). In general, though, the projects themselves are usually quite content to let other projects tag "their" articles; it's mostly the people who don't like the tags that complain. ;-) Kirill Lokshin 05:49, 25 October 2006 (UTC)
-
- It just seems a bit silly for the film-project tag for Gandalf to have a rating of B-class. Is that a rating for the whole article, or just the paragraph on Gandalf in films. We know it is the film bit, but how does that square with getting a rating that means anything in a list of film-related articles? The Cultural depictions of Joan of Arc mentions films about her. Does that mean that there is any point in the WP-Film people rating the article? I'd say not. Carcharoth 06:01, 25 October 2006 (UTC)
-
-
- In practice, it's a rating for the whole article. In many cases, the ratings will match among the various projects (sometimes because the same person will fill out all of them); in other scenarios, however, they may vary depending on what's being looked for. For example, suppose that Cultural depictions of Joan of Arc discussed novels at length but glossed over the films; it might then get a high rating from the Novels project (as everything they looked for would be included) but not from the Films project (members of which would probably be more likely to notice the ommisions). Kirill Lokshin 06:08, 25 October 2006 (UTC)
-
-
-
-
- Right. I've also been considering whether to see if the Disaster management WP has the manpower to assess disaster articles. There, the logical course of action would seem to be to assess those articles that aren't already covered by, say, aviation, or trains, or earthquake, or hurricane, WPs. A kind of, fill-in-the-gaps approach for a broad, overview project. That could also apply to the Meteorology WP mentioned above. So having a parameter in the banner template coding that says "still part of this project, but not rated because a sub/sister project has rated it", or something? Worth thinking about? Carcharoth 06:34, 25 October 2006 (UTC)
-
-
[edit] Wikipedia:Version 1.0 Editorial Team/Germany articles by quality statistics
...has no assessed articles shown, 327 unassessed, but a total below the importance ratings of 1339. Category:Stub-class Germany articles, for example, contains hundreds of articles and is properly situated in Category:Germany articles by quality which is a subcat of Category:Wikipedia 1.0 assessments. Any idea what's going on? --kingboyk 13:55, 22 October 2006 (UTC)
- It should be "Stub-Class" and not "Stub-class" (uppercase "Class"). I now made the bot accept the lowercase version too [1]. Such things go against the nature of Wikipedia, which is case-sensitive; let us hope that won't introduce more bugs in the bot. Oleg Alexandrov (talk) 16:19, 22 October 2006 (UTC)
- Aha. I'd have just fixed the category and sent them a nasty letter personally, but you're the boss! ;) --kingboyk 18:55, 22 October 2006 (UTC)
- That was my first idea too. But then I realized that such things could happen in the future too, and the changes to the code were not big (and luckily it was Sunday morning and I had time to kill :) Oleg Alexandrov (talk)
- Aha. I'd have just fixed the category and sent them a nasty letter personally, but you're the boss! ;) --kingboyk 18:55, 22 October 2006 (UTC)
[edit] 150 projects milestone
The index currently shows 150 projects are participating in the bot process. Should this go to Signpost? -- Ganeshk (talk) 18:43, 24 October 2006 (UTC)
- It will help recruit more projects to participate too. -- Ganeshk (talk) 18:43, 24 October 2006 (UTC)
-
- It probably could—I doubt anyone will look at it too closely—but I'll point out that we don't actually have 150 separate projects. WP Beatles, for example, is responsible for a half-dozen different lists from the 150. Kirill Lokshin 18:47, 24 October 2006 (UTC)
- (Edit conflict) I don't think we should send things too often - it's only about a month since they covered us reaching the 100,000 assessed article mark (and now we've passed 150,000!). I think we should hold off until 200 or (preferably 250 projects, and/or 250,000 assessed articles. I disagree with Kirill about the value of this, I know for certain that some people found out about the assessment program through our recent publicity. Walkerma 18:53, 24 October 2006 (UTC)
- Oops, I wasn't very clear there; I meant people wouldn't look too closesly at the fact that we don't actually have 150 projects, not that they wouldn't care about the announcement. ;-) Kirill Lokshin 19:04, 24 October 2006 (UTC)
- Ironically enough, Kirill brings a point that recently reared its head in WP:WPTC: whether it is a good idea to make navigation a bit more hierarchical, and put, instead of Category:Tropical cyclone articles by quality and Category:Meteorology articles by quality as separate entities, Tropical cyclones inside Meteorology, but still existing as separate entities. For the bloody details of the discussion, you can see here, but it would also be helpful for WP Beatles, perhaps. Titoxd(?!?) 05:55, 25 October 2006 (UTC)
- (Edit conflict) I don't think we should send things too often - it's only about a month since they covered us reaching the 100,000 assessed article mark (and now we've passed 150,000!). I think we should hold off until 200 or (preferably 250 projects, and/or 250,000 assessed articles. I disagree with Kirill about the value of this, I know for certain that some people found out about the assessment program through our recent publicity. Walkerma 18:53, 24 October 2006 (UTC)
- It probably could—I doubt anyone will look at it too closely—but I'll point out that we don't actually have 150 separate projects. WP Beatles, for example, is responsible for a half-dozen different lists from the 150. Kirill Lokshin 18:47, 24 October 2006 (UTC)
- There's also a few empty entries and at least one duplicate (see Wikipedia_talk:Version_1.0_Editorial_Team/Index#Article_counts). --kingboyk 16:11, 5 November 2006 (UTC)
[edit] Problem with WP:Dallas
I can't figure out why articles aren't showing up in Category:Dallas articles with comments. An example of a page that should is Talk:Oak Lawn, Dallas per Talk:Oak Lawn, Dallas/Comments and coding from {{WikiProject Dallas}}. Any help/advice would be great!! drumguy8800 C T 19:30, 26 October 2006 (UTC)
- This is fixed. I had to add a little space before the sort key. That did the trick. Regards, Ganeshk (talk) 04:07, 27 October 2006 (UTC)
[edit] Mixed up entries on WP:HV
I don't know why but it looks like a few entries are mixed up for WP:HV. Please let us know if there is a bug in our code and if there's any good way to avoid this. The relevant entries are: Wikipedia:Version 1.0 Editorial Team/Heraldry and vexillology articles by quality/1 (articles 2 and 3), and Wikipedia:Version 1.0 Editorial Team/Heraldry and vexillology articles by quality/2 (articles 387 and 388) and Wikipedia:Version 1.0 Editorial Team/Heraldry and vexillology articles by quality/4 (articles 37 and 38 and 391 and 392). Regards. Valentinian (talk) / (contribs) 09:27, 27 October 2006 (UTC)
- The problem appears when an article has a /Comments page (coming from another wikiproject assessment, since WP:HV doesn't use comments). It looks like there's a }} missing when the comments are transcluded into the worklist, e.g. (for the first example above)
{{assessment | page=[[Elias Ashmole]] [http://en.wikipedia.org/w/index.php?title=Elias_Ashmole&oldid=75298981 ] | importance= | date=[[October 6]], [[2006]] | class={{FA-Class}} | version=0.5 | comments={{Talk:Elias Ashmole/Comments }}
- Editing the page to add an extra }} after /Comments and before the closing }} of the {{assessment}} template fixes the problem for the current instance of the page, but this will of course get overwritten with the next update. --Dr pda 12:03, 27 October 2006 (UTC)
-
- That's a bot bug. The bot was adding the "}}" just fine, the problem is that when it was reading its own output next time it was not reading the "}}" in. I fixed the bug, see here. Thanks for the report. Oleg Alexandrov (talk) 05:31, 28 October 2006 (UTC)
-
-
- Thanks for the help. Valentinian (talk) / (contribs) 07:54, 28 October 2006 (UTC)
-
[edit] Article counts
I was wondering if the bot ought to ignore WikiProjects which have 0 articles? (Index · Statistics · Log)
Also, since you collect the stats for every project on every run, it might be cool to add columns to the table on this page for total of number articles (per project) and number unassessed? --kingboyk 15:39, 18 October 2006 (UTC)
- Bump. I found another empty list today (Index · Statistics · Log; see also Index · Statistics · Log). --kingboyk 16:07, 5 November 2006 (UTC)
-
- I would think that if you want the stats for a project, you could just click on the stats link, available on every line in the index page. I'd think that having the total stats only in the index page should be enough.
- All I can say is that Mortal Kombat created the assessment page only in the middle of last month, and Taiwan only a few days ago, by the histories. I think maybe the time has come to remind Mortal Kombat that they said they were going to assess their articles, and that they might lose the option if they don't start, and Taiwan could certainly be told that we're ready for them to start assessing whenever they are, but I'd give them at least a month before each before really removing them. Badbilltucker 14:25, 6 November 2006 (UTC)
-
-
- Thanks for checking these, Bill. IMHO only reminders are in order at this stage, but since these categories aren't doing any harm I don't think we should start using threats for quite a while yet! Walkerma 16:32, 6 November 2006 (UTC)
-
-
- About empty projects, I don't see what would be gained by ignoring projects which have zero articles. They are very few and the bot does not use much energy in updating those, as there is nothing to update. :) Oleg Alexandrov (talk) 16:26, 5 November 2006 (UTC)
- Just to prove I've too much time on my hands I made a combined Project league table.
- Results are interesting, showing different approaches of some projects to grading. some add the template to all their articles and then get on with grading, other only grade their best articles. --Salix alba (talk) 19:10, 5 November 2006 (UTC)
- ...and some use a bot (which gives large unassessed and auto-assessed-as stub numbers); some projects have assessed all the articles within their scope that they know about so have 0 unassessed both in the chart and literally. --kingboyk 19:18, 5 November 2006 (UTC)
-
-
- And you say that you have too much time on your hands? ;) Titoxd(?!?) 19:00, 6 November 2006 (UTC)
-
- How about % of articles at each grade (including unassessed)? That might be interesting. --kingboyk 19:16, 5 November 2006 (UTC)
- OK, I've added this to User:Salix alba/Project league table. --Salix alba (talk) 19:40, 5 November 2006 (UTC)
- Nice work. Now we need to ask Oleg nicely and perhaps Mathbot will write it out every day? :) --kingboyk 19:47, 5 November 2006 (UTC)
- Is it going to be useful? So far I see a large table with all kinds of stats which people may wonder about once in a while but which would not be worth updating everyday I would think. OK, if people thin it would be worth updating everyday, I could do it.
- Nice work. Now we need to ask Oleg nicely and perhaps Mathbot will write it out every day? :) --kingboyk 19:47, 5 November 2006 (UTC)
- OK, I've added this to User:Salix alba/Project league table. --Salix alba (talk) 19:40, 5 November 2006 (UTC)
-
-
-
- By the way, Salix alba, actually it may be a nice challenge for you to write a Perl bot to read those pages, make the data go through the script you made already, and publish the data back. What do you think? Oleg Alexandrov (talk) 03:50, 6 November 2006 (UTC)
- I can help you set up the bot. :) Oleg Alexandrov (talk) 03:51, 6 November 2006 (UTC)
- Useful? Probably not, Oleg, no - I just thought it was cool in a geeky kind of way :) --kingboyk 11:48, 6 November 2006 (UTC)
- By the way, Salix alba, actually it may be a nice challenge for you to write a Perl bot to read those pages, make the data go through the script you made already, and publish the data back. What do you think? Oleg Alexandrov (talk) 03:50, 6 November 2006 (UTC)
-
- Kind of marginally useful. The table tells us a few useful things: theres still 33 unassessed V 0.5 articles, and theres only 22 countries articles. Both of these are things which could be quickly fixed. Having an overview makes it easier to spot such items. I know that User:Lincer is currently tagging random articles and he might find it a bit more effective to target some of the projects with big unassessed categories.
- The table does not really need to be updated daily, a moththly update would probably suffice.
- Yes I have though about getting a bot account, probably to help make lists of mathematics aticles by field, without having to bug Oleg too much. --Salix alba (talk) 17:04, 6 November 2006 (UTC)
- Good luck with that. Setting up a bot is very easy as soon as you intall WWW::Mediawiki::Client which has a lot of dependencies. I'd say it is worth you giving it a try. :) Oleg Alexandrov (talk) 05:41, 7 November 2006 (UTC)
-
- I converted to the table to a Openoffice spreadsheet format. You can find it at Image:Project league table.sxc at Commons. This would help in running queries on the data. Regards, Ganeshk (talk) 03:24, 7 November 2006 (UTC)
[edit] Bot went mad
Today the bot was editing just fine for a while, until here, then it started mass-blanking the articles. That is something I can't explain, either there is an error in the script, then it should be all wrong, or there should be no error, then all the pages should be right. I really must go to bed now, so I just killed the bot and will look into this tomorrow. Sorry, I don't know what is going on. Oleg Alexandrov (talk) 06:06, 7 November 2006 (UTC)
As if one can sleep. :) I found the problem; the html source code for categories changed suddenly in a very subtle way, but enough to confuse the bot. I fixed it now. Here are some issues to think about.
- What to do with the roughly hundred and sixty blanked pages (starting from the link above -- album articles by quality). The admin rollback applied to an article would do a mass revert not only of the last mathbot edit, but of all the mathbot edits to that article until a non-mathbot edit. Can the javascript tool do better?
- I modified the subroutine which collects articles and subcategories from a category to just die if something is suspicious and not ruin everything. Whether this will take care of all the problems is not yet absolutely certain. The moral is that parsing data from fickle html source code is a bad idea in principle. Any ideas? Oleg Alexandrov (talk) 06:51, 7 November 2006 (UTC)
PS I will not run the bot until something is done about item 1 above. Restarting the bot would of course regenrate all pages, but it will lose the history link information (see for example the first column here for what I mean). Oleg Alexandrov (talk) 06:54, 7 November 2006 (UTC)
- Glad you found the problem! There's nothing worse than an inexplicable bug! I wonder if we went through the pages with AWB, semi-manually reverting the Mathbot changes? If it's only 160 pages, that's doable isn't it? Walkerma 06:58, 7 November 2006 (UTC)
- From my readings of WP:BOT theres various query API's avaliable meta:API and User:Yurik/Query API these might be more stable than directly reading the categories. --Salix alba (talk) 08:20, 7 November 2006 (UTC)
- I revered a few projects manually that I am part of that got the boot. But I think we need to revert to save the history. Shane (talk/contrib) 08:33, 7 November 2006 (UTC)
- Beware of the rouge bot network, it'll start with a few innocent edits, then... Just spamming a little, Scoo 08:58, 7 November 2006 (UTC)
- I reverted today's edits to WP:DALLAS's list. You never know what you got 'til its gone (er, temporarily whacked) ;) drumguy8800 C T 10:23, 7 November 2006 (UTC)
- I reverted a few but I'm not sure what to revert. Should we revert all of the last Mathbot edits made to the one before? This would include log, statistics, and the list (and its subpage). Cbrown1023 14:11, 7 November 2006 (UTC)
- What needs to be reverted is all the blanked pages (take the diff by looking at bot's contributions and see if the page got blanked). I will do a bunch of them by hand tonight, any help is of course appreciated. Oleg Alexandrov (talk) 17:17, 7 November 2006 (UTC)
- OK, I produced a list of what I believe are the corrupted pages, here. I've tried using AWB to revert these to the previous version, but I don't know how to do this. I know it can be done because I've seen lots of people use AWB to revert to the previous version after vandalism/blanking, which is in effect what is needed here. Can someone who knows AWB perhaps try this? I presume that once you set things up you can revert the problem in a few minutes. Thanks, Walkerma 18:01, 7 November 2006 (UTC)
- What needs to be reverted is all the blanked pages (take the diff by looking at bot's contributions and see if the page got blanked). I will do a bunch of them by hand tonight, any help is of course appreciated. Oleg Alexandrov (talk) 17:17, 7 November 2006 (UTC)
- I reverted a few but I'm not sure what to revert. Should we revert all of the last Mathbot edits made to the one before? This would include log, statistics, and the list (and its subpage). Cbrown1023 14:11, 7 November 2006 (UTC)
- I revered a few projects manually that I am part of that got the boot. But I think we need to revert to save the history. Shane (talk/contrib) 08:33, 7 November 2006 (UTC)
- From my readings of WP:BOT theres various query API's avaliable meta:API and User:Yurik/Query API these might be more stable than directly reading the categories. --Salix alba (talk) 08:20, 7 November 2006 (UTC)
- I Don't think AWB can revert, it can only edit the the current version. Wikipedia:Tools/Navigation popups can be handy for reverting, though its a two step process, first you need to do a diff or hist and then hover over the revision you want and click revert. --Salix alba (talk) 18:19, 7 November 2006 (UTC)
-
- OK, those people must use some other tool as well. Anyway, good news. Looking at that list, and also at Mathbot's "contributions", it is clear that Oleg caught it once it reached the Firefly project's list. So I think it shouldn't be too hard to go through the list of 60 Foobar articles by quality pages manually. I will do that now. Should we also revert the logs, to make sure the Biography log doesn't choke as 100,000 articles are added into it? If we fix those pages, will Mathbot be able to fix the statistics page itself? Walkerma 18:46, 7 November 2006 (UTC)
-
-
- All the "by quality" list pages are now OK. It took me exactly 30 minutes to do this using the Auto-Martin Browser. Do we still need to revert the corresponding log pages? Meanwhile, I must get on with other things at work. Walkerma 19:28, 7 November 2006 (UTC)
-
-
-
-
-
-
- Thanks everybody. I planned that after work (meaning now) I'd go and revert after my bot, and now appears that I was saved a couple of hours of hard work.
-
-
-
-
-
-
-
-
-
- The bot is fixed now, and again, I programmed it to just die if it can't detect subpages. I wish I could say there would never be problems again, but unfortunately one never knows, and the consequences of the bot going nuts would be hundreds of messed up pages. I guess the best we can do is to monitor the bot and block it if it misbehaves. Other ideas? Oleg Alexandrov (talk) 03:13, 8 November 2006 (UTC)
-
-
-
-
[edit] 24-hour bot runs
What'll happen when there're so many projects and updates to do that it takes the bot more than 24 hours to do a run? Then we won't have daily runs anymore. Rlevse 13:04, 9 November 2006 (UTC)
- Well, just because yesterday's bot did not finish running should not affect the run of today's bot. But you are right, very huge runs are not a good thing.
- The script spends most time in fetching the history of each newly added article (or article whose rating changed) and creating a link to a version in history. That thing needs to be done on a per-article basis, while everything else is done on a per category or list basis. Oleg Alexandrov (talk) 16:57, 9 November 2006 (UTC)
- OK, but what about when it takes over 24 hours to do that? Will the bot crash, will we have to change to running only every 2 days, etc? I recall when my project was added there were only 30 projects involved but now there are 170+ and every week it takes longer. That's what made me thing of this future possibility.Rlevse 19:06, 9 November 2006 (UTC)
-
- The bot won't crash in the sense that the two running bot instances won't be editing the same pages at the same time so they won't conflict. But it may crash because one bot already takes around 50% of the memory of my machine when running (it needs to keep all articles in the memory to compute the totals without repetitions). With two bots I may need to run them on different machines on alternate days or something. There could also be other issues coming up when the scale of the project increases. Let's see. :) Oleg Alexandrov (talk) 03:56, 10 November 2006 (UTC)
- Would this make it faster? Titoxd(?!?) 05:02, 10 November 2006 (UTC)
- This will make things faster and more robust I believe. Thanks! I think Salix alba also mentioned this earlier. I will look into implementing this alternative way of collecting categories in the near future. Oleg Alexandrov (talk) 16:11, 10 November 2006 (UTC)
- By the way, Oleg, I've contacted Yurik about this, and hopefully he can tell us if this can be done with the current Query API, or if we have to wait until the MediaWiki API is implemented. Titoxd(?!?) 05:21, 11 November 2006 (UTC)
- After talking with Yurik on IRC tonight, he said that we can request the pages in the assessed categories using the current Query API (as in the link I made above), do the back-end work the bot currently does (for example, determine which pages changed categories, which articles were added to the assessment lists, etc.) and then we can retrieve revision IDs for every change with a separate query, for example, this one for a bunch of hurricane articles: [2]. Now, I am not sure how many articles you can retrieve at once, but if you retrieve, let's say, up to 10 at a time, that should be much faster than what we currently do, and more robust, as well. Just remember that they're phasing out the Query API gradually for the MediaWiki API, so the syntax for the first query may have to be eventually changed. You can also change the format of the query at any time, to be more efficient with Perl. So, now, I shall go to sleep, as it is past 2:00 am here. Titoxd(?!?) 09:25, 11 November 2006 (UTC)
- By the way, Oleg, I've contacted Yurik about this, and hopefully he can tell us if this can be done with the current Query API, or if we have to wait until the MediaWiki API is implemented. Titoxd(?!?) 05:21, 11 November 2006 (UTC)
- This will make things faster and more robust I believe. Thanks! I think Salix alba also mentioned this earlier. I will look into implementing this alternative way of collecting categories in the near future. Oleg Alexandrov (talk) 16:11, 10 November 2006 (UTC)
- Would this make it faster? Titoxd(?!?) 05:02, 10 November 2006 (UTC)
- The bot won't crash in the sense that the two running bot instances won't be editing the same pages at the same time so they won't conflict. But it may crash because one bot already takes around 50% of the memory of my machine when running (it needs to keep all articles in the memory to compute the totals without repetitions). With two bots I may need to run them on different machines on alternate days or something. There could also be other issues coming up when the scale of the project increases. Let's see. :) Oleg Alexandrov (talk) 03:56, 10 November 2006 (UTC)
- The real problem right now is the amount of time needed to fetch the history of each newly added article. Right now, about 1/4 of the total articles are involved. As we get more and more articles in the system, if I understand this right, there will be fewer and fewer articles added to the system for the first time, and with any luck the run time will be reduced. I hope. Anyway, that's the impression I got from what he said. Badbilltucker 19:28, 9 November 2006 (UTC)
-
- OK, but what about when it takes over 24 hours to do that? Will the bot crash, will we have to change to running only every 2 days, etc? I recall when my project was added there were only 30 projects involved but now there are 170+ and every week it takes longer. That's what made me thing of this future possibility.Rlevse 19:06, 9 November 2006 (UTC)
(Deindent and reply to Titoxd above.) Then perhaps we can wait until they change the syntax, as we don't want the bot to again mass blank everything because it is confused by the change. Oleg Alexandrov (talk) 19:32, 11 November 2006 (UTC)
[edit] Mathbot down
I am not able to connect to my UCLA machine today, the network is down. I guess that explains why mathbot stopped running in the middle of the night. Oleg Alexandrov (talk) 16:13, 10 November 2006 (UTC)
- My school's network came back up, and the bot should run as usual tonight. Oleg Alexandrov (talk) 03:08, 11 November 2006 (UTC)
-
- Today the network was down again. Now it is up. The bot should run as usual tonight. Oleg Alexandrov (talk) 19:32, 11 November 2006 (UTC)
[edit] Bot bug?
While looking through Wikipedia:Version 1.0 Editorial Team/Tropical cyclone articles by quality/1, I saw that the "edit comment" links are malformed. Am I the only one who saw that? Titoxd(?!?) 05:20, 11 November 2006 (UTC)
- That's a bug, and I guess it spread everywhere. Sorry. I fixed it now. Oleg Alexandrov (talk) 19:32, 11 November 2006 (UTC)
[edit] Articles by class and importance
Is it possible for Mathbot to create a table showing the correlation of class and importance? For example:
Table Name Goes Here? |
Importance | ||||
---|---|---|---|---|---|
Top | High | Mid | Low | ||
Class | FA | # | # | # | # |
A | # | # | # | # | |
GA | # | # | # | # | |
B | # | # | # | # | |
Start | # | # | # | # | |
Stub | # | # | # | # |
My thoughts are that it isn't really feasibly possible with the way Mathbot works. I assume it simply loads the categories instead of loading each page and implementing this would seriously convolute the code, require an entire rewrite and would cause massive load. But I figured I'd just ask anyway, in case I'm completely wrong. :) thadius856talk 01:29, 14 November 2006 (UTC)
- Implementing this feature will not require a massive rewrite or a massive load. Each article is now an object knowing its quality (FA, A, GA, etc) and its importance (Top, High, Low). I would need to iterate through all the articles in a given project (say Beatles articles) and add to the appropriate cell in the table above.
- However, it is true that implementing this feature will require an amount of work, and it will generate a large amount of pages in addition to existing ones. I would be reluctant to work on this unless people believe that this feature will be extremely helpful to the project and worth updating such pages every day. For now I am not really convinced, for not too-large projects one can do some counting by hand to estimate some numbers in the table above, as the articles are sorted nicely first by quality and then by importance (priority), see for example Wikipedia:Version 1.0 Editorial Team/The Beatles articles by quality. Oleg Alexandrov (talk) 03:21, 14 November 2006 (UTC)
- Couldn't the table go in the statistics page for each project? (I'm still thinking whether this is a good idea or not...) Titoxd(?!?) 03:32, 14 November 2006 (UTC)
- Probably a good idea to put it there, if it's created; it'll keep down the number of extra pages, at the least.
- If this is created, incidentally: would it be possible not to generate an extra table for projects that don't have any importance assessments? It'll be rather unhelpful in those cases. ;-) Kirill Lokshin 04:58, 14 November 2006 (UTC)
- Couldn't the table go in the statistics page for each project? (I'm still thinking whether this is a good idea or not...) Titoxd(?!?) 03:32, 14 November 2006 (UTC)
-
-
-
- I think this table would be a very nice way to summarise the statistics; as long as the unassessed (NA) and the totals for each type were added in as well, as shown:
-
-
Table Name Goes Here? |
Importance | ||||||
---|---|---|---|---|---|---|---|
Top | High | Mid | Low | NA | Total | ||
Class | FA | # | # | # | # | # | # |
A | # | # | # | # | # | # | |
GA | # | # | # | # | # | # | |
B | # | # | # | # | # | # | |
Start | # | # | # | # | # | # | |
Stub | # | # | # | # | # | # | |
NA | # | # | # | # | # | # | |
Total | # | # | # | # | # | # |
-
-
-
- It's worth noting (yes, I realise it may be obvious to some) that this isn't just an alternative format, it actually presents information that is available but is presently hidden in the tables. I presume that this table would go on the statistics page in place of the existing table - is that what is proposed? In that case, one thing I don't understand is why it would involve generating new pages - can someone explain that to me?
-
-
-
-
-
- One alternative, to save making Mathbot's code (and Oleg's life!) any more complicated, is to use MartinbotII. MartinbotII has been recruited by yours truly to start generating new tables based on importance and quality criteria - you can see the test results from Chemistry, Physics, Maths and a few from Medicine here. I am proposing that this bot runs weekly, generating lists of articles suitable for WP:1.0. If Martin (not me!) is willing to do so, maybe MartinbotII could generate statistics like this, with an indication (in the table) of which cells are included in the release and which are not. Walkerma 05:35, 14 November 2006 (UTC)
-
-
On a side note, I figured I'd update it with some color, just for the sake of it. ;) thadius856talk 05:40, 14 November 2006 (UTC)
x | Top | High | Mid | Low |
---|---|---|---|---|
FA | # | # | # | # |
A | # | # | # | # |
GA | # | # | # | # |
B | # | # | # | # |
Start | # | # | # | # |
Stub | # | # | # | # |
It seems that this idea has good support so I will modify the code which generates the statistics table to include the extra columns as above with quality vs importance.
That is going to make the stats tables wider than now (see Wikipedia:Version 1.0 Editorial Team/A-League player articles by quality statistics for current layout), but I hope that won't be a problem. I will work on this the coming weekend. Oleg Alexandrov (talk) 16:02, 14 November 2006 (UTC)
Table Name Goes Here? |
Importance | ||||||
---|---|---|---|---|---|---|---|
Top | High | Mid | Low | NA | Total | ||
Class | FA | # | # | # | # | # | # |
A | # | # | # | # | # | # | |
GA | # | # | # | # | # | # | |
B | # | # | # | # | # | # | |
Start | # | # | # | # | # | # | |
Stub | # | # | # | # | # | # | |
NA | # | # | # | # | # | # | |
Total | # | # | # | # | # | # |
Is that the combined idea then the information and the colour. :: Kevinalewis : (Talk Page)/(Desk) 16:26, 14 November 2006 (UTC)
- I only see one problem with that table, Kevin. I assume you're using "NA" to mean "Not Assessed". However, hopefully you remember that articles can be NA class, meaning that they're non-article pages. An example of this would be Category talk:Stub-Class airport articles, while the corresponding category would be Category:Non-article airport pages. Perhaps "???" would work better, as it's already recognized in most 1.0 talk banners. By the way, I don't think the intersection of the two totals columns really has an significances, other than totaling how many class/importance ratings have been given. All the same, removing it looks very tacky.
Assessment Statistics |
Importance | ||||||
---|---|---|---|---|---|---|---|
Top | High | Mid | Low | ??? | Total | ||
Class | FA | # | # | # | # | # | # |
A | # | # | # | # | # | # | |
GA | # | # | # | # | # | # | |
B | # | # | # | # | # | # | |
Start | # | # | # | # | # | # | |
Stub | # | # | # | # | # | # | |
??? | # | # | # | # | # | # | |
Total | # | # | # | # | # | # | |
Non-article pages | # |
Obviously a project should concentrate on having their highest importance articles improved to the highest classes (rising numbers in top-left corner and decreasing numbers in bottom-right corner), and this is what I was hoping such a format would promote when I first proposed it. If you look at WP:AIRPORTS/A, for example, you'll notice that our one FA and two GA articles are not necessarily the most important airports in the world by most definitions. thadius856talk 23:46, 14 November 2006 (UTC)
[edit] Category intersection
- Is it possible to link the numbers (#) to sub-pages that contain the articles that related to both (intersection)? For example, A/Top will link to a new page, "List of A-class articles with Top-importance". I would suggest the new stats table be added to a new page since the existing stats table is used at the bottom of many project navigation boxes. -- Ganeshk (talk) 20:26, 19 November 2006 (UTC)
- I agree with the latter but do you know how many categories the former would create? Cbrown1023 20:36, 19 November 2006 (UTC)
- May be the former can be programmed into the project banner templates to create the intersection categories. And the bot could link the numbers to the respective categories. Regards, Ganeshk (talk) 20:46, 19 November 2006 (UTC)
-
- I modified the India project banner to implement the 30 intersection categories. I can now find out Stub-Class India articles of Top-importance. Now can Mathbot be programmed to read these categories? What do others feel about this? Regards, Ganeshk (talk) 22:40, 19 November 2006 (UTC)
- I personally think the method we use currently is already complicated to set up, so having 30 categories is close to a nightmare, and that it will put a barrier to entry to new WikiProjects who don't want to go through roundabouts to get two or three articles assessed... Titoxd(?!?) 23:40, 19 November 2006 (UTC)
- Could it be made optional? Projects that have these intersection cats will additionally get a new page with the above statistics table? It didn't take me too long to setup these cats. If new projects don't create these cats, the rest of the system still works for them. -- Ganeshk (talk) 01:18, 20 November 2006 (UTC)
- Oh for Wikipedia:Category intersection which would make these problmes go away if implemented. I've also a slight concern that changing the size of the statistics table may mess up some of the project pages, I know the mathematics and portugal projects transclude the statistics page in to the project pages. --Salix alba (talk) 00:41, 20 November 2006 (UTC)
- I personally think the method we use currently is already complicated to set up, so having 30 categories is close to a nightmare, and that it will put a barrier to entry to new WikiProjects who don't want to go through roundabouts to get two or three articles assessed... Titoxd(?!?) 23:40, 19 November 2006 (UTC)
- I modified the India project banner to implement the 30 intersection categories. I can now find out Stub-Class India articles of Top-importance. Now can Mathbot be programmed to read these categories? What do others feel about this? Regards, Ganeshk (talk) 22:40, 19 November 2006 (UTC)
-
-
-
-
- I don't like the idea of more links in the table, either with subpages, or with categories. It would be a lot of work to set up, and really, I don't see the gain. And the idea of optional features is not good either I think; it just creates a bloated code with features with few people if anbody uses. If I see good support from the community in implementing this, I will, otherwise I think it is not worth the trouble. Oleg Alexandrov (talk) 03:06, 20 November 2006 (UTC)
- Well, I personally think it is redundant with the main tables, as they are already sorted by quality, then importance. Any desired combination is already visible through there... Titoxd(?!?) 05:18, 20 November 2006 (UTC)
- The numerical count of combination is visible, but the list of articles behind the count is not. As Salix alba pointed out, Wikipedia:Category intersection may be the answer for this. But it is still in development stage. I do see a need for this however it is finally implemented. Regards, Ganeshk (talk) 19:32, 20 November 2006 (UTC)
- No, I don't mean the summary table; I mean the actual assessment tables, like Wikipedia:Version 1.0 Editorial Team/India articles by quality/1. Those include the sorted info, so the need for those pages is already fulfilled... Titoxd(?!?) 21:08, 20 November 2006 (UTC)
- I don't think Wikipedia:Version 1.0 Editorial Team/India articles by quality/1 is same as what I am asking for. What you point to is just a list of articles sorted by the quality. They don't show the intersection. Can you show me how I can see "A Class articles of Top importance" using those pages? Regards, Ganeshk (talk) 21:33, 20 November 2006 (UTC)
- The pages are ordered by quality and then by importance... so you scroll down to A-Class and then look at which ones say Top-Importance (they are all grouped together). Cbrown1023 22:29, 20 November 2006 (UTC)
- I don't think Wikipedia:Version 1.0 Editorial Team/India articles by quality/1 is same as what I am asking for. What you point to is just a list of articles sorted by the quality. They don't show the intersection. Can you show me how I can see "A Class articles of Top importance" using those pages? Regards, Ganeshk (talk) 21:33, 20 November 2006 (UTC)
- No, I don't mean the summary table; I mean the actual assessment tables, like Wikipedia:Version 1.0 Editorial Team/India articles by quality/1. Those include the sorted info, so the need for those pages is already fulfilled... Titoxd(?!?) 21:08, 20 November 2006 (UTC)
- The numerical count of combination is visible, but the list of articles behind the count is not. As Salix alba pointed out, Wikipedia:Category intersection may be the answer for this. But it is still in development stage. I do see a need for this however it is finally implemented. Regards, Ganeshk (talk) 19:32, 20 November 2006 (UTC)
- Well, I personally think it is redundant with the main tables, as they are already sorted by quality, then importance. Any desired combination is already visible through there... Titoxd(?!?) 05:18, 20 November 2006 (UTC)
- I don't like the idea of more links in the table, either with subpages, or with categories. It would be a lot of work to set up, and really, I don't see the gain. And the idea of optional features is not good either I think; it just creates a bloated code with features with few people if anbody uses. If I see good support from the community in implementing this, I will, otherwise I think it is not worth the trouble. Oleg Alexandrov (talk) 03:06, 20 November 2006 (UTC)
-
-
-
[edit] Datestamp changed to GMT
Following a comment by Mike Peel on my talk page, I modified the date stamp at the bottom of lists to GMT from my local time (PST). Today that will cause the bot to jump a day, see here, but from tomorrow on, the datestamp output by the bot will actually correspond to the current day to most users at most times. Oleg Alexandrov (talk) 03:53, 16 November 2006 (UTC)
[edit] Linking Importance categories
Oleg, Just like the class categories link to the related project's respective class categories, could the same be done with the Importance parameter? Right now they show up with no links on the statistics pages. Thanks, Ganeshk (talk) 23:47, 18 November 2006 (UTC)
- Thanks, good point. I will do that. Oleg Alexandrov (talk) 08:52, 19 November 2006 (UTC)
[edit] Class and Importance, again...
Just wondering if the update was still scheduled for this weekend. I'm getting rather antsy, but I completely understand if its been delay, or trashed entirely. Thanks! thadius856talk 18:52, 19 November 2006 (UTC)
- It is still Sunday here, in Los Angeles. :) If I knew somebody is dying to see that feature, I could have done it earlier. :)
- OK, one can see how things look here.
- I agree with Salix alba's comment (three sections above) that wider stats table could be a problem for some projects. In the same time, I am reluctant to decide the table appearance on a per project basis. Today the bot is running with the new (bigger and more detailed) table. Depending on the community feedback I will either have the bot always using the new format or revert to the old one. Oleg Alexandrov (talk) 04:33, 20 November 2006 (UTC)
- Looks great to me! Walkerma 04:37, 20 November 2006 (UTC)
- It looks awesome! One small gripe: would it be possible to put a break in between the project name and "articles" to prevent stretching the first column for longer project names? For example, Adelaide<br />articles or Adelaide<br>articles. thadius856talk 06:01, 20 November 2006 (UTC)
- Agree it looks good. A small point, I'd bold the numbers in the totals row and column, so they stand out more from the other numbers. --Salix alba (talk) 10:04, 20 November 2006 (UTC)
- I implemented these two. They indeed make the table nicer. Oleg Alexandrov (talk) 03:51, 21 November 2006 (UTC)
- Agree it looks good. A small point, I'd bold the numbers in the totals row and column, so they stand out more from the other numbers. --Salix alba (talk) 10:04, 20 November 2006 (UTC)
- It looks awesome! One small gripe: would it be possible to put a break in between the project name and "articles" to prevent stretching the first column for longer project names? For example, Adelaide<br />articles or Adelaide<br>articles. thadius856talk 06:01, 20 November 2006 (UTC)
- Looks great to me! Walkerma 04:37, 20 November 2006 (UTC)
It looks fine to me, but could the old table still also be generated as a smaller alternative? A lot of the projects don't use the importance criteria at all, and furthermore a lot of them try to transclude the stats table in their sidebars. The new wider table pretty much breaks the nice-looking sidebars. Thanks. Girolamo Savonarola 10:31, 20 November 2006 (UTC)
- That would require generating two different statistics pages for each project. It is something I would be rather reluctant to do, I feel the script is getting a lot of feature creep and and is becoming too wasteful of resources. Are two tables really necessary? Oleg Alexandrov (talk) 15:52, 20 November 2006 (UTC)
-
- Probably not; having more pages would just confuse things. Is there any way around it, though? For example, could the script only write columns to the statistics table if they have a non-zero total? (In other words, a project that didn't use importance ratings would get two columns ["None" and "Total"] rather than the current six.) Kirill Lokshin 16:33, 20 November 2006 (UTC)
-
-
- Please? :) Girolamo Savonarola 05:24, 21 November 2006 (UTC)
-
-
-
-
- I implemented Kirill's suggestion above. Now empty columns won't show up in the table, see for example Wikipedia:Version 1.0 Editorial Team/A-League player articles by quality statistics. That will help to some extent. Oleg Alexandrov (talk) 03:59, 22 November 2006 (UTC)
-
-
-
-
-
-
- Once again, Oleg, you work miracles. That looks excellent. Thank you so much for your time and effort, many of us on Wikipedia are in your debt. Walkerma 05:29, 22 November 2006 (UTC)
-
-
-
[edit] New stats layout
Okay, the new stats layout is good. But it creates problems for some WikiProjects.
A lot of the projects use {{PROJECT articles by quality statistics}} to blend the stats into the project page. Now this new, wider table is destroying the layout. AQu01rius (User • Talk) 16:54, 20 November 2006 (UTC)
- Yes, we expected this would occur in a few cases, but felt the upset in changing over was worthwhile for the extra information. I hope that you can re-format your project page, just as the Math folks did this morning. If you have a problem doing this, let us know at WVWP and we'll help out. Walkerma 17:57, 20 November 2006 (UTC)
-
- OK, I see a problem with projects like {{WPCanada_Navigation}} who are inserting the stats page by transclusion into their navigation template. Are there any suggestions from the technically smart people (i.e., not me)? Or do we just ask projects not to do this? Walkerma 05:40, 21 November 2006 (UTC)
-
-
- On {{WPIndia Navigation}}, I set the navigation box width to 325px like you advised above. That fixed it. -- Ganeshk (talk) 05:57, 21 November 2006 (UTC)
- Thanks! I'm sure you've done wonders for India-Canada relations! Walkerma 06:23, 21 November 2006 (UTC)
- On {{WPIndia Navigation}}, I set the navigation box width to 325px like you advised above. That fixed it. -- Ganeshk (talk) 05:57, 21 November 2006 (UTC)
-
This doesn't look good. 325px is too wide, and eats a lot space (which ruins the purpose of sidebar). I'll try to figure out other ways around.. AQu01rius (User • Talk) 21:04, 21 November 2006 (UTC)
- Nevermind, there's no way around (I thought some <noinclude> tweak may work, but the bot replaces the entire stats page in every update).
The new detailed stats is really not suitable for inline inclusion anymore, and what I'll do is to remove it, and just leave the "view full worklist" link in the sidebar. AQu01rius (User • Talk) 19:25, 23 November 2006 (UTC)
-
-
- Perhaps we can switch to the old mode indeed, there seem to be plenty of people who are not happy with the extra columns. Oleg Alexandrov (talk) 05:27, 24 November 2006 (UTC)
-
- No.. Keep it. Your doing amazing work. But is it possible to make it optional? Like <noinclude> the detailed parts. AQu01rius (User • Talk) 21:19, 24 November 2006 (UTC)
- <noinclude> is not a good solution though, as most people (both those who love and those who hate the extra columns) use that template transcluded. (For the record, I don't much care which format is used, but I don't want to complicate the code by introducing per-project preferences.) Oleg Alexandrov (talk) 02:13, 25 November 2006 (UTC)
[edit] Canada
{{WikiProject Canada}} not working with the Mathbot assessment project. Lincher 19:31, 20 November 2006 (UTC)
- Ah, I fixed the category name in the template. It should work now.. I haven't set up the Importance catogorization though. AQu01rius (User • Talk) 06:05, 21 November 2006 (UTC)
-
- Thanks for the fix, the elements in the template make it too dangerous to touch by a n00b like me. Lincher 20:20, 21 November 2006 (UTC)
- Canada is still a new project, and probably just hasn't had all the categories and such created yet. I know it's a new project; I made the banner and userbox for them. Any objections to having the modifications dropping the assessment criteria reversed, IF the project is organized to work with the bot and regularize assessments? Badbilltucker 14:27, 22 November 2006 (UTC)
- Thanks for the fix, the elements in the template make it too dangerous to touch by a n00b like me. Lincher 20:20, 21 November 2006 (UTC)
[edit] Yurik's query API
I am very tempted to switch to Yurik's API (mentioned a few sections above) for reading categories, not only because it would be faster and less confusing to my bot, but also because the way things are now, parsing HTML, sometimes results in cached (and therefore inaccurate) information. Yurik's API provides minimally formatted output delivered straight from the database, which is as good as it gets.
An example of getting the articles and subcategories in Category:Wikipedia 1.0 assessments is here. That text does not include all subcategories however (there is a limit on the number of subcategories displayed at once, for performance reasons I guess). What I was not able to figure out is what query to use to get the remainder of the category. Any ideas on that? Oleg Alexandrov (talk) 16:25, 23 November 2006 (UTC)
- In the query setion of the output
[query] => Array ( [category] => Array ( [next] => Maharashtra articles by quality )
)
the next argument gives the starting point for the next query. You can do a loop, or recursion, testing to see if query section of the output is present. --Salix alba (talk) 17:48, 23 November 2006 (UTC)
- In this instance the next url will be http://en.wikipedia.org/w/query.php?what=category&cptitle=Wikipedia_1.0_assessments&format=txt&cpfrom=Maharashtra_articles_by_quality. --Salix alba (talk) 21:12, 23 November 2006 (UTC)
- Strangely enough I tried that, but I could not make it work. Thanks! Oleg Alexandrov (talk) 05:27, 24 November 2006 (UTC)
-
-
- There is still a problem though. In the link you supplied above, Maharashtra_articles_by_quality serves as a tag to move to the next page. However, the actual subcategory Category:Maharashtra_articles_by_quality, is neither in the first page, nor in the second page. The bot then fails to read that category, as you see here. Do you know why? Oleg Alexandrov (talk) 06:25, 24 November 2006 (UTC)
-
- Curiously it does seem to work. The articles does appear to be unordered which can confuse http://en.wikipedia.org/w/query.php?what=category&cptitle=Wikipedia_1.0_assessments&cpfrom=Maharashtra+articles+by+quality&format=xml works. I seem to get different results puting the format argument before the cpfrom. --Salix alba (talk) 16:29, 24 November 2006 (UTC)
-
- That's so weird. Writing the same link as
-
- (underscores instead of plus signs, does not have Category:Maharashtra articles by quality as output. Oleg Alexandrov (talk) 02:23, 25 November 2006 (UTC)
[edit] Switch to Yurik's API
With Salix alba's help I switched to Yurik's API for reading categories (and will do that for history revisions soon also, that will make the code faster). Let us hope that Yurik won't change the API syntax, that would confuse the bot. The new API has the advantage that it always gets up-to-date info, rather than stale html the way it was before. Oleg Alexandrov (talk) 05:42, 25 November 2006 (UTC)
[edit] Running the bot on demand
I implemented a web form that allows one to run the bot for an individual project at any time, rather than waiting a good chunk of a day until it is scheduled. I think that could be helpful for new projects as then one could get a quick feedback if the project was set up right. The link is here. Oleg Alexandrov (talk) 05:27, 25 November 2006 (UTC)
- So, as a common courtesy to you and your bot, projects like Biography should not run it happenstance and should wait for your scheduled update because their vast number of articles? Cbrown1023 14:03, 25 November 2006 (UTC)
- I guess it would be more of a curtsesy to the Wikipedia servers not to overuse the bot. But I don't know, if you have a large project, but you did a lot of changes and want instant gratification, I guess you could go for it. Up to your conscience. :) Oleg Alexandrov (talk) 15:56, 25 November 2006 (UTC)
[edit] Bot source code
I posted the source code to the Perl script which updates the lists in the index together with all dependencies and instructions here.
I think it would be a good idea if perhaps somebody could try it out. I am not going anywhere any time soon, but thinking long term, considering how important the Wikipedia 1.0 project is and the amount of work put into this project by the community so far, I think it would good if the code is public and more people than just me could run it and perhaps even have an idea of how to modify it. Following the instructions over there should take an hour at most, assuming that nothing goes wrong. So, any volunteers? :) Oleg Alexandrov (talk) 05:34, 27 November 2006 (UTC)
[edit] Linking of dates
I was just looking through the lists of all articles per WikiProject (xXx articles by quality/#). I'm just wondering why the dates were all wikilinked. It seems to me it doesn't really serve any purpose and WP:MOSDATE seems to hint that dates shouldn't be linked unless they give context, but I get the feeling that's only for ns0.
However, removing the wikilinks seems that it wouldn't trim much of the size of the file off. I was just wondering if perhaps switching to a [[YYYY-MM-DD]] format would keep file sizes down any. It wouldn't change anything on the user's end, since user preferences catch that format as well. thadius856talk|airports|neutrality 06:07, 27 November 2006 (UTC)
[edit] Sortable tables
Should we get excited about this commit to SVN? Apparently, we are now able to create sortable tables... Titoxd(?!?) 00:25, 28 November 2006 (UTC)
- Here's an example, using a modified {{assessment header}} found in my sandbox: [3] Titoxd(?!?) 00:57, 28 November 2006 (UTC)
- Very cool! :) Very slow... :( But I'm assuming they're still ironing out a lot of the kinks. Definitely a good step towards resolving pesky issues of list ordering (eg chronological vs. reverse chronological). Girolamo Savonarola 01:15, 28 November 2006 (UTC)
- That's awesome. Except, doesn't it kinda take the place of India's multi-categories. It's easy to see all these this way. I hope Oleg implements it in the work lists when they iron out the kinks. Cbrown1023 01:23, 28 November 2006 (UTC)
- Very cool! :) Very slow... :( But I'm assuming they're still ironing out a lot of the kinks. Definitely a good step towards resolving pesky issues of list ordering (eg chronological vs. reverse chronological). Girolamo Savonarola 01:15, 28 November 2006 (UTC)
[edit] Generating some categories automatically
When adding a new project, one should generate those FA-Class, A-Class, Top-importance and other categories. I made that step (semi-)automatic. One can visit Wikipedia:Version 1.0 Editorial Team/Generate categories and specify what categories to create, and the bot will do it for you. Only administrators can use that tool (it requires editing a protected page). That way random people can't just generate any categories they want, and if the categories were generated incorrectly the admin in question can delete them.
One can argue this tool is not that necessary, but I think it can save some work when setting up a project (and for people who did not set up a new project for the bot before there is less to learn this way). Oleg Alexandrov (talk) 06:39, 28 November 2006 (UTC)
[edit] Statistics - total number of assess articles decreasing?
Am I missing something, or have almost 3000 articles just gone missing? See: [4]- Trevor MacInnis (Contribs) 04:49, 29 November 2006 (UTC)
- You're missing a zero; it's almost 30,000 articles gone. It may be a bug with the bot; but, based on prior experience, I suspect that some WikiProject has broken their banner code. I'll see if I can figure out what list these are disappearing from. Kirill Lokshin 04:54, 29 November 2006 (UTC)
-
- Well, Biography lost about 20,000; I'm not sure where the others are. Kirill Lokshin 05:15, 29 November 2006 (UTC)
- I killed the bot for tonight until this is sorted out. Weird indeed, I am also trying to understand what is going on. Oleg Alexandrov (talk) 05:37, 29 November 2006 (UTC)
- I browsed through a few early-alphabet projects, the only project where I saw a problem was Biography. I notice that the log has been blanked by the bot. I also checked, the statistics show only 200 B-Class articles, but the category contains many more than 200 articles. Walkerma 06:25, 29 November 2006 (UTC)
- I killed the bot for tonight until this is sorted out. Weird indeed, I am also trying to understand what is going on. Oleg Alexandrov (talk) 05:37, 29 November 2006 (UTC)
- Well, Biography lost about 20,000; I'm not sure where the others are. Kirill Lokshin 05:15, 29 November 2006 (UTC)
-
-
-
-
-
- I further modified the routine which processes the logs to try not to blank them but rather truncate them if they are too big. That may help in the future.
-
-
-
-
-
-
-
-
-
- I still don't know what was going on. Oleg Alexandrov (talk) 16:59, 29 November 2006 (UTC)
-
-
-
-
I looked at the stats of around 65 of the projects at the end of the list, and also at all biography projects (arts and entertainment, core, military, and other biographical). None had any significant decreases in the numbers, except the fat mother-of-all plain biography articles by quality. That one is always ran last, as it is the hugest.
If it is a bot bug, it could be subtle, as it shows up very seldom (once, so far). Can't be that the server was down; the bot was programmed to die if it can't repeatedly do an HTTP request. If it can't read the contents of a category, it would also die.
The bot did not crash, otherwise it would not have commited the total stats on that date. By the way, if you look at that diff, you would see that only the B-Class articles and Unassessed articles decreased, and roughly by same amount as in the problematic Biography stats.
All in all, I don't know what is going on. Maybe the code is sound but we are pushing against the limits of Perl, or computer memory, or who knows what. Either way it appears that the only place article disappeared from is the biography project (such a monster should not exist to start with). I won't let the bot run today either. But if we discover nothing by tomorrow I guess we could run it again and hope for the best.
This keeps me again wondering. What if one day for a reason or another the bot will get really mad? Would be rather hard to reverse the damage. Any comments? Oleg Alexandrov (talk) 03:54, 30 November 2006 (UTC)
- There was one change in {{WPBiography}} yesterday,[5] but I can't figure out how it would affect the listing... perhaps you can decipher something? I don't think we're reaching the limits of the MediaWiki API either, as the last change in SVN was six weeks ago...
- As for bot lunacy: the dirtiest way I could think of was having a different bot (I don't know... MissMathbot or something) update the "last updated" date field in the tables, so if Mr. Mathbot blows stuff up, we can always bot rollback him to the missus' version... Titoxd(?!?) 05:24, 30 November 2006 (UTC)
- I was wondering a similar thing - having an "antidote" bot (AntiMathbot?) ready to do a mass revert of all Mathbot's recent edits if he turns into Madbot. Just don't set such a bot loose against my edits, or I'll turn paranoid! Walkerma 05:29, 30 November 2006 (UTC)
- That would require some programming, parsing history and dates, etc. I thought of this problem too, and I came with a rather dumb idea. How about just backing up each day's output on my local machine, say for the last 10 days? That's easy to implement, it would require creating a directory for each calendar day (November_16_2006 for example) and doing a save for each page submitted to Wikipedia. I have a few spare gigs on my work computer I could use. How's that? :) Oleg Alexandrov (talk) 05:44, 30 November 2006 (UTC)
- That sounds great to me! If you do that, it should probably be noted on Mathbot's page that these backups are available. Are you always around when Mathbot is running, or does he sometimes run when you're absent? Walkerma 06:38, 30 November 2006 (UTC)
- I will implement the hard drive backup as it is easy. Now, the backups will not be available online, as I don't have enough storage in my world-viewable directory, so only I would be able to do the reverts. Long term, I agreee that it is more elegant to revert with a bot, so I will think about that too (although that's of course more work; anybody else willing to write such a bot? :) I am not always around when the bot is running, it runs on its own schedule. So, usually if something is suspicious the bot should be blocked right away. Oleg Alexandrov (talk) 16:03, 30 November 2006 (UTC)
- Would it even need a separate bot? For example, having one thread of the program write the assessment tables, and a different thread with a different bot edit just the one line in the date, as I said above, could work. That way, we can, in the worst-case scenario, admin rollback him by hand. Titoxd(?!?) 05:29, 1 December 2006 (UTC)
- I will implement the hard drive backup as it is easy. Now, the backups will not be available online, as I don't have enough storage in my world-viewable directory, so only I would be able to do the reverts. Long term, I agreee that it is more elegant to revert with a bot, so I will think about that too (although that's of course more work; anybody else willing to write such a bot? :) I am not always around when the bot is running, it runs on its own schedule. So, usually if something is suspicious the bot should be blocked right away. Oleg Alexandrov (talk) 16:03, 30 November 2006 (UTC)
- That sounds great to me! If you do that, it should probably be noted on Mathbot's page that these backups are available. Are you always around when Mathbot is running, or does he sometimes run when you're absent? Walkerma 06:38, 30 November 2006 (UTC)
- That would require some programming, parsing history and dates, etc. I thought of this problem too, and I came with a rather dumb idea. How about just backing up each day's output on my local machine, say for the last 10 days? That's easy to implement, it would require creating a directory for each calendar day (November_16_2006 for example) and doing a save for each page submitted to Wikipedia. I have a few spare gigs on my work computer I could use. How's that? :) Oleg Alexandrov (talk) 05:44, 30 November 2006 (UTC)
- I was wondering a similar thing - having an "antidote" bot (AntiMathbot?) ready to do a mass revert of all Mathbot's recent edits if he turns into Madbot. Just don't set such a bot loose against my edits, or I'll turn paranoid! Walkerma 05:29, 30 November 2006 (UTC)
-
-
-
-
-
-
- That's simple indeed. But this would imply that each page must be edited twice each day, doubling the history and making the servers store an additional version of each page each day (and gosh, we have many pages). Also, if say we realized that something was wrong not right away but after more than 24 hours, then more than one previous edit would need reversal, and since two bots edited the same page in one day, the admin rollback would not work. The more elegant solution would be I think a script smart enough to actually go through a bot's contribs and reverting those which happened on a given day to given projects. But that's hard to do. :) Oleg Alexandrov (talk) 05:38, 1 December 2006 (UTC)
-
-
-
-
-
I am almost finished with the "backup on disk feature", I think I will complete it tomorrow.
I reverted by hand the first 25 pages of Wikipedia:Version 1.0 Editorial Team/Biography articles by quality, containing all FA-Class, A-Class, GA-Class, B-Class and a bunch of Start-Class articles (note by the way the corrupted text at Wikipedia:Version 1.0 Editorial Team/Biography articles by quality/6 (specific version link), I don't understand what is going on).
I tested the bot just in case on the Adelaide project, and nothing strange happens.
So, with the evidence so far that the problem is most likely not with the bot but with the Biography project, I restarted the bot for today. Hopefully nothing wrong will happen. Once the backup feature is finished, even if something wrong happens it will be easier to revert (not that I should become more sloppy as a coder :)
Good night, all. :) Oleg Alexandrov (talk) 05:38, 1 December 2006 (UTC)
- Good night, Oleg. For tomorrow: looking at the Biography table link you gave us, someone is adding huge templates to the Comments pages of some Congressional biographies... could that have an adverse effect? Titoxd(?!?) 05:44, 1 December 2006 (UTC)
-
- It should not. As far as the bot is concerned, it does not matter what is inside a comments page, all it sees is the link to it. But something in the comments pages is messing up the rendering of tables, that should be fixed eventually. Oleg Alexandrov (talk) 15:55, 1 December 2006 (UTC)
May just be that it has stopped again - this time at about 7:30 at the end of "Albums" :: Kevinalewis : (Talk Page)/(Desk) 10:54, 1 December 2006 (UTC)
- It is happily running now. (I checked a bunch of statistics, and articles are not disappearing.) I guess it appeared it stopped becausse it was slowed down by having to read the most recent version of a bunch of new added articles. Once I switch to Yurik's API for that, it should go much faster.
- By the way, as the bot runs it is writing log info to disk. The log is publically available here so one can tell if the bot is stuck or died or something. Oleg Alexandrov (talk) 15:55, 1 December 2006 (UTC)
-
- OK, the back-up feature works now. From now on each list of the form "Foobar articles by quality" will be backed up for the last five days. So, if anything goes wrong with the list, be it bot's fault or not, we can recover the data. I think backing up the logs and the stats is not that important, and I am not sure if going beyond five days is worth it. Oleg Alexandrov (talk) 05:36, 2 December 2006 (UTC)
- Hmm. Biography just recovered about 40,000 articles.[6] That was weird... Titoxd(?!?) 06:54, 2 December 2006 (UTC)
-
- Why is that weird? It recovered the B class articles lost before, a bunch of unassessed articles, and much more. I think that's fine. Oleg Alexandrov (talk) 17:01, 2 December 2006 (UTC)
[edit] Film stats
The film stats didn't update last night so I tried to run the bot on demand. It worked unti it got here:
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=Unassessed+importance+film+articles&format=txt&cpfrom=List+of+films+made+into+television+programs
It just stopped. Do you know what the problem is? Cbrown1023 17:04, 2 December 2006 (UTC)
- I started it "on demand" and it works now. Did you actually keep that window open where the code was running to see what messages it was outputting?
- By the way, I have no idea why the code stopped last night after doing just a few projects. I am trying to figure that out. Oleg Alexandrov (talk) 17:48, 2 December 2006 (UTC)
- About the "on demand" thing yeah, the window is still open right now. (Just in case you need to know something else about the code.) Cbrown1023 17:53, 2 December 2006 (UTC)
- Whatever you did, it worked. It is now fixed (/updated). Cbrown1023 18:26, 2 December 2006 (UTC)
- About the "on demand" thing yeah, the window is still open right now. (Just in case you need to know something else about the code.) Cbrown1023 17:53, 2 December 2006 (UTC)
-
-
-
- I could make the online "on demand" tool work either for the film project. It works though for the "Beatles" project. And I managed to make the film project work from the command line (that is what you saw).
-
-
-
-
-
- In short, the codebase seems sound, but when called online for large projects it does not want to work. I will try to think of what is going on. Oleg Alexandrov (talk) 18:30, 2 December 2006 (UTC)
-
-
[edit] Bot confused again
Biography's in trouble again. ;-) Kirill Lokshin 20:58, 3 December 2006 (UTC)
- I just went through the December 3 log. It appears that only the biography articles have problems. The other log, what the bot actually writes to disk as it does stuff, also has nothing suspicious. I am truly at loss. I will try to investigate what is going on. The biggest mystery is why other projects are not affected, only this one, which is also by far the biggest? Oleg Alexandrov (talk) 23:40, 3 December 2006 (UTC)
- Is there a way to sort the results of a query.php query? Because if the "by quality" categories are modified (e.g. a page is added to them), then that may jumble the results of the SQL query, because in PHP, database entries do not have any particular order unless specified otherwise, and that may be causing some issues. (I just thought about that right now...) Titoxd(?!?) 00:13, 4 December 2006 (UTC)
-
-
- Do you mean to say that if a category is modified while the bot reads subpage by subpage then the bot may not read it correctly? Oleg Alexandrov (talk) 00:22, 4 December 2006 (UTC)
- Yes, kind of. If the category is modified, let's say, Unassessed biography articles, then the order in the SQL database is jumbled, if I remember my PHP correctly. Titoxd(?!?) 01:13, 4 December 2006 (UTC)
- Do you mean to say that if a category is modified while the bot reads subpage by subpage then the bot may not read it correctly? Oleg Alexandrov (talk) 00:22, 4 December 2006 (UTC)
-
I will keep that in mind. But there's got to be more to it. Per the log, the bot started reading the B-Class biography articles, read the articles starting with A, then B, then C, all the way to J, and then simply did not go on. Here's the relevant part in the log
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=Andy+Warhol
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=Bo+Schembechler
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=Constantine+Maroulis
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=El+Greco
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=Gessius+Florus
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=Isabella%2c+Countess+of+Atholl
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=B-Class+biography+articles&format=txt&cpfrom=John+H.+Russell%2c+Jr.
Getting http://en.wikipedia.org/w/query.php?what=category&cptitle=Start-Class+biography+articles&format=txt
(look at the last too lines, it just went from B-Class to Start-Class). Even if the B-Class category was jumbled, it should have still yielded some articles beyond J. When I ran the bot manually this afternoon, it easily went beyond J. Oleg Alexandrov (talk) 01:57, 4 December 2006 (UTC)
I was doing some testing for the bot and found articles disappearing, see Wikipedia:Version 1.0 Editorial Team/Pokémon Collaborative Project articles by quality/1 history. That is very strange. I am taking the bot down for today, hopefully one of these days I'll find out what is going on. Oleg Alexandrov (talk) 02:29, 4 December 2006 (UTC)
- Just to clarify what I wrote above. Yesterday I did this test, ran the bot, got this (articles disappearing). Reran the bot again, got this (more articles disappearing), then this (articles popped right back).
- Today, I did exactly the same test here. I ran the bot, and this time I got the correct result here, the bot just put back the articles I removed (with new history revision link, obviously).
- I did not change the code from yesterday to today.
- Now, all I can think about is that what is going on is not bot's fault, but rather, the server does not always give it accurate information about what is inside categories. I cannot come up with any other explanation. Comments? Oleg Alexandrov (talk) 03:31, 5 December 2006 (UTC)
- Ask the people at Wikipedia:Village pump (technical)? They should be able to help if the problem's on Wikipedia's side. Mike Peel 10:17, 5 December 2006 (UTC)
-
-
- I don't even know how to phrase the question properly. :) I think Yurik would know this stuff, because we ultimately use his query format. But I decided to chicken out for now and not use it anymore and hope that it would solve the problems (see below). Oleg Alexandrov (talk) 04:09, 6 December 2006 (UTC)
-
So, the bot has not been working well recetnly, and I think this started with me switching to Yurik's query format for reading categories. I decided to go back to the old way of reading categories, parsing html source. That also has problems, like sometimes the wiki servers change the format of html (and the bot is confused) and sometimes they serve cached info, but let us see if the switch back will deal with the problems with the biography articles.
I implemented a couple of routines which will hopefully make the bot automatically recover if articles go missing.
The way things are now, if say a bot did not read well a category (be it either its fault or server's fault), some articles will disappear from the worklists. Next time the bot runs it may read that category well and recover an article in the list. What would be lost however, would be the history link and the date (column 1 and 3 here for example).
In addition to doing the backup mentioned somewhere in the previous sections, I now store on disk the history links and dates for the last five days (all in one single file for each project). So, if an article goes missing, and pops up back in a day or two, the bot will check on disk if that article has been around recently. If yes, and if the quality assessment did not change, the bot will recover the old history link and date from disk, so that info won't be lost.
In short, now the bot not only writes backups, it also reads backups, and vital info is not stored only within the Wikipedia worklists but also on disk. Here's a demonstration: I removed a few articles, and the bot put them back without information loss.
I will let the bot run today. Let's see what happens. Oleg Alexandrov (talk) 04:09, 6 December 2006 (UTC)
[edit] Bot error when articles are deleted
When loading the Album articles and identifying the different versions, when the bot doesn't find a version then there is an automatic error as in Noxious Saucy Beast and Here We Are (Swizzle Tree) where they have been fixed ... I have removed the tag for there is no article attached to the talk page. I think this might be the what causes the problem. Lincher 07:07, 6 December 2006 (UTC)
- That's right. :( Recently I switched to Yurik's query format also for reading history version information, and in the process I had programmed the bot to die if it can't find a history version. That was meant to test Yurik's quiery format and I did not get to it because of other recent problems (ironically also caused by the switch to Yurik's format for reading categories).
- Tonight I'll revert to the good html way of finding the history version too. Hopefully this will put all the recent problems behind us. For today, I think one can still use the on demand version of the bot to run it for your specific project unless the bot dies on you as above.
- I am sorry for all the recent problems. I'll do my best to get over that as soon as possible. Oleg Alexandrov (talk) 16:45, 6 December 2006 (UTC)
-
-
- It litterally died for it didn't go into its normal sleep period and gave me a message that looked like this, the log. Lincher 17:12, 6 December 2006 (UTC)
-
-
-
-
- To reply to Titoxd, that sometime in the past the bot stopped reading a category and switched to another one, is not a problem of my making (but that the bot died when it should not is an unintended consequence of what I did).
-
-
-
-
-
- OK, so I now went back to the old way of reading history links and categories (fixing a bug in my history-reading routine along the way), so hopefully we are back to normal. I did not give up on using Yurik's query features, but I will be much more cautious and will spend more time understanding how it works. Let us see how the bot run tonight. Oleg Alexandrov (talk) 01:56, 7 December 2006 (UTC)
-
-
[edit] No update in 3 days
What's up with the bot. There hasn't been a full run in 3 days. Rlevse 23:30, 6 December 2006 (UTC)
- Problems, see the above posts... Cbrown1023 01:36, 7 December 2006 (UTC)
- However, if you would like to get a bot run now, try the automated version. Cbrown1023 01:55, 7 December 2006 (UTC)
[edit] Bot did not finish updating the biography articles
... because of a power outage at my work. The power is still out, perhaps it will come back later tonight. Oleg Alexandrov (talk) 01:29, 8 December 2006 (UTC)
- And this grandiously proves Murphy's Law... :| Titoxd(?!?) 03:59, 8 December 2006 (UTC)
-
- Whatever, as long as it is not bot's fault. :) Oleg Alexandrov (talk) 05:18, 8 December 2006 (UTC)
[edit] Article moves
Until now, when an article got renamed, the bot would consider that the old article disappeared and a new article got created. Now I modified the script to actually copy over the history link and date to the new article. I don't know how necessary that is, but I would think it makes more sense that way. Oleg Alexandrov (talk) 05:18, 8 December 2006 (UTC)
- It definitely makes more sense to me. How would the logs record such a change? Currently you see [[Oldname]] removed and [[Newname]] added as two separate entries, would it remain like that? Walkerma 05:32, 8 December 2006 (UTC)
-
- The bot will say Old name moved to new name'. Oleg Alexandrov (talk) 06:12, 8 December 2006 (UTC)
-
-
- It would be nice if the bot notices a namespace move but no talk page move, he also does that talk page move for it prevents from having redirect pages associated with an assessment on that talk page even though the article is at another place. I. E. [[Article a]] moved to [[Article b]] where [[Talk:Article a]] isn't moved to [[Talk:Article b]] and a user also adds an assessment to [[Talk:Article b]] making both talk pages full but only Article b containing the article. Lincher 05:55, 8 December 2006 (UTC)
-
-
-
- The bot cannot do page moves, unfortunately.
-
-
-
-
-
- Your point is I think that we should eliminate redirects from the lists. So, if a talk page is say B-class, but the article itself is a redirect, the article must be removed from the list. I will think on whether that's possible. Oleg Alexandrov (talk) 06:12, 8 December 2006 (UTC)
- Yes that is what I meant, just couldn't make it concise enough. Don't burn yourself trying it though. Best of luck and great job on the bot. Lincher 16:17, 8 December 2006 (UTC)
- Your point is I think that we should eliminate redirects from the lists. So, if a talk page is say B-class, but the article itself is a redirect, the article must be removed from the list. I will think on whether that's possible. Oleg Alexandrov (talk) 06:12, 8 December 2006 (UTC)
-
-
[edit] Redlinks
Quite a lot of the articles in the list are redlinks, but their talk pages list the article as B-Class, etc. I wrote a script to tag such talk pages for speedy deletion. They will also show up in Category:Wikipedia 1.0 problematic articles for people to take a closer look. Oleg Alexandrov (talk) 06:12, 8 December 2006 (UTC)
- I see that Mathbot is blocked, due to CSD'ing several inappropriate talk pages. Can I assume this is the reason why? thadius856talk|airports|neutrality 09:35, 8 December 2006 (UTC)
- The articles tagged so far seem to be mostly Talk archives. It would be better if you wrote a script to removed tags from archives. Dev920 (Have a nice day!) 10:31, 8 December 2006 (UTC)
-
-
- Sorry! The bot was doing the right thing for a while, then I went to bed, and then the mess started. The bot was rightfully blocked and hopefully I'll learn my lesson to supervise it more properly. My dumb tagging script finished by now, so I unblocked the bot and dealt with the few improperly tagged articles in Category:Wikipedia 1.0 problematic articles which were not cleaned up by others. I also restarted the WP1.0 script. Oleg Alexandrov (talk) 16:14, 8 December 2006 (UTC)
- But for the sake of completeness, and in reply to Dev920 above, most articles were tagged correctly. The archives were a minority, but they were the only ones left in bot's contributions after most of the other talk pages were speeded. Not that it matters of course. Oleg Alexandrov (talk) 16:30, 8 December 2006 (UTC)
- Sorry! The bot was doing the right thing for a while, then I went to bed, and then the mess started. The bot was rightfully blocked and hopefully I'll learn my lesson to supervise it more properly. My dumb tagging script finished by now, so I unblocked the bot and dealt with the few improperly tagged articles in Category:Wikipedia 1.0 problematic articles which were not cleaned up by others. I also restarted the WP1.0 script. Oleg Alexandrov (talk) 16:14, 8 December 2006 (UTC)
-
[edit] Wikipedia Version 0.5
Category:Wikipedia Version 0.5 is very huge I think. Is perhaps time to split it according to individual projects, say Category:Military history version 0.5 articles? That would require modifying the assessment templates for all the projects (e.g., {{WPMILHIST}}), but I'd think that at some point this may need to be done. Comments? Oleg Alexandrov (talk) 20:30, 9 December 2006 (UTC)
- It already has a bunch of sub-categories for the broad topic areas; I'm not sure why the tag is also putting everything directly into the main category, but it would be trivial to change that. Kirill Lokshin 20:48, 9 December 2006 (UTC)
-
- There are 10 subcategories (and a "Misc") - such as History, Arts, etc. See {{V0.5}} for a full list. I'm in fact using these to make navigation pages for the CD such as Wikipedia:Version 0.5/Language and Literature. There are around 2000 articles in the main category - I didn't think that would count as huge, does it? I think there was a reason for having a global category, though I forget the reason now - maybe Tito knows. Is there a problem with it? Walkerma 03:20, 10 December 2006 (UTC)
-
-
- Each time the bot runs on demand or I need to do some debugging of the code, one needs to wait until all version 0.5 articles are read. Ideally, next to Category:Physics articles by quality, Category:Physics articles by importance, and Category:Physics articles with comments there would also be a Category:Physics version 0.5 articles, so instead of all the version 0.5 articles being read in bulk at the beginning, they would be read separately for each project when needed. Not a big reason of course, but I thought it would be nice if things are that way. Oleg Alexandrov (talk) 05:11, 10 December 2006 (UTC)
- I don't mean that you should reshuffle the entire V0.5 naming scheme. If a physics article shows up both in Category:Physics version 0.5 articles, and in Category:Natural sciences Version 0.5 articles, and in Category:Wikipedia Version 0.5, that would be perfectly fine. The bot would simply read Category:Physics version 0.5 articles to get version information for physics articles, and ignore the other categories. Oleg Alexandrov (talk) 05:17, 10 December 2006 (UTC)
- Each time the bot runs on demand or I need to do some debugging of the code, one needs to wait until all version 0.5 articles are read. Ideally, next to Category:Physics articles by quality, Category:Physics articles by importance, and Category:Physics articles with comments there would also be a Category:Physics version 0.5 articles, so instead of all the version 0.5 articles being read in bulk at the beginning, they would be read separately for each project when needed. Not a big reason of course, but I thought it would be nice if things are that way. Oleg Alexandrov (talk) 05:11, 10 December 2006 (UTC)
-
- The reason we had the root category was that it allowed us to catch articles that "fell through the cracks"; similar to the No-Class importance categories many projects have. If anyone can think of a nice way to ensure that articles won't fall through the cracks, then I see no problem with this change... :) Titoxd(?!?) 05:09, 11 December 2006 (UTC)
-
- Thanks Tito! I thought there was some reason like that. Oleg, Version 0.5 is small enough that we didn't see the need to break down the categories any smaller. We may need to switch to small categories (such as Physics) for Version 0.7 and later releases, as these releases may get pretty big. I will be working on this sort of thing in the next six weeks, and I'll bear it in mind when I'm setting up the new systems for 0.7. Cheers, Walkerma 07:04, 11 December 2006 (UTC)
- Is there any reason we can't just have the tag add a "Miscellaneous" category explicitly when a valid one isn't provided? That seems much more useful for finding things that fell through the cracks than simply putting everything in one category, no? Kirill Lokshin 10:20, 11 December 2006 (UTC)
(Moved "BozMo's first cut to Wikipedia talk:Version 1.0 Editorial Team#BozMo's first cut)