Wikipedia talk:WikiProject Ships/New articles
From Wikipedia, the free encyclopedia
Contents |
[edit] Proposed changes
Hi - Can you add to the ruleset the following key words for article titles: USAT USAR USAHS (ship) MV MS M/V M/S and for article content: DANFS {{DANFS}} [[DANFS]] Hopefully that will pickup some articles that are slipping past the bot. At least I don't see those keywords in the current ruleset so assume that is the problem. Thanks --Brad (talk) 20:46, 1 February 2008 (UTC)
- Sure; dunno how I missed some of those, especially USAT. I'll start working on it. By chance, can you give me any actual examples of articles that didn't get picked up? Might help me catch them while avoiding false positives. Thanks. Maralia (talk) 22:06, 1 February 2008 (UTC)
- This was your bot? I didn't know :) What brought this up was our friend Woodruff. If you look at his last 50 edits only one or two of his new articles show up on the new list. That's what led to me finding out how many articles he's been creating. There was also a few articles that had (ship) in the article title. --Brad (talk) 13:03, 2 February 2008 (UTC)
[edit] Point threshold
The point threshold on the filter was @ 50 in order to trigger a hit as a new ship article. Seeing as many articles have been getting past the filter lately, I recently lowered the point threshold to 40. That seems to have helped some; at least it appears there are more articles hitting the New list. We'll give this a few days to see how things go from here. Reading more about how the filter works, I think it would be a good idea to allow the changes to add various "DANFS" mentions in article body text to automatically give a point score equal to the filter threshold, meaning an instant hit. --Brad (talk) 00:16, 2 March 2008 (UTC)
- Yeah I noticed the threshold change this morning. I've been rewriting the ruleset offline, testing regex in AWB to tighten up the syntax and rules. Adding catches for Categories, and improving the catch for stub types, should help (although of course not all new articles will have either). I'm also adding USAHS etc. per your suggestions above, and various terms that didn't occur to me the first time (barque, sidewheeler, sternwheeler, etc). DANFS actually needs some qualifiers - DANFS info is used on a lot of biography articles, and other naval articles like various battles, operations, convoys, etc.
- Going through the logs, which show articles that were 'caught' but failed to meet the threshold, I really don't see very many false fails. While it's of course true that an article has to have at least one match to be listed in the logs even as a fail, with extremely generic rules like 'ship' and 'submarine', it seems highly unlikely that more than a few articles would fail to be caught at all. I really can't explain the Prof mystery. Maralia (talk) 03:43, 2 March 2008 (UTC)
- There is some conversation at User_talk:AlexNewArtBot#Bot missing matches (New Zealand) and the topic below that one regarding missing articles. Apparently there have been some network problems resulting in the bot not being able to do its work. Might explain why articles are getting missed, if the ruleset isn't at fault. As far as danfs was concerned, a bio article with danfs content is likely to have some relation to ship names or class names which in that case I've been assigning a low importance project tag to. --Brad (talk) 07:45, 2 March 2008 (UTC)
- Hmm, mind bringing up tagging bios at WT:SHIPS? I don't tag them except in very rare circumstances, and I wouldn't guess that most anyone else does either. I don't know if it's a good idea or not—I don't feel very strongly about it either way—but it would be good to discuss it so we can all be consistent. Maralia (talk) 15:39, 2 March 2008 (UTC)
- There is some conversation at User_talk:AlexNewArtBot#Bot missing matches (New Zealand) and the topic below that one regarding missing articles. Apparently there have been some network problems resulting in the bot not being able to do its work. Might explain why articles are getting missed, if the ruleset isn't at fault. As far as danfs was concerned, a bio article with danfs content is likely to have some relation to ship names or class names which in that case I've been assigning a low importance project tag to. --Brad (talk) 07:45, 2 March 2008 (UTC)
[edit] Missed articles
- USS Sargent George D Keathley (APC-117) which was moved to USNS Sargent George D Keathley (T-APC-117); USNS article had no hits --Brad (talk) 21:20, 2 March 2008 (UTC)
- Aha, finally a specific case to dig into! My suspicion after a glance at the article is that it may have been caught but failed the threshold, because none of the existing big terms (ship etc) were in the lead (where they get counted double towards the threshold). Will look up logs etc now. Maralia (talk) 01:32, 3 March 2008 (UTC)
- I was off, but it's starting to make sense. It seems that the bot only checks literally new articles, not moved articles (as, logically, they are preexisting content just renamed). This is likely the explanation for Prof's articles being 'missed' - they were often getting moved due to poor naming. Considering that in months of tagging new articles from the feed, I never noticed any old articles pop up due to renaming, it seems highly likely that this is the case.
- The USS article was caught by the bot, but failed to meet the threshold as it only matched the single term 'USNS' - because the only content of the USS article is the redirect to the USNS article name.
- I'm not sure yet how we can work around this, but I'm glad to have something more than mysterious regex failures or network problems to work with. Maralia (talk) 02:08, 3 March 2008 (UTC)
- SS Talune --Brad (talk) 21:35, 3 March 2008 (UTC)
- Brookes (ship)
- Sub alarms
- Wasa 30
- CCGC Cape Sutil
- Supermarine Sea Eagle
- Eurydice (S644)
- Pohang class corvette --Brad (talk) 00:44, 8 March 2008 (UTC)
- Okay. A few of those will be caught with better search terms; some of them, though, won't be caught unless we want the list to include, like, any article that uses the word 'ship' or 'submarine' a single time. I'm not sure that's a great idea. In any case, I've just hugely revamped the search terms; let's see how terrifying the list looks at the next run. Maralia (talk) 05:31, 8 March 2008 (UTC)
- It would be silly to expect any filter to be perfect but I'm sure your changes will help. Prior to now I hadn't paid much attention to the workings of the bot; just what the bot reported. I think the bot itself is a bit flakey at times which contributes to it missing articles. --Brad (talk) 11:17, 8 March 2008 (UTC)
[edit] Watching for templates
Is it possible to add parameters to watch for new templates created? I'm basing this on the amount of infoboxes that wp:ships is trying to replace. If a new template was created that duplicated a template we already have, being aware of it would prevent many articles from having to later be converted. --Brad (talk) 04:04, 20 May 2008 (UTC)