User:DOI bot/bugs

From Wikipedia, the free encyclopedia

Please read this before reporting a bug

Before listing your bug, it would be appreciated if you considered the likely extent of the problem. Most new bugs have resulted from quirky, non-standard ways of specifying data. The bot has already made thousands of edits without your bug being detected – despite very thorough checking of its edits. In many cases, it would probably be quicker to change all the affected references to use the standard format than to re-write extensive portions of the bot.

As I'm now hoping to use my time for more constructive editing of the pedia, please consider that in the space of 2 hours I could either (1) fix a bug; (2) expand an article to GA status.

Contents

[edit] Current issues

Please report instances of the bot malfunctioning here, including a link to the errant edit. If you like, it would be helpful if you could drop the problematic citation in its entirety in the bot's sandbox.

Please check the date of the edit before reporting it here. If the edit was made before the 30th May, the bug has probably been fixed; please check the list of fixed bugs to check that it has been reported.

[edit] Issues with the manual interface

I just asked DOI bot to make this edit to Dutch famine of 1944. It shouldn't have made that edit without making sure the Wikipedia username field had *some* text in it. I was skim-reading through the page as I'd used DOI bot before, and didn't notice the place to type my username. Also, in the part about PMID parameters, "exhaustative" should be "exhaustive. Graham87 14:55, 2 June 2008 (UTC)

[edit] DOI not found: parsing page numbers

This DOI bot edit produced DOI references that get "Error – DOI Not Found", for example here. What's going wrong? --EPadmirateur (talk) 19:24, 7 June 2008 (UTC)

It looks like the publisher has yet to register the DOI with the linking system. Smith609 Talk 21:11, 7 June 2008 (UTC)
Another example in this edit to here. If it's a matter of the publisher not registering with the DOI system... this article was published in 1992, it seems unlikely to be updated in a way that would make that DOI page start working. LyrlTalk C 03:39, 14 June 2008 (UTC)
Oh, I see the issue of broken DOIs may be better addressed by the feature request discussed immediately below. Hm. I hope it works out that the DOI people are able to fix this type of failure to link to a webpage. LyrlTalk C 03:41, 14 June 2008 (UTC)

[edit] Bot added non-working DOI to Philitas of Cos

This change to Philitas of Cos added two DOIs, but the first one (to doi:10.1093/cq/46.1.308 from a Classical Quarterly citation) does not work, because the broken DOI resolves to a domain name (cq.oupjournals.org) that no longer exists. CHANGES TO OXFORD JOURNALS reports that The Classical Quarterly transferred to a different publisher; perhaps this explains why the DOI doesn't work any more. So much for DOIs being "stable", huh?

I fixed the problem by removing the broken DOI by hand, but am worried that the DOI bot will just add it again. Is there any way to fix the DOI bot not to add DOIs that resolve to domain names that no longer exist? Short of that, is there any way to inform the DOI bot of DOIs that are known not to work? or at least, to tell the DOI bot not to mess with this particular citation when it traverses this page again? Eubulides (talk) 23:40, 8 June 2008 (UTC)

I guess the bot could follow each DOI and see if it encountered a "DOI not found" error (which would slow it down a significant amount). I suspect that the publisher will eventually update the DOI if notified that it is broken. Theoretically, I suppose the bot could provide doi.org with a list of broken DOIs – I'll get in touch with them and see what their response to a long list of broken DOIs would be! Smith609 Talk 08:41, 9 June 2008 (UTC)
Indeed, arguably the DOI itself isn't broken, and is fulfilling its primary function as a document identifier just fine. What's broken is the database entry that allows dx.doi.org to translate that identifier into a working web link, and presumably that will get fixed eventually, at least if someone reports the problem. After all, no-one would call the ISBN for an obscure book "broken" just because Google or Special:BookSources didn't give any results for it. —Ilmari Karonen (talk) 10:16, 9 June 2008 (UTC)
Indeed. The DOI people sound quite helpful, so I think the best way forwards is to leave DOIs which aren't recognised by the database, and report them for fixing. I'll implement this soon! Smith609 Talk 12:06, 9 June 2008 (UTC)
OK, thanks, this all helps. I am a bit skeptical that DOI problems like this one will be fixed quickly in the DOI database. I suspect that what happened here is that the new publisher didn't want to pay the DOI fees, or whatever. Anyway, I'll turn this bug report into a feature request in #Deny DOI bot on a particular citation below. Eubulides (talk) 18:17, 9 June 2008 (UTC)

[edit] DOI bot not properly handling refs that already use the doi template

Here are some examples:

Kaldari (talk) 15:07, 9 June 2008 (UTC)

Fixed 17:10, 9 June 2008 (UTC) . The bot did its task correctly, but did not exist the pre-existing (additional) templates. It now does so.

[edit] Modifying "pages" in single page inline citations

In the following edits, that I have noticed, here and here, the bot alters the word "page" to "pages" despite the fact that the inline citation if a reference to a single page and is not the number of pages in the source.

I am not sure if this is a problem that the bot does not know the difference an inline citation and a source entry, or just looks at a citation template and assumes in all cases that the number listed is the total number of pages of the cited source. ww2censor (talk) 03:06, 12 June 2008 (UTC)

There is no "page" parameter.

*{{cite journal|title=Using "page" parameter|page=page number}}
*{{cite journal|title=Using "pages" parameter|pages=page number}}
  • "Using "page" parameter" . 
  • "Using "pages" parameter" : page number. 

Smith609 Talk 09:34, 12 June 2008 (UTC)

What are you actually saying here? If there is no "page" parameter then why is the bot changing "page" to "pages" when a single pages is being referenced? Or, is the citation template used not being done so correctly even though it comes from the citation templates example page? Sorry but the answer does not enlighten me one iota! ww2censor (talk) 04:11, 13 June 2008 (UTC)
There is no "page=" parameter with Template:Cite journal. There is only a "pages=" parameter. If you write "page=10", it is ignored. You are supposed to write "pages=10". Admittedly this is a bit weird, but that's how the template works. Where is the "citation templates example page"? If it uses "page=" it should get fixed. Eubulides (talk) 04:46, 13 June 2008 (UTC)

[edit] Dead link

Why does DOIbot in this edit claim that the link to the Panero&Funk paper is dead? It works fine for me. Furthermore, the supplied google scholar search is broken. It has a spurious "author:" at the start (removing this makes the google scholar search work). The cite journal template seems to specify authors in one of the (many) acceptable formats. It would be nice to straighten this out, as this is a very widely cited paper (both in Wikipedia and outside it). Kingdon (talk) 04:19, 12 June 2008 (UTC)

The problem was that the URL in the source code contained an & in place of an &; the URL as written in the source returned a 404 page. I've now recoded the bot to replace &s in URLs with &s, and fixed the scholar search. Thanks for spotting this! Smith609 Talk 09:15, 12 June 2008 (UTC)
Thanks for the prompt fix. Kingdon (talk) 11:16, 12 June 2008 (UTC)

[edit] Another Dead link

The DOIbot has twice flagged a reference in Archaeoastronomy as a dead link [1]. The link is in fact good. Could you correct this error. --SteveMcCluskey (talk) 19:50, 13 June 2008 (UTC)

I'll see what I can do. Smith609 Talk 20:44, 13 June 2008 (UTC)

[edit] Feature requests

  • Any ideas for improvement? Pop them in here.

[edit] Forbid DOI bot

  1. Create a comment parameter <!-- doibot = disable --> one can add to the top of an article so that the DOI bot doesn't traverse it. This would be useful for those preferring to manually control {{cite journal}} internal links. -- alexgieg (talk) 16:34, 3 May 2008 (UTC)
The bot respects the {{bots}} tag. Use "bots|deny=DOI bot".
Ah! I didn't know about that template. Thank you! Not that I'm actually going to deny yours, but it's good to know about the existence of such nice features. I'm slowly, slowly leaving newbie-land. :-) -- alexgieg (talk) 13:02, 12 May 2008 (UTC)

[edit] DOI bot ignores bots tag

As this edit shows, the {{bots|deny=DOI bot}} tag does not work: the DOI bot ignores the tag and updates the article anyway. I see from this edit that User:Smith609 knows of the problem, but thought I'd document it here for others to see. Eubulides (talk) 15:37, 12 June 2008 (UTC)

Just to confirm that this was fixed as soon as I spotted it. Smith609 Talk 18:24, 13 June 2008 (UTC)

[edit] PMID addition

  • Query PMID and add a PMID parameter, where one exists.
Y Done using thorough search mode.

[edit] Details from SICI

  • Extract useful information from DOIs of the format 10.1666/0094-8373(2008)034[0210:EBTICM]2.0.CO;2
i.e. ISSN(year)volume[page:LETTERS]2.0.CO;2
or 0016–7398(193702)89:2<156:LUIEAT>2.0.CO;2-C
i.e. ISSN(year0Month)Volume:issue<startPage:LETTERS>2.0.CO;2
Y Done 14:21, 31 May 2008 (UTC)

[edit] Deny DOI bot on a particular citation

As discussed in #Bot added non-working DOI to Philitas of Cos it is sometimes preferable to omit a DOI from a citation, even when one is available, because the DOI does not actually work. Perhaps it's temporary that the DOI does not work, and perhaps not, but either way we need to tell the DOI bot not to add that DOI. For Philitas of Cos I worked around the problem with this hack, which replaced this:

  • |doi= 10.1093/cq/46.1.308}}

with this:

  • }}<!-- On 2008-06-08 the DOI bot added doi=10.1093/cq/46.1.308 but this does not work, so disable the bot for now. Too bad that we have to disable it for the whole article. -->{{bots|deny=DOI bot}}

However, this denies the DOI bot for the entire article, whereas it'd be better to deny the bot only for this particular citation.

Here's a suggestion. Support a syntax like "|broken_doi=10.1093/cq/46.1.308" , to tell the DOI bot that we know the DOI and that it's broken. The DOI could even appear in the article, but as plain text (not as a link that can be followed). Eubulides (talk) 18:17, 9 June 2008 (UTC)

I think I'm with Ilmari Karone on this. The DOI is not broken; crossref is – therefore the DOI should be left intact. However, there may be scope for a new parameter which tells the template not to render the DOI as a link, and perhaps to add a flag beside it. However, this is not the place to discuss changes to the cite journal template: I'd suggest you propose your changes there; the bot will be updated as soon as consensus is reached there. Smith609 Talk 09:32, 10 June 2008 (UTC)
Thanks for the suggestion. I followed up at Template talk:Cite journal #Request for doi_broken parameter. Eubulides (talk) 16:50, 10 June 2008 (UTC)
In followup to that suggestion, a further suggestion was made on Template talk:Cite journal #Request for doi_broken parameter to modify Template:Cite journal to ignore doi= if it has the special value "deadlink", and to modify the DOI bot to not change doi= if its value is "deadlink". Does this suggestion seem good? I will follow up there rather than here. Eubulides (talk) 21:22, 13 June 2008 (UTC)