Wikipedia:Bots/Requests for approval/Lightbot
From Wikipedia, the free encyclopedia
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
[edit] Lightbot
tasks • contribs • count • sul • logs • page moves • block user • block log • flag log • flag bot
Operator: Lightmouse (talk)
Automatic or Manually Assisted: Manually assisted
Programming Language(s): Monobook or AWB
Function Summary: Janitorial edits mainly to units and dates.
Edit period(s) (e.g. Continuous, daily, one time run): Continuous
Already has a bot flag (Y/N): No.
Function Details: Janitorial edits mainly to units and dates. Examples include:
- Changing '|sqm|', '|cum|' and '|knot|' to '|m2|', '|m3|' and '|kn|' when in the convert template (no visible effect to the reader but rationalises the template). Very low false positive rate.
- Fixing damaged date links that damage autoformatting e.g. [[November 5th]] (should be [[November 5]]). Very low false positive rate.
- Fixing dates that are damaged by autoformatting such as date ranges e.g. [[1 May|1]] to [[4 May]] should be simply '1 to 4 May' (to stop autoformatting converting it to "1 to May 4"). Low false positive rate.
- Unlinking date fragments such as links to solitary months ([[February]]), solitary days of the week ([[Tuesday]]), digits ([[16]]). Some false positives possible but I know some of the common ones and will check by eye when doing these.
I have done thousands of script assisted edits of this kind as Lightmouse. Low error rate tasks will be transferred to Lightbot. See contributions of Lightbot.
[edit] Discussion
Seems fine assuming to check manually for false positives on the last bullet point. --Apoc2400 (talk) 22:00, 25 May 2008 (UTC)
- You say "Very low false positive rate." a few times. Can you give examples of these false positives? dihydrogen monoxide (H2O) 09:35, 26 May 2008 (UTC)
- Actually, I say 'Very low' for the first two bullets. I say plain 'low' for the third bullet. The fourth bullet has different wording. I base my estimates on thousands of edits as Lightmouse. I will use low error rate parts of the same script.
- "Changing '|sqm|'": I have yet to detect or imagine a false positive scenario with the proposed regex. It would be naive to predict zero. So I hypothesised 'very low'.
- "Date links that damage autoformatting": I have yet to detect or imagine a false positive scenario with the proposed regex. It would be naive to predict zero. So I hypothesised 'very low'.
- "Fixing dates that are damaged by autoformatting": This is a more difficult regex problem. With range examples, it is easy to address the first half of a range (or just bad formatting) such as [[1 May|1]]. It is more difficult to correctly address the second half of a range [[4 May]]. That is where the theoretical possibility of false positives exists i.e. I want to delink '4 May' in a date range but not otherwise.
- Unlinking date fragments. For example false positives for delinking day names occur in references to calendars and gods i.e. I want to delink 'Thursday' when it is just the day that a TV show airs but not when referring to the god 'Thor'. Of the four bullet points here, this one needs the most care. I have done thousands of these and I know what to check for. I would be happy to do test runs. I hope that helps. Lightmouse (talk) 11:13, 26 May 2008 (UTC)
While you're editing units, I wonder whether you would be able to implement a procedure that places a non-breaking space between any digit and an ensuing unit, as per the manual of style. I think that using a regexp as simple as \d\s\w would suffice; even if false positives did occur, the change in space style would cause no user inconvenience. The scope could be extended with a regexp such as (\s\d+)(\w*) (replaced by $1 $2), but I wonder whether that will be more false-positive producing. Cheers, Smith609 Talk 13:53, 26 May 2008 (UTC)
- I am not convinced that the upside/downside balance for non-breaking spaces is a net benefit. So I do not choose to write, check, debug and maintain regex for them. I know that my view is a minority. However, please note that I use the convert template. That template includes non-breaking spaces as per the MOS. The net effect of any edit that adds the convert template is to give you what you want. Lightmouse (talk) 17:32, 26 May 2008 (UTC)
I think there would be many false positives when a digit is used inside a name or in codes of various kinds. --Apoc2400 (talk) 19:42, 26 May 2008 (UTC)
- Please set Smith609's non-breaking space question and his/her proposed code to one side. It is not part of my request for bot approval. Lightmouse (talk) 19:54, 26 May 2008 (UTC)
Okay, thanks for considering it, and sorry for sidetracking discussion! Smith609 Talk 08:18, 27 May 2008 (UTC)
- Sounds good to me. A bot to clean up {{convert}} tranclusions would be useful. JIMp talk·cont 20:27, 29 May 2008 (UTC)
Any news on this? Lightmouse (talk) 21:06, 4 June 2008 (UTC)
- Lets try a Approved for trial (100 edits). to see how it works. MBisanz talk 21:18, 4 June 2008 (UTC)
Trial edits complete. Lightmouse (talk) 09:55, 5 June 2008 (UTC)
- {{BAGAssistanceNeeded}} I've taken a look at 1/4 edits and see no mistakes, propose approval. BJTalk 10:03, 5 June 2008 (UTC)
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.