Wikipedia talk:AutoWikiBrowser/Typos/Archive

From Wikipedia, the free encyclopedia

Contents

[edit] Womens'

I'm trying to correct many instances of "womens'" or "womens" to "women's", but I'm having trouble grabbing that trailing apostrophe in the regex. Can someone help me with the syntax? I'm wondering if this is a AWB bug or you have to do something special for apostrophes. --Thiseye 00:19, 13 January 2007 (UTC)

It seems there is some sort of problem related to identifying the end of a word. However, using a whitespace instead of a wordbreak seems to work.
"\b(W|w)omens'(\s)" -->> "$1omen's$2"
Gaius Cornelius 13:08, 13 January 2007 (UTC)
Thanks, that did the trick. :) --Thiseye 01:59, 14 January 2007 (UTC)

[edit] Greece

There's an error in the "Greece" entry. It should have $1, not $2. --Thiseye 01:43, 3 January 2007 (UTC)

[edit] Gandhi

There are two entries for Gandhi. I believe the newest one was added to avoid some false positives, but the old one wasn't removed. --Thiseye 01:26, 2 January 2007 (UTC)

[edit] Poss reconsider

Bizarre as in Some Bizarre Records. Rich Farmbrough 22:53 11 August 2006 (GMT).

[edit] Attempt

If a fix for attemp is desired, "\b(A|a)tt?em(p|t)(|ed|ing|s)\b" --> "$1ttempt$3" seems to work for all cases. I don't think it matches any real words.—Mrkwcz 17:23, 12 August 2006 (UTC)

[edit] Opposites

What about alternative beginnings to words, as in opposites like accessible and inaccessible? Instead of having two separate entries to check and maintain, we could easily just have one:

<Typo word="Accessible/Inaccessible" find="\b(A|a|Ina|ina)ccessab(le|ility)\b" replace="$1ccessib$2" />

This would simply require a rule that opposites (starting with in-, un-, etc) should not be placed alphabetically, but placed with their root word, and in many cases in the same regex.

An other strategy would be a rule that any word covered like this outside its normal alphabetical order should have a comment line placed in the alphabetical list where it would have gone.

Euchiasmus 12:55, 20 August 2006 (UTC)

Sounds like a great idea, reducing duplication is always good. thanks Martin 18:58, 20 August 2006 (UTC)

[edit] Victuals and eke

I removed these new additions:

Typo word="Victuals" find="\b(V|v)ittles\b" replace="$1ictuals"

Typo word="Eke" find="\b(E|e)e(ke|ked|kes|king)\b" replace="$1$2"

Typo word="Eke" find="\b(E|e)e(k)\b" replace="$1ke"

Typo word="Ekes" find="\b(E|e)eks\b" replace="$1kes"

"Vittles" is so old a misspelling that it's kind of its own word now, not to mention the cat food Tender Vittles, etc. (see the Google search).

"Eek" is a really common onomatopoeia for screaming, among other things. There are a lot of false positives on this Google search, and the words on the list should have 0 false positives. "Eeks" seems the same (lots of legit uses), there seem to be two legit uses of "eeke", and there are only 4 mainspace results for "eeked", 2 for "eeking", and none for "eekes". --Galaxiaad 18:34, 25 August 2006 (UTC)

Ah, sorry. I even happen to know that Eek is a town in Alaska (you can't get there from here, or even from there—Google Maps fails!). But has "eek"-the-onomatopoeia been verbed? "The scream queen eeked out a living"? BTW, I've just made probably a few hundred changes to the list—I gather that you're genuinely interested, so you might want to take a gander.--BillFlis 23:36, 25 August 2006 (UTC)
Yeah, all the changes are impressive and a bit overwhelming. I definitely want to look though. I didn't mean to sound harsh in my previous comment; sometimes it's hard for me to sound human instead of just stating facts, heh. Hm, doesn't look like it's been verbed, but there is the plural in "Eeks and Squeaks". (The instances of "eeked" actually were typos for "eked" but there were only 4, which isn't enough to merit inclusion.) Hey, I'm just wondering and you'd probably know: what does the word="whatever" bit actually do? --Galaxiaad 13:58, 26 August 2006 (UTC)
Actually, I thought your points were well-taken. I figure the word="whatever is just informational. My understanding of the AWB is that it's not a bot, it just helps someone make the same kind of edit over and over very quickly. I have a question about how it uses this typo list: I've noticed that some of the rules here have sort of the opposite of a false positive; that is, the correct spelling will trigger a change, back to the correct spelling. There's no harm done, but isn't this inefficient? Should I be stamping these cases out?--BillFlis 14:27, 26 August 2006 (UTC)
The word property means they can be sorted in proper alphabetic order (sorting by order of the typo was very difficult to deal with, as duplicates were not adjacent to eachother), also it allows easy location of a specific word, which will hopefully avoid future duplication, and probably explains the enourmous amount of duplication that previously existed. Matching the correct spelling is much more efficient than having 2 separate regexes (which is how is used to be) but not as efficent as having a single regex that manages to avoid the correct spelling, so yes, avoid them when possible, but if it is becoming complicated then it doesnt really matter. And thanks for all the work you have done on this! Martin 14:48, 26 August 2006 (UTC)

[edit] Airbourne

I got a false positive running AWB when Airbourne was changed to Airborne. Should this be removed? — Loudsox 16:46, 27 August 2006 (UTC)

I think it should be removed, or maybe changed. I think a more likely misspelling is airborn. What would be really nice is some way to tag a word within the encyclopedia as a deliberate misspelling, like adding "[sic]".--BillFlis 17:34, 27 August 2006 (UTC)

[edit] Regexes that match the correct spelling

Sometimes a regex, in providing matches for a variety of possible misspellings, matches the correct spelling. As best I can tell, AWB stops on an article when the regex matches the correct spelling and therefore makes no change.

Example: for "Apparel", the regex

(A|a)pp?arr?e(l|ls|ling|lling|led|lled) 

corrects "Aparrel", "Aparel", and "Apparrel". Unfortunately, those alternatives allow "Apparel" to match, so AWB stops on "Apparel" but shows no diff. Example article: Jones Apparel Group.

So, 1) is what I'm saying true; 2) is there a preference against such regexes; 3) is there a way to fix the regex (while keeping only one regex) to avoid this? (And/or, can AWB be programmed to realize that a null edit has occurred?) Thanks, –Outriggr § 01:23, 16 September 2006 (UTC)

Well, I just played with the "Skip article when no change made" setting (which I could swear was on by default, or that I have always had it on), and I see that AWB no longer stops in the above case. Not such an issue then? –Outriggr § 01:31, 16 September 2006 (UTC)
I've been told that the regex does make the "change" (to the same correct spelling, thus useless work) and is thus wasteful of resources. Have I been given some bad info? I've been trying to stamp out such cases, but maybe the program is smart enough to recognize (i.e., it checks) whether any real change is made, and I'm the one doing the useless work!--BillFlis 20:15, 16 September 2006 (UTC)
The program is smart enough to know if a change was actually made, but it is slightly preferable not to match the correct spelling, though not critical. I suppose it might be more critical in the future if some other software wanted to make use of this list though. Martin 09:31, 17 September 2006 (UTC)

[edit] Suggestion of a change

How about "alot" to "a lot". But I am not sure how to program it.--Esprit15d 17:50, 27 September 2006 (UTC)

But it might be "allot".--BillFlis 19:46, 27 September 2006 (UTC)

I suppose:

<Typo word="Alot" find="\b(A|a)lot\b" replace="$1 lot" />

<Typo word="Allot" find="\b(A|a)llot\b" replace="$1 lot" />

Reedy Boy 17:10, 16 October 2006 (UTC)

Upon doing it manually with AWB find and replace the words allotment and ballots came up causing a problem with the search on Allot.

Would running those like that, ensure that only that word is used? Or would it include words that include alot/allot?

Reedy Boy 17:11, 16 October 2006 (UTC)


Seems some people use allot instead of allocate...?

Reedy Boy 17:14, 16 October 2006 (UTC)

reject Allot comes from the sense of "assigning by lot" and therefore implies random allocation. Allotment has a specific political meaning of "to select by random selection" - aka "jury" selection and "sortition". Allocation does not have any sense of chance and e.g. to allocate a person to a jury rather than allot them would imply they were chosen rather than selected at random (which would dramatically change their nature) The two words are very different and in my view to replace "allot" with "a lot" was just vandalism. --Mike 16:10, 18 October 2006 (UTC)

I think what you intended was:

<Typo word="Alot" find="\b(A|a)lot\b" replace="$1 lot" />
<Typo word="Allot" find="\b(A|a)lot\b" replace="$1llot" />

Then you'd have to run AWB manually (isn't this always how it's run?), and decide which rule to accept: alot --> a lot or alot --> allot. Yes, allot means allocate, as "within the allotted time". This would be safe to add, I think:

<Typo word="Allot_" find="\b(A|a)lot(ted|ting|ments?|tees?)\b" replace="$1llot$2" />

where we add the low-line character (_) to signal that only certain endings are being treated.--BillFlis 17:22, 16 October 2006 (UTC)

[edit] Reconsider

Rich Farmbrough, 19:33 3 October 2006 (GMT).

    • I'm a bit concerned that people—both those who use AWB, and those who see bad edits—forget that this system is semi-automated. In conjunction with the fact that the AWB user is reviewing his edits, I don't see why it is necessary to get rid of a spelling correction rule even if there are very rare exceptions to that rule. I managed not to "correct" Garry Tallent (in another article) once. I'm not pressing for the removal of the spelling error "tallent". –Outriggr § 00:27, 4 October 2006 (UTC)
      • Simply because the stated aim is to have no false positives. "The lofty goal of RETF is to be completely automatic." It is a courtesy to the creator report problems here. Rich Farmbrough, 21:58 7 October 2006 (GMT).

[edit] Two questions

  1. Is "first-hand" really bad? dictionary.com
  2. Comunal->Communal breaks Estadio Comunal de Aixovall, do we care?

Rich Farmbrough, 21:58 7 October 2006 (GMT).

Also, "first hand" can occur together. "I won the first hand."--BillFlis 12:01, 8 October 2006 (UTC)
Actually, "first hand" occurs in Canasta.
Each player is dealt a hand of 11 and a second hand of 13, sometimes referred to as the "hand" and the "foot", respectively. The hand with the lowest bottom card is played first. Once a player plays all cards from his first hand he picks up the second and continues normal play.
It has caused a false positive.Punainen Nörtti 18:15, 25 October 2006 (UTC)

[edit] Countries

I've added entries to convert names of countries to Title Case. My process was:

  • copy list of countries from List of countries
  • process to remove text in () or []
  • process "See * for *" lines
  • change lines with "1, 2" into "2 1" (eg "Congo, Republic of")
  • manually inspect and make special changes (eg Taiwan)
  • add to AutoWikiBrowser/Typos and test
  • remove duplicates that had already been put onto the list
  • remove country names that are also words that can be in lowercase (chad, guinea, jersey)

I guess that many of the lines could be manually tweaked to give greater coverage of variants - but this is a start, anyway...

Hope this doesn't generate too many erroneous matches that I haven't thought of...

Euchiasmus 07:40, 8 October 2006 (UTC)

"wale(s)" and "coco(s)" have uncapitalized meanings in http://www.m-w.com. "chile" is a valid spelling of "chili" (capsicum). "india" (occasionally before "ink" and "rubber") isn't always capitalized.--BillFlis 11:54, 8 October 2006 (UTC)

Thanks, Bill - I've removed those. I also realised about turkey and took that out too. Euchiasmus 19:51, 9 October 2006 (UTC)

Because this is an issue of capitalisation rather than spelling, I suggest that these entries are placed in a separate section rather than being distributed into the A, B, C, sections. Gaius Cornelius 13:21, 6 November 2006 (UTC)

[edit] Predominately?

Suggested addition - replacing "predominately" (not a word) with "predominantly." | Mr. Darcy talk 20:22, 6 November 2006 (UTC)

Sorry, but "predominately" is indeed a word, meaning--guess what?--"predominantly". See here.--BillFlis 19:58, 10 November 2006 (UTC)

[edit] 'Logical' punctuation in quotations

I'm changing punctuation at the end of quotations to 'logical' style, per Wikipedia:Manual of Style#Quotations by replacing <," > (comma-quote-space) with <", > (quote-comma-space) throughout (e.g. <"Yes," he said.> to <"Yes", he said.>. I haven't come across any false positives yet. A similar replacement might be possible for embedded full stops at the end of quotations, but that's more controversial and would produce too many false positives, I think, unless someone could suggest a clever method to exclude the case where an entire sentence, including its final punctuation, is being quoted. Colonies Chris 22:59, 6 November 2006 (UTC)

[edit] Orignal --> Original

There is a town in Ontario called L'Orignal, mentioned in a few articles, so the regex should exclude this if possible. Colonies Chris 08:23, 9 November 2006 (UTC)

[edit] Problem with "definitions"

When presented with the misspelling "defintions" it tries to replace it with "definitons" which is still not the correct spelling. I took a look at the RegEx and I am not quite sure how to fix this problem, so if somebody with more experience can fix it, that would be great. --Maelnuneb (Talk) 19:49, 10 November 2006 (UTC)

OK, fixed, thanks.--BillFlis 19:58, 10 November 2006 (UTC)

[edit] Firsthand

I am getting a ton of false-positives with this one. Card game pages are a real big source of false-positives. I am going to remove it from the list due to this. Code for the RegEx was: <Typo word="Firsthand" find="\b(F|f)irst[ -]hand\b" replace="$1irsthand" /> Possible fix: only match first-hand, but I'm not positive that version isn't an acceptable spelling. Any comment on that would be great. --Maelnuneb (Talk) 20:59, 13 November 2006 (UTC)

After looking up first-hand on [1], it suggested firsthand, so I will add checking for "first-hand" back into the system, but not "first hand" as the possibility of a false positive for "first-hand" is non-existent. If people believe that "first hand" should be included still, please debate here. --Maelnuneb (Talk) 21:05, 13 November 2006 (UTC)
And the OED and Webster Unabridged, both more reliable dictionaries, have "first hand" and "first-hand". This is certainly not a typo, and at the very least is an acceptable alternative spelling, if not the better spelling. —Centrxtalk • 21:29, 14 November 2006 (UTC)
Given that, I would agree to not have firsthand in the list of typos. I personally didn't write the rule in the first place, just tweaked it to get rid of false positives and then did a quick search to see if "first-hand" was a correct spelling, running on the assumption that the original contributor that added the rule for firsthand was in fact correct. Centrx, thank you very much for finding evidence of the other spellings and bringing them here. --Maelnuneb (Talk) 17:46, 15 November 2006 (UTC)

Also, this list really does need to be restricted to typos, not bad usage, because quotations and normal sentences will be filled with cases that should not be "corrected". Also, with compound words there are common sentences (such as actually referring to the first hand of something, as in a game of cards or something about physiology) that would never warrant changing. —Centrxtalk • 06:34, 16 November 2006 (UTC)

Typos would still show up in those cases unfortunately. That is the entire reason that the process of fixing typos is not automated. Your point about "first hand" was exactly why I changed the rule to match only "first-hand" actually. I was getting tired of fixing false positives, so I changed the rule to prevent it. --Maelnuneb (Talk) 18:00, 17 November 2006 (UTC)

[edit] referrences -> referencces

<Typo word="Reference" find="\b(R|r)efe(?:rr?a|rre)n(ce[ds]?|cing|ts?)\b" replace="$1eferenc$2" />
should likely be
<Typo word="Reference" find="\b(R|r)efe(?:rr?a|rre)n(ce[ds]?|cing|ts?)\b" replace="$1eferen$2" />
~ BigrTex 20:19, 15 November 2006 (UTC)

Thank you for your suggestion! When you feel an article needs improvement, please feel free to make those changes. Wikipedia is a wiki, so anyone can edit almost any article by simply following the Edit this page link at the top. You don't even need to log in (although there are many reasons why you might want to). The Wikipedia community encourages you to be bold in updating pages. Don't worry too much about making honest mistakes — they're likely to be found and corrected quickly. If you're not sure how editing works, check out how to edit a page, or use the sandbox to try out your editing skills. New contributors are always welcome. ~ BigrTex 20:00, 16 November 2006 (UTC)

[edit] Society, abundant

  • Societ -> Society
  • abundandt - >abundant
  • abundandtly -> abundantly

I stumbled across "Societ" today, and I have a tendency to add an an unnecessary d to abundant as well, but I don't know how to add these to the filters myself. --Lethargy 00:14, 16 November 2006 (UTC)

I have just added <Typo word="Abundant" find="\b(A|a)bundand(t|tly)\b" replace="$1bundan$2" /> Tankred 00:38, 16 November 2006 (UTC)

[edit] <Typo word="Oft(en)times" find="\b(O|o)ft(|en)[- ]times\b" replace="$1ft$2times" /

Often Times to Oftentimes ???

It might be me, but that seems like a use that would be sparsely used?

Or is it just me?

Reedy Boy 15:32, 19 November 2006 (UTC)

[edit] New additions section

Can we be more explicit in whether the new additions should be put at the beginning or at the end of the "New additions" section? People put them to both places, which makes the chronology of the section a bit problematic to follow. The section is fairly large now and it would be perhaps a good idea to check the oldest additions again and then to put them to the main body. Tankred 16:55, 19 November 2006 (UTC)

[edit] Increase

Suggested addition: While fixing other typos I stumbled upon 'increse' (missing a).

<Typo word="Increase" find="\b(I|i)ncres(e|ed|ing|ingly)\b" replace="$1ncreas$2" />

Thanks. ChrisCork 06:51, 28 November 2006 (UTC)

Added, with the handling of "Decrease" as well.--BillFlis 12:52, 28 November 2006 (UTC)

[edit] Super Bowl

Superbowl -> Super Bowl. I see that one a lot, not just on the Wiki. I'm not sure how to add listings that split into two words, so I'm adding it here. --cholmes75 (chit chat) 20:56, 28 November 2006 (UTC)

Done!--BillFlis 21:02, 28 November 2006 (UTC)

[edit] Guerilla

<Typo word="Guerilla" find="\b(G|g)uer(?:r?i|ril?)l(as?)\b" replace="$1uerill$2" />

We are replacing Guerrilla with Guerilla, even though the article spells it the 'wrong' way. I have removed the line. ~ BigrTex 00:12, 1 December 2006 (UTC)

[edit] Problem with kW, kJ, Hz

I'm getting problems with kW, kJ, Hz because AWB now changes (eg on the Bible page)

[[kw:Bibel]] to [[kW:Bibel]]
[[kj:Ombibeli]] to [[kJ:Ombibeli]]
[[hz:Ombeibela]] to [[Hz:Ombeibela]]

They then get moved out of sequence. I suggest the regex be amended to exclude situations where the word is preceded by square brackets and followed by a colon.

Sorry haven't got time to do it at present - I'm rushing off to work!

Cheers - Euchiasmus 07:08, 1 December 2006 (UTC)

[edit] Rule Problems

  • The rule as written changes governement to governmen. -- Saaber 04:07, 4 December 2006 (UTC)
  • The rule as written changes quanity to quantituanit. -- Saaber 11:02, 4 December 2006 (UTC)
  • The rule as written changes 'dominican' to 'Dominica' -- ChrisCork 15:48, 15 December 2006 (UTC)

[edit] Miniscule

... is cool, listed as a variant of "minuscule" here and here.--BillFlis 12:50, 9 December 2006 (UTC)

The misspelling has become so widespread that some authorities are listing it as an alternative. However, there is still a clear majority in favour of the correct spelling. I vote we go with the majority and stick to minuscule. Euchiasmus 16:07, 9 December 2006 (UTC)
Dictionary.com shows "miniscule" in three different sources here, which makes a total of at least four, since M-W isn't one of them. Given the policy against changing from one spelling of the same word to another, I don't think we should be automatically changing this. —Krellis 17:31, 11 December 2006 (UTC)
Whatever you do, don't change the occurrences of "miniscule" in the minuscule article. This article does indeed say that "miniscule" has been "traditionally regarded as a spelling mistake," although no reference is offered for this contention. Some discussion with references may be found here.--BillFlis 19:03, 11 December 2006 (UTC)

[edit] Changing ordinals to cardinals in dates

Please can we remove the ordinal to cardinal conversion in dates? Maybe the Americans don't habitually use dates like "1st May", but we British do use them and I can't see anything wrong with them. When I read "1 May" it looks very strange, especially in narrative prose.

Here in UK the use of st|nd|rd|th is very common in dates. For example, glancing through filed correspondance I find that the majority of my documents (insurance policies, bank statements, nominet registration, etc) use ordinal numbers in dates. With other regional variations WP allows alternative forms - why not in dates?

Euchiasmus 14:18, 10 December 2006 (UTC)

I personally have mixed feelings about adding things to the typo list that aren't typos or misspellings, but the intention here was clearly to go with the Manual of Style guideline on ordinal suffixes in dates (relevant section here). So you'd really probably be better off bringing it up there. Hope this helps. --Galaxiaad 19:02, 10 December 2006 (UTC)
  • Here are a couple of points:
    • because WP:DATE is a guideline, consensus was reached about the date format to be used. While a guideline is not a rule, we should be striving towards the suggestions given unless there is a strong push for a change, which would mean that there is no longer consensus. Therefore, while consensus still exists, there is no reason to remove the rules removing ordinals from dates.
    • A note to users of WP:AWB/T: be careful not to remove ordinals in direct quotes. --Maelnuneb (Talk) 17:44, 12 December 2006 (UTC)

[edit] Error in proclaim rule?

The current rule for proclaim:

word="Proclaim" find="\b(P|p)roclam(e[dsr]?|ing)\b" replace="$1roclaim$2"

changes proclame to proclaime. Was this intended? Euchiasmus 11:36, 17 December 2006 (UTC)

I think not. The "?" shouldn't be there.--BillFlis 13:36, 17 December 2006 (UTC)

[edit] 'Receive' typo

I see there've been some recent changes to the way 'receive' is corrected, but unfortunatly it's now broken. I'm not too hot on regexp, so could someone take a look for me please? ^_^ ShakingSpirittalk 07:18, 19 December 2006 (UTC)

[edit] New words

I'm looking at Wikipedia:Lists of common misspellings and am trying to fix some of them, using AWB. As thus, I'd like someone more skilled with regexes than me to add:

  • Sacrifice
  • Satellite
  • Sandwich
  • Sergeant

Come to think about it, someone with enough time on their hands could just go ahead and look through everything in Wikipedia:Lists of common misspellings. Obviously, I was looking at S, but there's probably a lot missing elsewhere too. Thank you! Jobjörn (Talk ° contribs) 02:00, 25 December 2006 (UTC)

Jobjörn: I am currently working through all the 'S' typos myself. I am about halfway through a dump of the 30-Nov-2006 database. It might make more sense for us not to duplicate this effort - would you mind working on another letter? There are plenty to go round. If you wish I can help you with a whole bunch of regexes. Personally, I like to work on a set of regexes to make sure that there not too many errors or false positives before submitting them to the Wikipedia:AutoWikiBrowser/Typos list. Still, I have added sacrifice, sandwich and satellite for you - but not sergent because it generates false positives against a common surname. You might like to try this regex for lowercase only:
"sargant(s?)" --> "sergeant$1"
Let me know what you think - but it is Christmas and I will be away for a few days! Gaius Cornelius 13:25, 25 December 2006 (UTC)
No, definitely. I'll grab some other letter. Jobjörn (Talk ° contribs) 17:06, 25 December 2006 (UTC)

[edit] Targetting/targeting

I don't have the right dictionaries handy to confirm, but AFAIK 'targetting' and 'targetted' are accepted spellings in UK English (and possibly Australian English as well). Could somebody with access to the OED and/or Macquarie please check this and remove them from the list if this is so? --Calair 05:13, 30 December 2006 (UTC)

[edit] Typicaly & Essentialy

If someone could add 'typicaly' (typically) & 'essentialy' (essentially) to the regex list that would be great, there seem to be a lot of these errors at the moment.--Hooperbloob 07:31, 4 January 2007 (UTC)

Done. Gaius Cornelius 21:11, 4 January 2007 (UTC)

[edit] Manoeuver

I just merged (Out)Manoeuver into Maneuver as (Out)Maneuver. This is the line I deleted:

<Typo word="(Out)Manoeuver" find="\b([Oo]utm|M|m)an(?:[oeu]{1,2})ver(s?|ing|e[dr]|abl[ey]|ability)\b" replace="$1anoeuver$2" />

If someone could double-check my merge, I'd appreciate it. ~ BigrTex 21:23, 5 January 2007 (UTC)

AFAIK, the British spelling is 'manouevre''manoeuvre', so it's probably not a good idea to auto-correct a spelling halfway between the two to the US option without checking context. --Calair 23:19, 5 January 2007 (UTC)
My big American dictionary here has "manoeuvre" and "manoeuver" (but not "manouevre"--are you sure that's right?) as variants of "maneuver", without any indication that they are only British spellings. However, this dictionary says "manoeuvre" is "Chiefly British"; no listing for "manoeuver".--BillFlis 00:05, 6 January 2007 (UTC)
Oops, typo fixed, thanks :-)
I don't have good references handy, but as per American_and_British_English_spelling_differences#-re_.2F_-er the usual UK spelling is 'manoeuvre' and the US spelling is 'maneuver'. (This comes from a combination of US/UK differences on whether to end words with '-re' or '-er', combined with different rules on rendering the ligature 'œ' in a modern alphabet - UK spellings tend to split it into two letters, US spellings go with a single phonetic 'e'.)
'Manoeuver' is halfway between the two; it probably should be corrected where it appears, but I'd recommend checking context (i.e. the subject matter of the article, and failing that the style of the rest of it) to judge which way the correction should go. --Calair 01:31, 6 January 2007 (UTC)

[edit] Prepubescent or pre-pubescent

I'm not sure which is the correct format but both exist in quantity here.--Hooperbloob 03:02, 6 January 2007 (UTC)

[edit] Comital

While it's a common misspelling for "committal", "comital" is also a legitimate word meaning "pertaining to the count". I don't know enough about regexps to fix this, but perhaps something should be done; I've seen this change made twice in the past month or so. Choess 15:56, 12 January 2007 (UTC)

[edit] Sponser

Over 300 of these last time I checked. Should be 'sponsor', 'co-sponsor', 'sponsored', 'sponsoring', etc. --Hooperbloob 08:01, 14 January 2007 (UTC)

Only just under 100 in mainspace, according to wikisearch, I'll take a stab at them and report back. —Krellis 23:13, 15 January 2007 (UTC)
All of these in mainspace and Images: should now be taken care of. —Krellis 00:05, 16 January 2007 (UTC)

[edit] Trailor

trailor -> trailer --Hooperbloob 08:25, 14 January 2007 (UTC)

[edit] "_Strange" Pattern

I just removed the following pattern:

<Typo word="_Strange" find="(?<!\b([A-Z][a-z]*))(\s[Ss])tange\b" replace="$1trange" />

For two reasons:

  1. "Stange" is a last name that I've run across a number of times, particularly in Major League Baseball articles.
  2. The pattern is broken, replacing "Stange" with "trange" - the negative lookbehind assertion appears to be capturing, so the $1 would need to be $2.

Replacing "stange" to "strange" is probably fine, as long as we don't replace the capitalized version. I don't quite understand why this pattern has the lookbehind stuff, rather than just using word boundaries like other patterns, so I don't feel comfortable replacing it - if the original author (or anyone else) wants to do so, please go ahead, as long as you preserve "Stange" and make sure it replaces the right captured string. —Krellis 20:53, 15 January 2007 (UTC)

I originally added this fix. The purpose of the lookbehind was to elliminate instances of Stange preceeded by a word that begins with a capital letter - which may be a first name. I found this pretty effective at reducing false positives. Gaius Cornelius 21:09, 15 January 2007 (UTC)
Aha, okay, that makes so much more sense now. My brain just wasn't in a regex parsing mood earlier, I guess. Unfortunately, I've come across at least four or five false positives in the past few days - many articles use just the last name to identify individuals once they have been introduced. At least some of the FPs I've seen have been at the beginning of a sentence or line, so matching that in a lookbehind could theoretically help avoid some more, though of course it would probably prevent legitimate errors from being found as well. Given the advice of "don't add if there is one (false positive)" at the top of the list, I would suggest "Stange" be considered a lost cause, and just the lower case version be re-added. —Krellis 23:01, 15 January 2007 (UTC)

[edit] ((In)De/In/Af)Finite misbehaves!

The list of typos includes the almost impossibly complicated:

<Typo word="((In)De/In/Af)Finite" find="\b([Ii]n|)(F|f|[Dd]ef|[Aa]ff)(?:finite?|f?in[ae]te?|f?init)(s?|ly|ness|y)\b" replace="$1$2init$3" />

It changes infinetly to infinitly - (for example, try it with the Home Construction article).

If I could work out what it was doing right and what it was doing wrong, I would correct it! I think my example is not the only thing it does wrong. Somebody please help! Thanks. - Euchiasmus 20:17, 25 January 2007 (UTC)

[edit] Light Year

I ended up having a problem with the light year regex, so I removed it. Here is the original code: <Typo word="_Light year" find="(?<!\b(Buzz ))(L|l)ig?h?tyea(rs?)\b" replace="$1ight yea$2" /> This is what was happening when it ran for me: AWB found "lightyears" and wanted to replace it with "ight yeal". Obviously a problem with the substitution. I tried changing the $1→$2 and the $2→$3, but that did not end up working for me, which does not make any sense to me. If somebody with more experience can attempt to fix this one, that would be great. --Maelnuneb (Talk) 20:26, 26 January 2007 (UTC)

My fix was actually correct. I just had a cache problem getting in the way of having an updated set of typo rules. Problem solved. --Maelnuneb (Talk) 20:31, 26 January 2007 (UTC)
This dictionary says that it's "light-year", with a hyphen.--BillFlis 13:22, 27 January 2007 (UTC)
My home dictionary gives it as two words whereas the wikipedia article says it is either one word or hyphenated. I guess that typo fix had better come out. Gaius Cornelius 19:04, 27 January 2007 (UTC)

[edit] Peleton

peleton -> peloton

Thanks, Mk3severo 00:55, 2 February 2007 (UTC)

[edit] ususally --> usually

Please add this typo to the list. Harryboyles 05:03, 2 February 2007 (UTC)

Added. Wow, a quick search turned up 380 instances of this weird misspelling!--BillFlis 13:02, 2 February 2007 (UTC)

[edit] Simalar -> similar

not really sure how to add that... -ΖαππερΝαππερ BabelAlexandria 14:18, 13 February 2007 (UTC)

 <Typo word="Similar" find="\b(S|s)imalar\b" replace="$1imilar" /> 
Reedy Boy 14:43, 13 February 2007 (UTC)
Just Looked, there is
 <Typo word="(Dis)Similar" find="\b(S|s|[Dd]iss)im(?:mi|u)lar(|ly|ity)\b" replace="$1imilar$2" /> 

So, possibly encorporate with that?

 <Typo word="(Dis)Similar" find="\b(S|s|[Dd]iss)im(?:mi|u|a)lar(|ly|ity)\b" replace="$1imilar$2" /> 

I think. Addition of |a to the middle of the word Reedy Boy 14:46, 13 February 2007 (UTC)

[edit] Moniter -> Monitor

Need to handle moniter, monitering, monitered, etc..--Hooperbloob 23:48, 28 February 2007 (UTC)