Talk:CAPTCHA
From Wikipedia, the free encyclopedia
[edit] CAPTCHA Compromised?
I don't know if this is something that should be noted, but the following is from an article posted on Slashdot today:
Hell Yeah! reminds us of a 2-week-old development that somehow escaped notice here. A team of Russian hackers has found a way to decipher a Yahoo CAPTCHA, thought to be one of the most difficult, with 35% accuracy. The Russian group's notice, posted by one "John Wane," is dated January 16. This site hosts a rapidshare link to what looks to be demonstration software for Windows, and quotes the Russian researchers: "It's not necessary to achieve high degree of accuracy when designing automated recognition software. The accuracy of 15% is enough when attacker is able to run 100,000 tries per day, taking into the consideration the price of not automated recognition — one cent per one CAPTCHA."
Source: http://it.slashdot.org/article.pl?sid=08/01/30/0037254 206.180.38.20 (talk) 17:21, 30 January 2008 (UTC)
[edit] Scholar work on more accessible CAPTCHAs, specially CAPTCHAs in the text dmain
I think the section under "Accessibility->Attempts at more accessible CAPTCHAs" is incomplete. That is mainly because it completely ignores the research work that has been done on the subject. I took the liberty to write a small paragraph about it, but haven't edited anything yet in respect to the encyclopedia and tho the other more active contributors. I'll leave it here so you guys can decide.
- There have been MANY papers on CAPTCHAs, this article would become unmanageable if they were all listed. Given that no serious implementations of these concepts exist, it does not seem notable enough to include in the article. 128.2.101.39 (talk) 03:48, 17 November 2007 (UTC)
HERE IT IS MY SUGGESTION:
Additionally, research work has been done towards finding a viable CAPTCHA alternative for the text domain. Philip Godfrey and Stefan Katzenbeisser have proposed using word sense desambiguation as a form of CAPTCHA [1]. Also, Pablo Ximenes and colegues have proposed using phonetic pun riddles (as in the case of knock-knock jokes) as a way to distinguish men and machines [2,3]. Although these works indicate possible paths for the design of text-only CAPTCHAs, their ideas are still too preliminary to offer any practical viability.
References:
[1] Richard Bergmair and Stefan Katzenbeisser, "Towards Human Interactive Proofs in the Text-Domain: Using the Problem of Sense-Ambiguity for Security". In: Information Security. Lecture Notes in Computer Science v.3225/2004 pp. 257-267. Berlin : Springer-Verlag, 2005
[2] Pablo Ximenes, A. dos Santos, M. Fernandez, and J. Celestino Jr, "A Proposal of Human Interactive Proof in the Text Domain", In: Proceedings of the 5th Brazilian Syposium on Information Security - SBSeg 2005, Florianópolis, Brazil, 2005
[3] Pablo Ximenes; A. Santos; M. fernandez; J. Celestino Jr. "A CAPTCHA in the Text Domain". In: On the Move to Meaningful Internet Systems 2006: OTM 2006 Workshops, 2006 , Montpellier, France, Proceedings. Lecture Notes in Computer Science v. 4277/2006 pp. 605-615. Berlin : Springer-Verlag, 2006.
-l
[edit] Who keeps deleting the captcha.biz external link ?
Whoever you are - you could at least offer a reasonable reason why this is being done and please do so publicly so that we / I know who you think you are / or really are:
Criteria for a reasonable reason for deleting the above mentioned external link might be : -
1 - you have at least 10 websites where you have implemented captcha BY YOURSELF.
Meaning you didn't just delete an external link about a subject you know much about in theory, but know zilch about in practice.
- It is irrelevant whether the editor/contributor of an article has implemented 10 websites, or none, whether he or she did so alone or as part of a group. The only issue of relevance is whether the external link was both relevant to the article and of sufficient notability or quality as to be widely considered appropriate to Wikipedia as an encyclopedia.
- Wikipedia is not a link farm.
2 - you are not a programmer - i.e. have no php, and almost no html knowledge If you are none of these then captcha.biz is for you if you need a captcha solution. If you are some or all of these then captcha.biz is not for you - but asuming that others won't find it useful from you is presumptious in the least.
- It's irrelevant if the editor/contributor is a programmer. It is hardly "presumptious"[sic] nor would it be presumptuous for members of our community to decide what links are notable, authoritative, and appropriate for inclusion into our encylopedia.
- Conversely it is presumptuous for anyone to post links to their own external sites into articles in Wikipedia ... and extremely presumptuous for them to dispute removal by other maintainers of the article.
3 - you have done your little own bit on the web to fight spam [ where, when and how ? ]
- It's not Wikipedia's mission to "fight spam" nor is it our mission to advertise your site. The only question at hand is whether the external link in questions was relevant and important to the article. It appears that the sole proponent of including this link seems to be the understandably biased owner of its target.
- Wikipedia is not a political or activist platform. It's not a soapbox.
4 - you have read through captcha.biz and can honestly say that the content there is no good to 99% of the average webmaster(s) who has/have no degree in IT, and all he / she wants is an easy implementation of captcha on their website which can be added in 15 minutes.
- Whether the site is useful is irrelevant. It may be useful to many people. However, to be appropriate for inclusion in a Wikipedia article it must be notable, important, and relevant to that article. There may be a multitude of "useful" sites which are, nonetheless, not appropriate for inclusion in our articles.
- Wikipedia is not a link farm. It's not a search engine.
5 - you are aware that besides the academical arguments of the pros and contras about captcha - there is also a practical element where many web sites are innundated by form spam and need a non accademical, high brow solution full of gobbledegook which the average person does not understand - but something that works. This is what www.captcha.biz is about. And having implemented CAPTCHA on 40+ of my websites which reduced form spam from 300 form spams a day to 20 a day - I know what I am talking about. Do you ?
- The article covers academic and practical aspects of the topic. The topic of the article is CAPTCHA, not www.captcha.biz nor any other website in particular. Your personal experience using your software are as irrelevant as your alleged knowlege about what you are "talking about." You are talking about the inclusion of a link to your own website. We are talking about Wikipedia's content and quality standards.
99 - Appreciated would be an honest opinion from anyone / everyone who have taken it on themselves to judge what is low or high quality wikipedia article (external link) content disregarding completely the fact that this so called high quality content - such as complex explanations of complex issues - are often useless to the layman who would appreciate simple explanations. And going by all the other external links in the wikipedia CAPTCHA section, 99% of them do not cater to the average web master. Meaning : this section on wikipedia about captcha has gone all metaphysical and completely out of hand skywards with no regards to simplifying an already complex subject for most.
- Wikipedia is not a webmaster's HOWTO guide. It's an encylopedia. Webmasters and "laymen" who are looking for "simple explanations" of how to use (implement) CAPTCHA systems on their sites are welcome to avail themselves of many other resources, such as Google.
- Wikipedia is also not a polling place. Perhaps you'd appreciate a plethora of opinions on this discussion page. Most of us would not. The honest opinions of the article maintainers, to date, seems to have been that this external link is not sufficiently important to this article to be included.
101 - Last point - I have created a wikipedia user account today so that I am no longer anonymous - does that mean I get a contact to the god editor who keeps deleting my external link ? ... and have probably broken a thousand wikipedia laws by writing here. Sorry, I am new - but also very peeved off about the way as I see it some sincere and useful content or links are just deleted by the wikipedia gods. Before I took this step to outline my protest - I did a search on Google about wikipedia deletions and similar ---- the results are not very complimentary to the wikipedia editors and to the so called wikipedia editing democracy that appears to be in the hands of the selected few.
So - who was the selected few that decide that an external link is of low quality ? And do I get a chance for an appeal?
Pete - wiki user = captchap --Captchap 14:01, 28 March 2007 (UTC)
- The only wikipedia "laws" you have broken are by allowing your biases to motivate your dispute with established editors and maintainers of some of the content here. If the link to your site was held to be of sufficient quality and importance by any significant number of the others here, it would be included.
- While it's not a "law" of participation in the Wikipedia community it's a generally held convention that one should refrain from the appearance of bias and avoid anything that would be likely to be considered "self-serving."
—The preceding unsigned comment was added by Captchap (talk • contribs) 00:12, 28 March 2007 (UTC).
- You know, after the whole "Essjay" hoo-ha I thought people would realize that argument by authority ("I know what I am talking about. Do you ?") just doesn't wash. Bi 08:43, 28 March 2007 (UTC)
My appology for not signing - which has now been done belatedly.
Concerning my 'argument of authority' ("I know what I am talking about. Do you ?") if not taken out of the above context translates into two parts:
1. I know what I am talking about - when we are talking about showing non expert webmasters how to add captcha to their websites - having had to solve the above mentioned form spamming problem for 40 of my websites by adding Captcha myself, with no php knowledge and very little html know how. [ This is a simple porovable fact - not an argument of / for authority. And this section on CAPTCHA is about all aspects of captcha and not only the mathematics, history and how captcha can be sovled ... or not ? ]
- We are not talking about "showing non expert[sic] webmasters how to add captcha[sic] to their websites." This article is not a HOWTO. It's an encyclopedia article.
2. 'Do You ?' (know what you are talking about) is a question for the editor who took it on to himself to delete an external link to easy to understand explanations, examples and ready to use downloads of Captcha - and is a fair question to be asking.
- It cannot be considered to be a "fair question" for at least two reasons. First, it's clearly ad hominem. More importantly we have to consider the source. Every editor of Wikipedia "take is upon himself (or herself)" to judge the appropriateness, importance and
relevance of each change he or she makes. Whichever editor(s) removed the link did so based on that judgement. There has been no evidence of any malicious nor unethical intent motivating this removal.
- As someone who can reasonably, understandably be considered to be biased about topic (whether your' site is linked from this article) it's reasonable for the rest of the community to ask that you abstain from the decision.
If someone can delete an external link which offers useful practical information on a subject that tends to be technical for most - without offering the other party any chance of at least discussing the issue and putting his opinion up for consideration then that takes the 'argument' out of the equation and we are left with pure authority.
Having also attempted to contact the editor I think deleted the external link(s) without any reply - can someone please explain the following:
Who decides which external links are deleted ? On what basis and under which criteria ? Can these be dissagreed with ? How can I know who removed an external link and ask him/her directly and not have to have a discussion about it here.
thank you - Pete --Captchap 14:01, 28 March 2007 (UTC)
- We each of us decide what links are added, deleted and maintained.
- The two primary criteria are relevance and importance (notability).
- Most disputes are handled in the discussions pages by maintainers and contributors to any particular article. However, feel free to read Wikipedia:About#Handling_disputes_and_abuse for further details.
- You can see who removed a link by viewing the articles history. However, this is the appropriate place to hold any discussion about the contents of this article.
The captchas on captcha.biz are trivially OCR'd (there is no distortion, the letters are in a constant position, the lines are also in a constant position and easily removed). While they probably seem to prevent you from getting spam, if lots of people used the captcha (say, because a large website linked to it), then a spammer would have incentive to write the OCR code.
- By itself this would not be relevant to the inclusion of the link. If the site was notable, perhaps even of historical interest, then the effectiveness of the implementation or methods described there might not be the reason to include it.
Further, captcha.biz is very hard to navigate and looks spammy. The meta tags and extensive use of Google ads suggest that the site exists primarially to generate revenue rather than to inform. 128.2.101.104 05:27, 30 March 2007 (UTC)
- This is relevant. Sites which "seem spammy" are generally excluded from Wikipedia links. Another way of saying this is that they are not sufficiently important, notable, and appropriate to the topic. Naturally this is the least objective criteria when settling disputes.
- All the foregoing comments added by User:JimD. Conclusion: the link to www.captcha.biz doesn't appear to be of sufficient importance to include in the article.
- I'd also propose that this discussion page be trimmed down to a reasonable size! JimD 20:24, 25 May 2007 (UTC)
---
I too am getting contributions and my link deleted for identiPIC.com by the same zealous editor. His game is quite clear and will not be allowed to continue. Why keep on the non-novel and easily guessed kittenauth, yet delete everything else?? The answer should be clear to most. I will be taking this to the highest authority to stop his anti-community behaviour and his obvious agenda.
Kittenauth has 35,000 results on Google. identiPIC has 19. You created identiPIC on Apr 21, 2007 -- about three weeks ago. identiPIC simply documents the fact that "one can put up a captcha that uses images and offers N choices per image". That's it. One sentence. 128.237.238.114 15:20, 4 May 2007 (UTC)
[edit] Aural v. Oral!
aural=oral? Or am I just too american? Ilyanep 14:39, 18 Jun 2004 (UTC)
- "aural" means of the ear; "oral" means of the mouth. Aural captchas are captchas you listen to, as opposed to visual ones. Marnanel 14:57, 18 Jun 2004 (UTC)
-
- Yeah, but it's given to you orally, that's the confusion factor, thanks for the help Ilyanep 17:41, 18 Jun 2004 (UTC)
-
-
- Aural captchas need not have been anywhere near anyone's mouth, though... Marnanel 18:12, 18 Jun 2004 (UTC)
-
-
-
- It's not "given to you orally". "Orally" would be either by sticking it in your mouth or by telling you *in person*. 80.128.126.123 (talk) 16:56, 16 May 2008 (UTC)
-
-
-
- With digital voices reading the letters, the aural captcha is no more generated orally than the visual captcha is hand-drawn... --- Arancaytar - avá artanhé (reply) 13:12, 17 May 2008 (UTC)
-
It's not just visual impairments that can make captchas unusable-- entering a long string of random characters is hard enough for someone with dyslexia, even without the added distortion that most of these scripts use. I'm notorious for transposing digits when copying down numbers...
Also, the choice of font can be crucial. What if a 1 looks too much like a 7? And what about one versus ell, zero versus oh, and so on? --Codeman38 16:18, 11 Oct 2004 (UTC)
[edit] The "free porn" weakness?
As much as we all hate spammers, you got to give 'em credit for using free porn to break captchas. I just cant get over how brilliant. Still, that free porn cant be accesed by the blind, dyslexic, or elderly. Poor ppl The bellman 13:01, 2005 Apr 23 (UTC)
Has anyone demonstrated this "free porn" scheme actually being used? Cory Doctorow's proposal was theoretical, not evidence of an actual implementation.
- One crucial flaw with this method of defeating captchas is that the "free porn" site presenting the borrowed captcha does not even know the correct answer/verification code itself, so it would invariably allow access regardless of the answer/code given. 150.101.115.231 22:51, 27 October 2005 (UTC)
- If the porn site gave the wrong answer to the email site, wouldnt the email site have some response to indicate that it was wrong? I dont think that the porn site would necessarily allow access regardless of answer/code given.
Furthermore, most current captchas can be broken without much effort by using OCR or trivial image comparison techniques, so there's little point.
- With this in mind, as well as the need for audio support for blind users, the Implementations list of captcha generators perhaps should be split into two? One listing "Visual Only" implementations, and one listing those that include audio implementation. Currently it relies on each entry haphazardly mentioning audio support. If not two lists, then perhaps a table with a column for audio support to have yes/no inserted for each implementation? 150.101.115.231 22:51, 27 October 2005 (UTC)
And the "free porn" approach is easy to circumvent; place a short timeout on the captcha before it becomes invalid and the user has to try a different one (like, say, 10 seconds).
I'm not saying captchas can't be cheaply circumvented; if you want to do it badly enough, hire sweatshop workers at $3/hour. Free porn ain't the way.
- I don't know if the "free porn" method is currently used, but it's definitely technically possible. Say a spammer wants hotmail accounts. He/she/it sets up this "free porn" site with a captcha relay. When a visitor wants porn, the spammer's site visits hotmail, grabs a hotmail captcha (maintaining the proper cookies), and relays it to the visitor. The human solves the captcha and gives the spammer's site the answer. The spammer's site relays the answer to hotmail along with any relevent account registration info, and hotmail confirms or denies their answer (if it's confirmed, presto - they've got a new account). This confirmation or denial is then relayed back to the human, who reacts naturally. Alternatively, the spammer could present a fake denial several times, or even indefinitely, and get 3 or 4 captchas solved with each visitor until they realize that they'll never, in fact, be getting any porn. My point is, whether it works well or not, it takes practically no work to operate once it's set up. I think this is a legitimate concern worthy of presentation as a weakness on the article page. Courtarro 19:26, 17 March 2006 (UTC)
- I love the idea of "free porn" being the secret to CAPTCHAs. Please keep :) Mathiastck 16:30, 28 August 2006 (UTC)
-
-
- Wow, this is a so cool topic that I've just created an account to stop being an anonymous reader of wikipedia and instead start to be a content provider :P —The preceding unsigned comment was added by 74.56.161.7 (talk) 06:24, 10 May 2007 (UTC).
-
[edit] people without sight how can you register online????
people without sight how can you register online????
- Usually you simply cannot. But some websites provide an audio Captcha, or a way to interact with a human operator in order to prove that you are human. Sam Hocevar 06:56, 26 Apr 2005 (UTC)
I added the origin section a few days ago.. much of it comes from research, but parts are from andrei broder's talk at a workshop that I attended (similar info from another attendee) Matt Casey 21:32, 17 August 2005 (UTC)
What is the source of the example CAPTCHA image? The distortion is so extreme that I'd have trouble reading it in a real situation, and the gradient is obviously differentiable from the letters, making it useless. I think a better example could be found. MrVacBob 03:26, 17 December 2005 (UTC)
I have no trouble reading the image, and I feel it is typical of good captcha. That MrVacBob (and probably many others, too) has trouble reading it only underscores the unfairness of captcha. David 15:53, 17 February 2006 (UTC)
[edit] Invention credit
How come of the people involved in the CAPTCHA project at CMU, the ones that have pages are Manuel Blum, Nick Hopper, and John Langford, while Luis von Ahn who had more to do with the project than the last two does not have a link from the site, the guy was featured in a NY Times article with Manuel Blum, as opposed to the other two, it seems to me that he has more merit of a stub than the other two. Just wondering. -- Jorge Vittes 17:36, 17 Dec 2005 (PST)
- This is the nature of research. Once a scientist is famous, they manage other folks rather than do lower-level work. There's a great photo of William Shockley smiling and sitting at a microscope (which he hadn't used before the photo shoot) while his underlings, who did the actual work of inventing the transistor, stand around and look frustrated and useless. Compare the length of those three men's Wikipedia articles.--Joel 22:51, 9 May 2006 (UTC)
[edit] Article quality
The percentage of this article given over to external links instead of content ain't great...--BozMotalk 11:15, 5 January 2006 (UTC)
agreed. It's useful to have some links, but the amount currently there is a bit silly. imagine the same happening on the 'guestbook' page. Suggest either a separate page, or that someone goes through the list and decides which are the (two?) best for each language. I would do it, but I wrote one of the PHP ones, so obviously I'm biased. (user24)
[edit] Is there a Wiki Captcha extension?
Is there a Wiki extension for Captcha?
Not on the English Wikipedia. I have accounts on other Wikipedias and of the 6 I did 4 of them had a CAPTCHA when I registered to keep some else from using my name, [1], [2], [3], [4] --Ávril ʃáη 04:48, 21 July 2006 (UTC)
- There are several MediaWiki extensions for Captcha, including reCAPTCHA.
- However, if you have read "Inaccessibility of CAPTCHA: Alternatives", you might consider implementing other ways of combating spam -- some are listed at MediaWiki: Manual:Combating_spam.
- If you are using some other wiki engine software, it is quite likely to have a CAPTCHA or some other anti-spam technique. For example, Oddmuse has the QuestionAsker Extension.
- --75.19.73.101 13:38, 25 October 2007 (UTC)
[edit] Does the use of captcha violate civil liberties?
Most people these days agree that discrimination against people for jobs or housing on the basis of skin color is a violation of Civil rights and Civil liberties. Yet widespread discrimination against people who are blind or visually impaired is politely noted and (usually) ignored.
Examples are captcha (see the Accessibility section of this article), inaccessible voting (usually one has to get a sighted friend or relative to vote for them), unequal access to education (textbooks are frequently not available in braille or large print in time for the classes that use them), and the barring of service animals from restaurants and other public places.
Some of these are civil rights violations, and some civil liberties violations, and some just plain inconveniences, but don't they deserve some real attention? Why can major website companies, such as Google and Yahoo, use purely visual captcha without serious challenge from society? Why is no one developing a challenge-response system that is text only?
I have no answers, only questions.
David 15:53, 17 February 2006 (UTC)
- This is not really something for Wikipedia to decide. If you are worried about the accessibility of Captchas, but nobody else is, then you should start a separate forum to discuss this. —Quarl (talk) 2006-02-17 22:35Z
[edit] Link involving goatse
The "PWNtcha" site linked to includes a goatse image... it should probably have a warning or some such. (There's also been questions involving its legitimacy, that I haven't done the research on yet.) Is there a standard warning for "potentially hazardous to your retinas" images, or a policy on this? --Piquan 00:46, 9 March 2006 (UTC)
- If by legitimacy you mean whether the goatse image was intentionally put there by sam.zoy.org, check out http://sam.zoy.org/goatse/
If you wonder whether pwntcha is actually a fake, there's now an online demo that should allay your fears. --User24
- Thanks for the info. The legitimacy question was mostly a sidenote; as I said, I hadn't done the research. I'm okay with the idea that a legitimate software demo includes an offensive image; I just think that viewers should be warned before visiting pages with shock images. --08:49, 20 March 2006 (UTC)
[edit] Article title: acronym styling
This should be "CAPTCHA" rather than "Captcha", shouldn't it? —Ashley Y 06:14, 11 March 2006 (UTC)
I think that's a technical limitation of the wiki software, but yeah, it should -User24 27 March 2006
- Then shouldn't it be CAPTCHA throughout the article? -- Calion | Talk 03:34, 6 April 2006 (UTC)
- Yes. —Locke Cole • t • c 03:39, 6 April 2006 (UTC)
- Well, now it is. Except for the Links section, which was too dangerous to blanket convert. -- Calion | Talk 03:45, 6 April 2006 (UTC)
- Looks like we both went to do it at the same time. =) But I took longer because I went through each of the external links slowly (didn't want to break any URLs). ;) FWIW, it looks like I caught one URL you uppercased, so it's all good I guess. =) —Locke Cole • t • c 04:10, 6 April 2006 (UTC)
- Well, now it is. Except for the Links section, which was too dangerous to blanket convert. -- Calion | Talk 03:45, 6 April 2006 (UTC)
- Yes. —Locke Cole • t • c 03:39, 6 April 2006 (UTC)
-
-
-
- The present all-caps styling (CAPTCHA) is perfectly fine, and probably would be considered preferred by most people. But I just wanted to point out, purely as FYI, that c/lc styling (Captcha) can be logically defended as explained at wikt:Category:Acronyms. If you read through the whole page, it gives a nice overview of the epistemology of styling for acronyms and initialisms. (NB: I do not advocate changing this article's styling; it is fine as-is and would be fine either way.) Cheers! — Lumbercutter 13:26, 4 October 2007 (UTC)
-
-
[edit] Backronym? No.
Why is CAPTCHA a backronym?? That would mean the abbrevation had another meaning before? Or is it derived from capture? --Abe Lincoln 10:16, 21 April 2006 (UTC)
- It is not. It is a contrived acronym. --FOo 03:25, 22 April 2006 (UTC)
-
- Contrived acronym redirects to backronym. --Amit 08:42, 5 October 2006 (UTC)
-
-
- Follow-up: Redirect was corrected on 2007-05-05. — Lumbercutter 14:54, 4 October 2007 (UTC)
-
[edit] For this article?
I uploaded this image awhile ago. Would including it here be too self-referential (a little bigger, obviously)? Is another image even needed? Thanks. --LV (Dark Mark) 23:57, 26 April 2006 (UTC)
[edit] Patent vs Copyright
There was a note stating that an algorithm related to CAPTCHA may be patented. It was recently changed to "copyright", with an edit note that algorithms and source code are copyrighted, not patented. While I do agree that source code is generally copyrighted, my understanding is that algorithms can be patented, at least in the US. Famous examples include XORing images U.S. Patent 4,197,590 , the LZW compression patent (for which there were two distinct holders, but I only recall U.S. Patent 4,558,302 ), and the Fraunhaufer MP3 patent. While this practice is disputed (most strongly by the League for Programming Freedom), software patents are available in the US. See software patent for more discussion. I've reverted the edit, but put this note in to explain my reasoning. --Piquan 00:08, 3 May 2006 (UTC)
[edit] Did paypal invent this?
Did paypal invent the online thing where there are blurry numbers and letters and it takes 5-10 tries before a human guesses them right (not the leetspeak ones in this article)? In the book paypal wars, Eric M. Jackson thought paypal did. But I don't know. I had added it long ago to Auction_sniping. I don't know when this message will be seen by someone who knows but if paypal didn't, then the phrase "that Eric M. Jackson stated was invented by Paypal" should be removed from Auction_sniping. DyslexicEditor 21:32, 24 May 2006 (UTC)
- Yes, PayPal invented this sometime in 1999 or so. It was called the "Gausebeck-Levchin Test" at the time, after its creators, and was originally used to prevent fraudsters from signing up multiple accounts using automated scripts. The numbers were not blurry, but rather placed on an inconsistently broken grid with small gaps to foil automated OCR programs. 204.15.20.244 01:30, 14 September 2006 (UTC)
[edit] Removed 'javascript client-side' link
I removed this link: custom.programming-in.net/articles/art20-turingNumbers.asp
(1) it is not fully client-side (the image is generated on above site) (2) it does not protect at all, since it is an onclick event on pressing the submit button, so it will only work in a browser, not in a script that just submits form data.
Han-Kwang 08:58, 8 June 2006 (UTC)
- It was added by a pretty notorious programming-in.net spammer. Full support. Haakon 09:23, 8 June 2006 (UTC)
[edit] OCR :( :( :(
I tried to implement a captcha on my website, but bots are smart enough to crack it. Maybe the OCR is too hightech nowadays. I want to try a different menthod, like maybe a question in English 'What is the third letter of the word hippopotamus?' and you have to type in 'p' or 'P'... would that work better? --Sonjaaa 12:04, 8 June 2006 (UTC)
- Not only would it work better, but it would treat blind and visually impaired people fairly. 141.157.163.11 13:54, 24 June 2006 (UTC)
- As long as there is a predictable pattern to the question, it can be cracked. --Amit 08:49, 5 October 2006 (UTC)
- There cannot be a perfect solution, since spammers can hire people to process CAPTCHAs, and since any puzzle solvable by a human can be (theoretically) solved by a computer system. Only when the technologies distinguish between people and bots throughout the development cycle in a way that cannot be faked (probably using cryptography) can the distinction be relied upon. David 22:44, 2 January 2007 (UTC)
[edit] Cleanup external links section
I think the external links section is getting out of hand. Wikipedia is not a collection of links. A large number of them can be defeated by a trivial noise-removal algorithm (color filter or dot removal) followed by standard OCR. A few sites don't even bother to show an example. Since there are so many free implementations available, I think we can be a bit more selective here.
Unless someone can give me a good reason to keep the links, I will remove the links below a few days from now. Han-Kwang 19:56, 24 June 2006 (UTC)
Question to the here present captcha editors: REGARDING: - 05:27, 6 March 2007 38.114.140.10 (Talk) (→External links - removed the spam-y captcha.biz link. the site contains very low quality content and much advertising.)
The point of captcha.biz was due to myself not finding an easy to implement Captcha solution after several days of searching for one. And by easily implemented I mean for the non programmer, no php knowledge and very little html knowledge. captcha.biz is such a website, and has helped many webmasters implement captcha on their websites in 15 minutes. It contains step by step examples, free functional Captcha sequences which are easily implemented by the non programmer webmaster. The above IP editor who deleted the external link states that the content is of low quality. I beg to differ as if by high quality he intends technical cpatcha stuff, then that is not the scope captcha.biz. To keep the site alive some google ads were placed there as are on milions of other web sites included in wikipedia. And if the deleting editor took the trouble to find simple explanations on the web on how to implement Captcha easily - and there are almost none - he might have had second thoughts. Someone here stated that they were looking for Captcha resources that lists a large number of implementations of Captcha. Is simple Cpatcha implementation not considered as a resource ? My impression is that this Captcha section here is supposed to explain also how to use and implement captcha and not only on the complexities, history and problematics of Captcha. Before I go and re-edit the previous deletion I wanted to hear some opinions from you captcha expert analysts and on what is considered quality - if dealing with a complex matter and trying to simplify it for everyone to understand. - thank you - Pete
Replies copied from HK's Talk page:
- You are absolutely right in your assessment of the external links on the CAPTCHA page. I would do exactly the same....so feel free to trim those links off that you talked about. That would be fantastic.
--Ownlyanangel 00:21, 26 June 2006 (UTC)
- I would kill all of the implementation an services links citing WP:NOT a web directory. Replace them all with a link to something that is really a web directory, like Dmoz. --GraemeL (talk) 00:36, 26 June 2006 (UTC)
- I see, that's even more rigorous. Is there any good web resource on CAPTCHAs? DMOZ is not very helpful for this type of things, since DMOZ in principle only indexes complete sites, rather than sections of sites. (Not to mention that it usually has a 6-month backlog) Of course, I can start my own CAPTCHA index page (outside Wikipedia), but it would be kind of a conflict of interests. :-) Han-Kwang 13:41, 26 June 2006 (UTC)
- Attempting to find a site or two that lists, and perhaps reviews, a large number of implementations would still be an advantage. Even the cut down list below would still be around half the size of the actual article text and would likely grow again over time. --GraemeL (talk) 13:54, 26 June 2006 (UTC)
- I see, that's even more rigorous. Is there any good web resource on CAPTCHAs? DMOZ is not very helpful for this type of things, since DMOZ in principle only indexes complete sites, rather than sections of sites. (Not to mention that it usually has a 6-month backlog) Of course, I can start my own CAPTCHA index page (outside Wikipedia), but it would be kind of a conflict of interests. :-) Han-Kwang 13:41, 26 June 2006 (UTC)
I just removed the links listed below. As soon as I find a good reviewed CAPTCHA index page I will remove the other links to implementations as well, but I haven't been searching yet. Han-Kwang 21:49, 1 July 2006 (UTC)
[edit] PHP section
- Cryptographp - http://www.cryptographp.com/ - not in English
- Captha script - http://andyydev.com/project.php?file=captcha - weak
- Image verification - http://www.pscode.com/vb/scripts/ShowCode.asp?txtCodeId=762&lngWId=8 - commercial site, ads, weak
- Auditor - http://php.webmaster-kit.com/ - weak
- Forms generation - http://www.phpclasses.org/browse/package/1.html - weak, excessive ads
- PEAR - http://pear.php.net/package/Text_CAPTCHA/docs - no example provided
- Captcha PHP - http://freshmeat.net/p/captchaphp - no example provided
- CAPTCHA for Drupal - http://drupal.org/project/captcha - no example, works only in Drupal CMS
- PHP CAPTCHA AllSyntax - http://www.allsyntax.com/code/php/57/CAPTCHA-Image/1.php - weak
[edit] .NET
Three of the four links seem to be basically the same Captcha engine. One website appears 3 times in the list:
- ASP.NET Control - http://www.newtonsoft.com/blog/archive/2006/05/29/283.aspx - weak, duplicate
- Captcha Image - http://www.codeproject.com/aspnet/CaptchaImage.asp - weak, duplicate
- Custom control - http://www.codeproject.com/aspnet/CaptchaControl.asp - weak, duplicate
- ASP.NET Control - http://www.dotnetfreak.co.uk/blog/archive/2004/11/06/166.aspx - broken link (now works)
[edit] Classic ASP
- Poor man's - http://www.u229.no/stuff/Captcha/ - weak
- ASP Captcha script - http://www.motobit.com/util/captcha/ - somewhat weak
- ASP-CAPTCHA project - http://sourceforge.net/projects/asp-captcha - very weak
[edit] Java
- reCaptcha - http://www.crt.realtors.org/projects/reCaptcha/ - weak
- kaptcha - http://code.google.com/p/kaptcha/ - Strong. Easy to implement in your webapp. Produces output similar to Yahoo.
[edit] Coldfusion
- Open Source Captcha CFC - http://www.compoundtheory.com/?action=captcha.index - no example
[edit] C
- Obfuscated image - http://freshmeat.net/projects/obfuscatedimage/ - no clear example
[edit] Perl
- Authen - http://search.cpan.org/dist/Authen-Captcha/ - no example
- SecurityImage - http://search.cpan.org/dist/GD-SecurityImage/ - no example
- ImageCode - http://www.progland.com/protect_forms.htm - weak (vulnerable to reusing hidden form variables)
- Message Image - http://www.scss.com.au/family/andrew/webdesign/msgimg/ - weak
[edit] Python
- recipe, activestate - http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/440588 - no example
[edit] Ruby
- http://captcha.rubyforge.org/ - no example
- http://frankhale.org/CAPTCHA_RoR_Tutorial.html - broken link
- http://tagifieds.com/permatags/josh/ror-captcha-howto - broken link
[edit] Smalltalk
- http://www.squeaksource.com/SW2Captcha/ - no example
[edit] Lasso
- http://www.lassoforge.com/projects.lasso?PR=44 - no example
[edit] CAPTCHA services
Remove all links. It makes no sense to outsource a captcha to a different server since form verification has to be done on your own server. If you can do the latter, you can use one of the free implementations above as well. Moreover, these are not really CAPTCHAs in the sense that the generating algorithm is not disclosed. So only keep the ones that offer something extra and remove the following links:
- http://www.protectwebform.com/ - weak
- http://w2.syronex.com/jmr/safemailto/ - this is not a captcha
- http://www.cerospam.com.ar/demo/ - weak
Han-Kwang 19:56, 24 June 2006 (UTC), update 25 Jun
External links are a little messed up, but the defeating links are valuable in learning more about Captchas. Esp. when developing them ( to avoid pitfalls of creating ones that are easily defeated ) (209.87.176.132 17:25, 18 October 2006 (UTC))
[edit] E-mail distinguisher
This site has a captcha which makes you solve an equation. The bellman 15:42, 25 June 2006 (UTC)
- It took me maybe 20 minutes to write a perl script to defeat this thing, including the time to look up the Perl API for HTTP requests. After testing it a couple of times I realized that you probably aren't related to the above website (which seemed to contain a big wikipedia link). Han-Kwang 17:05, 25 June 2006 (UTC)
- I am the creator of the utility mentioned above, which is described in full on this page. It is not a CAPTCHA and makes no claim to be one. It is a form and CGI program which allows visitors to a site to send E-mail to an address which is nowhere disclosed on a Web page, requiring them to first solve a problem. Obviously, one can write a program to solve such problems—that is how the feedback form works itself! The purpose of the program is not to distinguish computers from people, but rather people whose mail is likely to be worth reading from idiots. --John Walker (fourmilab.ch) 17:46, 25 June 2006 (UTC)
[edit] Only one T for "Turing test to tell"‽
Wouldn't "Turing test to tell" generally become three T's under the usual initialism rules, with only "to" being optional? This acronym seems a bit too loose to be true. I suspect it's actually Completely Automated Procedure to Tell Computers and Humans Apart. It's not exactly a Turing test, since there's no human judge; maybe that bit was just thrown in during backronymization to impress non-experts. SeahenNeonMerlin 22:22, 2 July 2006 (UTC)
- See www.captcha.net[5], in the first publication listed. I've worked for Aladdin before, they have a history of coming up with acronyms that select letters to come up with what they want. --JVittes 22:44, 2 July 2006 (UTC)
- Considering it has been almost a week and there is no further discussion I'll remove the disputedAssertion tag soon. --JVittes 17:27, 8 July 2006 (UTC)
[edit] Why use trademark, not generic term?
Since CAPTCHA is registered as a trademark, why is it used in this article as the generic term for these programs? The community should come up with an unencumbered name and use that. A direct, un-cute name would be fine.
For what it's worth, CMU's case for the trademark is weakened by the sort of usage that's going on here. That's fine by me. I dislike the annexation of portions of our natural language namespace without good reason. —Preceding unsigned comment added by Knackers (talk • contribs)
- They are also called HIP, for 'human interaction proof', but this word has a wider meaning than Captcha. I can't find anything on the web about CMU complaining about the use of the word. Han-Kwang 03:27, 16 July 2006 (UTC)
Agreed. I'll just make a better version of this technology, give it a new name, and claim to never have even heard of the term "CAPTCHA" :3. I have a technique planned out that will eliminate character scanner from telling the difference. (Though as complex as this idea is, it's use has to be VERY limited.) ThymeCypher 16:08, 1 July 2007 (UTC)
[edit] Accessibility
The article says:
Even some of the demo CAPTCHAs at the software sites listed below are indecipherable to many if not all humans.
I assume that "software sites listed below" is referring to the the links at the bottom of the article. However, I'm not sure which of the links is a "software site". It certainly isn't all of them--I only fournd one that has more than one sample (Breaking a Visual CAPTCHA), and all the CAPTCHAs there were relatively easy to read. (I recently saw one somewhere--I don't recall where--that took me five tries to get right :-(.)
- That phrase refers to a section in an older version of the article that contained external links to implementation. I removed the entire section since it was attracting too much linkspam, intentional or not. I'll reword the phrase. Han-Kwang 18:46, 10 August 2006 (UTC)
Also, in the sample CAPTCHA at the beginning of the article, the caption says:
This CAPTCHA of "smwm" obscures its message from computer interpretation by twisting the letters and adding a background color gradient
What color gradient?
--69.140.23.118 16:06, 2 August 2006 (UTC)
- There is a gradient. If it is not showing up on your monitor, your system may not be in the highest standard color resolution. David 22:47, 2 January 2007 (UTC)
[edit] Paying the human operators with access to pornography instead of money has also been considered.
That line needed some sort of comment. Keep, notable ;) Mathiastck 18:35, 22 August 2006 (UTC)
[edit] I like the leet speek example
Hey I just readded the leet speek example. I think the leet speak example aptly explains why CAPTCHA's work. It's easy for humans to recognize text that machines have more trouble with. Mathiastck 16:29, 28 August 2006 (UTC)
[edit] Should this be in Guessing Games?
I don't think CAPTCHA should be in guessing games even though it is a bit of a guessing game sometimes.
- Why not? Nothing wrong with being in a lot of categories. The category system is underused. Mathiastck 12:07, 3 September 2006 (UTC)
from user24: freeCap has been turned into exactly that - a children's guessing game! check out http://www.jambav.com/jambav/flashy/cap/index.php?source=gamepage
[edit] Deafblind?
How would one who is deafblind access the internet in the first place? --66.220.237.102 15:10, 6 September 2006 (UTC)
- With a braille terminal or some other accesibility device. —Keenan Pepper 22:21, 6 September 2006 (UTC)
[edit] Spammers comment is misleading
The following is misleading in the context it appears in: "but the technology can also be exploited by spammers by impeding OCR detection of spam in images attached to email messages." These distorted messages that appear in spam messages are not an application of Captcha. The only way they are related to Captcha is that both use the same technology, so this comment - if it appears at all - belongs in a different section. —The preceding unsigned comment was added by Mahemoff (talk • contribs) 06:29, 7 December 2006 (UTC).
[edit] Statistics Regarding Deaf/Blind
The accessibility section features a section which contains several statistics regarding the deaf/blind. This strike me as unnecessary and is not useful to the article, since these statistics do not tell us anything about CAPTCHA itself. We know that deaf/blind people exist - the specific numbers of people in the UK who are both deaf and blind is not pertinent to CATPCHA itself. Would anyone take issue with me deleting these lines from the article? —The preceding unsigned comment was added by Vonkwink (talk • contribs) 07:16, 6 January 2007 (UTC).
[edit] Is this a site used to solve CAPTCHA?
The article talks about a «[...] technique used consists of using a script to re-post the target site's CAPTCHA as a CAPTCHA to a site owned by the attacker, which unsuspecting humans visit and correctly solve within a short while for the script to use.» I found this site, http://mortgage-and-remortgage.info/, while surfing the Internet. I don't know, but is this site using unsuspecting humans visit to correctly solve CAPTCHA? If so, would be of any interest using the site for the article, like using a screen shot of the site in the article? Would that be a copyright violation? Jayme 18:45, 6 January 2007 (UTC)
[edit] HTML encoded Captcha
I've just copy-edited a new article on HTML encoded Captcha, but it has only one primary source and I would suggest to evaluate it's content and - in as far notable - merge it here. Tikiwont 14:03, 11 January 2007 (UTC)
- Has anyone ever tried ascii Captcha (IE, like the example ascii art message at http://www.network-science.de/ascii/)? If different fonts are used, it would be hard to decode yet wouldn't take up much HTML, and it could be based in javascript. 24.107.235.184 21:07, 13 January 2007 (UTC)
- I don't think it's notable (esp. given that it only refers to one implementation), and as a 'CAPTCHA' doesn't necessarily need to be image based, there's nothing fundamentally different about HEC. Also the implementation linked is very weak; see the discussion on /. and the link in the HEC page. Sounds like advertising to me. The idea that html is less OCR-able than jpg is totally flawed; it doesn't take long printscreen. —The preceding unsigned comment was added by 194.80.176.253 (talk) 09:55, 14 January 2007 (UTC).
- sorry, that was me forgetting to sign my posts as usual. -User24 03:08, 19 January 2007 (UTC)
- Merge per Tikiwont. RupertMillard (Talk) 09:49, 20 January 2007 (UTC)
- I think there is a little information in the proposed text and I do not think it would be helpful to merge it. Theo Pavlidis
24.186.167.222 15:44, 26 January 2007 (UTC)
The article has been removed as per Wikipedia:Articles for deletion/HEC (html), so I rmeoved the merge tag. Tikiwont 10:39, 13 February 2007 (UTC)
[edit] Spam image
I'm pretty sure that the purpose of the colored streaks in the spam image is not to defeat OCR, but to add some randomness to each picture so spam recognition software can't fingerprint and filter them. --CyBot 17:34, 14 January 2007 (UTC)
- Both things are the same, at least from my view point.
- they are not the same at all. While totally ineffective at stopping OCR, the obfuscation is perfect for avoiding, say, md5 hash based spam image checks. I would bet money on this being the motivation for them adding these streaks. --User24 03:06, 19 January 2007 (UTC)
-
-
- As it happens, the earliest techniques did use MD5 hashing to fingerprint, but note that only one bit needs to be different to get a "unique" MD5 hash (a bottom row of random pixels in two shades of white would do the job fine). I think anon's point is that a truly general-purpose image-similarity detector—no matter how it happens to operate—becomes an OCR program when run against a dictionary of strings that have been converted to images. CAPTCHAs and image spam have in common fundamentally the goal of diverse digital representations of something that can still be read by intelligent beings. Look at it this way: if static and noise could be added to an image so that it didn't make it any less vulnerable to OCR, it would still improve a CAPTCHA algorithm by creating more unique variants (so that a database of CAPTCHA-hashes mapped to solution text would be less useful). Metaeducation 23:51, 18 February 2007 (UTC)
-
According to Ziff Davis and some other sources (search on OCR+"image spam") lots of filter software does indeed use OCR and then just run the result text through the usual algorithms. Counterintuitive though it may seem, existing OCR was easier to adapt than to write general-purpose image "fuzzy matching" (algorithms that could also—say—accurately determine if two pictures were of the same cat). Plus, in the domain of the kinds of images they were dealing with, extracting and comparing the text was probably better than any other approach they had on hand. (Note that it doesn't take much to screw up OCR software that hasn't been tailored to a specific kind of image, so intuition that the commonly-seen distortions are "easy" to compensate for isn't necessarily correct.) Metaeducation 23:51, 18 February 2007 (UTC)
[edit] Merge
This article was nominated for merge. I agree. Nothing more to say :) (I don't think it is needed an article for every type of captcha.
—Nethac DIU, would never stop to talk here—
18:41, 15 January 2007 (UTC)
We have a very fast bot. I fell in edit conflict to put the sign, 5 seconds after.
- The page that has been proposed for merging into this one looks like an advertisement. ("no production website has ever used it.") Gazpacho 05:35, 17 January 2007 (UTC)
Someone who knows what they are doing should edit the links on this page, removing the ones that go to phishing sites. Then lock the page.
[edit] Dating the external links
The external links do not include date (publishing, posting, proposal etc.), which is normal for all publications. I suggest that each external link had a WP visual date, to reflect the relative position and development of the linked illustrations. Barefact 09:28, 5 March 2007 (UTC)
[edit] Security of CAPTCHAs
There are various comments about different implementations using security through obscurity, could somebody knowledgeable add some info (or a sub-article) on how to implement a secure CAPTCHA, or where to find an existing, secure implementation? -- Lee Carré 19:57, 29 April 2007 (UTC)
[edit] Identipic
I think that the Identipic link that one user is adding should stay removed. The person has a "system" where he recommends the webmaster add 3 static pictures to their website and give 10 choices. It's not really a CAPTCHA in any sense.
I'm starting to wonder if CAPTCHA needs to be more carefully defined. For example, 1+2=? is being called a type of CAPTCHA under this article. I think the definition needs to be pruned down to bot-preventing challenges which provide some level of challenge to a simple computer program. The issue is defining simple.
128.2.223.159 00:55, 3 May 2007 (UTC)
- It certainly seems like link spam to me and should be removed. Russeasby 19:50, 4 May 2007 (UTC)
- I started to clarify the definition of a CAPTCHA to require two properties: 1) breaking the CAPTCHA must require AI. 2) there must be an automatic means of generating new CAPTCHAs. I believe this helps to differentiate effective CAPTCHAs from some of the CAPTCHA-like systems that have been proposed. The "breaking it needs AI" requirement eliminates obscurity based challenges (like what is 1+1). While the definition of "needs AI" is not exactly formal, I think it helps draw the line. I'm wondering if there is a more formal definition that can be used. The second requirement, automation, differentiates between a make-shift solution and something that could be deployed on a large site. Without this requirement, any aspect of the website could be a CAPTCHA. 128.237.238.114 16:30, 6 May 2007 (UTC)
- Clarifing the definition of CAPTCHA is of course good, if it is accurate and representative of the general view of what CAPTCHA is. I am not at all knowledgeable on this subject so I can not comment on your changes specificly. But it does make it clear to me that this section needs proper references to support it, can you provide those? Russeasby 16:54, 6 May 2007 (UTC)
- Well the captcha.net site states that "A CAPTCHA is a program that can generate and grade tests that most humans can pass, but current computer programs can't pass." I think that definition implies both of the properties ("a program that can generate" => automation, "current computer programs can't pass" => needs AI). The "current programs can't pass" part of the definition is a bit problematic. Strictly following this definition, there is no such thing as a week captcha. The captcha is either strong, or can be broken by current programs and then not a CAPTCHA. 128.2.223.159 20:07, 6 May 2007 (UTC)
- Clarifing the definition of CAPTCHA is of course good, if it is accurate and representative of the general view of what CAPTCHA is. I am not at all knowledgeable on this subject so I can not comment on your changes specificly. But it does make it clear to me that this section needs proper references to support it, can you provide those? Russeasby 16:54, 6 May 2007 (UTC)
-
-
-
- I came here from WP:RFC. The Identipic link should definitely stay removed as it's a clear violation of Wikipedia's External link policy (WP:EL) If you look here: WP:EL#Links_normally_to_be_avoided the identipic link fails on 1, 3 and 5. Also, there's no reason to believe that Identipic could be considered a reliable source. This isn't really a close call. --JayHenry 20:07, 7 May 2007 (UTC)
- I came here from RfC too. I also think the link should stay removed. Please remove this article's listing from Wikipedia:Requests for comment/Maths, science, and technology when this dispute is resolved.--Daveswagon 01:04, 7 June 2007 (UTC)
- I came here from WP:RFC. The Identipic link should definitely stay removed as it's a clear violation of Wikipedia's External link policy (WP:EL) If you look here: WP:EL#Links_normally_to_be_avoided the identipic link fails on 1, 3 and 5. Also, there's no reason to believe that Identipic could be considered a reliable source. This isn't really a close call. --JayHenry 20:07, 7 May 2007 (UTC)
-
-
-
- Has this been settled? It looks like the answer is yes. 199.125.109.127 05:50, 14 July 2007 (UTC)
[edit] TUR.ID. alternative to CAPTCHA
From http://turid.sf.net/ and http://sf.net/projects/turid
TURing human IDentification, a textual highly accessible alternative to image CAPTCHAs involving the usage of simple phrases and based on the language recognition features of the user, supposedly human.
Should it be added? --151.75.175.234 12:28, 7 May 2007 (UTC)
Uh, huge problem, all of these sources can be googled. For example, the first quote can be found with:
http://www.google.com/search?q=Quante+%27l+villan+ch%27al+poggio+si+riposa%2C&
If passages from googleable books are used, then the test isn't really a CAPTCHA, there's an easy back door. If non-googleable passages are used, it's not clear how the test is automated. Further, it's hard to ensure that this CAPTCHA is solveable by some audiences, eg children or non-native speakers of the language.
Also, I'm not sure how the program ensures that the sentence only makes sense with the chosen words. How is the word that is to be replaced chosen?
There also need to be many more choices, for example the example on the projects homepage has 36 possible choices. For the CAPTCHA to be effective, there probably needs to be at least 10,000 choices (meaning 4 slots of 10 choices each).
I think this article already has to many references to ideas for CAPTCHAs that haven't been fully realized.
128.237.238.114 15:43, 7 May 2007 (UTC)
If passages from googleable books are used, then the test isn't really a CAPTCHA, there's an easy back door. If non-googleable passages are used, it's not clear how the test is automated. Further, it's hard to ensure that this CAPTCHA is solveable by some audiences, eg children or non-native speakers of the language.
I think that the default sources should not be used. The ideal source would be a random blog (for example, daily cached in database) in the user's languagew
Also, I'm not sure how the program ensures that the sentence only makes sense with the chosen words. How is the word that is to be replaced chosen?
The choices are 4 random words from the same text + the correct word. The correct word is replaced with the combo box
There also need to be many more choices, for example the example on the projects homepage has 36 possible choices. For the CAPTCHA to be effective, there probably needs to be at least 10,000 choices (meaning 4 slots of 10 choices each).
It is an idea, the implementation offers customization, it is NOT the ready-to-use solution the one online - I guess
I think this article already has to many references to ideas for CAPTCHAs that haven't been fully realized. [[User:128.237.238.114|128.237.238.114]] 15:43, 7 May 2007 (UTC)
I agree. It's better to wait for some real-world successful application. However I proposed it as a theorical alternative to CAPTCHA --151.75.175.234 16:29, 7 May 2007 (UTC)
About randomness: the implementation does not allow to brute force a captcha since it is generated again for each request. It should be very effective if the source text are news for example - just my 2cents --151.75.175.234 16:31, 7 May 2007 (UTC)
"I think that the default sources should not be used. The ideal source would be a random blog (for example, daily cached in database) in the user's language"
It still needs to be non-googleable. A random blog is sure to be in Google. Further, to meet the requirement of being automated, the CAPTCHA must have a web crawler built in or something. But then Google can find it!
"The choices are 4 random words from the same text + the correct word. The correct word is replaced with the combo box"
There is no assurance that there aren't multiple correct solutions.
"It is an idea"
And a decent one at that. But there are lots of interesting ideas about how one might make a CAPTCHA that haven't been practically implemented. I think that in order to be in the WP article we need a simple criteria about the CAPTCHA type:
- There exists an implementation of the CAPTCHA that is either used on a major website, or a freely available implementation that is used by a large number of small sites. - The CAPTCHA poses a sizable challenge to an attacker who spends a reasonable amount of effort breaking the specific implementation
"About randomness: the implementation does not allow to brute force a captcha since it is generated again for each request. It should be very effective if the source text are news for example"
If there are 36 possible answers than a randomly guessing computer will be able to get through 1/36 times. Or put another way, a botnet based attack on the website would be 1/36th as effective as if the website didn't ahve a CAPTCHA at all
This article seems to be headed into a discussion of what might be a usable CAPTCHA. We really need criteria to help us document things that are in use, but not might-work ideas.
128.2.223.159 17:38, 7 May 2007 (UTC)
"I think that the default sources should not be used. The ideal source would be a random blog (for example, daily cached in database) in the user's language" It still needs to be non-googleable. A random blog is sure to be in Google. Further, to meet the requirement of being automated, the CAPTCHA must have a web crawler built in or something. But then Google can find it!
Google cannot be so fast to index the latest entries of the same day; but, you could say, the attacker could track the sources which are used and vanify the randomness. I have no real solution in such case; a good random text source should be used, that's the only valid scenario.
"The choices are 4 random words from the same text + the correct word. The correct word is replaced with the combo box" There is no assurance that there aren't multiple correct solutions.
Good point.
"It is an idea" And a decent one at that. But there are lots of interesting ideas about how one might make a CAPTCHA that haven't been practically implemented. I think that in order to be in the WP article we need a simple criteria about the CAPTCHA type: - There exists an implementation of the CAPTCHA that is either used on a major website, or a freely available implementation that is used by a large number of small sites. - The CAPTCHA poses a sizable challenge to an attacker who spends a reasonable amount of effort breaking the specific implementation
I agree
"About randomness: the implementation does not allow to brute force a captcha since it is generated again for each request. It should be very effective if the source text are news for example" If there are 36 possible answers than a randomly guessing computer will be able to get through 1/36 times. Or put another way, a botnet based attack on the website would be 1/36th as effective as if the website didn't ahve a CAPTCHA at all
Again, the botnet would not work since the captcha session is not re-usable. Most modern captchas don't allow it nowadays.
- Just want to clarify, the issue is that for EACH unique captcha the bot has a 1/36th chance. Let's put it like this: because I'm a really nice, I put up a website that will give you a price of 1 cent if you solve a CAPTCHA correctly. A bot would just randomly guess, and each time expect to win 1/36th of a cent. I'd be broke :-).
This article seems to be headed into a discussion of what might be a usable CAPTCHA. We really need criteria to help us document things that are in use, but not might-work ideas. [[User:128.2.223.159|128.2.223.159]] 17:38, 7 May 2007 (UTC)
No I did not want to propose it as a CAPTCHA, however the discussion has been satisfactory to me up to now. Thank you. --151.75.175.234 22:00, 7 May 2007 (UTC)
[edit] Policy on listing implemenations
Recently, there has been lots of issues with the article becoming a list of links of CAPTCHA implementations. I'd like to suggest that only CAPTCHA implementations recommended by a research group associated with CAPTCHAs (eg, the one at Carnegie Mellon, or the one that broke GIMPY) be listed. This allows reasonable implementations of CAPTCHAs to be listed here (as it is the first search result for "CAPTCHA", many people coming here may be in desire of an implementation).
Right now reCAPTCHA seems to fit the bill here, as it is recommended by the people who created some of the first CAPTCHAs. This seems to make it as close as anything to an "official" implemenation, which I've tried to change the article to reflect.
Comments?
[edit] Peas + hits picture
I thought the peas hits picture might contribute to the article on the potential hazards of using totally random captchas (in this case it could be offensive to some). I won't add this image back unless someone else agrees with me, but I feel strongly that this article could use that image, and not for some schoolboy lol reason. Bassgoonist Talk 07:52, 30 June 2007 (UTC)
- Photo in question http://en.wikipedia.org/wiki/Image:Peashits.png Bassgoonist Talk 07:54, 30 June 2007 (UTC)
- Well, there was no text in the article about it, so it didn't add anything as it was put there. I'm not sure if we can say anything more than "By randomly generating CAPTCHAs, there is a non-zero, but small, probability of offensive text being shown". However, I don't think this adds that much to the article: 1) many web technologies have similar risks (Google risks displaying porn. As does flicker.) Given that many major services have CAPTCHAs, it doesn't seem like this risk carries much weight. Also, the risk can be greatly mitigated with a good blacklist. 64.9.236.172 18:04, 30 June 2007 (UTC)
[edit] As nuisance to humans
There is no real reference in the article to one of the biggest disadvantages of Captchas besides accessibility:
They are a nuisance to humans if they prevail on a web site and no effort is made to make them interesting.
Captchas for email registration etc. are not that bad (since you only have to enter them once), but some pages (i.e. the German equivalent to facebook) use Captchas too frequently, thereby annoying users.
This constitutes a disadvantage of most random Captchas. Captchas that require some brain and not only eyes (i.e. the often-mentioned "simple questions") have a slight advantage over the random ones (but can still grow pretty annoying if you have to solve 1 every 4 minutes)--Ruben 14:19, 2 July 2007 (UTC) (forgot to sign in)
- The issue you're talking about is simply a design flaw of a given site -- it is over-agressive in presenting CAPTCHAs. Nothing intrinsic in CAPTCHAs prompt this -- a CAPTCHA is simply a way that the website can place a roadblock to robots in any web page that they want. The responsible use of the technology is the job of those deploying it. A similar analogy would be jackhammers. Sometimes, when doing construction, a jackhammer will be used early in the morning and wake you up. While it's worth noting in Wikipedia that the jackhammer is a loud tool, the issue of jackhammers waking people up is not -- by scheduling construction at reasonable times, the issue can be solved.
[edit] Newest type of captcha not yet in article?
I wonder if this is something we should include in the article? BigNate37(T) 23:36, 12 July 2007 (UTC)
[edit] 3D and Isometric Captchas
3D provides another dimension (pun intended) to CAPTCHA generation that is simple to create, and easier for humans to read than for OCR to break.
3D CAPTCHAs provides pros/cons of 3D and Isometric Captchas, with a working example written in Perl.
I propose adding this to the Defeating CAPTCHAs section. Grafman 06:25, 15 July 2007 (UTC)
- The site doesn't have much discussion in it (no detailed analysis of if the CAPTCHA proposed is actually harder to beat with OCR). I don't think it adds much to the article
[edit] CAPTCHA Killer - Commercial Services
Please do not add the CAPTCHA killer link again. The site is spammy, does not contain any content about how it defeats CAPTCHAs. The "service" it provides is also illegal -- in any site, defeating the CAPTCHA violates the terms of service of the given site. If you disagree, bring it up for discussion on the talk page
Respectfully, I disagree --
- There is content describing the methods -- we do first pass OCR and second pass with human verification.
- CAPTCHA outsourcing - is not illegal
- CAPTCHA outsourcing - is not against the Terms of Service of Myspace, Facebook, or any other site I have researched
- CAPTCHA outsourcing - is a widely used practice at almost every SEO and SEM organization. CAPTCHA outsourcing is used to manage Myspace accounts for many politicians, celebrities and various enterprises (even Fortune 2000.)
68.83.255.13 06:57, 22 July 2007 (UTC)
- CAPTCHA Outsourcing is against the terms of service of many sites. For example, facebook states: "you agree not to use the Service or the Site to:... use automated scripts to collect information from or otherwise interact with the Service or the Site;". CAPTCHA outsourcing clearly falls under this category. Software that exists for the sole purpose of violating a sites terms of service is illegal. If an organization has legitimate needs on the website (eg, a politician), I'm sure a legal means of using the site can be arranged. Further, CAPTCHAs are copyrighted content. Proxying a site's CAPTCHAs to other users (for human verification) is a violation of copyright law.
Wikipedia is not a place to advertize services for spammers. From your statements it is clear that you are affiliated with this website. As such, adding your link to wikipedia violates the principal of neutral point of view and should be done by those in the discussion board.
A reader of this article could gain nothing from your link short of how-to information for spamming. As the concept of human-based captcha circumvention is discussed in the article, it seems that adding your link adds nothing. 64.9.234.175
[edit] Question
Could these be in future and all login pages? 'Cuz that WILL be awesome! PNiddy Go! 0 16:31, 25 July 2007 (UTC)
[edit] External links
Regarding some of the recent edit warring over external links, let me just say this. An external link (excluding references) should not be placed inline where the website is mentioned. References should be cited properly according to Wikipedia:Citing sources and Wikipedia:Citation templates. In the specific case I am referring to with recaptcha.net, it is already in the External links section and need not be duplicated awkwardly in the article text. The relevant Wikipedia:Manual of Style page is Wikipedia:External links, moreover its External links section. Accordingly, Firefoxman (talk · contribs) removed the inline link (the bold part) from this sentence:
- Currently, [[reCAPTCHA]] ([http://recaptcha.net/ external site]) is recommended by the CAPTCHA creators as an official CAPTCHA implementation.<ref>http://captcha.net/</ref>
67.171.102.44 (talk • contribs • info • WHOIS) reverted this, and I subsequently restored it. Please note that the link is already in the References section and the External links section; there is no reason to add a third link for this sentence. BigNate37(T) 21:33, 29 July 2007 (UTC)
External Link = OK?
Hi, the "creator" of CAPTCHA (Louis von Ahn) was recently featured in an episode of Wired Science on PBS. This external site provides a nice summary of his appearance on that show. I have added a link to the External Links section. If you feel it is not appropriate, please delete it. Please do not cut and paste the summary from this other site. Thank you.
[edit] Binary choice
"Some current image recognition CAPTCHAs ask the user to make a binary choice (is this a cat or a dog?). Even with 16 images, a bot has a 1 in 65536 chance of getting the image right."
Doesn't a bot have a 1 in 2 chance of getting the image right if the choice is binary?
Never mind, I didn't read it as 216 IRbaboon 12:15, 8 August 2007 (UTC)
[edit] Image Recognition CAPTCHAs
Neopets uses an image recognition CAPTCHA in their NPC shops. I suspect Neopets would probably count as a major website.
[edit] reCAPTCHA and a parallel project published earlier
Hi all, my name is Bruno da Silva and I published a paper on the AAAI conference on Artificial Intelligence (the most recognized scientific conference on AI) showing how it is possible to extend the CAPTCHA concept to enable knowledge acquisition from Web users.
A few months after I had this paper accepted, I emailed it to the CMU CAPTCHA creators asking their opinion about my paper. It turned out that they were about to release a very similar system that is now known as reCAPTCHA. Therefore, I can say that we both developed the same idea in parallel. I published earlier, showing formally the architecture of an extended CAPTCHA with knowledge acquisition. And they developed a larger system, not only implementing the idea that I formalized before but also making a large-scale contribution to the digital archive.
My question is: I think my paper should be referenced in the CAPTCHA article of Wikipedia (it can be found on google, by its name -> KA-CAPTCHA: An Opportunity for Knowledge Acquisition on the Web), however it seems like it would be against WP policy for me to edit the CAPTCHA article myself and reference my own paper there. Is that correct? Is this case, what should I do?
Bruno da Silva 04:43, 23 August 2007 (UTC)
- It's not clear to me that this paper would be relevant enough to the CAPTCHA article. Google Scholar lists ~550 CAPTCHA papers and it's obviously not possible to link to all of them in the article. Only papers which are specifically notable should be included here. Colin M. 19:44, 23 August 2007 (UTC)
- Bruno, the idea of human computation (using human brain 'cycles' to perform computational tasks) isn't new at all. Actually, this is the main idea behind Von Ahn's PhD thesis.
[edit] captcha "Completely Automated Public Turing test to tell Computers and Humans Apart"
-
- Because CAPTCHAs rely on perception, users unable to perceive a CAPTCHA (for example, due to a disability or because it is difficult to read) will be unable to perform the task protected by a CAPTCHA. As such, sites implementing CAPTCHAs should provide an audio version of the CAPTCHA in addition to the visual method. The official CAPTCHA site [4] recommends providing an audio CAPTCHA for accessibility reasons.
If that would be disability-access, then wikipedia would be disability-accessible, as would the internet, & all of society, & there would be no forced homelessness,....
But, such oversimplifications cause such articles to be deceptive, delusional. Does wiki seek to be disability-accessible, or, an illusion of such access?? Such a crock paragraph, therefore, the article, as well.
[[ hopiakuta Please do sign your signature on your message. ~~ Thank You. -]] 01:40, 21 October 2007 (UTC)
- I don't really understand what you're trying to say. Visual only CAPTCHAs are a problem for people who have difficult seeing or reading (whether blind or just severe visual difficulties or perhaps even those with dyslexia or otherwise unable to read) i.e. need to rely on screen readers or similar devices as a screen reader obviously can't read a CAPTCHA. By having a audio CAPTCHA, those who rely on screen readers have an alternative which they can hopefully use to prevent login problems due to CAPTCHAs. Obviously this isn't perfect, dead/blind people or people who can't rely on audio or visual feedback still won't have a solution. And since the vast majority of CAPTCHAs by far are using the roman alphabet and in many cases knowledge of the English language also helps greatly they likely cause problems for people who don't unserstand English, particularly those who don't even use the roman alphabet. I presume it's even worse in the audio CAPTCHA field. In any case, this is obviously only one component of making a site accessible and a site clearly needs to do other things to make sure it is properly accessible. I see no suggestion in the quoted text that CAPTCHAs somehow magically solve other accessibility problems that may arise, obviously they don't and other things such as overuse of Flash may destroy any accessibility and make audio CAPTCHAs somewhat pointless but this is a seperate issue. If you do have suggestions as to how to improve accessibility on wikipedia, you are welcome to propose them on the WP:VP. We already have I believe several editors who are doing their best to ensure wikipedia remains as accessible as possible Nil Einne (talk) 18:52, 30 April 2008 (UTC)
[edit] official implementation
I have removed this sentence form lead section, but it has been added again:
Currently, [[reCAPTCHA]] is recommended as the official CAPTCHA implementation by the original CAPTCHA creators. <ref>http://www.captcha.net/</ref>
I don't want to remove it again; instead I would like to discuss it. My reasons for removal of this information were:
- we already discuss recaptcha in 'collateral benefit' section
- it is original research; recaptcha site does not say anything about 'official implementation'; it is just saying recaptcha has been developed by the same people who invented original captcha (we mention this in 'collateral benefit' too)
Miko3k 17:37, 29 October 2007 (UTC)
IMO, it's not original research; the captcha.net page (from the original creators of CAPTCHA) says the following under the "Guidelines" section:
In general, making your own CAPTCHA script (e.g., using PHP, Perl or .Net) is a bad idea, as there are many failure modes. We recommend that you use a well-tested implementation such as reCAPTCHA.
Colin M. 15:28, 1 November 2007 (UTC)
[edit] bank of america captcha
the article contains the following text:
Some researchers promote image recognition CAPTCHAs as a possible alternative for text based CAPTCHAs. The U.S. financial institution Bank of America has used image-recognition CAPTCHAs as part of the secure login process for their personal banking website.
how does the site key implemented quailify as a captcha when it also says
A CAPTCHA system is a means of automatically generating new challenges
yet the site key is chosen by the user and remains consistant.--71.131.31.243 (talk) 09:17, 21 December 2007 (UTC)
- This is correct, SiteKey is not a CAPTCHA. I've fixed this up 68.164.124.45 (talk) 06:52, 24 December 2007 (UTC)
[edit] The 3-D CAPTCHA
I added the section on the 3-D CAPTCHA and I added the external link to The 3-D CAPTCHA because it explains in detail this CAPTCHA.
Note that this 3-D CAPTCHA is an image recognition CAPTCHA and it is unrelated to a different 3-D CAPTCHA mentioned earlier on this page. The other 3-D CAPTCHA is simply a way to give alphanumeric characters a 3-D blocky appearance so in reality it is a conventional 2-D alphanumeric CAPTCHA - it is not an image recognition CAPTCHA. CrunchyChewy (talk) 19:25, 25 January 2008 (UTC)
[edit] Ethical to reference?
I was wondering whether it would be considered ethical to add a reference to a site of ours. The site is a catalog of open source CAPTCHAs in PHP. We installed the CAPTCHAs on the site so that people could evaluate them (the url is http://www.trycaptcha.com). Ronp001 (talk) 14:34, 11 February 2008 (UTC)
[edit] Artificial Intelligence
This kind of cat and mouse arms race will, no doubt, lead eventually to Artificial Intelligence, or at the very least, improved visual recognition software —Preceding unsigned comment added by 156.80.75.250 (talk) 19:54, 14 February 2008 (UTC)
[edit] Worst CAPTCHAs
I did a fair amount of research looking for a sinlge article that pointed out the worst CAPTCHAs. This post has raised a lot of attention and is pretty funny as well. I think it would make a good addition the external links section. If there are any questions please fee free to make an objection in this section and we can debat. —Preceding (the url is http://www.johnmwillis.com/other/top-10-worst-captchas/). —Preceding unsigned comment added by 24.99.46.201 (talk) 14:20, 1 March 2008 (UTC)
[edit] Audio CAPTCHA for Wikipedia?
Could somebody please tell me, why we don't use this Mediawiki extension from reCAPTCHA for Wikipedia? It provides an alternative audio CAPTCHA and I would like to use it for a recently started Blind Wiki at Wikia as well. There are certainly good arguments for not using it within the great Wikipedia but I would like to learn about the reasons. Thank you. -- Lalue (talk) 09:52, 9 April 2008 (UTC)
- This has nothing to do with improving the article. I suggest WP:VP instead for problems relating to wikipedia. For those relating wikia, then I suggest somewhere on wikia. I see you have already received an answer here [6]. There is also some discussion somewhere in the thread here [7] Nil Einne (talk) 13:18, 29 April 2008 (UTC)
[edit] concept hijacking
Well before the uni mob "coined" their trademark "CAPTCHA" acronym, everyone called these "Turing" tests - why are we using the trademarked copy instead of the original name here? —Preceding unsigned comment added by 203.206.137.129 (talk) 14:36, 16 April 2008 (UTC)
- Firstly, please follow the normal practice on wikipedia and add new sections to the bottom not the top. Secondly, as the article mentions (did you read it??) a CAPTCHA is not really a Turing test but a reverse Turing test. Thirdly, a CAPTCHA is a specific form of a Reverse Turing test designed in a fairly specific way with a fairly specific purpose in mind and used in specific circumstances. In particular, CAPTCHAs are designed to be reasonably simple for humans to follow and not take too much time, which of course means that a lot of programs which are unlikely to be consider intelligent and can easily be recognised as programs will be able to 'pass' a CAPTCHA. Fourthly and perhaps most importantly, everyone calls them CAPTCHAs. No one calls these reverse turing tests. Wikipedia nearly always follow the common name even more so when there is no well accepted other name, trademark or not Nil Einne (talk) 19:02, 30 April 2008 (UTC)
[edit] Article does not discuss the serious problems with CAPTCHA
As members of the Baby Boom generation age, their hearing and vision tends to degrade. Increasingly, users of major Web-based services like Yahoo are beginning to have serious difficulty recognizing CAPTCHA images or sounds. This problem can only grow worse in the future.
The article may be said to lack neutrality in that it projects a uniform acceptance of the use of CAPTCHA and makes little mention of the problems that image and audio CAPTCHAs pose for older users (as well as discriminating against people who are blind, deaf, and dyslexic).
While it is true that text-based CAPTCHA poses greater technical challenges (including, for example, the possible need for periodic maintenance) for website designers than do image and audio CAPTCHAs, it is also true that even simple text CAPTCHAs (such as requiring the user to copy a short series of digits) can be sufficient protection for websites and services having low traffic. Only high-traffic sites such as Yahoo Games and email account registration services truly need a large collection of text-based challenges.
The article is also missing any mention of alternative means of keeping bots out of websites, such as the use of registered and trackable tokens that indicate use by a human (the tracking ensuring that use by a bot can be easily tracked back to the act of stealing the token from a human, which could be made legally actionable). (Note: since I just made up this example, it is most likely flawed. That fact does not argue against making the article balanced by discussing such alternatives.)
David (talk) 14:26, 17 May 2008 (UTC)
[edit] Can't see Wikipedia's Captcha
I try to surf and edit securely, and allow images from wikipedia, and cannot see the captcha words in the wikipedia box any more. First I could edit, and did things like add latest W.H.O. info to articles on H5N1. Then, I started getting flaky responses where sometimes things would work and sometimes not. Now, in the little box where the Captcha words should be, there is nothing.
Where would one suggest that a troubleshooting link next to the Captcha box would be a good thing? —Preceding unsigned comment added by 68.166.205.59 (talk) 21:18, 3 June 2008 (UTC)