Talk:Password strength

From Wikipedia, the free encyclopedia

Userbox
This user has a
Strong Password.

{{User:UBX/Strong Password}}

Contents

[edit] dubious edits

A series of recent edits by 87.66.184.17 make sensible points, but are not integrated into the article at all well. I suggest that they be modified to fit better, or be removed for another try. Comment? ww 16:58, 22 September 2006 (UTC)


[edit] Obsolete Advice

The following URL: Choosing Good Passwords -- A User Guide, Provides almost dangerously obsolete password advice. Precomputed reverse lookup tables, called Rainbow tables, are widely and inexpensively available on the internet. Most of these tables cover passwords that are 8 places long or less and use letters, numbers and common keyboard symbols.

Extending the symbols used in a password or increasing its length have become vital considerations for "High" security passwords.

Even the guidance provided NIST often sighted by IT Auditors is now more than 30 years out of date. Please consider an update of this page. Arctific 22:01, 21 December 2006 (UTC)

The rainbow table attack will work only if the attacker has access to the hashed password (i.e. requires a break-in or information disclosure), and if the password hashes are stored in unsalted form.
In other words, whether a rainbow table attack will work is independent of the quality of the password. —Preceding unsigned comment added by 87.182.35.203 (talk) 19:59, 14 April 2008 (UTC)
I agree, the entire section on "time to crack a password" is pretty poor and redundant at this point. The Rainbow tables section, in particular, is wrong and not relevant as you point out. --Musides (talk) 05:12, 15 April 2008 (UTC)

[edit] Shadow passwords

How often is the encrypted form of the user's password actually divulged in modern usage? Shadow passwords have been around for twenty years, and so in theory, a hacker needs to have root access before he can see the password file. Now as that entry says, they could be divulged somehow on a network (which is remarkably silly), or they might not even be used, but how often does that really occur? Of course, the issue is always relevant if you use the same password on many machines, but to beat that the hacker doesn't even need root access - he just needs a Web facade that looks like the other twenty you've decided to register at. 204.186.117.158 00:40, 8 May 2007 (UTC)

[edit] Lawyer's point of view

Nearly everything I read on this subject, including nearly all of this article, is written from a lawyer's point of view. The user's point of view is more like this: Most websites that demand a user id and a password do so for no obvious reason, so the more obvious the password, the better. If we choose 50 really long passwords, full of random letters and special characters, change them every month, and never write them down, then why wouldn't the next logical step be to throw the computer away? Recite from memory the license plate numbers of 50 friends for me, and then I'll take you seriously. Isn't the risk that someone might impersonate me at some obscure website trivial, compared to the alternative risk that I won't be able to get back into the website at all? Why would James Bond want to impersonate me anyway, at any site other than online banking for instance? Yeah, I know, this is WP:OR. Art LaPella 00:43, 8 May 2007 (UTC)

I agree with some of these sentiments. Passwords are used these days for so much useless trivialities just because they can be. I personally don't give a rat's aspirin if someone stole almost any of my passwords. Who really cares what Bill Roper, aged 19 has in his Hotmail account? Online e-mail accounts are almost universally disposable anyway, if it was compromised you'd just abandon it and set up another. And who wants to post in HypAluvA12s place on any given forum? Unless you're at work or logging on to a shopping or bank website, they're all pretty meaningless. The chances that anybody would care about finding out one of these passwords, to one of these sites, and for just one user is infinitesimal. The over-use of passwords by organistaions and websites that don't need them is tune with the mass hysteria that's going on all over the world to do with computer security. Shouldn't this be mentioned on this article? It doesn't say anything about the fact that not all passwords are serving a purpose. Mr.bonus 13:48, 8 August 2007 (UTC)

[edit] trick passwords

Although the arcticle said that trick passowrds, like password, or in any other simlilar way is a bad password, cause they might able to guess it. But I don't think so. A person can name a password: wordpass.

That is a suggested password in my opinion.

pls list what you think.

—The preceding unsigned comment was added by SteveNash11 (talkcontribs) 04:00, 8 May 2007 (UTC).
  • Try using the most absolute random word in the dictionary!!!!! I know what it is, but I ain't sayin'!!!!! (not-so-evil laugh) Chef Clover MyTalk 14:16, 8 May 2007 (UTC)
    • Of course, there's always the simple double-letter-doohickey, like "rraannddoomm." (giggles) Just try not to make sure that it's one of the most used passwords listed in the article, please. Chef Clover MyTalk 23:32, 8 May 2007 (UTC)
  • Common permutations of dictionary words are standard things to try with most cracking software, and have been for ages: as long ago as 1993 I pointed out to my university lecturer that his password, "drowssap" was insecure and had been cracked in seconds. Concatenations of two short words, like "wordpass" are equally insecure. So, no: if you have a problem with people cracking your account, don't use passwords like this. If you don't mind being hacked, then by all means use the easiest password to type: something like "12345678" or the letter "a" repeated for long enough to meet the minimum password length. Just be aware that these common and simple terms will be in the dictionaries of any decent cracker. —Preceding unsigned comment added by DewiMorgan (talkcontribs) 20:44, 13 December 2007 (UTC)

[edit] objection on pattern passwords

Many of you say that pattern passwords can be easily guessed, by a varied of materials.

I myself doesn't agree, as the password like 13795, is it easily guessed?

It is just basically a pattern on the number square, but most hackers don't even try it out. However, i just wanted to make sure that all of you agree on it before it is being changed. —The preceding unsigned comment was added by SteveNash11 (talkcontribs) 04:09, 8 May 2007 (UTC).

Yes, this and other patterns on a Pump It Up or Dance Dance Revolution pad are easily guessed. Pseudocode follows:
for i in (1..99999999):
if hash(numToString(i)) == passwordHash:
write "Your password might be " + numToString(i) + linebreak to log
Computers can process this loop very quickly, especially on systems that can distribute for...in to multiple processing cores in a machine or to multiple machines in a cluster. --Damian Yerrick (talk | stalk) 12:29, 8 May 2007 (UTC)
Yeah, even with a numeric sequence twice as long (viz., 1652479803), it takes only seconds to find it:
#include <stdio.h>
int main()
{
  unsigned int i, pass = 1652479803;
  for (i = 0; i < 2000000000; ++i) {
    if (i == pass) {
      printf("Password: %d\n", i);
      break;
    }
  }
}
64.234.1.144 (talk) 04:46, 1 April 2008 (UTC)
It seems as though this has already been corrected in the article. A reference, however, would be ideal.--Musides (talk) 19:07, 1 April 2008 (UTC)

[edit] Impossible

This quote: "impossible to guess are considered strong" -- I believe is stated in the wrong way. A password cannot be impossible to guess, as long as the chance is there. --Thus Spake Anittas 19:55, 8 May 2007 (UTC)

Lol. I think you might be right. If you use the brute-force technology, then it will be breakable sooner or later; but the bad thing it may take years to figure out one. SSBM Pro 00:53, 9 May 2007 (UTC)SteveNash11

[edit] This article is pretty bad

and looks pasted from someplace (could someone do a copyvio check by googling some phrases from it?). Also, it's a tutorial: even if it can be improved, it should probably be turned into a wikibook.

The documentation for diceware has a good explanation of password entropy. 71.141.91.251 07:10, 9 May 2007 (UTC)

Thank you for your suggestion! When you feel an article needs improvement, please feel free to make those changes. Wikipedia is a wiki, so anyone can edit almost any article by simply following the Edit this page link at the top. You don't even need to log in (although there are many reasons why you might want to). The Wikipedia community encourages you to be bold in updating pages. Don't worry too much about making honest mistakes — they're likely to be found and corrected quickly. If you're not sure how editing works, check out how to edit a page, or use the sandbox to try out your editing skills. New contributors are always welcome. +A.0u 22:54, 12 May 2007 (UTC)

[edit] Microsoft Passwordchecker removed

I removed the Microsoft Passwordchecker from the External links section.

Reason: You should never ever give your password to a third party in the internet without a really really really good reason.

We could add a comment to the link, something like "don't send your real possword to Microsoft!", but that is WP:BEANS.

Fantasy 06:45, 10 May 2007 (UTC)

Are you sure that this is a WP:BEANS case? The article was talking about things that people would be tempted to do. I can see why someone would be tempted to crash Wikipedia, but sending your password to a third party is not exactly tempting. I might be wrong, I would appreciate commentary on this issue.

Zaglith 03:42, 7 June 2007 (UTC)

This is not a WP:BEANS issue. Computer security is extremely difficult, is widely misunderstood, and users have some (commonly distorted) sense of some of this. They routinely avail themselves of anything which may, even if improbably from a more fully informed perspective, help with this murky stuff. In addition, there are many, not so honorable, 'advisors' willing to offer such advice, often with impressive looking or sounding presentations. Distinguishing between the incompetent (or fraudulent) advisor and those offering something solid is nearly always beyond the ability of the average user. This article takes the position that information should be provided directly to the reader which will strengthen their sense of what's underlying the entire issue. In that context, noting that sending a password across the public internet is not secure, even if it's headed toward a Microsoft password advising site. ww 02:09, 9 August 2007 (UTC)
The Microsoft password checker does not send anything over the internet, but rather downloads an applet that checks the passwords locally. Unless Microsoft lies about that, there should be no security risk in trying real passwords there. -- Meni Rosenfeld (talk) 11:45, 10 August 2007 (UTC)
Alright, let me understand this. You download an applet from somewhere (that it's Ms has little weight as they have committed major security blunders and been the victims of major security attacks which affected their users in the past -- the track record is splotched at best), and give it your password. You do not control this applet, have not seen its source code, do not really know what it does, and are willing to entrust the security of your
  • computer system (assuming its an admin password) or
  • your finances (assuming its an online banking password) or
  • national security (assuming its a password to some supersecret uber important official computer) or
  • your company's future (assuming its an admin password to your company's most important research/marketing/product database
to that applet. I'd say there's a security problem, whether or not a first pass guesstimate says so, and that's without even discussing whether a totally disinterested in you (your company, your country, ...) attack isn't out collecting passwords just for fun.
That old saw about even paranoids having enemies is funny, but in a threat environment evaluation exercise, it's simple sanity. Ron Anderson's Security Engineering (Wiley and available online) is an enlightening read. ww (talk) 19:31, 2 January 2008 (UTC)

I concur that this isn't a WP:BEANS issue. Some external testing sites are useful for testing types of passwords, after all! kencf0618 07:52, 25 August 2007 (UTC)

[edit] entropy

The information entropy figures given don't seem to make any sense, so I'll remove them. (Can you really encode '1382465304H' in just over 3 bits?) My guess is that these numbers are in bits per character, which may be relevant to compressing large files, but not to the strength of passwords. The total entropy of a password/passphrase is much more important, that is why longer passwords are generally stronger. Feel free to put entropy back in, but please:

  • consider what statistic is most appropriate for this article
  • explain exactly what the numbers refer to, and
  • provide enough detail for other editors to reproduce your calculations.

I would recommend also adding an example passphrase of about a dozen words, with entropy indicated for comparison.
Townmouse 20:14, 15 September 2007 (UTC)

Information entropy is a practical, over-all measurement of a password's strength (is it sufficiently intractable?), so I didn't include the mathematics (why is it intractable?). All of the strong passwords listed had an information entropy, at least according to CertainKey's site, of more than 3. I thought this useful information.kencf0618 03:36, 18 September 2007 (UTC)

I appreciate your effort, kencf, but I'm strongly with the person who originally removed the numbers, and am removing them again.
First, programs like this are inherently not an authoritative measure of the entropy in a password. The operative measure of "entropy" here is, "how many guesses will the attacker need to find your password?" If the attacker's cracking program has knowledge about common passwords that isn't wired into your strength meter, the meter can give a false sense of security.
As if to prove this point, CertainKey's meter thinks two of the "weak" passwords in this article (p@$$\/\/0rd and 0-375-72626-8) are strong -- specifically, it says they would take over 100,000 days to crack.
Second, endorsing an unverified password meter by citing it is worse than not citing one at all. At best, it gives a false sense of security. At worst, it could be a phishing scam. (See [1].) Yes, CertainKey claims the password doesn't leave your computer, but a phisher would make the same claim.
Third, there's a units problem. You pasted the entropy number in without a unit; the app itself claims the entropy is in "bits", but it appears to mean bits per symbol. That's not the right measure of password strength; total entropy is what we really want.
Fourth, even if none of the other problems raised here applied, we'd need citations other than CertainKey's own marketing before we could treat its results as verified enough to cite. See WP:NOR. (Update: Also see WP:SPS)
Sorry for being so insistent -- again, I appreciate the effort to add useful info, I just don't think we can cite this or a similar password strength meter's output this way. 75.24.110.207 (talk) 03:39, 30 December 2007 (UTC)
Your analysis is well taken. The article does need further parsing; there are too many technical details which need to be put into too many contexts, etc. The likes of CertainKey are just one possible puzzle piece in this particular editing puzzle – it's certainly no magic bullet! (Update: It's also a WP:Beans issue, come to think of it.) kencf0618 (talk) 05:16, 30 December 2007 (UTC)


I've added a fairly large section on entropy. In order to effectively explain password strength, one must talk about entropy. I've sourced most of it from the National Institute for Standards and Technology, which I think is about as unassailable a source of information as exists! It looks like the past problems have revolved around how people have included entropy, without good sourcing, in particular regarding some vendor/neutrality issues. I hope this new data puts that to rest, because really, we have to lay out entropy to have this definition make sense. --Musides (talk) 21:37, 27 March 2008 (UTC)

[edit] Proposed reorganization

Something I'm seeing here is that lots of very specific technical info -- LM hashes, rainbow tables, and specific ways that passwords can be weak. The article includes some theoretical grounding for what makes strong passwords strong (information entropy) and some effort to look at PW strength in the larger security context (problems with shared passwords, etc.), but that's spread throughout a very long article.

So I think I'd like to consolidate some theoretical stuff at the top, make the core of the article useful to folks trying to set good password policies, and close by putting this in the larger information-security context. Detailed information about LM hashes and rainbow tables should probably be in other articles, and linked from here rather than included. I'm not sure what should happen to the list of ways that a password can be weak; I feel like we should have just a brief summary here, and go in depth elsewhere (Password cracking?).

FWIW, here that is in outline form.

  • Definition and theory (information entropy, etc.)
  • Effective password-protection policy
    • Choosing strong passwords (good methods and bad)
    • Preventing password disclosure
    • (etc.?)
  • Password strength in context
    • Threat models, difference between encryption keys and login passwords
    • Other system vulnerabilities
    • (etc.?)

That's all just a dream at this point, but I'm interested in anyone's comments on it. —Preceding unsigned comment added by 75.24.110.207 (talk) 04:18, 30 December 2007 (UTC)

I've started reorganizing. 69.181.124.144 (talk) 23:15, 30 December 2007 (UTC)

I;m going to suggest that you set up an account, as it seems you are editing from two (so far) IP addresses.
The internal WP mechanisms work much better then.
In addition, I applaud your impulse to clean up an article which has accumulated quite a bit of cruft -- some of it quite dubious/wrong/unreasonable.... But I'd note that, as this is supposed to be an encyclopedia article, we ought to concentrate on understanding in the mind of the reader rather than concentrating on technical precision or historical completeness or some such. We are attempting to increase understanding, not provide a step by step procedure for creating/using/whatever passwords. I think the effort should be to bring the reader up to some sort of speed about the context in which passwords are used (ie, the eternal battle betwixt the good guys and the bad guys, with users accidental, unwitting, if inevitable participants in that largely invisible struggle), rather than, as now, dribble this or that collection of facts about passwords under their unprepared noses. Those facts and such become mere red herrings, and a detriment to the reader's nascent understanding of fundamental issues. For instance, Kline's studies of password vulnerability to automated guessing deserve prominence as an illustration of the minefield in which users find themselves. ww (talk) 19:12, 2 January 2008 (UTC)

[edit] Removed content missing citations

This is really interesting stuff, but it's been here without a citation since May 2007, and I couldn't find a non-Wiki source with a quick search. If anyone can find one, sweet.

In large password files the awareness of password strength may be roughly measured by the time it takes to crack 50% of the password hashes present. With a large password file, using a standard Pentium 4 system, Microsoft password hashing techniques and an under-trained user group making password choices, it may be safely estimated that 50% of the password hashes will crack in less than two minutes.[citation needed] However, with careful attention to password strength advice, password choices can outlast the patience of a password cracking effort.
User training versus a cracking time metric for large user base password files:[citation needed]
User training actions Metric: 50% password hashes cracked
No Training Time ≤ 2 minutes
Sporadic User Training Time ≤ 4 minutes
General User Training Time ≥ 45 minutes
Focused User Training Time ≥ 24 hours
Motivated User Population Time ≥ 1 week
Random Password Group Time ≥ 2 weeks
Contest Winners Time ≥ 1 month
Note: times are based on reports from authorized password cracking efforts that used John the Ripper, without a dictionary, on Microsoft LM Hashes, run on a Pentium 4, 2 GHz system. Only results from password files containing greater than 3,000 password hashes are included.[citation needed]

69.181.124.144 (talk) 23:22, 30 December 2007 (UTC)

Just gave the same treatment to the claim that World of Warcraft has an extremely secure login screen.69.181.124.144 (talk) 23:22, 30 December 2007 (UTC)

[edit] 9510 possibilities

I'd like to flesh out the 9510 possibilities line. What is that in terms of the powers of two? As a rule of thumb, the longer the password, the better the combinatorial explosion, but the sense of scale could use some more color. kencf0618 (talk) 03:32, 31 December 2007 (UTC)

It's about 266. (Type log(95^10)/log(2) into Google.) Changed the sentence:
A ten letter password such as '4pRte!ai@3', because there are about 95 keys available, is one of 9510 (or about 266) possibilities, which would take approximately 632,860 years to be found assuming purely random possible password generation.
Fair warning: I might mess with this section more soon (as in the next few days). No plans set in stone yet. —Twotwotwo (talk) 06:35, 31 December 2007 (UTC)
The permutations probably deserve inclusion in Orders of magnitude (numbers) in keeping with their (everyday) ludicrous computational intractability. The venerable and infamous Rubik's Cube is in the 1018 range, whereas a simple 9510 password is already at 1020 and 9515 is in the 1030 order of magnitude. A standard keyboard and good password practice arguably have more important permutations than those of a deck of cards! kencf0618 (talk) 06:35, 1 January 2008 (UTC)
I've included 9510 in the 1018 section of Orders of magnitude (numbers) and linked us. Correct my mathematics as needed, of course! kencf0618 (talk) 18:28, 1 January 2008 (UTC)

[edit] Time needed for password cracking

In the following sentence (in the section "Time needed for password searches"):

But as a practical matter, password strength may meet its strength objectives if the time needed to crack the password exceeds the time investment for cracking the password and/or if the protected information becomes obsolete before a cracking effort completes.

The phrase, "the time needed to crack the password exceeds the time investment for cracking the password," looks like a tautology to me.

I think what is intended is something along the lines of, "the value of the time needed to crack the password exceeds the value of what is accessed by cracking the password." Does this make sense? Or am I misunderstanding the intent. —Preceding unsigned comment added by STarry (talkcontribs) 20:03, 30 January 2008 (UTC)

Why is a section on the time required to crack passwords in an article about password strength? There are many ancillaries to a particular topic such as password strength, but that doesn't mean that an article on the subject should be full of such ancillaries. --Musides (talk) 17:32, 27 March 2008 (UTC)

[edit] Wikipedia password limits

Wikipedia's page for logging on has a link to the article "password strength". I have the following questions:

  • What is the set of all characters acceptable for use in a Wikipedia password? (Does it include: Greek letters? Russian letters? Armenian letters? Devanagari letters? Japanese characters? Korean characters? mathematical symbols? webdings?)
  • What is the maximum allowable length for a Wikipedia password? (Does one Chinese character count as much as a Latin letter?)
  • Does Wikipedia truncate the password after a specific number of characters, and disregard the rest?
  • Why are the answers to these questions not visible on the page for logging on? -- Wavelength (talk) 00:24, 6 February 2008 (UTC)
I just changed my Wikipedia password using KeePassX 0.2.2 (new version 0.3.1! with AutoType!) with maxed out password strength, which has the following options:
1000 characters length
Upper letters
Lower letters
Numbers
Special characters
White spaces
Minus
Underline
higher ANSI characters
KeePassX reports a password generated with these options as either 8000 bit or 12816 bit quality, either way Wikipedia happily accepted it. Slark (talk) 07:39, 20 March 2008 (UTC)

[edit] Case sensitive passwords

Changed the password examples here to "better" ones. I selected them as "better" on the basis that: 1) They avoided using words and common phrases ("quick brown fox"), so avoiding the issue of how that affects password entropy 2) The two passwords clearly have similar entropy. One is eight characters, one is the same eight characters at one bit-per-character less entropy, plus another two characters with somewhat more than four bits per character (26 possibilities). Since it's not a round number of bits per character I couldn't use two that had the exact same entropy, but they're close. This makes comparison of memorability easier, which is the point of that section I think. —Preceding unsigned comment added by DewiMorgan (talkcontribs) 19:03, 21 February 2008 (UTC)

[edit] Major clean-up needed!

Most of this article is *not* about password strength. Some sections are about cracking, others about password policy, and so on. I suggest that all sections including "Choosing, remembering, and using strong passwords" and beyond be moved into other articles. I will make a case for each section if necessary. I'm not 100% clear on how to use the "merge" syntax for sections of an article.

One further note. In the intro, I don't think the sentence starting "Strong passwords can still be exploited by..." is appropriate for an introduction. It doesn't seem to fit because it is a broad statement of opinion, and while it is true, I don't see how it has anything to do with password strength!

M, The situation is not so conceptually pristine as you would have it. Strength, in the context of passwords, is like all computer and crypto security, dependent on context. Classically, if your only opponent is your kid sister, just about any password will be strong. Though if she's a young Emma Noether or Hypatia... If your opponent is the NSA, or GCHQ, or Moscow Centre, the situation is quite different. They may use attack techniques unknown to the public against which particular levels of strength would be insufficient. And the classic (and wrong-headed) plaint that 'no one like that is interested in me' is simply silly since no one can be sure, in a networked world, who is attempting to break one's security, including passwords. It could be an attack bot started by a script kiddie, the Mallory level equivalent of Ken Thompson acting privately or for hire, or it could be a more clueful than usual private eye sort attempting to find evidence of your affair with someone about whom your spouse is concerned.
So your objection to coverage of potentially useful attack techniques is inapposite. No Gentle Reader will acquire from this article a sense of the issues underlying the somewhat reified concept of password strength if all these issues are elided from the article. The article should address common misconceptions about password strength (lots of urban legends thereabouts) and explain why. This leads to more than the taut and concise article you seem to have in mind, but WP articles describe reality, not a Platonic ideal one.
I oppose your program on these grounds. Improve the writing, please, if needs it (why are tech types near functional illiterates??). Or the logical organization of this or that. But do not remove useful and encyclopedic content. ww (talk) 20:42, 1 April 2008 (UTC)
Your discussion on "context" is a discussion about subjectivity versus objectivity. Subjective discussions are not particularly useful for an Encyclopedia. Yet, is it possible to objectively talk about password strength? An objective approach is not "pristine" or beyond critique, as you suggest, it is merely an attempt to explain reality. One way to resolve your critique is to modify the bit-strength threshold section: NIST may recommend 80-bits, but perhaps you can find a source that says 10-bits is just fine for certain applications. That works fine -- but don't throw the baby out with the bathwater.
Regarding your defense of various attack techniques which include dumpster diving: explain to me how this relates to password strength? We both understand how these relate to security, but password strength? This is part of the reason this article is such a mess: it tries to cover cracking, policy, etc, all at once.
We both agree that covering myths about password strength is a good idea -- do you have sources for this that haven't been covered already? Regarding the other sections of the article, are you defending every section as being relevant? For example, can you explain why "Time needed for password searches" is more appropriate here, than it is in an article about cracking, which covers this subject with much better depth, quality, and sources than the poor attempt made here?
Btw, your references to being "taut" and "platonic" are not only misplaced, but they are personal, and detract from what matters here: a discussion about the actual subject. Let's not get into name calling, okay?


--Musides (talk) 22:26, 1 April 2008 (UTC)
Speaking of context, I think it would be helpful if we could agree, in the overall context of: (a) security and (b) passwords, that the particular subject of "password strength" is fairly particular and narrow. In other words, it is only about the strength of passwords and: how strength is measured; how to compose strong passwords. As an encyclopedia, it seems safe to let other terms, such as password cracking, talk about how password strength is compromised. If we can agree on this basic principle and goal for this word, then I think it is possible to clean up this article. If we can't, then it will remain a mess.
--Musides (talk) 22:36, 1 April 2008 (UTC)
Clearly, there's a more or less complete disconnect between us, as well as between our positions.
Taut means tight, and in the context of prose writing, implies careful composition, minimal discursion, little rhetorical flourish, great clarity, ... in the prose. Platonic refers to Plato's philosophy which was built around the conviction that only ideals had reality and that practical reality was an illusion. Neither usage has any aspect of name calling or insult. Certainly not in this context, though I suppose I can imagine a Post-Modernist who... Well, never mind about that.
As to subjectivity vs objectivity, we are here enjoined by WP to describe things as they are. If lots of system administratoras subjectively advocate, using subjective reasons, that passwords have upper and lower case letters, -- as long as we can cite a source. By so doing we have achieved WP style objectivity. Even if you, or I, think it's wrong. Your particular view on the subject, if not so backed up is subjective and because it's original research, is not acceptable on WP. Editors are enjoined to achieve consensus while editing, a terminal case of lack of consensus is grounds for referral to peer review etc.
As for your suggestion that we agree password strength is a narrow subject and should be left to a discussion of password bit length, it is precisely this which I suggest would be a disservice to our Readers, for whom such a discussion would likely be opaque if the Reader is not already an expert in the field, or alternatively (in the untypical case, the skill being uncommon and still less actually used) a Reader willing and able to chase through several other WP articles to acquire the necessary context for understanding.
And no, I'm not defending any particular section or paragraph here. Instead, I've been attempting to settle a logically prior difference to avoid editing conflicts since I realized the degree of difference between our position.
Do you think the distance between consensus and our situation is such that we should refer the conflict here to the Arbitration Committee? Or perhaps peer review? Comments? ww (talk) 01:49, 2 April 2008 (UTC)
He he he, well, it just goes to show the nature of text communication. :) I come from a different place philosophically, where platonic is indeed an insult, and taut has a far different connotation. My fault, I've clearly been a bit too subsumed in such a world. C'est la vie, nice to know it was meant well.
Regarding our goals here: I didn't suggest we limit ourselves to bit strength. Instead I'm suggesting that we define some kind of limitations, which we obviously agree on doing. The next step then, is what is a proper delimiter? Is everything regarding the subject of passwords appropriate? Stated in another way: what about passwords is not appropriate in an article on password strength? Because, when talking about strength in particular, one could make a case for any aspect of passwords being pertinent: naturally, since strength is a subset, there are various and sundry supporting components. One can, as we have here, go beyond just passwords and talk about security in general. In this sense, we could talk about social engineering, physical security, and on and on.
Regarding your logic of readers and services to them, I don't fully follow. If we cannot expect readers to check references to learn more, to explore other portions of the Encyclopedia where they need to improve their knowledge, then I think that is a bit of a disservice to Wikipedia, and the capabilities of readers, isn't it? I certainly support not being erudite or assuming/requiring a certain level of knowledge in an article, but that is one thing, and supporting information is another I think.
Finally, ww, I'm still not entirely sure for your basis of supporting all the content in this article, or put another way, I'm not sure we've explored that completely. You say, for example, that you aren't defending anything in particular. In that case, I'm not sure what your objection then is to my efforts to clean this up by removing some sections which I feel are clutter. To take a different example, how is a discussion on "guarding user passwords" appropriate for an article on password strength? Hopefully, through continued dialogue, we can zero in on the issue and try to produce a better end result here. :)
--Musides (talk) 03:17, 2 April 2008 (UTC)

<-- I'll try to clarify what seems to have been my unclarity. Our Gentle Readers are not to be presumed to be specialists in the topic of an article. Hence we are not writing for the cognoscenti, but for the Average Joe. Furthermore, we are not writing for specialist scholars either, who can be trusted to (or should be trustable to) monitor their own learning and to follow up on links and references, assembling an understanding of a topic from various sources. And furthermore to withhold assent to some claim until sufficiently confirmed from primary sources and maybe secondary sources. In the absence of such Readers, we editors are obligated to provide something more like a capsule view of the topic, if possible, somewhat able to stand alone if the article is the only thing consulted, as is likely the case. We are not constrained (much) by space considerations (eg, 1 printed page maximum or some such) being a virtual encyclopedia, except by more or less arbitrary community standards of (length, nr of illustrations, ...). In the case of this article, password strength being an important topic in an era of security breach after security breach, we should, if it is possible, provide a view sufficient for our assumed Gentle Reader to make sensible evaluations of comments made. Since adequate computer security is completely contingent on the attack environment, this logically requires that we explain (briefly to be sure) the background behind the usual recommendations. And, because the figure is not the ground and the perception of one is often blocked by the other, we should also deal with urban myths of password strength, explaining why they are so. Our Gentle Reader should be able to leave the article with a working understanding (including the context for this or that password strength verity) of the issue and sufficient to avoid the more egregious errors which plague the field. It's for this reason that the context you've been trimming assiduously is not mere surplusage to be excised, but a service to the Readers of this article. I trust this helps make clear my perspective. ww (talk) 19:01, 8 April 2008 (UTC)

ww, your comments are extremely vague. Can we talk about this article? Do you even agree with the heading of the article, that cleanup is required? If you do, can you identify the sections that you think need clean up? Then we will have something concrete to discuss.
--Musides (talk) 19:45, 9 April 2008 (UTC)
M, Sorry you perceive vagueness after two tries. However, after the back and forth here, I actually attempted to make a few edits. You have essentially reverted them in toto leaving at least one edit summary suggesting topic ignorance on my part, which suggests to me that the disagreement here is deep. In addition, your phrasing is technically incorrect in several places, possibly in an attempt at brevity.
I agree that we should talk, though since there has been quite a bit here thus far, perhaps we ought to invoke a peer review, or ask the folks in the crypto corner for some 3rd party assistance? ww (talk) 01:42, 10 April 2008 (UTC)
Thanks for the email offline. I think we have some room for improvement in how we work together. I think we can make some progress here, or over email, as opposed to edit wars. That said, when I saw you made edits without referencing them here, I tried to take that in the best light, and incorporate your ideas while staying true to the referenced sources. This is the second time you've mentioned intervention. I think we have plenty of opportunity to work together on this, but if you think a peer review is our best path forward, then what can I say? --Musides (talk) 02:49, 10 April 2008 (UTC)
I've done some additions today, let's see where we stand. Note that, with the exception of the introduction, everything I have done so far is purely new material. All the new material comes from very strong sources (NIST, Gartner, Schneier) that have doubled the number of references in this article. I think we are on the right track. Hopefully this will build up enough good faith for us to be able to clean-up the older parts of the article. --Musides (talk) 20:19, 10 April 2008 (UTC)
Having now a bit of time, I'd like to attempt a resolution to a statement or two I've corrected and which have been reinstated. Strength of encryption is not measured by bit strength of key; indeed there is no measure for it aside from induction from failure of serious attempts to find weaknesses, a point Schneier makes in many places including Crypto-gram. The doghouse entry citation has more to do with attempts at marketing flim-flam and so is a psychological point, not a cryptographic or security one.
A very long key for a low quality encryption algorithm does not increase the strength of the algorithm, and so increase the strength of the encryption. What high bit strength "for" an encryption algorithm key does, just as for passwords, is to increase the difficulty of a brute force attack. And even so, does not increase such difficulty if the key or password is chosen poorly (ie, with low entropy). And only such difficulty. There are other possible attacks to which a high entropy key may be quite vulnerable.
In addition, there is a difference, technical and tricky but very important, between entropy and randomness. Several recent edits and reversions have confused this difference, to the likely confusion of those Gentle Readers not fully up to speed on these subtle issues (we must write on the assumption that this is our Reader, not an expert on the difference). To make it clear, let me give an example. The sequence of digits in pi (3.1415...) are random (or at least as far as is known) as there is no method of predicting the next digit with any greater than chance results, aside from calculating it directly. These digits pass all the statistical tests for randomness that have been used (as of my last inquiry). Yet that sequence has very low entropy and any portion of it would be a poor choice as an encryption key or a password. Our Gentle Readers should benefit from our very serious and sustained efforts not to confuse the two meanings. This is, in a conceptual sense, a technical article. In fact, it would be worthwhile here to make the difference as clear as possible (as was, IIRC, once done here) so that this portion of the context for password strength be made patent for them. ww (talk) 18:27, 11 April 2008 (UTC)

[edit] encryption bit strength

Regarding encryption bit strength, this article is about passwords, and that is the method our sources use. In any event, fwiw, bit strength is simply one measure of the strength of encryption. Encryption, of course, is quite a different subject from passwords. We seem to agree on this, of course, because encryption algorithm is critical to strength (hello 128-bit WEP), etc. We don't agree on brute force -- this is where I think you misunderstand. NIST does a pretty good job of distinguishing this difference. Since password bit-strength is a function of human entropy, crackers first and foremost guess passwords -- and bit strength does a pretty good job of reflecting this. Secondly, a dictionary attack, again, entropy. Thirdly, pattern testing is employed, which you could loosely equate to encryption algorithm cracking -- but as you know, patterns for password generation are well known and extremely weak. Again, the whole point of the NIST (or Garnter if you prefer) entropy data is to account for patterns. Regarding pi, we certainly agree. How about we make a section on "randomness and entropy"? --Musides (talk) 18:51, 11 April 2008 (UTC)

Disagree absolutely on bit strength being a measure of the strength of encryption. You are in error in this respect.
I disagree also on your comments about brute strength. You are correct that the usual attacker (I suppose, but don't really know) will follow the sequence you note. But this is not really relelvant to this article. Attackers do what they do, we can't in principle know what it is that they do. Worth a comment perhaps, but not to be reflected in the article structure, as that is merely confusing to the Average Joe. NIST's recommendations make a practical estimate only, and we should not write as though such an estimate procedure is definitive or true. Again, worth a comment, or use as an example, but not more. certainly not as an authoritative account. It's an estimate, purely heuristic and nothing more.
I'm glad we agree on pi, but the distinction between randomness and entropy bears very heavily on the urban legendry surrounding password choices and so the distinction should not be confined to a section. Otherwise, I'm for it. We can steal organization and some content from several other articles, I suspect.
I'm pleased to see that you did not revert my edits wholesale this time. The stylistic points you made are obiter dicta, by and large, and though I disagree on some of them, they're not important in the current situation. Much more important are issues of content, coverage, contextual contingency, in my view. ww (talk) 20:11, 11 April 2008 (UTC)
So you do not believe there is a different between 128-bit AES and 256-bit AES; or a more practical example: 56-bit DES and 72-bit DES? I don't see how you can sustain the notion that "bit strength [is not] a measure of the strength of encryption". In any event, that is a bit OT.
We agree the NIST data is estimated, though it has been empirically tested. As with any scientific method, it isn't a "fact" nor is it "proof", it is accurate until empirical data forces an alternate method. You are welcome to cite alternate methods -- I'm not aware of any. Calling their method hueristic, however, is quite unfair and inaccurate. --Musides (talk) 21:10, 11 April 2008 (UTC)
You are confusing key length with encryption strength. Some secret key algorithms allow different key lengths, like those you cite. Blowfish is still more liberal. During operation, there is an operation called key scheduling in which the keys used for each round are generated. It is they which control the encryption. It may be, that given a particular key scheduling algorithm, an encryption algorithm may be harder to break with one length key than another, but this is irrelevant to the case of passwords, in which analytic breaks of the type noted here (and on which you rely implicitly) are irrelevant to the attack being discussed (also generally implicitly), namely brute force search. One is an issue of encryption strength against analytic attack, one is an issue of brute force key search attack, which is a quite different kettle of fish. Confusable conceptually, obviously, but quite different.
NIST's estimate was heuristic as it was a rule of thumb, with some support from analysis. At least one meaning of heuristic is 'rule of thumb'. It's not a scientific issue really since I can't think of a test of their hypothesis in the current security environment which might falsify it. It won't apply to other languages as I understand it, or if so only by accident and so is parochial.
Which brings up a deferred question. How is it that your milieu takes either or both of taut and Platonic as insults? ww (talk) 21:50, 11 April 2008 (UTC)
Re: encryption, I think we are splitting hairs, so I'll let the matter rest. We both agree encryption and passwords are two different things. I agree that making any kind of relation between the two is more confusing than it is helpful.
Re NIST: You make a strange statement. Falsification is quite easy. Just take a sample of passwords, rate them based on these criteria, and then use different cracking methods. The NIST statement of 18-bits is quite easy to falsify in a real world experiment. Now, sure, you can call a hypothesis a rule of thumb I suppose, but let's not get into semantics, because that doesn't mean their isn't science behind their method.
Re Plato: Platonic is a pretty well discredited school of philosophical idealism, and idealism as a whole is particularly antediluvian. Labels can take on different meanings, for example, when people invoke Nietzsche. So, there are the popular views, and the philosophical critiques. By way of analogy, imagine saying to someone "that is very libertarian of you", or "that was a very catholic thing to do". These labels have different meanings, both to people contingent on their places politically or theologically, but also to the agnostic or apolitical. Oh, and "taut" can be read as shorthand for "tautological" in philosophy speak, which is a pretty bad thing. :)--Musides (talk) 22:23, 11 April 2008 (UTC)

[edit] Outline proposed by Musides

Keep in article:

  • 1 Bit Strength
1.1 Calculating Bit Strength
1.2 Bit-Strength Threshold
  • 2 Guidelines for strong passwords
2.1 Examples of weak passwords
2.2 Examples that follow guidelines
  • 8 References
  • 9 External links
9.1 Password generators

Merge with password article:

  • 3 Choosing, remembering, and using strong passwords
3.1 Random passwords
3.2 Mnemonic passwords
3.3 Patterned passwords
  • 6 Guarding user passwords

Merge with cracking article:

  • 5 Time needed for password searches
5.1 Rainbow tables
  • 7 Password discovery

Delete:

  • 4 Case-sensitive passwords

--Musides (talk) 21:48, 27 March 2008 (UTC)

[edit] Removal of "case senstitive passwords"

I would like to remove this section. Firstly, it contains no citations, and states several dubious facts. Secondly, it contradicts most of the sources that we do have. In particular:

It is often argued that

These are Weasel words.

hurt usability and cause users to write down passwords without substantially improving password strength.

No source that we have supports this. Sure, some sources admonish written down passwords, but they don't suggest that effects the strength of the password.

In theory, case sensitivity nearly doubles the number of characters available for passwords.

Inaccurate: dependent on context. If using only lowercase, it exactly doubles, not nearly. Out of 95 charaters, uppercase represents only 27% of the total.

Doubling the size of the alphabet makes each symbol one more bit of information, thus adding one bit of information per character.

No, that isn't how exponential math works.

I could go on, but this should be sufficient? --Musides (talk) 17:17, 11 April 2008 (UTC)

I agree that this is sloppy writing. I've had it on my to-do list for quite a while. So much to-do, so little time.
But, I suggest that the issue of allowing both cases is relevant to increased brute force resistance, and that this point id worth making for the benefit of the Gentle Reader attempting to decide on a password or attempting to develop an opinion about the local acceptable password policy for one of his/her systems. I think that this would be best handled by a section discussing such policies and their effect on password choice. Not be a wholesale deletion of this section and its content as punishment for poor writing. ww (talk) 17:58, 11 April 2008 (UTC)
Character selection isn't just a matter of brute force, it is relevant for any cracking method. So, we can make a section on character selection, I suppose, and talk about why it matters in detail. That seems fair enough. --Musides (talk) 18:38, 11 April 2008 (UTC)
By character selection I presume you mean something like universe of choice. If so, I think I agree, but in general I think this subject deserves very little coverage as, once understood, it's trivial and is mostly relevant to acceptable password policies imposed on users. It is the easiest aspect of password choice to discuss, which accounts for the journalistic verbiage devoted to it, almost all skew to actual resistance to brute force attack, and is entirely misleading in concentrating on this one measure of password strength to the neglect of other issues which are harder to write about in a journalistic context. We, not being limited to that context, should do better by our Readers. ww (talk) 20:17, 11 April 2008 (UTC)

<-- The WP ASCII article notes 7 national characters, and I remember writing a paper on ASCII (ca '80) which listed 4 national characters. Without further research, for which I haven't the time just now, I left it indeterminate, which I suspect is correct anyway, given the proliferation of ASCII variants. As for the no spaces in passwords typical policy, none of the systems I'm currently using allow spaces. I seem to remember that DEC's VMS didn't either, original Unix did not, and DEC's RSX and RSTS didn't either. Some website registration might, I suppose -- not much experience with them-- but this point is ot worth a dubious or citation needed flag. No harm, no foul -- there's no violation of WP purity in equity here. There is a useful flag for the Avearge User's brain that the permitted characters list may be odd. A useful note to make, though not worth much wording. ww (talk) 22:10, 11 April 2008 (UTC)

[edit] passwords vs pass phrases

The distinction between these is a distinction without much difference. In one sense a pass phrase is simply a long pawwword with some spaces in it -- generally not permitted by most acceptable password content policies. In another sense, a pass phrase is a generator for an acceptable password. Eg "I was twentyone when I first went to Paris" --> "IwtwIfwtP". A poor password for several reasons, none having to do with its origins as s pass phrase. A recent edit comment suggests that pass phrase issues have no place in this article. I disagree for the reasons given. Worth only a comment or two perhaps, but not exile. ww (talk) 20:24, 11 April 2008 (UTC)

I agree. There are two definitions for "pass phrase", as you've identified. The edit I made was to remove a passphrase as an example of a "good password" -- for exactly the reasons you've just identified. So, I think we both agree on the edit, you just took exception to the comment made describing the edit! :) --Musides (talk) 21:04, 11 April 2008 (UTC)

[edit] Removal of "Password discovery"

In the introduction we talk about reasons why strong passwords won't help a user. Devoting an entire section to this in an article about strong passwords is confusing to the average reader. This section isn't only about password discovery -- it is about the various and myriad ways security can be compromised. If addressing these things in the introduction is for some reason not satisfactory enough, then at the very least this section should be renamed "Ways computer security can be compromised". I still don't think that belongs, but let's call a spade a spade. --Musides (talk) 21:41, 11 April 2008 (UTC)

Whatever we call a spade, we are discussing an aspect of password strength, although one some classifiers would find not strictly kosher. Since our Gentle Reader is not likely to care about kosher, we ought to note that a password carelessly handled/stored is a weak password in practice. The length of the current section seems to be an accretion of comments rather than clearly written. Better writing solves the problem you note, I think. ww (talk) 21:54, 11 April 2008 (UTC)
Then it seems to me this section should be renamed "Password Policy", since handling and storing passwords is a function of policy (written or otherwise). My point in this is that if we can figure out what we are trying to accomplish in this section, we'll do a much better job at it. Agreed? --Musides (talk) 22:27, 11 April 2008 (UTC)

[edit] Brute force, etc

Concerning password length, characters and entropy, brute force is not the only applicable attack. In fact, it is just the least common denominator. Entropy has no effect on brute force, only length and composition matter. While entropy is the most important defense against guessing and pattern matching, length and composition still matter. Identifying patterns quickly is entirely contingent on the number of permutations being handled. The greater the number of permutations, the more difficult pattern matching becomes. For example, entropy being equal, a pattern contained in a 4 character number is far easier to discovery versus a pattern contained in a 15 character alphanumeric plus symbols.--Musides (talk) 22:58, 11 April 2008 (UTC)

M, 96 x 96 ... (for 8 characters) is the search space for a maximal ASCII is 96^8. 10 choices (for one character), 52 x 52 ... (for 5 characters) is 52^5, 33 x 33 ... (for 2 characters) is 33^2, for a total universe of choice of 10 x 52^5 x 33^2 <<<< 96^8. Thus a password patterned in proportion to the proportion of characters in the ASCII set is rather less than maximum entropy of an 8 character password chosen from the ASCII character set.
Your reversion is in error.
In fact, I think that the point is not necessary to even approach, though noting that it is easy to mislead oneself about the entropy of a particular password design algorithm. So easy that one should probably not even attempt it. Though this may be too much POV (unless we can find a source) for WP. ww (talk) 01:37, 15 April 2008 (UTC)
I do not understand what you are talking about. Are you talking about the manual checking of entropy data? If so, I'm not clear on the point you are trying to make. But look, you talk about the "Gentle Reader" often enough, and I think something needs to be said in regard to vetting entropy. One must have some sort of loose guidelines: afterall, a randomly generated password can certainly contain a pattern. I can provide a source on that section if necessary. Please clarify what we are talking about, and I'll move us forward with a source. --Musides (talk) 05:19, 15 April 2008 (UTC)

[edit] Is it possible to check entropy?

Consider a universe of choice having 10 possible values. A password 8 characters long from that universe will have at most 10^8 values if each value is equiprobable. That's the best one con doe with those conditions. Hard to assure the equiprobability though, which is the heart of the entropy business. If one increases the universe of choice by 10 characters, but excludes 5 of them because they aren't suitable for some reason, then the number of possible 8 char passwords is NOT 20^8, but 15^8. ANY pattern imposed on a password reduces the number of possible passwords and makes a brute force search easier, perhaps too easy. Your suggested pattern, to match the group proportions in the printable ASCII character set is a pattern, and so makes breaking easier, not harder.
Entropy cannot be checked for automatically or manually. We have no method, even a theoretical one, which can do so. Looking random may or may not be high entropy, and passing this or that statistical test for randomness (not actually a test for randomness, but rather for possible patterns, all of which are non-random) is no assurance of high entropy either. See Knuth Semi-numerical Algorithms, vol 2 or Schneier's discussions of randomness (and entropy) in Applied Crypto. It's a subtle issue and distinction which is easy to get tangled up with. Probably something about the wiring of the human brain -- Khaneman and Tversky established that we can't reason clearly about risk even in simple situations (papers beginning in the early 80s IIRC) and Khaneman finally got the Nobel for it (too bad Tversky had died before the Committee got around to it). That we have trouble with tricky situations like those in this article is probably par for the course.
We shouldn't even bring up a false reasoning trail, even as a bad example because it's bad writing in an general article which is to be suited to the novice Gentle Reader. Poor pedagogy.
This article should leave the Reader with some impression (accurate, and not overlaid with red herrings) of the meaning of password strength (and that it's mostly a chimera anyway) and why. It's this last that involves a brief introduction to entropy as distinct from randomness or random appearing. We should also leave that Reader with a reasonable procedure for producing a password which is likely to prove both satisfactorily difficult to break, yet possible to remember. And we should leave our Gentle Reader with some sense of the ways a password can be lost, betrayed, guessed, socially engineered, etc so the password(s) chosen has some increased chance of not being discovered.
We should not speculate, because it's useless to bother, about sequences of cracking methods which crackers might apply. A successful password will remain unknown regardless of the method of attack used, and regardless of whether we editors can even list the attacks which might be used. This rather simplifies the account needed, I think.
Our writing should thread the following needle. We are writing for novice Gentle Readers and so cannot lose them in the weeds, and we should write correctly so the expert Gentle Readers won't be outraged at our imprecision of thought and presentation. This is a high standard to which to repair, but I think it can be done. ww (talk) 06:45, 16 April 2008 (UTC)

I agree that qualifying entropy does not imply a suggested pattern. We disagree, however, in that I think it is possible for readers to think in terms of entropy quality. The data that leads you to the notion that it is impossible to identify entropy tends to the opposite conclusion: humans are talented at pattern detection, thus the absence of a pattern is a qualitative indicator. The lack of perfection is not an indication of impossibility: reasonable objectives can be achieved with accurate indications. For example, in the event of a 14 character password, generated randomly based on a 95 character set, all characters happen to be lowercase. This means, in practice, that your entropy is only a function of a 26 character set, and further is contingent about any random patterns that may be present, lowering it further. This represents a common sense way to qualify entropy for the average reader. Moreover, the method of statistical distribution is pretty well-known and used in real-world policies such as in the US Military. I mean, it is nice to talk about the theory, and how strong things should be and so on, but when it comes to implementation, certain things have to be established, such as the 2+2+2+2+4 rule, etc, etc, etc. It just isn't credible to argue that such implementations produce anything less than the best quality entropy one can expect. --Musides (talk) 22:40, 16 April 2008 (UTC)

No sir. A randomly chosen 8 char password that happens to have all lowercase alfas presents a brute force attacker with a 95^8 search space. In fact, a randomly chosen 8 char password which is aaaaaaaa presents such an attacker with the same search space. The entropy may or may not be high as well. Consider the older example of the digits of pi. If that password comes from that sequence the entropy will be very low, but the randomness will remain high (if, as seems to be the case, the digits of pi are a random sequence).
As for humans having good judgement of entropy, this is simply not so, experimentally. We cannot chose random numbers from 1 - 10 or 1 - 100 or random letters, etc. It was this very difficulty which made it necessary to produce such things as the Rand Corp book of random numbers in the mid 50s (we've an article on it, no less!). We do indeed detect patterns (some patterns) very well, as we have inherited the neural net machinery which made it possible for great great uncle Ug to see that snake in time, and so survive to have kids, including great uncle Og. We do far more poorly at higher dimensional patterns, so our failure to see a pattern is not evidence of randomness, nor of the related but quite different, entropy.
The best quality entropy (ie, the highest value for entropy, I trust) is entirely contingent on the situation. If the attacker expects a password to be from the digits of pi, there will be very little entropy. If the attacker is so stupid as to never consider that possibility (and how could we tell? there's no way, so we must perforce assume that the attacker is brighter than the average rock), then there would be considerable entropy.
It may help to think of entropy, the relevant measure in cases like this, as more or less the inverse or redundancy. So, a language with no redundancy would suffer severely fro noise on the communications line. On the other hand, it would be terse, much terser than all natural languages which have considerable redundancy. Low entropy signals are more robust in the face of noise on the line since their internal redundancy makes guessing what was meant much more reliable. High entropy (low redundancy) languages are just the opposite. Most of this paragraph is a quick and dirty restatement of Shannon's work in which this meaning of entropy was first defined with precision, and from which the field of information theory arose.
Since this is tricky stuff to think about, I have my doubts that Readers will be able to do so successfully. They are far more likely to find themselves wandering the quicksand wilderness of seemingly sensible but wrong ideas. As for "qualifying entropy", what???? There is not test for identifying randomness, only theoretical reasons to treat the output of this or that generator as random, and so there can be no test of entropy as it is in the context of this article, a subset of random. We should not say anything which may mislead Readers into thinking that there is such a thing. They will be sent on many a snark hunt, a very undesirable result for editors writing here. ww (talk) 23:21, 16 April 2008 (UTC)
First, if you are only using lowercase, explain to me why you would crack for a 95 character set? You are assuming the cracker knows that you are using a 95 character set, which is a dubious assumption, or that crackers are not going to crack lower entropy first, again, a dubious assumption.
The cracker need not know anything about the universe of choice. If it's, say, 95 characters, the size of the search space is 95^x, regardless of what an attacker might assume. The attacker's knowledge is irrelevant.
Secondly, I didn't say humans have a good judgement of entropy. On the contrary, I said humans are good at pattern detection, and they can use that skill in a negative way to identify low entropy. My point is that it is possible to judge entropy, which we obviously disagree on.
If that's what you meant we agree. My point in addition was that the patterns humans can detect are a small subset of the possible patterns and so insufficiently reliable in this context.
Thirdly, entropy is an objective criteria, not a subjective one. You are doomed to pseudo science by making math and empirical data subjective. There is no dispute that crackers use pattern detection (dictionary, word substitution, etc) first and foremost. You are approaching this subject with the assumption that password strength is fundamentally unknowable, unmeasurable, and purely subjective. You are doing this despite all the sources to the contrary. You would be better off making this article a case for why password strength doesn't exist, for example, rather than simply foiling data counter to your own.--Musides (talk) 00:59, 17 April 2008 (UTC)
In the sense being used here, entropy is a contingent, indeterminate, and dependent on particular situations. It is not an objective measure, being so contingent. A random number known to an attacker has low entropy, The point of choosing randomly is to increase the entropy to the unknown attacker. The point of rejecting patterns of any kind is to avoid giving an attacker a shortcut. We wish to make brute force search his best (and not very attractive) attack choice. But if we do, the attacker will find that bribery, etc is a better choice. No messy brute force searches. ww (talk) 02:32, 17 April 2008 (UTC)

I just want to add that the intent of statistical distribution is (a) ensuring full use of the character set and in so doing; (b) breaking most common pattern methods. If the password is randomly generated based on that policy, you are pushing pretty close to the upper limit of permutations since any particular character could be one of 95. If it is user selected you have considerably more in your favor like common letters, numbers and symbols, but you also won't come across a harder to crack password in the user space. That is the whole point of password strength, and the whole point in the distinction between randomness and entropy. It isn't a pattern to suggest that a password utilize the character space, on the contrary! The 2+ policy does not dictate character placement, though one could theoretically infer common patterns from placement under such policies, which helps reduce entropy. Yet, as you know, entropy research in that field isn't exactly common. :) --Musides (talk) 23:07, 16 April 2008 (UTC)

Anything other than random selection from the universe of choice (the character set in this context) reduces randomness. The password space must include the patterned as well as all other possibilities, else the search space will be less than that possible. No limitation to some statistical standard will ensure higher entropy, including that you suggest here. As for using the character space (do you mean set?), the act of choosing randomly does so. No further massaging is needed, and any further massaging is very likely to reduce entropy and so make the attacker's task easier. Your suggestion as to the "whole point of randomness and entropy" is obscure. Can you try again? ww (talk) 23:26, 16 April 2008 (UTC)
Yes, methods to improve entropy reduce randomness, but the objective is improving entropy not maintaining the purity of randomness. Random selection can be problematic for entropy, just as user selection can be. Given a false dichotomy, sure, randomness is better. Randomness alone, however, is by definition an inconsistent method for creating entropy.
This is not clear at all. Entropy, as much as possible, is the object, and random selection is a way to achieve it. Otherwise, ???
The idea that the password space should include patterns is problematic. If you take patterns as a function of possible permutations, we are talking about a small fraction of the total space. That is the whole point of patterns being a problem -- because they are easy to check for and represent an incredibly small space. This is why random selection looks good superficially, because the chances are good that your random selection will not be a pattern, which lends the impression that randomness equals entropy.
I think I agree, but I'm not sure because I don't follow you comments. Please clarify.
For example, a random selection of 8 letters are very unlikely to be an actual word. Why? Let's accept the estimate of 1 million words in English. 268 represents

208,827,064,576 possible permutations. That means that there is about a 0.00000047% chance a random selection will happen to contain at least a portion of a word. This makes randomness look really good. If we then add in all the other common patterns like double words, basic substitution, complex substitution, words with numbers appended and so on, we might get to around if we really try 1%. Again, randomness looks good. Yet, we have to then account for the problem of numbers: the 208 billion is a very small number of permutations, so this kind of randomness, even though 99% looks really good, doesn't mean anything in reality. When we talk about 958, we start looking at pretty good sized numbers that become meaningful in some cracking contexts. Yet, the likelihood that randomness will choose from just lowers, for example, is common enough to be problematic for any serious approach to password policy. This is why we need to qualify for entropy, because dealing with numbers that we have fundamental difficulty grasping often leads to false assumptions.

I'm afraid I don't follow this. Yes words are unlikely after a random selection of characters, but ... Can you try again? ww (talk) 02:32, 17 April 2008 (UTC) --Musides (talk) 02:05, 17 April 2008 (UTC)

Okay, I think we are making good progress. First, I want to throw this out as an example of statistical distribution being employed: [2]. They choose not to use symbols in their policy, hence the slight difference.

The objective of a strong password is to force a cracker to check a a prohibitively large space. Success is contingent on this movement, which tells us that the counter-point is where the cracker must narrow the space as much as possible. Our first assumption is that, all things being equal at the start, no side has a particular advantage. Instead, the strength or weakness of their approaches is contingent on the decisions they make, and these decisions are generally made independent of either side.

The attacker can do absolutely nothing to change the sized of the search space. That is determined by the universe of choice size and is thereafter 'fixed'. What an attacker will try to do, of course, is to check the easy possibilities. Dictionary words, lists of common passwords, username, nickname, kids, pets, ... If that fails, brute force search becomes the most reasonable choice. The point of high entropy in a password (ie, high strength password) is to force this choice on an attacker. None of this, except in a very odd reading, narrows the search space.

This leads us to the first question: what must a cracker due to narrow the space? We both know that answer: they start with the dictionary and move onto patterns. The key point of interest on patterns is that they are a function of password policy. Thus, for example, the pattern of 7 lowers and a number has a high probability of producing pattern of 7 + 1 at the end, but any particular permutations of the 7+1 pattern is not a serious obstacle, and thus low hanging fruit. In combination with letter and number selection patterns inherit to people (1, a, e, etc, etc), the space is reduced further. Thus, for any given set of passwords on such a policy, the cracker will have the majority resolved quickly, since they play to the strength of the cracker by operating in such a small space.

Yes, patterns are a bad thing in all cases, as the attacker may exploit them and short the brute force search we are trying to force on them. No controversy here.

We need to put this example into the proper context. We don't know the "random" characters people will select in their password, by definition. By the same token, however, we can follow people's patterns without fail. While every instance has an exception, Shannon's work is predicated on the mean, and this has been empirically supported for the past 50 years. This means that success in strong passwords cannot be achieved as a function of user selection, but instead must be predicated on the negative problem of patterns, or put positively, on entropy.

Perhaps not quite 'without fail'. Shannon'w work is not predicated on the mean. He attempted an analysis based on random choice as compared to non-random choice and the consequences this had for the attacker. He is not making mere statistical points, but something far more fundamental.

Let's return to why what the cracker knows about their adversary doesn't matter. The method of the cracker is to reduce the set -- to attack the smallest space possible, or put another way, the attacker must focus on low entropy. In this sense, it is an advantage for the defender to broadcast a strong policy that shows a reasonable end point in medium to high entropy. As Sun-Tzu says, if you convince your opponent of the futility of his effort before the engagement begins, you've won in the best way possible. Meanwhile, if the cracker engages, you've done them no favors at all -- the fact that they don't have to spend time on their tiny cracking set and puny rainbow tables is an exchange *you want*. After all, this means that in exchange for the few days they would have spent on the easy stuff, you are forcing them into a time frame of years or decades. When cracking passwords becomes a war of attrition, you've lowered your risk profile considerably through strong passwords.

There may be mere term confusion here. The attacker cannot 'attack the smallest space'; they are stuck with the search space 'selected' by the password chooser.

The problem between randomness and entropy can best be viewed from the information perspective, where information is a reduction in uncertainty. Let's use a random password produced in accordance to a particular policy on the 95 space. The clarity of the information produced in a given set of 100, for example, is improved contingent on certain results. Why? Because in practice, some results will feature a reduced set, and reduced sets imply improved certainty. For example, random selections that end up featuring just lowers and numbers have precipitously fallen straight through to low entropy.

This is purely opaque. Maximum uncertainty is indeed the point, and is the definition of entropy. If random choice delivers that, it is good, otherwise not. All else is surplusage.

Why did I try to illustrate this with the dictionary example? Because that is one example of information certainty, but certainty goes much further. All lowers is an example of certainty, but here it is an example of where random data correlates with human patterns. Patterns are not limited to mathematical constructs such as abcd, but constitute also patterns in human choice. In some kind of abstract sense you can justify that the source or method is a random one, but you can't justify that the result is random -- because any result of only lowers, for example -- is quite the opposite of a random result: it is exactly what we expect to see from human input in a password.

'random data correlates with human patterns'. ? There's a definitional problem here, or exotic use of terms. Otherwise this is opaque. This subject, being tricky, requires considerable care in phrasing lest meaning be lost. ww (talk) 16:41, 18 April 2008 (UTC)

Have a read of this: [3]

--Musides (talk) 17:33, 17 April 2008 (UTC)


Great! I think we have found a lot of common ground. :) We have some minor semantic differences (such is the problem of an autodidact such as myself) in my phrasing about the attacker reducing the space, which you interpret literally while my intent is purely figurative. We agree on the substance, of course.

Figurative speech about a topic such as this leads quickly to bog of confusion, illustrated above on this talk page.

Randomness seems to be our last hurdle in this phase, and semantics seems to be the main obstacle. We agree on the key point that randomness can both deliver high and low entropy. We both agree that poor entropy can be correlated with typical patterns, or "easy possibilities". We also agree that people are quite bad at being random, and instead choose patterns, which results in low entropy.

Poor entropy is a matter of lack of uncertainty. So, my password, drowssap has high entropy if you (or your automatic minions) are unable to come up with it, or take a long time to come up with it. If you can do so, it didn't have high entropy after all. In practice, we must choose passwords randomly, and avoid easy to attack words, names, and such. Our password generating scheme should avoid these, but still generate rememberable passwords -- this means that we will likely not have purely random passwords as these are rarely memorable. How to make random choices is very difficult; there is no known algorithm, and the randomness sources most everyone believes to be actually random are too hard to use for passwords (eg, radioactive decay).
I trust we agree so far?

If my assumptions about our state of agreement are correct, then we can deduct that the innate ability people have to identify patterns can serve to identify low entropy resulting from random selection. Do we have agreement up to this point?

Actually human ability to detect low entropy (or for that matter, high entropy) is, in my view, sufficiently identical to nil that it cannot, in the real world, be relied upon. Not even a little. Other than noting this fact, I think we should not further discuss this possibility as it is of no use, save for confusion, to the Average Reader and will merely irritate the Expert Reader. ww (talk) 18:08, 18 April 2008 (UTC)

--Musides (talk) 17:22, 18 April 2008 (UTC)

Okay, two steps forward, one step back hopefully? Regarding your password, I'll make two points. The first is purely empirical: the example you provided is among one of your "easy possibilities". This is an objective fact because the extremely small number of permutations for 8 lowercase letters make it a no-brainer, so entropy isn't even worth the effort of worrying about. That said, FWIW, "r, o, a and s" are all among the five most common letters chosen in passwords according to Burnett , which accounts for 5 of your characters. Thus, we can objectively say, based on the source data that we have, that the password you have cited has low entropy. The second point is that the construct of your argument is a tautology: you are essentially saying it is strong if it can't be cracked. I think that reflects a relativistic point of view that simply isn't supported by any of the research or data that we are citing.

On your last point, you can't have it both ways. You can't state that humans are pattern bound and cannot reproduce randomness, while also stating that humans are unable to detect patterns in a given result.--Musides (talk) 19:14, 18 April 2008 (UTC)

Nope, we're farther away that you expect. My contrived password example is not low entropy because of common letters (after all even a high entropy very long and completely random password will have them as well), but because attackers will hit upon it first or early as you suggest. Even if were to turn up in a randomly generated list of passwords. On the other hand, an attacker in Poland will not be likely to think the same collection of letters, words, whatever, as very likely choices, so its entropy will be higher for such attackers. As for strong if not attackable, well, that's the definition of high entropy. The attacker has been forced to fall back on brute force search as the best possible attack technique, and that's the best any password chooser can manage. Lots of ambiguity here, for there are no known ways to anticipate what attackers will do or consider to be reasonable early tries. So choosing a high strength password is more than a little a shot in the dark. All one can do is to avoid known bad passwords (no dictionary words, names, pets, too short, large enough search space (ie, big enough character set or long enough password or both), etc) and make sure what is actually used was randomly chosen, or as close as possible considering remeberability and other such practical considerations. Warring criteria to be sure.
Humans are good at noting some kinds of patterns and very bad a recognizing or avoiding others. No contradiction there, just empirical observation. We don't do well at estimating risk either, even in simple situations, another empirical fact. ww (talk) 22:52, 20 April 2008 (UTC)
Okay, so this discussion is without merit, unfortunately. Your insistence on the notion that entropy is purely subjective runs directly counter to the NIST and Gartner sources in particular, or most recently, the common letters issue. You are quite cavalier in pushing aside published books and articles with your own opinion. Since referenced, published sources are not enough to shift your relativistic point of view, I don't suppose there is any way to come to an accord on this subject. --Musides (talk) 20:21, 21 April 2008 (UTC)
Having bogged down as much as it seems to have been, we may need outside help to resolve differences so that the article can be improved. You will note that I have suspected some such aid might eventually be needed for some time. The article is currently in a most unsatisfactory state, partly because of our back and forthing. I certainly have attempted few improvement edits since it became clear that they would be contentious until we editors have agreed on fundamental issues.
As for "You are quite cavalier in pushing aside published books and articles with your own opinion", we are veering into lack of good faith territory here. Not a good place to head toward. I am not quoting my own opinion, I am harking back to Shannon's definition of information entropy, which involves surprise at the content received at the destination. Indeed the reference you linked to above makes this point (eventually) in discussing language redundancy. The divergence between us seems to me (intermittently) to have been one of fundamental definition, and until that is settled, there is little point in referring to texts and papers which do not attempt to address fundamentals except inferentially. Such sources are, generally, feeble reeds on which to rely in covering fundamental territory. Does this calirfy what you perceive as my cavalier attempts to ignore the field and substitute my own opinion? ww (talk) 05:11, 22 April 2008 (UTC)
Well ww, let's not miss the forest for the trees. We have had a ridiculously long and pointless discussion regarding one particular issue: can entropy be checked. Specifically, this is in the "entropy sources: Manual entropy" section. So, we are talking about a couple of lines in the article. Regarding the sources, I've used several in this discussion by inference (NIST, Gartner) and some direct ([4]), but that doesn't seem to matter. I don't really care if that section gets ripped out: it is pretty obvious we are in an area of interpretation and opinion, which is an untenable endeavor here.
Now, in the overall picture of the article, I think things are looking pretty good. Do you have other issues? --Musides (talk) 22:00, 22 April 2008 (UTC)

[edit] Online Research Guide

Generally, Wikipedia does use both types of English. I would guess that we have more contributors coming in from the United States, so I would imagine there is a bids towards American English. However, the general policy is to use whichever spelling of word is more appropriate for the given article. So for example, Harry Potter articles and Winston Churcnill would use English spelling being British topics. The other main thing is to make sure that type of English it uses. Cheers! Evilphoenix —Preceding unsigned comment added by 216.237.225.194 (talk) 19:29, 21 April 2008 (UTC)

[edit] password generator section

I removed a recently added generator and took a look at the two remaining. Neither inspires confidence and both involve passing around passwords over the Net. Furthermore the software which handles user's passwords is not available for inspection .The security implications here are not good. We might have a section discussing the issues of such generators, but I don't think we should be endorsing or appearing to endorse any of them. Comments? ww (talk) 08:30, 10 June 2008 (UTC)

Nice work. This article may be yet another place for spyware vendors to hawk their wares. It is important for us to be extra critical. --Musides (talk) 19:23, 12 June 2008 (UTC)