(fwd) Re: PGP Pass Phrase Security
I thought this FAQ from Grady Ward (sometimes on our list, sometimes not) might fit with the discussion of password and passphrase security. There's a lot of crunching needed to determine if a selected passphrase has enough entropy. (And to some extent, it is not computable to determine if a string has entropy L, as a sufficiently clever attacker may realize a seemingly complex string actually is much simpler, more predictable, lower entropy than other analyses might suggest.) As others have said, using these sources for passphrases is a Bad Idea: - phrases from popular songs (and several levels of permutations) - famous quotes (and permutations, e.g, "Four scored but seven didn't" is not a very good passphrase, in comparison with "Fully weaSSel lampshop 3856fq3") - lines from novels, television These all have much less entropy than the "shocking nonsense" that many recommend. Memorizing good passphrases is expected to be hard. Personal information leaks bits. Finding personal information that is meaningful to one, but has not been revealed to others (or included in databases) is tough. Anyway, here is Grady's FAQ on this: PASSPHRASE FAQ V. 1.0 1 November 1993 '"PGP," warns Dorothy Denning, a Georgetown University professor who has worked closely with the National Security Agency, "could potentially become a widespread problem.' -- (E. Dexheimer) Comments to: Grady Ward, grady@netcom.com Contributors: John Kelsey, c445585@mizzou1.missouri.edu (Appendix A.) RSA Data Security (Appendix C. The MD5 Algorithm) Jim Gillogly (Appendix D. The Secure Hash Algorithm) FAQ: How do I choose a good password or phrase? ANS: Shocking nonsense makes the most sense With the intrinsic strength of some of the modern encryption, authentication, and message digest algorithms such as RSA, MD5, SHS and IDEA the user password or phrase is becoming more and more the focus of vulnerability. For example, Deputy Ponder with the Los Angeles County Sheriff's Department admitted in early 1993 that both they and the FBI despaired of breaking the PGP 1.0 system except through a successful dictionary attack (trying many possible passwords or phrases from lists of probable choices and their variations) rather than "breaking" the underlying cryptographic algorithm mathematically. The fundamental reason why attacking or trying to guess the user's password or phrase will increasingly be the focus of cryptanalysis is that the user's choice of password may represent a much simpler cryptographic key than optimal for the encryption algorithm being used. This weakness of the user's password choice provides the potential cryptanalytic wedge. For example, suppose a user chooses the password 'david.' On the surface the entropy of this key (or the number of different equiprobable key states) appears to be five characters chosen from a set of twenty-six with replacements: 26^5 or 1.188 x 10^7. But since the user is apparently biased toward common given names, which a majority appear in lists numbering only 6,000-7,000 entries, the true entropy is undoubtedly much closer to 6.5 x 10^3, or about four orders of magnitude smaller than the raw length might suggest. (In fact this password probably possesses a much smaller entropy than even this for the very common name "david" would be one of the first names to be checked by an optimized dictionary attack program.) In other words the "entropy" of a keyspace is not a fixed physical quantity: the cryptanalyst can exploit whole cultural biases and contexts, not just byte frequencies, digraphs, or even whole-word correlations to reduce the key space he or she is trying to explore. To thwart this avenue of attack we would like to discover a method of selecting passwords or phrases that have at least as many bits of entropy (or "hard-to-guessness") as the entropy of the cryptographic key of the underlying algorithm being used. To compare, DES (Data Encryption Standard) is believed to have about 54-55 bits (~4 x 10 ^16) of entropy while the IDEA algorithm is believed to have about 128 bits (~3.5 x 10^38) of entropy. The closer the entropy of the user's password or phrase is to the intrinsic entropy of the cryptographic key of the underlying algorithm being used, the more likely an attacker would need to search a substantially larger portion of the algorithm's key space in order to rediscover the key. Unfortunately many documents suggest choosing passwords or phrases that are distinctly inferior to the latest method. For example, one white paper widely archived on the internet suggests selecting an original password by constructing an acronym from a popular song lyric or from a line of script from, for example, the SF movie "Star Wars". Both of these ideas turn out to be weak because both the entire script to Stars Wars and entire sets of song lyrics to thousands of popular songs are available on-line to everyone and, in some cases, are already embedded into "crack" dictionary attack programs (See ftp.uwp.edu). However, the conflict between choosing an easy-to-remember key and choosing a key with a high level of entropy is not a hopeless task if we exploit mnemonic devices that have been used for a long time outside the field of cryptography. With the goal of making up a passphrase not included in any existing corpus yet very easy to remember, an effective technique is one known as "shocking nonsense." "Shocking nonsense" means to make up a short phrase or sentence that is both nonsensical and shocking in the culture of the user, that is, it contains grossly obscene, racist, impossible or other extreme juxtaposition of ideas. This technique is permissable because the passphrase, by its nature, is never revealed to anyone with sensibilities to be offended. Shocking nonsense is unlikely to be duplicated anywhere because it does not describe a matter-of-fact that could be accidentally rediscovered by anyone else and the emotional evocation makes it difficult for the creator to forget. A mild example of such shocking nonsense might be: "mollusks peck my galloping genitals ." The reader can undoubtedly make up many far more shocking or entertaining examples for himself or herself. Even relatively short phrases offer acceptable entropy because the far larger "alphabet" pool of word symbols that may be chosen than the 26 characters that form the Roman alphabet. Even choosing from a vocabulary of a few thousand words a five word phrase might have on the order of 58 to 60 bits of entropy -- more than what is needed for the DES algorithm, for example. When you are permitted to use passphrases of arbitrary length (in PGP for example) it is not necessary to further perturb your 'shocking nonsense' passphrase to include numbers or special symbols because the pool of word choices is already very high. Not needing those special symbols or numbers (that are not intrinsically meaningful) makes the shocking nonsense passphrase that much easier to remember. If you are forced to use, say, a Unix password utility that permits only passwords of restricted length, one good strategy is to process a your secret passphrase using MD5 or SHA, then UUENCODE the result and select your shorter key from the output. See Appendix C and D for actual MD5 and SHA source implmentations. Appendix A. For software developers For software developers designing "front-ends" or user interfaces to conventional short-password applications, very good results will come from permitting the user arbitrary length passphrases that are then "crunched" or processed using a strong digest algorithm such as the 160-bit SHS (Secure Hash Standard) or the 128-bit MD5 (Message Digest rev.5).[See following Appendices] The interface program then chooses the appropriate number of bits from the digest and supplies them to the engine enforcing a short password. This 'key crunching' technique will assure the developer that even the short password key space will have a far greater opportunity of being fully exploited by the user. John Kelsey writes: "I think it's a really good idea to use a randomly-generated salt to generate a key from a password, and that this salt should be as large as possible. Basically, this is to keep an attacker from spending lots of computer power *once* to generate a dictionary of likely keys. If users use good techniques to choose passwords, this won't matter much, but if they don't, this may save them from having their encrypted files or transmissions routinely read. The simplest scheme I can see for this is simply to prepend a 128-bit salt (generated as strongly as possible) to each encrypted file. Generate the key from the password by pre- filling a buffer with the 128-bit salt, then XORing in the keyed- in password, or by appending the key to the keyed-in password. Then, run SHA or MD5 or whatever to get the key. A secondary point: Adding a random salt ensures that people who use the same password/passphrase for lots of files/transmissions don't get the same key every time. Since most successful attacks against modern encryption schemes use *lots* of ciphertext from the same key, this might add some practical security, at relatively low cost." --John Kelsey, c445585@mizzou1.missouri.edu Appendix B. A tool to experimentally investigate entropy A practical Unix tool for investigating the entropy of typical user keys can be found in Wu and Manber's 'agrep' (approximate grep) similarity pattern matching tool available in C source from cs.arizona.edu [192.12.69.5]. This tool can determine the "edit distance," that is, the number of insertions, substitutions, or deletions that would be required of an arbitrary pattern in order for it to match any of a large corpus of words or phrases, say the usr/dict word list, or over the set of Star Trek trivia archives. The user can then adjust the pattern to give an arbitrary high threshold difference between it and common words and phrases in the corpus to make crack programs that systematically vary known strings less likely to succeed. It is often surprising to discover that a substring pattern like "hxirtes" is only of edit distance two from as many as forty separate words ranging from "bushfires" to "whitest." Certainly no password or phrase ought to be chosen as a working password or phrase that is within two or fewer edit distance from a known string or substring in any on-line collection. Appendix C. & D. not included for bandwidth reasons -- Grady Ward | For information and free samples on | "Look!" grady@netcom.com | royalty-free Moby natural language | -- Madame Sosostris +1 707 826 7715 | development core rules, run: | A91F2740531E6801 (voice/24hr FAX) | finger grady@netcom.com | 5B117D084B916B27 .......................................................................... Timothy C. May | Crypto Anarchy: encryption, digital money, tcmay@netcom.com | anonymous networks, digital pseudonyms, zero 408-688-5409 | knowledge, reputations, information markets, W.A.S.T.E.: Aptos, CA | black markets, collapse of governments. Higher Power: 2^859433 | Public Key: PGP and MailSafe available. "National borders are just speed bumps on the information superhighway."
participants (1)
-
tcmay@netcom.com