On Sat, Jul 07, 2001 at 08:03:30PM -0700, IT3 Stuart B. Tener, USNR-R wrote: > Mr. Touloumtzis, et al.: > > Some ideas to increase entropy: > > 1) Randomizing the location of the number within each word add more entropy? > I noticed you consistently placed the number at the beginning of every word > 2) Randomizing the capitalization change anything? > 3) Random non-alphanumeric characters in random positions of each of the > words help? Hello Mr. Tener, Sandy Harris already responded to this, so I'll mostly just agree with him: these things all add entropy, but (IMHO) they disrupt the balance between getting enough entropy and being able to remember the passphrase. The simple "number + word" combination has the advantage that remembering each element of the passphrase just means remembering the number and the word. In practice, this often means remembering the word and relying on memory's inherent associativity to bring the number along. When you start "tacking on" more information in each group, I think it would start getting harder to remember out of proportion to the amount of entropy added. For example, randomly putting the number at the beginning or end of the word only adds 5 bits of entropy to the passphrase construction algorithm, as Sandy pointed out; if you use 8 punctuation characters in addition to the 8 numbers, that's only another 5 bits (16 values instead of 8 only add 1 bit of entropy). In contrast, adding another number + letter pair adds 18, and would probably be easier to remember. Generating strong passphrase data is easy; just choose a truly random value N bits long. The challenge is in constructing a useful mapping from the space of N-bit strings to the space of character strings which can (a) be typed, and (b) be memorized. In my previous mail, I neglected to mention another advantage of random-word-selection algorithms and similar approaches, as opposed to trying to come up with unguessable natural language passphrases: it's much safer to split the passphrase in half (thus requiring two people to be present to type in the two halves) if you don't use natural language. In order to be able to split a passphrase P into two substrings P1 and P2, you want the conditional probablility of P1 given P2 to be the same as the probability of P1, and vice versa. In other words, you don't want any redundancy across the split. In a random word selection algorithm, you can safely split between words, since the selection of each word is independent from the others. For example, you could generate a 10-word+number phrase given the algorithm I described and give 5 words to each person. Each of those people would still have to confront a 90 bit search space to get the other person's half (that is, they would have to brute force it). In contrast, natural language strings contain redundancy that often spans the length of the string (even if it's an entire document, in the case of well-known documents), making such a simple split very unsafe. As an example: suppose your passphrase selection algorithm was to interleave word-by-word four lines chosen independently and at random from the complete works of Shakespeare. This would result in obnoxiously long passphrases, but they would be reasonably strong (about 70 bits of entropy) because Shakespeare wrote a heck of a lot :-). However, you couldn't split such a phrase into two substrings safely; at least one of the recipients (both, if it's a 1/2-1/2 split) would be able to compute the entire passphrase trivially, since there is so much redundancy within the phrase. You can split such a phrase safely by generating a random string R and using [R, XOR(passphrase, R)] as the split, but I'm assuming you want each of the people to be able to memorize her half. Of course, the Right Way to do this (and the only way to do it if your requirements are more involved) is to use an M-of-N secret splitting algorithm. But a simple substring split can be convenient, for example in an office context where people are basically considered trustworthy, but where you still want to have two people present to generate a signature for a software release. regards, miket Linux-crypto: cryptography in and on the Linux system Archive: http://mail.nl.linux.org/linux-crypto/