Re: Entropy, Randomness, etc.
Good questions.
My understanding of a random number is a number is generated from two or more unrelated events.
No. This may be one general category of ways to manufacture random numbers, but specifically a random number is just an arbitrary number typically drawn from a sequence of independent arbitrary numbers. The quality of 'randomness' is a measure of the independence of the elements in the stream. Therefor, there is no such thing as a random number except as an element of a sequence or other context from which to establish independence.
In order for this number to be most useful cryptographically, it needs a even distribution.
No. In order for this number to be cryptographically useful, it must cost more to guess the number (perhaps knowing the numbers that came before) than the reward for guessing it correctly. It happens that non-flat distribution of a sequence is a lever for cheaper guessing, thus flat distribution is natural characteristic of high-quality random sequences.
Does this make this distribution a gaussian distribution?
No. Gaussian (a.k.a. 'normal') distribution is the bell curve (and clearly indicates a relation between samples). Math texts describing this distribution often use the phrase 'distribution of some random variable x', by which they in fact mean 'distribution of samples from a varying source'.
Also how are these statistical measurements done?
See Knuth, "The Art of Computer Programming", Volume 2: Seminumerical Algorithms, Chapter 3: Random Numbers.
Is it as simple as a histogram?
Yes.
Are we talking frequency analysis with FFTs and more advanced things?
Yes.
How do we measure the entropy of a random number, or a series of random numbers?
Ah. Now we're talking. Entropy is closely related but not equal to 'randomness'. Entropy is a measure of information often expressed as the fraction of information-size to data-size. Randomness is a measure of unpredictability. A sufficiently random sequence will be of very high entropy from the perspective of the 'guesser', though not necessarily from the that of the generator (e.g. a PRNG). The best way to measure entropy (if that is what you want to measure), is to build a sufficiently powerful Markov model, or the equivalent, to predict the sequence, and treat it like a compressor. The number of bits output is the entropy of the sequence with respect to that model. If you can't build a model as smart as your presumed attackers (as smart as them, not as smart as any model they might build), then you will have to use more tests to assure yourself of indepence of elements in the sequence (see Knuth, et al). In practice however, most of these methods represent very low bars over which any RNG _must_ jump, and which often poor ones can. Most RNGs are broken by understanding how they work, and exploiting weaknesses in their construction and context (e.g. poor 'seed' selection).
Have people on the list done this, or is this still in the range of people that do math and number theory for a living?
Yes, and Yes.
These topics are probably covered in some of the basic books in the field, but all of the reference's I've been able to locate don't go into specifics of how to measure the quality of random numbers.
See Knuth.
Unless some measurements are made, you can't really be sure that those /dev/mem MD5 hashes don't come up the same 10%, 30%, or more of the time. It seems that a lot of assumptions are being made about what is good and what isn't.
You should be exactly as paranoid as it is cost effective to be. Hope this helps. Scott Collins | "Few people realize what tremendous power there | is in one of these things." -- Willy Wonka ......................|................................................ BUSINESS. voice:408.862.0540 fax:974.6094 collins@newton.apple.com Apple Computer, Inc. 5 Infinite Loop, MS 305-2B Cupertino, CA 95014 ....................................................................... PERSONAL. voice/fax:408.257.1746 1024:669687 catalyst@netcom.com
participants (1)
-
collins@newton.apple.com