Re: Entropy Estimator

At 9:06 PM 4/12/96, rick hoselton wrote:
At 12:19 AM 4/12/96 -0400, JonWienke@aol.com wrote:
I just added a feature to my entropy graphing program that estimates the number of bits of entropy in the file,
Hey, that's just what I need. I have these two 8-million byte files. One is a recording made by a geiger counter, every bit is uncorrelated with anything else in the universe and each bit is equally likely to be a one or a zero. The second file is an IDEA encryption of all the four-byte numbers from one to two-million.
Here's my problem. I can't remember which file is which, and I've forgotten sixty-four bits of the key I used to produce the encrypted file.
That's where your technique come in. The first file has sixty-four million bits of entropy. The second file has only sixty-four bits of entropy, total (the missing key bits). Surely, your technique can tell me which file is which.
No, this is not the case. Suppose your file of 64 million bits of entropy is stored as "Rick's Geiger Counter File," perhaps on your Web site. (Great for use by various people as a "virtual one time pad" (patent pending).) A year or so from now someone asks a program to measure the entropy of this file. Nearly all programs will report that the file has lots of bits of entropy... However, is this the "true" entropy? A clever program, or a person, might well remember that this file is Rick's Geiger Counter File, making the bits _very_ predictable (or, equivalently, little "surprise," low entropy, great compressibility, etc.). Once again, one can never know for sure that a file, sequence, string is maximally compressed. The application to your stated problem is that the two files might very well have similar statistics (a good cryptographic hash function is likely to produce a regular output, for example) and that a program cannot tell which one is the "really random" file and which is the "seemingly random" file is not a failing of the program, necessarily, but is implicit in some inevitable limitations of all programs. --Tim May Boycott "Big Brother Inside" software! We got computers, we're tapping phone lines, we know that that ain't allowed. ---------:---------:---------:---------:---------:---------:---------:---- Timothy C. May | Crypto Anarchy: encryption, digital money, tcmay@got.net 408-728-0152 | anonymous networks, digital pseudonyms, zero W.A.S.T.E.: Corralitos, CA | knowledge, reputations, information markets, Higher Power: 2^756839 - 1 | black markets, collapse of governments. "National borders aren't even speed bumps on the information superhighway."
participants (1)
-
tcmay@got.net