(I'm gonna take a breather on this "randomness of a bit string" thread after sending this post off. I agree with what many folks have written, and was especially glad to see Scott Collins' nice summary earlier today about the difficulties in describing randomness. It's a fascinating topic, with even some practical consequences for Cypherpunks....maybe.) Ray Cromwell writes:
Tim writes:
But can he ever say "I can prove the number is random"? No. There's always some chance an even-cleverer puzzle solver will find the pattern, the key that unlocks the randomness. For example, most ciphertexts pass nearly all statistical tests for randomness, "look" random, and even _act_ like random numbers (recall the Blum-Blum-Shub pseudorandom number generator and how good it is). But simple application of the key turns the seemingly random "100010001010110010101" into "ATTACK."
But can we say that "100010001010110010101" has been ``compressed'' into "ATTACK"? How do we know? Let IC(x) stand for the amount of information
Let me first point out that _any_ string can be "compressed" into "ATTACK" with the right mapping. My house could be stormed my Reno's Raiders and the number 100010001010110010101 subjected to thorough scrutiny at the Fort. Lo and behold, they could find the string which when applied to my string (by some process) outputs "ATTACK." There are some subtle issues of "relevance" that need to be addressed. As an example, if a number written down somewhere in my house produces the transformation into "ATTACK," that's presumably of more relevance than if the NSA finds some number lying around (and of course they can _construct_ such a number easily). I'm sure cryptanalysts take such things into account, but formal theories don't seem to have addressed this (but I may just be unaware of papers along these lines). And certainly the courts have yet to touch on this issue, so far as I know. Scott Collins nicely summarized the difficulties in calling any number random (echoing the points I was making, perhaps less formally), and Phil Karns was right when he said "Randomness is in the eye of the beholder." (He may've been making an ironic point about my arguments, but he was still right.) Back to Ray's point:
storage used by x. Is
IC(100010001010110010101) > IC(ATTACK) + IC(key) + IC(algorithm)?
It is not at all clear that this relationship would hold. (in fact, I don't think it will even begin to work out unless the cyphertext is much longer than the plaintext) So in fact, cryptorandom numbers can be considered incompressible if you take into account the algorithm required to perform the operation -- just as if I had used a 100 terabyte dictionary to compress via lookup, or better yet, a one time pad.
Yeah, but the complexity of the algorithm, and the "CPU effort" needed to mount the analysis is not considered part of "Kolmogorov complexity." That's just the formalism. Since the effort is indeed important (e.g., the complexity of DNA strings, for example, gives evidence that many billions of years of compression, massaging, more compression, etc. happened), others have developed measures of complexity which take into account the effort, the CPU cycles, if you will. Greg Chaitin first looked at this in 1966, but it was left to fellow IBM researcher Charles Bennett (whom Cypherpunks may know as the coinventor with Gilles Brassard of "quantum cryptography," and also a pioneer in reversible computation) to label the idea "logical depth" and explore the ramifications more deeply (pun intended). Logical depth addresses the issues Ray is raising. A good summary is in "The Turing Machine: A Half-Century Survey," edited by Rolf Herken, and published in about 1991.
All of this is meaningless anyway. Information theory was proven wrong by WEB technologies when they invented a compression program that can recursively compress any input data down to 64k. Harddrives are now obsolete.
Yes, as Perry Metzger once showed on this list, even the longest of posts can be compressed into the period at the end of this sentence. --Tim May -- .......................................................................... Timothy C. May | Crypto Anarchy: encryption, digital money, tcmay@netcom.com | anonymous networks, digital pseudonyms, zero 408-688-5409 | knowledge, reputations, information markets, W.A.S.T.E.: Aptos, CA | black markets, collapse of governments. Higher Power:2**859433 | Public Key: PGP and MailSafe available.