Here's a short article I wrote for sci.crypt aboout "randomness" of a bit string and the Kolmogorov-Chaitin definition that a string is random if and only if it has no shorter description than itself. This has some fascinating tie-ins to "cryptoregular" strings, which are strings which appear to be "regular" (a variant of randomness, meaning all digits are equally represented...high entropy) but which, with the right transformation, suddenly lose their regularity. (For you practical engineering folks, noise sources and other physical randomness sources will in most cases be enough, even if the randomness can never be "proved.") --Tim May Newsgroups: sci.crypt From: tcmay@netcom.com (Timothy C. May) Subject: Re: Randomness of a bit string Message-ID: <tcmayCK5CtF.23H@netcom.com> Date: Mon, 24 Jan 1994 18:32:03 GMT Bruce Grant (bgrant@umcc.umcc.umich.edu) wrote: : The usefulness of a one-time pad seems to hinge on whether the sequence : of key bits is really random. Could someone post a short, not too : technical definition of randomness of a bit string? In particular, is : this a mathematical property, or just a general measure of whether the : string is "predictable"? Does it depend on the nature of the cryptanalyst : or only on the string of bits? (In other words, if the key is based on : an Albanian translation of "Mary had a little lamb" is it random if you : don't know Albanian?) : Could a program test a key for randomness, or is this meaningless? A fascinating question! The answer lies at the heart of what we mean by randomness, complexity, predictability, regularity, and falls into the field of Kolmogorov-Chaitin complexity, or algorithmic information theory. Also called "descriptive complexity." Basic definition: A random string has no shorter description than itself. That is, it is incompressible. (Practically, we know "random strings" won't compress much...sometimes a compressor will shorten them, sometimes it will lengthen them. The notion above, that random strings will not compress, is very general and applies in the limit, not for some particular instance of a string--and some particular instance, e.g., "1 0 0 0 1 1 0" will of course have a good chance of having some particular compressions, some short description.) One consequence is "regularity": all digits of a base will be equally represented in the limit. Another consequence, as noted in one of the other followups to this question, is unpredictability of the next element or bit in a sequence. (Predictability of bits would imply a compression.) Cryptography is an interesting situtation. Charles Bennett talks about "cryptoregular" strings in a paper in the "Physics of Computation" Proceedings (1992, IEEE Press). A cryptoregular string _appears_ to have high entropy ("maximum randomness") and regularity (all symbols equally represented), and thus to be "random." But application of the _key_ will show the string is actually low entropy ("Mary had a little lamb, it's fleece was white as snow...") and is very compressible (the name of the song is the compressed version, for example). Good cryptography means cryptoregular strings. A fascinating discovery by Chaitin and others (Kolmogorov, Solomnoff, Martin-Lof, Levin all worked in this area) is that one can never prove a given sequence or string is "random." As in some diabolically clever IQ test, an apparently random sequence may have some shorter description, or compression, that means it does not fit this definition of randomness. Having said this, it is clear that for practical purposes, many sources used to generate "random numbers," e.g., noise diodes, alpha particles, tosses of a coin, etc., are "effectively random" (don't ask me to define this!) in that no compression/prediction will ever be done, though we can never be absolutely certain one does not exist! A nice book on this stuff just came out: "An Introduction to Kolmogorov Complexity and Its Applications," by Li and Vitanyi, 1993, Springer-Verlag. Cryptography per se is not mentioned (a disappointing lapse), but the ideas are widely applicable. --Tim May -- .......................................................................... Timothy C. May | Crypto Anarchy: encryption, digital money, tcmay@netcom.com | anonymous networks, digital pseudonyms, zero 408-688-5409 | knowledge, reputations, information markets, W.A.S.T.E.: Aptos, CA | black markets, collapse of governments. Higher Power:2**859433 | Public Key: PGP and MailSafe available.