*** GMX Spamverdacht *** Re: *** GMX Spamverdacht *** Re: *** GMX Spamverdacht *** Re: paradoxes of randomness

Tim May timcmay at got.net
Mon Aug 18 09:53:22 PDT 2003


(Will whomever prepending this "Re: *** GMX Spamverdacht ***" header 
please STOP.)


On Monday, August 18, 2003, at 09:01  AM, Dave Howe wrote:
> randomness is a funny thing. a truely random file can be *anything* 
> that
> is the right length - including a excerpt from william shakespere
> (provided you have enough monkeys)
> it is most likely to be random garbage, for a large sample - but for a
> very small sample the odds of it being an ordered sample are 
> surprisingly
> good.

Quibble: only surprising if one misunderstands probability. (Not saying 
you do, just quibbling with any claim that readily calculated 
probabilities can be "surprising.")



> the obvious example here is a double coin test - two bits. an ordered
> sample would be both heads or both tails. a disordered sample would be 
> one
> head and one tail. in practice, you would expect to see half the 
> results
> from a larger trial (say 32 double-throws) with a ordered sample, and 
> half
> disordered.
> as you reach three coins, the odds of a completely ordered result 
> decrease
> (from 2 in 2^2, to 2 in 2^3). for four coins, you still have the same 
> two
> compressable results. consider:
> HHHH    HHHT    HHTH    HHTT
> HTHH    HTHT    HTTH    HTTT
> THHH    THHT    THTH    THTT
> TTHH    TTHT    TTTH    TTTT
>
> in a large trial, you would expect to see each of these once every 2^4
> tests, on the average. obviously HHHH and TTTT are very compressable.
> assuming a runlength encoding, I don't think any of the others are
> compressable at all...."We should not march into Baghdad. To occupy 
> Iraq would
instantly shatter our coalition, turning the whole Arab
world against us and make a broken tyrant into a latter-
day Arab hero. Assigning young soldiers to a fruitless
hunt for a securely entrenched dictator and condemning
them to fight in what would be an unwinable urban guerilla
war, it could only plunge that part of the world into ever
greater instability."
--George H. W. Bush, "A World Transformed", 1998
For any sequence of n fair tosses, the "length after compression" of 
the outcome is roughly, modulo some constants and factor, (1 - 2^(-n)), 
where "1" is the uncompressed length. In other words, as n gets large 
nearly all strings have a "length after compression" that is close to 
the original length, i.e., little compression. As n gets arbitrarily 
large, an arbitrarily small fraction of strings have a short, concise 
description (are "compressed").

--Tim May





More information about the cypherpunks-legacy mailing list