*** GMX Spamverdacht *** Re: *** GMX Spamverdacht *** Re: *** GMX Spamverdacht *** Re: paradoxes of randomness
Tim May
timcmay at got.net
Mon Aug 18 09:53:22 PDT 2003
(Will whomever prepending this "Re: *** GMX Spamverdacht ***" header
please STOP.)
On Monday, August 18, 2003, at 09:01 AM, Dave Howe wrote:
> randomness is a funny thing. a truely random file can be *anything*
> that
> is the right length - including a excerpt from william shakespere
> (provided you have enough monkeys)
> it is most likely to be random garbage, for a large sample - but for a
> very small sample the odds of it being an ordered sample are
> surprisingly
> good.
Quibble: only surprising if one misunderstands probability. (Not saying
you do, just quibbling with any claim that readily calculated
probabilities can be "surprising.")
> the obvious example here is a double coin test - two bits. an ordered
> sample would be both heads or both tails. a disordered sample would be
> one
> head and one tail. in practice, you would expect to see half the
> results
> from a larger trial (say 32 double-throws) with a ordered sample, and
> half
> disordered.
> as you reach three coins, the odds of a completely ordered result
> decrease
> (from 2 in 2^2, to 2 in 2^3). for four coins, you still have the same
> two
> compressable results. consider:
> HHHH HHHT HHTH HHTT
> HTHH HTHT HTTH HTTT
> THHH THHT THTH THTT
> TTHH TTHT TTTH TTTT
>
> in a large trial, you would expect to see each of these once every 2^4
> tests, on the average. obviously HHHH and TTTT are very compressable.
> assuming a runlength encoding, I don't think any of the others are
> compressable at all...."We should not march into Baghdad. To occupy
> Iraq would
instantly shatter our coalition, turning the whole Arab
world against us and make a broken tyrant into a latter-
day Arab hero. Assigning young soldiers to a fruitless
hunt for a securely entrenched dictator and condemning
them to fight in what would be an unwinable urban guerilla
war, it could only plunge that part of the world into ever
greater instability."
--George H. W. Bush, "A World Transformed", 1998
For any sequence of n fair tosses, the "length after compression" of
the outcome is roughly, modulo some constants and factor, (1 - 2^(-n)),
where "1" is the uncompressed length. In other words, as n gets large
nearly all strings have a "length after compression" that is close to
the original length, i.e., little compression. As n gets arbitrarily
large, an arbitrarily small fraction of strings have a short, concise
description (are "compressed").
--Tim May
More information about the cypherpunks-legacy
mailing list