Random Data Compressed 100:1 (Guffaw)

georgemw at speakeasy.net georgemw at speakeasy.net
Wed Jan 9 12:43:49 PST 2002


On 9 Jan 2002, at 10:24, Michael Motyka wrote:


> Perhaps my universe is off in all six degrees but answer this - which is
> it : 
> 
> random data cannot be compressed 
> 
> or
> 
> random data may include recognizable and even repeating patterns 
> 
Both.

I'll try this one last time and then give up on you.

Any algorithm which maps files onto files will be AT LEAST
as likely to increase the size of files as to decrease their size
if fed random data.  Further, the chance of being able to
significantly decrease the size of a file derived from a random 
source with any given algorithm is vanishingly small
(since there are 256 tomes as many 
N byte files as there are N - 1 byte files, the
odds of a random file being compressible by a single byte by a 
given algorithm can't be any better than 1 in 256.  The odds
of being able to compress it by two bytes ae even worse). 

> A string of random data may include anything. Compressibility will vary
> with the particular data and the algorithm used. Lucky with one input
> set, unlucky with another.

No.  Unlucky with virtually all of them.  There aren't words to 
describe how unlikely it is that a 1 megabyte file of random data
could be compressed by a factor of 100 by any given algorithm.
It's not like winning the lottery.  It's not even like
winning the lottery 100 times in a row.  We're talking
vastly improbable.

> I wouldn't try to base a business on getting
> lucky with the customer's input data but the absolute statement of
> incompressibility is wrong unless you preselect your data sets. Then
> they're no longer random and the definition of random becomes circular.
> 
> Mike

George
> 
> 






More information about the cypherpunks-legacy mailing list