Re: Random Data Compressed 100:1 (Guffaw)

9 Jan 2002

      On 9 Jan 2002, at 10:24, Michael Motyka wrote:
...
Perhaps my universe is off in all six degrees but answer this - which is
it :
random data cannot be compressed
or
random data may include recognizable and even repeating patterns
Both.

I'll try this one last time and then give up on you.

Any algorithm which maps files onto files will be AT LEAST
as likely to increase the size of files as to decrease their size
if fed random data.  Further, the chance of being able to
significantly decrease the size of a file derived from a random 
source with any given algorithm is vanishingly small
(since there are 256 tomes as many 
N byte files as there are N - 1 byte files, the
odds of a random file being compressible by a single byte by a 
given algorithm can't be any better than 1 in 256.  The odds
of being able to compress it by two bytes ae even worse).
...
A string of random data may include anything. Compressibility will vary
with the particular data and the algorithm used. Lucky with one input
set, unlucky with another.
No.  Unlucky with virtually all of them.  There aren't words to 
describe how unlikely it is that a 1 megabyte file of random data
could be compressed by a factor of 100 by any given algorithm.
It's not like winning the lottery.  It's not even like
winning the lottery 100 times in a row.  We're talking
vastly improbable.
...
I wouldn't try to base a business on getting
lucky with the customer's input data but the absolute statement of
incompressibility is wrong unless you preselect your data sets. Then
they're no longer random and the definition of random becomes circular.
Mike
George
...

Re: Random Data Compressed 100:1 (Guffaw)

georgemw＠speakeasy.net