entropy measures

Eric Hollander hh at soda.berkeley.edu
Sun Oct 25 21:41:19 PST 1992



>uuencoding will have a slightly lower single-character entropy than
>the ASCII armor PGP uses because just about every line begins with the
>letter 'M'.  This will skew the distribution slightly.  But a better
>way of distinguishing uuencoding and ascii armor is to see that in
>falls in the same entropy class, and then just looking at the
>alphabetic subsets used.

It's not that simple.  The entropy of a byte is the number of bits needed to
represent it.  If what is uuencoded is extremely repetitive, the entropy
will be low, maybe even less than one.  On the other hand, if it were random
data, it would just be slightly lower than ascii armor.  Binaries are
somewhat repetitive, so they have somewhat less entropy than random data.
English has a lot of redundancy, so it has a low entropy.

e






More information about the cypherpunks-legacy mailing list