Randomness of a bit string

Mon Jan 24 23:08:30 PST 1994

Continuing the discussion on whether there may exist a few random strings
that can be proven random...

Tim May writes:

> To see this another way, suppose an algorithm existed to always know
> if a given number is "random" or not.  [Paradoxes follow]

But that's not what I was talking about; I specifically acknowledged
that there was no such algorithm that ALWAYS gives you the
answer.  But even in the absence of a general algorithm to decide a
problem, it may be possible to decide some specific instances.  For
example, a basic result of computability theory is that there is
no algorithm that will, for any program P and input x, tell you if
P eventually halts on input x.  Yet there are many SPECIFIC instances
of programs P and inputs x for which it has been proven that P halts on
input x; this is what the whole business of formal proofs of program
correctness is about.

> If someone claims they can "prove" the sequence "0
> 1101100110111100010" is really random, ask them _how_. Ask them if the
> compression "Chaitin 27," meaning the example number given on page 27
> of Chaitin's book is not that same number, making it hardly random.

This argument is invalid.  To see why, let's review the definition of
a random string.  Randomness is defined in terms of Kolmogorov
complexity, which is defined relative to any universal function U.  (A
universal function U takes as input an encoding of a Turing machine T,
together with its input z; its output is undefined if T does not halt
on input z, otherwise its output is the value T outputs on input z.
Each different effective encoding of program-input pairs defines a
different universal function.)  The Kolmogorov complexity C_U(x) of a
string x (relative to U) is defined to be the length of the shortest
string y such that U(y) is defined and U(y) = x.  In a sense, it
doesn't matter which universal function you use, since it turns out
that for any two universal functions U and V there exist constants c1
and c2 such that

   C_U(x) <= C_V(x) + c1  for all x, and
   C_V(x) <= C_U(x) + c2  for all x.

A string x is defined to be random (w.r.t. U) if C_U(x) >= x.

Trivially then, the empty string is a random string.  Also, Tim's
example is meaningless, since it does not give an algorithm.
(Caveat: you COULD construct a universal function U that has
Chaitin's book built in to it, but it is certainly NOT the case that
every universal function has this property.)

To prove that a nonempty string x is nonempty, it suffices to prove
that for all strings y shorter than x, either U(y) is undefined
or U(y) != x.  This amounts to proving the output (and halting
behavior) of a finite number of program-input pairs.  For some strings
x and universal functions U this task may be absolutely trivial.  Consider
a Turing machine T that always halts and always outputs the empty
string, regardless of its input.  Let z_1,...,z_m be m arbitrary
strings, where m exceeds the number of strings shorter than x.  It is
straightforward to construct an effective encoding of program-input
pairs for which (T,z_i) is encoded as the i-th bit-string in
lexicographic order.  Suppose that U is the corresponding universal
function, and let y_i be the encoding of the program-input pair
(T,z_i).  Then U(y_i) is the empty string, for all 1 <= i <= m.  Since
the set { y_i : 1 <= i <= m } includes every string shorter than x,
and x is nonempty, we then see that x is random (relative to U.)

-----------------------------------------------------------------------------
Kevin S. Van Horn     | It is the means that determine the ends.
kevin at bert.cs.byu.edu |