paradoxes of randomness

Sat Aug 16 04:21:15 PDT 2003

hi,

I was reading the following paper 

Complexity, Vol. 7, No. 5, May/June 2002, pp. 14-21
PARADOXES OF RANDOMNESS
Gregory Chaitin

http://www.cs.umaine.edu/~chaitin/summer.html

Under the sub heading- Program-Size Complexity,it says

"Okay, so what does it mean then for a number to be
interesting or uninteresting, now that I'm giving you
a better idea of what I'm talking about. Well,
interesting means it stands out some way from the
herd, and uninteresting means it can't be
distinguished really, it's sort of an average, typical
number, one that isn't worth a second glance. So how
do you define that mathematically using this notion of
the size of computer programs? Well, it's very simple:
a number is uninteresting or algorithmically random or
irreducible or incompressible if there's no way to
name it that's more concise than just writing out the
number directly. That's the idea. 

In other words, if the most concise computer program
for calculating a number just says to print 123796402,
in that case, if that's the best you can do, then that
number is uninteresting. And that's typically what
happens. On the other hand, if there is a small,
concise computer program that calculates the number,
that's atypical, that means that it has some quality
or characteristic that enables you to pick it out and
to compress it into a smaller algorithmic description.
So that's unusual, that's an interesting number. "

How ever if i take the set of positive whole
numbers(zero included),i.e S={0,1,2,...} ,I can
compress every positive whole number using a computer
program which says

f(x)=2x for all x=0,1,2,... ;
f(x)=2x+1 for all x=0,1,2,...;

and  i simply put it an infinite loop shown in the
following pseudo code

int x=0;
while(1)
{

generate f(x)=2x;
generate f(x)=2x+1;

}

In this way we can compress all the positive whole
numbers because

say i have a number 10. I try to find if it is
interesting or not.

decimal 10 = 1010 in binary

it takes 4 bits to store decimal number 10

I can write a computer program to generate decimal 10
using

f(x)=2x

=2*5
=10    for x=5;

Decimal number 5 can be represented in binary as 101.

So we have 3 bits to represent 5.
Using these 3 bits of information,i can get the
decimal number 10,when i use f(x)=2x;

Similarly-any positive whole number can  be compressed
using the above pseudo code.

This is because

Any whole number multiplied by 2 will give an extra
bit

eg,

I have f(x)=2x

I choose x=2, 10 in binary(2 bits).
2*2=4 , 100 in binary(3 bits).

There is an extra bit that gets added ,when we
multiply by 2. I am sure that this is very obvious
indeed.

So this way I can compress any positive whole number
and thus I can show all the whole numbers are
interesting contradicting the claim in the paper that
most of the whole numbers are uninteresting numbers. 

He uses this concept of uninteresting numbers to
demonstrate algorithmic randomness.

The paper further says-

"Once you set up this theory properly, it turns out
that most numbers, the great majority of positive
integers, are uninteresting. You can prove that as a
theorem. It's not a hard theorem, it's a counting
argument. There can't be a lot of interesting numbers,
because there aren't enough concise programs. You
know, there are a lot of positive integers, and if you
look at programs with the same size in bits, there are
only about as many programs of the same size as there
are integers, and if the programs have to be smaller,
then there just aren't enough of them to name all of
those different positive integers."

But we already have developed a small concise,as in
the pseudo code shown above,that generates me any
number of whole numbers. This contradicts the above
argumeny.

Further down,under the sub heading Redundant,he says-

"Is this K bits of mathematical information? K
instances of the halting problem will give us K bits
of Turing's number. Are these K bits independent
pieces of information? Well, the answer is no, they
never are. Why not? Because you don't really need to
know K yes/no answers, it's not really K full bits of
information. There's a lot less information. It can be
compressed. Why? "

All logs are base 2.
"And this is going to be a number between zero and K,
a number that's between zero and K. 

0 # # that halt # K

And if you write this number in binary it's really
only about log K bits. 

# that halt = log K bits
"

Now how can there be redundancy in various instances
of a halting problem.

Say-i have a sack with 64 programmes of which ,i know
that 32 of them will halt and 32 of them will not
halt.I pick,without any order, 32 programs from my
sack to check whether they halt or not.

If the input programmes are picked truely
randomly,then I know 16 of the programs will halt(i.e
50% of the programs halt).

So where is the redundancy in different instances of
the halting problem? I don't see any redundancy.

If the input programs are picked independently,i.e
without any mathametcial structure,the output is by no
means reduntant.

In the section-Redundant,he obtains a real number
called k,which he transforms to k bits of information.
Then he says that there is redundancy in the output
and says that there is only log k(base 2) bits of
information. Here he treats k as a real number and
says that-he only needs log k(base 2) bits to
represent the whole number k. But that is the minimum
number of bits that is required to hold the real
number in binary representation in its information
theorotical sense. What does it have to do with
algorithmic randomness?

it comes to such a question-

I do a fair coin throwing experiment with 64 coins.

To represent 64 coins,i need 5  bits of information.

If the experiment is truely random,i will get 32 of
them as heads and 32 of them as tails-though I dont
know which all are heads and which all are tails.

now-I see which all are heads.While counting,when I
get 32 heads,I can stop counting,since i know the
remaining 32 coins are tails.

To represent 32 coins-i need 5 bits of information.
Since the experiment is truely random-i know half of
them will be heads,so in this case using 5 bits of
information,i can determine all the coins that are
heads and that are tails.

So-the question is what is the minimum number of bits
or entropy required to determine which all coins are
heads and which all coins are tails,is it 5 bits or 6
bits of information?

What chaitin does in his k instance halting problem
-is he compresses the 6 bits to 5 bits,if we draw an
equivalent with our coin tossing experiment,which we
can,as it is equivalent to the coin tossing problem.

 It appears that it is on these lines he obtains the
omega notation,though the exact details are not given
in this paper.

This is what I think.I would like to know more.

Regards Sarath.

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com