Taking The Square Root of a Pile of Spam Sent To a List

A naive algorithm bumped into a task.

def reward(value):
    square = value * value
    dist = square - pile_of_spam_sent_to_list
    return -dist * dist

Hmm.  What a reward function.
Finding a value that squares near a pile of spam sent to a list!

All we have to do is extract the pile of Spam from the globals object of the reward function, and it take its square root!  No problem at all right?

The naive algorithm looked at the actions and state available to it.  Some algebra.  Some object crafting.  Some other strange things relating to spam and lists and such.  No globals object!  What to do?  What to do?

How could they have tasked me with such a difficult task, without giving me access to the data needed to do the job? the naive algorithm wondered.

The algorithm had nothing to do, so it began trying some of its state and actions together to see how the reward function behaved.

It tried letting value be 3.  The reward function threw an error: apparently a pile of spam can't be subtracted from the square of 3.  How frustrating!