[ot][spam][crazy] draft: learning RL
Undiscussed Horrific Abuse, One Victim of Many
gmkarl at gmail.com
Mon May 9 01:38:37 PDT 2022
On Mon, May 9, 2022, 4:22 AM Undiscussed Horrific Abuse, One Victim of Many
<gmkarl at gmail.com> wrote:
> To represent normal goal behavior with maximization, the return function
>> needs to not only be incredibly complex, but also feed back to its own
>> evaluation, in a way not provided for in these libraries.
> It should have anything inside the policy that can change as part of its
> environment state.
> This is so important that even if it doesn't help it should be done,
> because it's so important to observe before action, in all situations.
There is unexpected conflict around this combined expression of more useful
processes, and safer observation before influence. I believe this is
important (if acontextual), and wrong only in ways that are smaller than
the eventual problems it reduces, but I understand that my perception is
incorrect in some way.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 2091 bytes
Desc: not available
More information about the cypherpunks