[ot][spam][crazy] draft: learning RL
Undiscussed Horrific Abuse, One Victim of Many
gmkarl at gmail.com
Mon May 9 01:22:42 PDT 2022
> To represent normal goal behavior with maximization, the return function
> needs to not only be incredibly complex, but also feed back to its own
> evaluation, in a way not provided for in these libraries.
It should have anything inside the policy that can change as part of its
This is so important that even if it doesn't help it should be done,
because it's so important to observe before action, in all situations.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 1163 bytes
Desc: not available
More information about the cypherpunks