On Mon, May 9, 2022, 4:40 AM Undiscussed Horrific Abuse, One Victim of Many <
gmkarl@gmail.com> wrote:
On Mon, May 9, 2022, 4:38 AM Undiscussed Horrific Abuse, One Victim of Many <
gmkarl@gmail.com> wrote:
On Mon, May 9, 2022, 4:22 AM Undiscussed Horrific Abuse, One Victim of Many <
gmkarl@gmail.com> wrote:
To represent normal goal behavior with maximization, the return function needs to not only be incredibly complex, but also feed back to its own evaluation, in a way not provided for in these libraries.
It should have anything inside the policy that can change as part of its environment state.
There is censorship here: many important parts of the idea are left out, focusing only on one projection of error.
The concern is a severe norm of action prior to observation, a habit known to cause severe errors, regardless of training and practice.