[ot][spam] Behavior Log For Compliance Examples: HFRL Unit 2

Undiscussed Horrific Abuse, One Victim of Many gmkarl at gmail.com
Fri Jun 24 07:29:37 PDT 2022


1028 I have mostly read that section. Most of it was a recap that
reinforcement learning uses a policy to prioritise actions based on
observation information of an environment.

Policy-based methods are described as training a policy directly.
Value-based methods are described as breaking the environment into
distinguishable states, learning the value of the states, and
selecting actions that move toward them.

The next section is "The two types of value-based methods" at
https://huggingface.co/blog/deep-rl-q-part1#the-two-types-of-value-based-methods
.


More information about the cypherpunks mailing list