Behavior Log For Compliance Examples: HFRL Unit 2

Fri Jun 24 07:29:37 PDT 2022

1028 I have mostly read that section. Most of it was a recap that
reinforcement learning uses a policy to prioritise actions based on
observation information of an environment.

Policy-based methods are described as training a policy directly.
Value-based methods are described as breaking the environment into
distinguishable states, learning the value of the states, and
selecting actions that move toward them.

The next section is "The two types of value-based methods" at

