Re: [ot][spam] Behavior Log For Compliance Examples: HFRL Unit 2

24 Jun 2022

      1038 I am now on the state-value function section at
https://huggingface.co/blog/deep-rl-q-part1#the-state-value-function .

The information bit I missed writing in the last section was that in
value-based methods, the policy is defined by hand, whereas the value
function is modularised as a neural network: in policy-based methods,
the policy itself is the neural network. [limiting hardcoded
heuristics]

Re: [ot][spam] Behavior Log For Compliance Examples: HFRL Unit 2

Undiscussed Horrific Abuse, One Victim of Many