24 Jun
2022
24 Jun
'22
2:41 p.m.
1038 I am now on the state-value function section at https://huggingface.co/blog/deep-rl-q-part1#the-state-value-function . The information bit I missed writing in the last section was that in value-based methods, the policy is defined by hand, whereas the value function is modularised as a neural network: in policy-based methods, the policy itself is the neural network. [limiting hardcoded heuristics]