![](https://secure.gravatar.com/avatar/86577f45d5fb4051824c3df598d4157d.jpg?s=120&d=mm&r=g)
24 Jun
2022
24 Jun
'22
2:41 p.m.
1038 I am now on the state-value function section at https://huggingface.co/blog/deep-rl-q-part1#the-state-value-function . The information bit I missed writing in the last section was that in value-based methods, the policy is defined by hand, whereas the value function is modularised as a neural network: in policy-based methods, the policy itself is the neural network. [limiting hardcoded heuristics]