24 Jun
2022
24 Jun
'22
2:56 p.m.
1052 I reviewed the help desk to get their aid staying on task. They might need to add something to their FAQ, not sure, or maybe reorder it. The Bellman Equation simplifies the calculation of state-value and state-action value. The examples in this section are simplified, removing discounting of the reward. Note: It is not too hard to calculate a reward for a state in order to sum them. The environment provides this information. I may have confused the terms "reward" and "return" in earlier notes. The return is the sum of the rewards following the policy. Bellman Equation: V(st) = R_t+1 + gamma * V(St + 1) The value of a state is the