24 Jun
24 Jun
3:06 p.m.
1104 - this is the last section of part 1 - there are two ways of learning Monte Carlo and Temporal Difference Learning are two different training strategies based on the experiences of the agent. Monte Carlo uses an entire episode of experiences. Temporal Difference uses a single state (a quadruple of state, action, reward, next-state) One of the sentences could imply that these might also apply to policy-based approaches.