[ot][spam] Behavior Log For Compliance Examples: HFRL Unit 2

Undiscussed Horrific Abuse, One Victim of Many gmkarl at gmail.com
Fri Jun 24 07:24:36 PDT 2022


1023

I have moved through the introduction. It also listed some of the
subparts of the unit. It described that Q-learning was the first
algorithm able to beat humans at some video games, and it roughly said
that this unit is important if you want to be able to work Q-learning
algorithms. My perception is that Q-learning is less useful than PPO;
I could be wrong. This perception creates difficulty for me.


More information about the cypherpunks mailing list