Re: [ot][spam] Behavior Log For Control Data: HFRL Unit 1 Lab

24 Jun 2022

      0859 The next task is this:

Step 6: Train the PPO agent 🏃
Let's train our agent for 500,000 timesteps, don't forget to use GPU
on Colab. It will take approximately ~10min, but you can use less
timesteps if you just want to try it out.

I will plan to try it out with a short number of timesteps. My first
approach for finding how to do this will be scrolling up in the lab.

Re: [ot][spam] Behavior Log For Control Data: HFRL Unit 1 Lab

Undiscussed Horrific Abuse, One Victim of Many