[ot][spam][crazy] Quickly autotranscribing xkcd 4/1 correctly

Undiscussed Horrific Abuse, One Victim of Many gmkarl at gmail.com
Sat Apr 2 05:02:12 PDT 2022


Regarding generation code, I know that long generate function can look
daunting, but it is almost entirely boilerplate. It calls other
boilerplate functions that eventually call the forward pass of the
model in a loop.

Basic generation is called "greedy" - this is where it generates the
most likely token every step. All the other non-greedy options are
ways of producing mostly more diverse data, or occasionally also data
with other properties than having a likely next token.


More information about the cypherpunks mailing list