[ot][spam][crazy] crazylogs: STaR

Undiscussed Horrific Abuse, One Victim of Many gmkarl at gmail.com
Tue Jul 5 13:04:24 PDT 2022


my existing work is at https://github.com/xloem/rnlp .

i just got composer_nb_alter_2.py to work.

i took an existing training sample for a masked language model, and
made it work with a generative language model.

the smallest pretrained generative models i found are bigger, so it
takes forever to run on a cpu since it won't fit on my gpu.

finetuning a pretrained model is a fast and effective operation.

we don't actually need to use a generative model for this. the STaR
can be done with a classifier. the model doesn't need to write out the
answer, just pick it.

however, the authors of the paper used a generative model.

i'm thinking about trying to generalise my code to use either a
generative model, or a pretrained model. the two are very similar. i
was likely planning to do this.


More information about the cypherpunks mailing list