[ot][spam][crazy] Quickly autotranscribing xkcd 4/1 correctly

Undiscussed Horrific Abuse, One Victim of Many gmkarl at gmail.com
Sat Apr 2 04:37:17 PDT 2022


This crashes after using too much ram, so I'm thinking it's not the
way to go. Silly to make something that relies on paying money to run
it.

Finetuning the model around new data would require learning the new
dataformat but have reusable work around backpropagating information
to the model weights, which I'm planning to do anyway.

But maybe it would make sense to just finetune the model around
something small, such as the heuristic idea. I'm thinking of spending
some time just looking into things. Unsure.

What seems exciting is maybe just finetuning the model around human
data :/ . There's a lot of manual transcription available for this
recording already, and it is easy to generate by hand.

I'm thinking this approach could be generalised so as to require only
a little example data, and learn the rest. The pattern space is small
compared to the capacity of the model, since the output is limited and
the speech is repetitive. ("pattern space" is a phrase i just made up
for the space of relations between the data and its intended meaning)

This is hard for me. It's hard to think about nonnormative transfer
learning. It's hard! But I just posted about it to the list. What did
I post?


More information about the cypherpunks mailing list