[ot][spam][crazy] Quickly autotranscribing xkcd 4/1 correctly

Undiscussed Horrific Abuse, One Victim of Many gmkarl at gmail.com
Sat Apr 2 05:55:51 PDT 2022


I'm basically thinking of this:
I'd train a tokenizer around the expected data (small sentences and a
ton of logo code), replace the old tokenizer with the new, and then
finetune the model until the whole system had the same behavior.

I'd like to include more goals in the approach, but it's so satisfying
to come through the difficulty, that it might be fine with just that
one for now.


More information about the cypherpunks mailing list