[ot][spam][crazy] Being a tiny part of a goal

Undiscussed Horrific Abuse, One Victim of Many gmkarl at gmail.com
Sat Apr 2 08:16:17 PDT 2022


i think i'll try making there be 2 models first.

i did that a little and am still working on it. the detokenization
function doesn't yet do batching, it seems. i also enabled raw bytes
so the model can theoretically produce all words, but it will eat into
the capacity of the model if there are too many new ones during
training. this would mean another finetuning cycle would be needed.
don't remember if that's planned. but could work around it by uhhhh
not using random data, rather using samples from the recording in
question.

right. the original model produces letters described as words. this is
not correct. but isn't what the current focus is. when this is
addressed later it may resolve some or all of these partial concerns.


More information about the cypherpunks mailing list