[crazy][hobby][spam] Automated Reverse Engineering

k gmkarl at gmail.com
Sat Jan 22 11:09:39 PST 2022


this has been going slower than needed because colab was bailing when
i tried to run the model on google's tpus, during compilation.

today i made a google cloud vm and precompiled the model in their
shell, and addded precompilation support to the notebook.  it was
_really_ hard to make the vm, my psychosis kept having me forget i was
doing it and do something else, over and over and over again.  but
now, using the tpus it is much faster, days turn into minutes.  i
haven't poked around with it much yet.

i also found there is a series of t5 models pretrained on individual
bytes instead of tokens:
https://huggingface.co/docs/transformers/model_doc/byt5

exciting developments.

big next steps:
- have the code gather much much more data.
- try a bigger model that can learn more complex things.

i've been running the model on their shell for maybe an hour and the
loss is down to 0.5 or so.  they charge by the hour so i should really
turn it off.  i'm using the lowest-end tpus so that it models how the
notebooks should perform after i terminate the vm.


More information about the cypherpunks mailing list