26 Jul
2022
26 Jul
'22
2:01 a.m.
some trouble retaining my thoughts - the loss is reducing again (it's at 1.10-1.14) - i'm training a tokenizer, but don't have code to save it if the vm times out before i use it. it looks like it will take 3-4 hours just to preprocess this data ok i should move the tokenizer training to a system i control, without a gpu - i also started using cppgit2 which can diff trees, produce merge output, etc. it's an api to learn. the api looks like it is in the header files, luckily.