https://bafkreie7gyyy3alribjyl72hlm4pk4allyul7xem7yqmpl66yzcidumfnq.ipfs.dwe...
I modified this to save the trained model with a line at the end of "model.save_pretrained('words2nums')" which makes a folder and pickles the trained parameters into it, in a torch-specific zip format. I also changed the batch size to 128 when I ran it. I think it was performing well with a much smaller batchsize. It went to step 3850 or so to go through all 500k data. The loss doesn't drop below 0.08 or so - maybe it would perform better with a learning rate that reduced over time - but it gets the answers right, it just doesn't reach 100% certainty. nftp has stopped working for me on colab. When I `npm install -g nftp` a dependency error crops up and terminates the script.