20 Jan
2022
20 Jan
'22
10:52 p.m.
I wrote up some about reviewing google's masked language modeling implementation, but lost the writing during a spasm, despite saving drafts. Anyway, I think the most useful approach will be to port google's example pretrained model to pytorch, and see how it runs. I glanced at huggingface's test for this use case, and the test doesn't actually verify the output data yet, so they could have a bug. Google's example model is at https://storage.googleapis.com/perceiver_io/language_perceiver_io_bytes.pick... I guess I'll build haiku for arm64, since the pickle uses haiku objects.