[spam] [personal] perceiver model notes

Thu Jan 20 15:08:55 PST 2022

pip3 install git+https://github.com/deepmind/dm-haiku

>>> import pickle; params = pickle.load(open('language_perceiver_io_bytes.pickle','rb'))
>>> params.keys()

the haiku weights look very similar to pytorch weights.  a dictionary,
where each key is a path to tensor in a model.  they likely map 1-to-1
if the architectures are the same.

I typed a bunch more but lost it in more spasms.  The above text came
back when I reopened the last closed tab.

Parameter name guesses from reviewing the apply_perciever function
that defines the model in masked_language_modelling.ipynb using pip3
install jupytext:

huggingface's TextPreprocessor: 'embed' and 'trainable_position_encoding'