Re: [spam] [personal] perceiver model notes

20 Jan 2022

      PerceiverEncoder: perceiver_encoder
BasicDecoder: basic_decoder
EmbeddingDecoder: embedding_decoder

so 5 objects at the root level that map to huggingface objects in order.

i'll try to instantiate a huggingface model with the same
configuration and compare the names:

unclear configuration values, unsure how to map:
 encoder.z_index_dim=256
 encoder.num_z_channels=d_latents

config values needing setting from the transformers.PerceiverConfig() defaults:
 config.qk_channels = 8 * 32
 config.v_channels = config.d_latents

in google's example, the decoder is configurable:
 output_num_channels = config.d_latents
 position_encoding_type = 'trainable'
 output_index_dims = config.max_position_embeddings
 num_z_channels = config.d_latents
 qk_channels = 8*32
 v_channels = config.d_model
 num_heads = 8
 final_project = False
 trainable_position_encoding_kwargs={num_channels: config.d_model}

huggingface likely copied google's code, so i may be able to just
instantiate a model and have it look like google's example already

On 1/20/22, k <gmkarl@gmail.com> wrote:
...
pip3 install git+https://github.com/deepmind/dm-haiku
...
...
...
import pickle; params =
pickle.load(open('language_perceiver_io_bytes.pickle','rb'))
params.keys()
the haiku weights look very similar to pytorch weights.  a dictionary,
where each key is a path to tensor in a model.  they likely map 1-to-1
if the architectures are the same.
I typed a bunch more but lost it in more spasms.  The above text came
back when I reopened the last closed tab.
Parameter name guesses from reviewing the apply_perciever function
that defines the model in masked_language_modelling.ipynb using pip3
install jupytext:
huggingface's TextPreprocessor: 'embed' and 'trainable_position_encoding'