[spam] [personal] perceiver model notes

Thu Jan 20 15:18:54 PST 2022

PerceiverEncoder: perceiver_encoder
BasicDecoder: basic_decoder
EmbeddingDecoder: embedding_decoder

so 5 objects at the root level that map to huggingface objects in order.

i'll try to instantiate a huggingface model with the same
configuration and compare the names:

unclear configuration values, unsure how to map:
 encoder.z_index_dim=256
 encoder.num_z_channels=d_latents

config values needing setting from the transformers.PerceiverConfig() defaults:
 config.qk_channels = 8 * 32
 config.v_channels = config.d_latents

in google's example, the decoder is configurable:
 output_num_channels = config.d_latents
 position_encoding_type = 'trainable'
 output_index_dims = config.max_position_embeddings
 num_z_channels = config.d_latents
 qk_channels = 8*32
 v_channels = config.d_model
 num_heads = 8
 final_project = False
 trainable_position_encoding_kwargs={num_channels: config.d_model}

huggingface likely copied google's code, so i may be able to just
instantiate a model and have it look like google's example already

On 1/20/22, k <gmkarl at gmail.com> wrote:
> pip3 install git+https://github.com/deepmind/dm-haiku
>
>>>> import pickle; params =
>>>> pickle.load(open('language_perceiver_io_bytes.pickle','rb'))
>>>> params.keys()
>
> the haiku weights look very similar to pytorch weights.  a dictionary,
> where each key is a path to tensor in a model.  they likely map 1-to-1
> if the architectures are the same.
>
> I typed a bunch more but lost it in more spasms.  The above text came
> back when I reopened the last closed tab.
>
> Parameter name guesses from reviewing the apply_perciever function
> that defines the model in masked_language_modelling.ipynb using pip3
> install jupytext:
>
> huggingface's TextPreprocessor: 'embed' and 'trainable_position_encoding'
>