PerceiverEncoder: perceiver_encoder BasicDecoder: basic_decoder EmbeddingDecoder: embedding_decoder so 5 objects at the root level that map to huggingface objects in order. i'll try to instantiate a huggingface model with the same configuration and compare the names: unclear configuration values, unsure how to map: encoder.z_index_dim=256 encoder.num_z_channels=d_latents config values needing setting from the transformers.PerceiverConfig() defaults: config.qk_channels = 8 * 32 config.v_channels = config.d_latents in google's example, the decoder is configurable: output_num_channels = config.d_latents position_encoding_type = 'trainable' output_index_dims = config.max_position_embeddings num_z_channels = config.d_latents qk_channels = 8*32 v_channels = config.d_model num_heads = 8 final_project = False trainable_position_encoding_kwargs={num_channels: config.d_model} huggingface likely copied google's code, so i may be able to just instantiate a model and have it look like google's example already On 1/20/22, k <gmkarl@gmail.com> wrote:
pip3 install git+https://github.com/deepmind/dm-haiku
import pickle; params = pickle.load(open('language_perceiver_io_bytes.pickle','rb')) params.keys()
the haiku weights look very similar to pytorch weights. a dictionary, where each key is a path to tensor in a model. they likely map 1-to-1 if the architectures are the same.
I typed a bunch more but lost it in more spasms. The above text came back when I reopened the last closed tab.
Parameter name guesses from reviewing the apply_perciever function that defines the model in masked_language_modelling.ipynb using pip3 install jupytext:
huggingface's TextPreprocessor: 'embed' and 'trainable_position_encoding'