[spam] [personal] perceiver model notes
k
gmkarl at gmail.com
Thu Jan 20 15:18:54 PST 2022
PerceiverEncoder: perceiver_encoder
BasicDecoder: basic_decoder
EmbeddingDecoder: embedding_decoder
so 5 objects at the root level that map to huggingface objects in order.
i'll try to instantiate a huggingface model with the same
configuration and compare the names:
unclear configuration values, unsure how to map:
encoder.z_index_dim=256
encoder.num_z_channels=d_latents
config values needing setting from the transformers.PerceiverConfig() defaults:
config.qk_channels = 8 * 32
config.v_channels = config.d_latents
in google's example, the decoder is configurable:
output_num_channels = config.d_latents
position_encoding_type = 'trainable'
output_index_dims = config.max_position_embeddings
num_z_channels = config.d_latents
qk_channels = 8*32
v_channels = config.d_model
num_heads = 8
final_project = False
trainable_position_encoding_kwargs={num_channels: config.d_model}
huggingface likely copied google's code, so i may be able to just
instantiate a model and have it look like google's example already
On 1/20/22, k <gmkarl at gmail.com> wrote:
> pip3 install git+https://github.com/deepmind/dm-haiku
>
>>>> import pickle; params =
>>>> pickle.load(open('language_perceiver_io_bytes.pickle','rb'))
>>>> params.keys()
>
> the haiku weights look very similar to pytorch weights. a dictionary,
> where each key is a path to tensor in a model. they likely map 1-to-1
> if the architectures are the same.
>
> I typed a bunch more but lost it in more spasms. The above text came
> back when I reopened the last closed tab.
>
> Parameter name guesses from reviewing the apply_perciever function
> that defines the model in masked_language_modelling.ipynb using pip3
> install jupytext:
>
> huggingface's TextPreprocessor: 'embed' and 'trainable_position_encoding'
>
More information about the cypherpunks
mailing list