[spam] [personal] perceiver model notes

k gmkarl at gmail.com
Tue Jan 18 05:35:25 PST 2022


let's check out perceiver's masked language modeling architecture just
a little bit

PerceiverForMaskedLM (in github.com/transformers/transformers file
src/transformers/models/perceiver/modeling_perceiver.py )

i'm skipping down to the implementation of the forward() function, as
this will give me a quick description of what the model does with its
parts.

summary in pseudocode:
outputs = self.perceiver(inputs)
logits = self.embedding_decoder(selects output part depending on flag)
loss = crossentropyloss(logits, labels)
return logits and loss

ok so it basically has a 'perceiver' member that does the bulk work,
and an 'embedding decoder' that postprocesses the data.  so i flip up
to __init__ when these objects will be created

self.perceiver = PerceiverModel(
  input_preprocessor = PerceiverTextPreprocessor(),
  decoder = PerceiverBasicDecoder(lots of dimension information)
)
self.embedding_decoder = PerceiverEmbeddingDecoder()

so it looks like there are 4 parts here the information likely flows through.
1. Maybe PerceiverTextPreprocessor processes the input
2. Maybe PerceiverModel processes the preprocessed data
3. Maybe PerceiverBasicDecoder post-processes the data
4. Finally PerceiverEmbeddingDecoder likely converts high-dimensioned
outputs to simple byte probabilities

Next is to look at PerceiverModel's forward function to see how
important the non-parameterised behavior of the class is.


More information about the cypherpunks mailing list