[spam] [personal] perceiver model notes
k
gmkarl at gmail.com
Tue Jan 18 05:35:25 PST 2022
let's check out perceiver's masked language modeling architecture just
a little bit
PerceiverForMaskedLM (in github.com/transformers/transformers file
src/transformers/models/perceiver/modeling_perceiver.py )
i'm skipping down to the implementation of the forward() function, as
this will give me a quick description of what the model does with its
parts.
summary in pseudocode:
outputs = self.perceiver(inputs)
logits = self.embedding_decoder(selects output part depending on flag)
loss = crossentropyloss(logits, labels)
return logits and loss
ok so it basically has a 'perceiver' member that does the bulk work,
and an 'embedding decoder' that postprocesses the data. so i flip up
to __init__ when these objects will be created
self.perceiver = PerceiverModel(
input_preprocessor = PerceiverTextPreprocessor(),
decoder = PerceiverBasicDecoder(lots of dimension information)
)
self.embedding_decoder = PerceiverEmbeddingDecoder()
so it looks like there are 4 parts here the information likely flows through.
1. Maybe PerceiverTextPreprocessor processes the input
2. Maybe PerceiverModel processes the preprocessed data
3. Maybe PerceiverBasicDecoder post-processes the data
4. Finally PerceiverEmbeddingDecoder likely converts high-dimensioned
outputs to simple byte probabilities
Next is to look at PerceiverModel's forward function to see how
important the non-parameterised behavior of the class is.
More information about the cypherpunks
mailing list