[spam] [personal] perceiver model notes

k gmkarl at gmail.com
Tue Jan 18 07:11:20 PST 2022


Figuring out Step 3: Decoding

Step 3 is interesting because it takes both the unencoded embedding
vectors (inputs) and the encoded hidden states, as input.

PerceiverBasicDecoder configuration parameters:
output_num_channels = d_latents
output_index_dims = max_position_embeddings
num_channels = d_model
v_channels = d_model
qk_channels = 8 * 32
num_heads = 8
use_query_residual = False
final_project = False

Step 3a: Forming the decoder query
Hopefully this is incredibly simple.

# decoder_query(inputs):
pos_emb = self.output_position_encodings(batch_size)
pos_emb = self.positions_projection(pos_emb)
pos_emb = [inputs_without_pos, pos_emb] # conditioned on a flag i
haven't checked
return pos_emb

# __init__
output_position_encodings, positions_projection =
build_position_encoding('trainable', **)

# build_position_encoding
return PerceiverTrainablePositionEncoding(), nn.Linear(channels,
project_pos_dim)

PerceiverTrainablePositionEncoding is just some trainable parameters
like before.

So basically `decoder_query` takes inputs_without_pos and some
trainable parameters and concatenates them together.

I'd been leaving inputs_without_pos out.  Revisiting
PerceiverTextPreprocessor, it looks like None is returned for this
output.  So, `decoder_query` just returns some trainable parameters
that are only conditioned on the input for dimension sizing.


More information about the cypherpunks mailing list