[spam] [personal] perceiver model notes

k gmkarl at gmail.com
Thu Jan 20 07:08:44 PST 2022


this version has all the outputs fixed to a single value:
https://bafkreic6ulo7ahnblyxdrklw7lcshmaob4yfawznnprrdis4nop7lw6nxu.ipfs.dweb.link/

it's quite clear that the model isn't differentiating between output
positions.  maybe this has to do with the decoder, which included
output position encodings.

the problem still seemed to occur when i used the [huggingface
language model class rather than my own] which could mean i am using
it wrongly.

it could make sense to compare the decoder with google's original
language model decoder, and see how they use it:
https://github.com/deepmind/deepmind-research/tree/master/perceiver


More information about the cypherpunks mailing list