20 Jan
2022
20 Jan
'22
3:08 p.m.
this version has all the outputs fixed to a single value: https://bafkreic6ulo7ahnblyxdrklw7lcshmaob4yfawznnprrdis4nop7lw6nxu.ipfs.dwe... it's quite clear that the model isn't differentiating between output positions. maybe this has to do with the decoder, which included output position encodings. the problem still seemed to occur when i used the [huggingface language model class rather than my own] which could mean i am using it wrongly. it could make sense to compare the decoder with google's original language model decoder, and see how they use it: https://github.com/deepmind/deepmind-research/tree/master/perceiver