2 Apr
2022
2 Apr
'22
3:21 p.m.
one idea being considered is a small model that translates between the output of the model to the correct output. this would make sure propagation doesn't influence the pretrained model's knowledge. this could possibly be simplified by e.g. adding a layer to the model, and only enabling training on that layer. but it is a little confusing to look at extra things, like the perceiver decoder.