29 Dec
2021
29 Dec
'21
9:44 p.m.
i summarized some things at https://github.com/xloem/techsketball/blob/main/README.md including a link to that memory reducing paper at https://arxiv.org/abs/2112.05682 and some python import statements. there's code for this paper at https://github.com/AminRezaei0x443/memory-efficient-attention but i think the implementation there leaves out some oft-used attention features and may need adaptation, unknown. i'm looking at the huggingface source and seeing this model doesn't by default provide for raw embeddings as inputs. sometimes they do.