[ot][spam][crazy][data] transformer model 'attention' improvement

k gmkarl at gmail.com
Wed Jan 26 07:16:45 PST 2022


-
https://github.com/xloem/transformers/commit/3f2b78f787dd1d204d842f26cdc026c6e41c25ba

Added configuration parameters and import for memory efficient attention.
-

There is an existing setup for feedforward chunking. I'll likely try
to copy any reasonable patterns it establishes.


More information about the cypherpunks mailing list