[ot][spam][crazy][data] transformer model 'attention' improvement

k gmkarl at gmail.com
Wed Jan 26 07:16:45 PST 2022

Previous message (by thread): [ot][spam][crazy][data] transformer model 'attention' improvement
Next message (by thread): [ot][spam][crazy][data] transformer model 'attention' improvement
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

-
https://github.com/xloem/transformers/commit/3f2b78f787dd1d204d842f26cdc026c6e41c25ba

Added configuration parameters and import for memory efficient attention.
-

There is an existing setup for feedforward chunking. I'll likely try
to copy any reasonable patterns it establishes.

Previous message (by thread): [ot][spam][crazy][data] transformer model 'attention' improvement
Next message (by thread): [ot][spam][crazy][data] transformer model 'attention' improvement
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the cypherpunks mailing list