[ot][spam][crazy][data] transformer model 'attention' improvement

k gmkarl at gmail.com
Wed Jan 26 06:48:18 PST 2022


timing script

aminrezaei comes out a little faster than the paper here for me, not a
serious test.

i went into aminrezaei's source and removed the per-chunk print
statement, may have influenced speed.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: chunked_timing.py
Type: text/x-python
Size: 3429 bytes
Desc: not available
URL: <https://lists.cpunks.org/pipermail/cypherpunks/attachments/20220126/88de42cf/attachment.py>


More information about the cypherpunks mailing list