[ot][spam][crazy][data] transformer model 'attention' improvement

k gmkarl at gmail.com
Wed Jan 26 05:24:00 PST 2022


The moskomule attention implementation is matching for me if I perform
the square root myself outside it.  The project does not appear to
install properly, the test and the library file run adjacent in the
same folder.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: chunked_lib.py
Type: text/x-python
Size: 922 bytes
Desc: not available
URL: <https://lists.cpunks.org/pipermail/cypherpunks/attachments/20220126/57c9fec7/attachment.py>


More information about the cypherpunks mailing list