[ot][spam][crazy][data] transformer model 'attention' improvement

Undiscussed Horrific Abuse, One Victim & Survivor of Many gmkarl at gmail.com
Tue Feb 1 13:20:15 PST 2022


oops! it may be that when masks and biases are expanded to be dense no
additional memory is actually allocated.

uhhh !


More information about the cypherpunks mailing list