[ot][spam][crazy][data] transformer model 'attention' improvement
Undiscussed Horrific Abuse, One Victim & Survivor of Many
gmkarl at gmail.com
Tue Feb 1 13:20:15 PST 2022
oops! it may be that when masks and biases are expanded to be dense no
additional memory is actually allocated.
uhhh !
More information about the cypherpunks
mailing list