22 Jul
2022
22 Jul
'22
7:33 a.m.
i am risking my focus in order to spend a little time learning what the difference might be between local attention and transient global attention in longt5 as a side note, it is notably that it looks like a little like people are using longt5 heavily.