[spam] [personal] perceiver model notes

Thu Jan 27 07:39:10 PST 2022

https://github.com/AminRezaei0x443/memory-efficient-attention/issues/1
feat: return_weights #1
xloem opened this issue 36 minutes ago

I'm looking into hacking some of the models in the transformers
library to use this library for attention, and I don't see a way to
support return_weights yet. This is a flag passed in transformers,
where the pre-softmax attention weights are preserved and returned to
the user, if it is set.

I looked a little at implementing this in the torch backend, and I
note the scan() function provides for only a single tensor return
value. It seems to me that scan() function would be most clearly
replaced by a for loop, but it could also be modified to handle
tuples, or return_weights could be handled via accessing nonlocal data
in some way instead of returning them through the chunk scanner.

commit 63fb607e6e0ed41934fbc1e1e150993b261903eb (HEAD ->
return_weights, origin/return_weights)
Author: xloem <0xloem at gmail.com>
Date:   Thu Jan 27 15:37:09 2022 +0000

    reimplemented using an output parameter. not passing test yet