[spam] [personal] perceiver model notes

Undiscussed Horrific Abuse, One Victim & Survivor of gmkarl at gmail.com
Thu Jan 27 04:39:52 PST 2022


the big issue wasn't truncation; it was that i had put the wrong block
of code in an if/else condition.  current challenge is that the
efficient attention implementation doesn't provide for applying
dropout (random zeroing of some weights during training) where the
perceiver model applies it.  i fudged something in, untested.

commit 7628f3e4f32ac25b11774d939f2e16a20dd2a8fd (HEAD ->
memory-efficient-attention, xloem/memory-efficient-attention)
Author: xloem <0xloem at gmail.com>
Date:   Thu Jan 27 12:38:13 2022 +0000

    wip efficient attention: organising separate parts to include
dropout and application


More information about the cypherpunks mailing list