30 Jan
30 Jan
7:22 p.m.
Here's the commit hash. In my test, all the correct attention weights are 1.0 . So maybe I'll run the huge pretrained model through my mempickle project that lets it run on low-end systems, using this new code i'm writing, and verify that all the weights are output correctly before opening a pull request. commit 84724e1de4721ea0333d6bdbb91e8bce74fbeac2 (HEAD -> return_weights, origin/return_weights) Author: xloem <0xloem@gmail.com> Date: Sun Jan 30 10:14:37 2022 +0000 return the post-softmax weights rather than the pre-softmax weights