Re: [spam][crazy][ot] MCBoss Spinoffs 3

18 Jun 2023


      On 6/18/23, Karl Semich <0xloem@gmail.com> wrote:
...
here’s the code for the bayesian one:
https://github.com/gmum/few-shot-hypernets-public/tree/master/methods/hypern...
i’m wondering if there’s more private than public research here dunno
anyway it seems like these papers kind of say the encoding of weights
is a hyperparameter that hasn’t been sufficiently studied
it seems it would make sense to tack a linear layer onto a
transformer. the google paper uses raw outputs of a transformer, and
the hypernets paper says it uses linear heads. it’s notable i think it
uses things simpler than a transformer.
it’s also notable one of them said they actually had to remove almost
all of the parts to prevent overfitting when there was very little
data
both papers are tackling image recognition using cnns which is kind of
specific
i guess i’d like to try to train something in ggml next. maybe later
unsure.
note that it seems worthwhile to study metalearning more extensively