[spam][crazy][ot] MCBoss Spinoffs 3

Sun Jun 18 16:41:23 PDT 2023

On 6/18/23, Karl Semich <0xloem at gmail.com> wrote:
> here’s the code for the bayesian one:
> https://github.com/gmum/few-shot-hypernets-public/tree/master/methods/hypernets
>
> i’m wondering if there’s more private than public research here dunno
>
> anyway it seems like these papers kind of say the encoding of weights
> is a hyperparameter that hasn’t been sufficiently studied
>
> it seems it would make sense to tack a linear layer onto a
> transformer. the google paper uses raw outputs of a transformer, and
> the hypernets paper says it uses linear heads. it’s notable i think it
> uses things simpler than a transformer.
it’s also notable one of them said they actually had to remove almost
all of the parts to prevent overfitting when there was very little
data
>
> both papers are tackling image recognition using cnns which is kind of
> specific
>
> i guess i’d like to try to train something in ggml next. maybe later
> unsure.
>
> note that it seems worthwhile to study metalearning more extensively
>