[spam][crazy][ot] MCBoss Spinoffs 3

Sun Jun 18 16:39:05 PDT 2023

here’s the code for the bayesian one:
https://github.com/gmum/few-shot-hypernets-public/tree/master/methods/hypernets

i’m wondering if there’s more private than public research here dunno

anyway it seems like these papers kind of say the encoding of weights
is a hyperparameter that hasn’t been sufficiently studied

it seems it would make sense to tack a linear layer onto a
transformer. the google paper uses raw outputs of a transformer, and
the hypernets paper says it uses linear heads. it’s notable i think it
uses things simpler than a transformer.

both papers are tackling image recognition using cnns which is kind of specific

i guess i’d like to try to train something in ggml next. maybe later unsure.

note that it seems worthwhile to study metalearning more extensively