here’s the code for the bayesian one: https://github.com/gmum/few-shot-hypernets-public/tree/master/methods/hypern... i’m wondering if there’s more private than public research here dunno anyway it seems like these papers kind of say the encoding of weights is a hyperparameter that hasn’t been sufficiently studied it seems it would make sense to tack a linear layer onto a transformer. the google paper uses raw outputs of a transformer, and the hypernets paper says it uses linear heads. it’s notable i think it uses things simpler than a transformer. both papers are tackling image recognition using cnns which is kind of specific i guess i’d like to try to train something in ggml next. maybe later unsure. note that it seems worthwhile to study metalearning more extensively