[ot][spam][crazy] can commonsensebot make itself for me

Undescribed Horrific Abuse, One Victim & Survivor of Many gmkarl at gmail.com
Sun Nov 13 12:40:09 PST 2022


In their paper ( https://arxiv.org/pdf/2208.00635.pdf ) they say the
highest scores on CommonSenseQA were acquired via what they call
"DictRoBERTa + LWA(K+V)".

LWA means "Layer-wise Extra-hop Attention"

....

well i misplaced that.

i think i'll try to adapt bloom-560m to do this.
my plan is to give it a small dataset that i add to by hand
and have it break the dataset into train/test and train an adapter so
long as the loss on the test drops

i infer there is something wrong with that plan, but it's a start


More information about the cypherpunks mailing list