[ot][spam][crazy] can commonsensebot make itself for me
Undescribed Horrific Abuse, One Victim & Survivor of Many
gmkarl at gmail.com
Sun Nov 13 12:40:09 PST 2022
In their paper ( https://arxiv.org/pdf/2208.00635.pdf ) they say the
highest scores on CommonSenseQA were acquired via what they call
"DictRoBERTa + LWA(K+V)".
LWA means "Layer-wise Extra-hop Attention"
well i misplaced that.
i think i'll try to adapt bloom-560m to do this.
my plan is to give it a small dataset that i add to by hand
and have it break the dataset into train/test and train an adapter so
long as the loss on the test drops
i infer there is something wrong with that plan, but it's a start
More information about the cypherpunks