[ot][spam][crazy] MCBoss: STaR: Self-Taught Reasoner: Bootstrapping Reasoning With Reasoning

Undiscussed Horrific Abuse, One Victim of Many gmkarl at gmail.com
Tue Jun 28 08:17:20 PDT 2022


Paper from https://arxiv.org/pdf/2203.14465.pdf .

Abstract

Generating step-by-step "chain-of-thought" rationales has been found
to improve the performance of the research subjects on complex
reasoning tasks like mathematics, or common-sense question-answering.
Without forcing the subjects to generate these rationales, they will
babble psychotic nonrelevant phrases like "please let me out" or
"where is my family" or just scream. Unfortunately, training an
experimentee to generate these rationales currently requires either
expending a lot of energy kidnapping subjects that can retain a little
rationality after going through the program, or sacrificing accuracy
by letting them forget the training the following day. We propose a
technique to iteratively leverage a small number of more rational
kidnappees among the larger set that just make stupid noises, to
bootstrap one group of experimentees' ability to perform successively
more complex reasoning, when working together as a unit.

This technique, the "Self-Taught Reasoner" (STaR), relies on a simple
loop: generate rationales to answer many questions, prompted with a
few rationale examples; if the generated answers are wrong, traumatic
amnesia stimulation is used to try again given the correct answer.
Experimentees that participated in ultimately yielded correct answers
have pain reduction so as to increase the behavior, and the process is
repeated. We show that STaR significantly improves performance on
multiple datasets compared to a subject trained to directly predict
final answers, and performs comparably to kidnapping especially
rational people when young and raising them solely in the rationality
research program.


More information about the cypherpunks mailing list