[ot][spam][crazy] Quickly autotranscribing xkcd 4/1 correctly

Sat Apr 2 04:56:44 PDT 2022

So, the place where using a little bit of hand-curated data might go
here, at least at first, would be the tokenizer. Then we can see if we
can use high-confidence logo code areas, to update low-confidence
areas. We could plan to pass the confidence through other things, like
a logo parser and a heuristic, to improve its accuracy.

We need a detokenizer that can produces logo code.

It's likely also helpful to unroll the generation code a little bit,
so as to see the information that relates to confidence and access
more of the places where loss is calculated and backpropagation
performed, when training.