[how about the idea of encoded layers of data. we have _some_ rote encodings that we can display to the system, but not all.] [okay, kind of like 'views' of it, maybe? i dunno.] [that's a little helpful for how the ideas could be brought closer. let's think about the confidence approach. what were the two examples of extra confidence? a prediction model and ....] [for the new idea, let's use the parser. we seem to like the parse. let's assume it's a process we can call to convert correct data into some other representation.] [ok i think i see the edge spot here. we basically need some correct data to work off of. we can theoretically generate that, in which case we need some way to move toward it or discover it. but it makes sense here to require the transcription to have a little compilable logo. that basically means providing some correct logo data with timestamps.] [there is a way to discover it if the model understands letters. for example, the _existing model_ may have been trained off of recordings of people saying the names of letters, e.g. reading a form or something. if so, then the letters or symbols would have logits with weights to them, in the output.] [it sounds a little fun]