oh and for completing the vocab: I suppose you'd pass the whole recording through the original model and then add the output to the tokenizer. it'll get enough words I think. then the random noise stuff would preserve the non-code text but more interesting to map state machines to logit matrices