[ot][spam][crazy] Quickly autotranscribing xkcd 4/1 correctly

Sat Apr 2 03:35:50 PDT 2022

In an attempt to address the length issue I visited the model from the
example, and it looks like the helpful information is in its
config.json files at
https://huggingface.co/facebook/s2t-small-librispeech-asr/tree/main .

In preprocessor_config.json we have feature_size = 80 and in
config.json we have max_input_features = 6000 or something similar. I
don't know for sure what those are: i know i need the size of the
input position embeddings to the model, and the downsampling ratio of
the tokenizer. I'm guessing those are max_input_features and
feature_size configs, or such.

When I change the chunksize from 128 * 1024 to 80 * 600, I get much
better output. It does still begin making errors when it gets to the
end of the chunk, where the logo code is likely to start:

["and here we want to show you that you can program a picture right
along with us we'll use a single color some unorthodox functions and
each line we'll put a bit of nature's masterpieces right here on our
canvas to day we'll have them run all the functions across the stream
right now that you need to program along with us starting with the
simple one to dist colin why zero coal and one"]

Next: add a different model, hopefully one trained on different data