[ot][crazy][draft] Friday's xkcd was a challenge to regenerate the comic

Undiscussed Horrific Abuse, One Victim of Many gmkarl at gmail.com
Sat Apr 2 01:41:49 PDT 2022


It's audio of logo code that draws a comic.

The xkcd community I visit is explainxkcd.com, so it's not too advanced.

xkcd is the only webcomic I read regularly since 2014 or so. i'm also
reading seed a bit, which just recently pushed through a block regarding
shielded rooms, remains to be seen if it keeps going.

what's relevant here is the idea of recovering material when you do not
have a reference for what is correct. you could do it otherwise, of course,
but there is more return in learning to do it with only the available
material.

thoughts:
whatever technique you choose to initially approach the problem with will
have some way of extracting from it confidence in its output. with
transformer models, this is usually a clear part of the output.

if we have multiple indications of confidence, is it reasonable then to use
this information to refine and update the models autonomously, and have it
autocomplete the image reconstruction?

--

here's an example:

- speech-to-text produces logo code with associated confidence
- logo machine can then mark code with incorrect or correct syntax

then we can actually make a 3rd metric out of a simple heuristic, for
example how likely are very long and straight lines to be correct? the
screenshot at https://github.com/somebody1234/xkcd2601 has many long and
straight lines that look incorrect.

By using the existing confidence, we can tune a confidence metric of our
heuristic.

Then this heuristic can provide additional confidence information on the
output.

By finetuning the speech-to-text model based on this additional
information, it can then learn properties of the speech in the recording,
by knowing which output is correct or not. This produces improvements in
its output, autocorrecting errors that weren't detected by the heuristic
because of transfer of the correction information to the speech data.

--

It's part of something I've thought about some that is hard to transcribe.
There's a way to produce further feedback and improvement after that first
step, and a way to automate the heuristic generation, by simply assuming
that some of the output is correct in different ways and the correctness is
transferrable.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 3069 bytes
Desc: not available
URL: <https://lists.cpunks.org/pipermail/cypherpunks/attachments/20220402/a15e3adc/attachment.txt>


More information about the cypherpunks mailing list