the model with the funky tokenizer is starting to look useful as if it produced code on its own. i have not actually seen it do that yet. [it's hard handling how it's starting to look useful] bugs and personal issue prevent checking how useful it actually is. - i accidentally dropped the output length on colab to a very short number, and the loss dropped to 0.22, which is very accurate here imo. it was likely because it was only backpropagating on the patch header, and not the actual data. - i worked on forward.py a bit, and generated novel data. after the patch header, it hit something wrong, and terminated without outputting anything else. but it is the first time i've seen a good patch header (mostly because i rarely work on forward.py) i typed this: hello world example <pad>hello_world.py</pad> and it output this: <pad>diff --git a/hello_world.py b/hello_world.py --- a/hello_world.py +++ b/hello_world.py @@ -31,7 +31,7 @@<CRASH> the answer actually shows a bug, because since my input didn't include any preceding content, it should have patched /dev/null with new data.