[ot][spam][log] poking at km-infinite was: attempt bits: talk to a language model on list
Undescribed Horrific Abuse, One Victim & Survivor of Many
gmkarl at gmail.com
Mon Sep 11 09:22:04 PDT 2023
i’m looking at lm-infinite a little bit, maybe i can make steps on
making it work
paper: https://arxiv.org/pdf/2308.16137.pdf
partial implementation:
https://github.com/kyegomez/LM-Infinite/blob/main/infinite/main.py
it seems like the theory is that if out-of-context tokens are moved to
the very start of the context window in some empirically-determined
way, results on long context outputs radically increase in quality
the partial implementation doesn’t include the new calculation of
position encodings, which is different depending on the model the
length extension is applied to.
More information about the cypherpunks
mailing list