[ot][spam][log] poking at km-infinite was: attempt bits: talk to a language model on list
i’m looking at lm-infinite a little bit, maybe i can make steps on making it work paper: https://arxiv.org/pdf/2308.16137.pdf partial implementation: https://github.com/kyegomez/LM-Infinite/blob/main/infinite/main.py it seems like the theory is that if out-of-context tokens are moved to the very start of the context window in some empirically-determined way, results on long context outputs radically increase in quality the partial implementation doesn’t include the new calculation of position encodings, which is different depending on the model the length extension is applied to.
i guess a reason for poking at this is that yarn did not include the result quality in their review. maybe i can find numbers to compare (i’m second-guessing because the extension algorithm does not look very powerful despite people being excited about its results)
ok i found some numbers that i think indicate lm-infinite is poor which is sad because it was fun to try to learn to implement it in the lm-infinite paper they have this data: actually no i am not sure grumph how confusing these things are i don’t fully understand the lm-infinite approach at this time but basically they are all heuristics that try to
so now i’m looking at https://github.com/huggingface/transformers/blob/main/src/transformers/model... the lm-infinite implementation actually sums the token distance to its logit, i think before taking the softmax, and i think the paper talks this way too. this seems quite strange to me, but maybe i have forgotten about it. [content misplaced] maybe the implementation was autogenerated or something ;p
different kind of 8nhibition now :( curious how much this spreads to other tasks. looks like a change of context reduces it, but possibly less than i expect.
after dropped post (which talked about) maybe some energy to just use basic model (to try to protect habit in new space)
participants (1)
-
Undescribed Horrific Abuse, One Victim & Survivor of Many