[ot][spam][log] poking at km-infinite was: attempt bits: talk to a language model on list

Undescribed Horrific Abuse, One Victim & Survivor of Many gmkarl at gmail.com
Mon Sep 11 09:35:08 PDT 2023


so now i’m looking at
https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/modeling_llama.py

the lm-infinite implementation actually sums the token distance to its
logit, i think before taking the softmax, and i think the paper talks
this way too. this seems quite strange to me, but maybe i have
forgotten about it. [content misplaced] maybe the implementation was
autogenerated or something ;p


More information about the cypherpunks mailing list