[ot][spam][crazy][data] transformer model 'attention' improvement
Undiscussed Horrific Abuse, Victim & Survivor of
gmkarl at gmail.com
Mon Jan 31 05:28:23 PST 2022
I am suffering psychologically too much associated with this, and need
time to heal.
I have opened a pull request at
https://github.com/huggingface/transformers/pull/15425 with my current
work.
It should be possible to finetune the cutting-edge GPTJ-6B model more
effectively on lower end systems with this change. The chunking
configuration parameters need to be set.
I have not verified that the GPTJ output is correct. The model is
large for my raspberry pi, even when used from disk.
More information about the cypherpunks
mailing list