[ot][spam][crazy][data] transformer model 'attention' improvement

Mon Jan 31 05:28:23 PST 2022

I am suffering psychologically too much associated with this, and need
time to heal.

I have opened a pull request at
https://github.com/huggingface/transformers/pull/15425 with my current
work.

It should be possible to finetune the cutting-edge GPTJ-6B model more
effectively on lower end systems with this change. The chunking
configuration parameters need to be set.

I have not verified that the GPTJ output is correct. The model is
large for my raspberry pi, even when used from disk.