[ot][spam][crazy] adapters for semibalanced trees?

Fri Jul 22 04:51:17 PDT 2022

below is as far as i got.

when i tried running it on xsum, i ran out of gpu ram. i imagine there
are a number of ways to address that.

i'm thinking a simplest approach might be to use a smaller model, even
if it doesn't have the long context support.

another idea is to produce a shorter set of encodings of the data.
this might mesh with other uses.

i didn't try using the model in a forward manner, so i don't know
whether it actually succeeded for sure.

# training
git clone https://github.com/xloem/adapter-transformers --branch=longt5
pip3 install ./adapter-transformers
for ((ct=0; ct<4; ct++)); do echo a,b >> test.csv; done
python3 adapter-transformers/examples/pytorch/summarization/run_summarization.py
--model_name_or_path google/long-t5-tglobal-base     --do_train
--do_eval        --output_dir test     --per_device_train_batch_size=1
   --per_device_eval_batch_size=1     --overwrite_output_dir
--predict_with_generate --train_file test.csv --validation_file
test.csv --train_adapter True --num_train_epochs 100