Re: [ot][spam][crazy] crazylogs: STaR
looking at this i'm imagining that a number of people likely already have AGI libraries that combine adapted pretrained models. course i can't know that, but there are billions of people on the planet, seems very likely. the adapters library seems like it has established a little maturity internally.
i'm having a dull feeling and stopping i don't like it my mind part says i could be risking the wellbeing of the world and so it is stopping me i don't like the dull feeling at all
i was reading https://docs.adapterhub.ml/training.html . its link to run_glue is broken. i was copying from https://github.com/adapter-hub/adapter-transformers/blob/master/examples/pyt... which is the right link now i had gotten this far and it was lots of fun: import transformers def go(model_name, task_name=None, labels=None, config_name = None, tokenizer_name=None, revision=None): config = transformers.AutoConfig.from_pretrained(config_name if config_name else model_name, num_labels=len(labels) if labels else None, revision=revision, finetuning_task=task_name) tokenizer = transformers.AutoTokenizer.from_pretrained(tokenizer_name if tokenizer_name else model_name, revision=revision, use_fast=True) model = transformers.AutoAdapterModel.from_pretrained(model_name, config=config, revision=revision) if labels: model.add_classification_head( task_name or '', num_labels=len(labels) id2label={i: v for i, v in enumerate(labels)} if num_labels > 0 else None, )
i'm now at https://github.com/adapter-hub/adapter-transformers/blob/master/examples/pyt... because it shows the training loop more clearly
i'm poking at adapter transformers again i have a simpler goal of just making a model pick a legal chess move
from wrong thread, below. i got adapter finetuning to run without crashing by simply using the existing script. $ cd adapter-transformers/examples/pytorch/text-classification $ { echo input,label; for ((b=0; b<16; b++)); do echo -e 'one,1\ntwo,2\nthree,3'; done; } > test.csv $ python3 run_glue.py --model_name_or_path bert-base-uncased --do_train --do_eval --max_seq_length 128 --per_device_train_batch_size 2 --learning_rate 1e-4 --num_train_epochs 10.0 --train_adapter --adapter_config pfeiffer --output_dir test_output --overwrite_output_dir --train_file test.csv --validation_file test.csv presumably the loss is decreasing as it trains. the parameters are all from the example in their docs. i dropped the batch size to quickly make it run on my old gpu.
participants (1)
-
Undiscussed Horrific Abuse, One Victim of Many