1106 "Using DeepSpeed Optimizer param name {} as basic optimizer".format( 1107 self.optimizer_name())) (Pdb) up
/home/user/.local/lib/python3.9/site-packages/deepspeed/runtime/engine.py(291)__init__() -> self._configure_optimizer(optimizer, model_parameters) (Pdb) list 286 # Configure optimizer and scheduler 287 self.optimizer = None 288 self.basic_optimizer = None 289 self.lr_scheduler = None 290 if model_parameters or optimizer: 291 -> self._configure_optimizer(optimizer, model_parameters) 292 self._configure_lr_scheduler(lr_scheduler) 293 self._report_progress(0) 294 elif self.zero_optimization(): 295 # no optim selected but zero is enabled 296 self.optimizer = self._configure_zero_optimizer(optimizer=None) (Pdb)
i'm learning to run SwissArmyTokenizer, which is a repurposeable library made by chinese language model researchers. it can load mainstream language models under a shared generalized transformer architecture, which is a huge breath of fresh air after huggingface. unfortunately it looks like they mostly only used it for their research, the parts they use are the parts that have maintainedness. By default it assumes you have 8 gpus and write access to a global folder with path /data/qingsong .