i'm suspecting some people have been using fairseq for things like this [1]https://github.com/pytorch/fairseq it's a facebook project focused on training sequence transformer models. noticed there was a deep learning related repo on the old gitopia, too, could be meaningful to look through such things for community-built efforts References 1. https://github.com/pytorch/fairseq