via twitter, Feb 2022 from Nov 2021

https://github.com/VHellendoorn/Code-LMs

https://arxiv.org/abs/2202.13169

We release a new model, PolyCoder, with 2.7B parameters based on the GPT-2 architecture, which was trained on 249GB of code across 12 programming languages on a single machine. In the C programming language, PolyCoder outperforms all models including Codex.