9 Jan
2023
9 Jan
'23
8:46 p.m.
The galactica model is trained on scientific documents. Its large form has 120B parameters. Each layer is about 10GB in float16 precision. The bigscience/petals system for p2p inference splits by layers, so each node would need 5GB vram for 8 bit inference, 20GB vram for full precision. bigscience/petals currently requires manual coding for the client and server model interfaces. galactica inference alone is unlikely to be easy to add new full documents to without adaptation or finetuning, as like most models it was trained with a context size of only 2k words. adaptation and finetuning can be done p2p.