Note on Decentralized Automated Scientific Inference
The galactica model is trained on scientific documents. Its large form has 120B parameters. Each layer is about 10GB in float16 precision. The bigscience/petals system for p2p inference splits by layers, so each node would need 5GB vram for 8 bit inference, 20GB vram for full precision. bigscience/petals currently requires manual coding for the client and server model interfaces. galactica inference alone is unlikely to be easy to add new full documents to without adaptation or finetuning, as like most models it was trained with a context size of only 2k words. adaptation and finetuning can be done p2p.
This post was focused on galactica, a data science model easy for me to think about. There are likely other approaches as well of course.
participants (1)
-
Undescribed Horrific Abuse, One Victim & Survivor of Many