information on using petals: - petals is slow to release and doesn't have forward-compatible coding norms, so installing it from the git repository is what to do. - petals depends on hivemind which uses some libp2p binary. for me this fails to install and i had to manually acquire it and place it in hivemind/hivemind_cli/p2pd prior to install so: 1. get latest petals from https://github.com/bigscience-workshop/petals . for me this may be revision 22afba627a7eb4fcfe9418c49472c6a51334b8ac 2. look in petals' setup.cfg for the version of hivemind it depends on, for me this was https://github.com/learning-at-home/hivemind.git@213bff98a62accb91f254e2afdc... 3. checkout the appropriate version of hivemind and install it with the right version of p2pd in hivemind/hivemind_cli/p2pd . hivemind checks the sha256 of that binary prior to install unless a flag is passed to build from source. both paths default to downloading using urllib from github. 4. install petals 1553 meanwhile, petals will only access llama 405b if you have gated access which appears to be given to most people by default. there are likely clones of this model that would likely work to drop in, haven't looked closely. transformers basically accesses specific git-lfs files from the repository, and huggingface.co only lets people access repos if they have certain tokens enabled. so until i investigate the 1 or 2 steps involved in unrestrained access it would still have similarity to gpt-4 in that there's a gate. 1556 holy frogs it's downloading the entire model just to use petals with it O_O it used to only download the embeddings. it has 191 4.8GB files to download. i'm not sure this is the ideal approach. i wonder how many emails i have left to the list