petals is all over that llama 400b thing, it's their top example in their readme at github.com/bigscience-workshop/petals: ``` from transformers import AutoTokenizer from petals import AutoDistributedModelForCausalLM # Choose any model available at https://health.petals.dev model_name = "meta-llama/Meta-Llama-3.1-405B-Instruct" # Connect to a distributed network hosting model layers tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoDistributedModelForCausalLM.from_pretrained(model_name) # Run the model as if it were on your computer inputs = tokenizer("A cat sat", return_tensors="pt")["input_ids"] outputs = model.generate(inputs, max_new_tokens=5) print(tokenizer.decode(outputs[0])) # A cat sat on a mat... ``` maybe i can feed an ai-pattern addiction with llama 3.1 405b petals!