
19 Oct
2024
19 Oct
'24
8:21 p.m.
the biggest llama model is over 400b parameters it wouldn't at all fit on my system! that would be ummmmmm 400GB at 8bit quantization, 800GB at bfloat16 precision, 3.2TB at double precision ... 200GB at 4bit quantization. but we want the full model! we want all 800GB ! 21/24 ok i could try to sort out network offloading and stuff but maybe i'll see if it's on petals or stable horde or maybe i'll juts yammer for a few more messages until the moderation system squelches me