Re: [ml] langchain runs local model officially

6 Apr 2023


      also llama.cpp is better in many ways

but in python with huggingface accelerate and the transformers package
it will spread between gpu and cpu ram, giving more total ram, if you
pass device_map='auto', and it will use fast mmap loading if you use a
safetensors model

note that huggingface's libs do tend to be somewhat crippled
user-focused things, maybe why i know them

Re: [ml] langchain runs local model officially

Undescribed Horrific Abuse, One Victim & Survivor of Many