On 6/18/23, Karl Semich <0xloem@gmail.com> wrote:
i actually had trouble creating the model and building ggml. if i had i mean comstructing an empty basic transformer architecture here, not an actual model of anything. ignored file storage i would possibly have skipped most of the trouble. i was thinking of linking to ctransformers which is made for python but builds a library with load and eval functions for popular transformers
middle sticking point was trying to load a s8ngle-file model with llama.cpp tip. i thought it would be organized,maybe? to use a llama transformer and reuse all the high-activity llama code, but it seems like convert.py generates a singe-file model and llama.cpp only loads multifile models right now (datatype=f32), could be wrong but it was my experience