[spam][crazy][fiction][random] Non-Canon MCBoss Spinoffs

Sun Jan 7 14:04:51 PST 2024

y'know for some tiem now, after i started being a little able to write cod
again, it's been _so_ _so_ much easier to start code than to improve code,
or integrate code, etc etc. it partly relates to exposure to things i've
associated with issues, etc,
but one of the similarities is that that's also how gpt made code. openai
would have it generate code, from start to finish. it talks this way too.

maybe it would be nice to reorganize something like that, so that it
actually produces things in a sensical way, rather than starting from the
start and calling it good. sadly this is similar to a lot of false starts,
but when considered as valuable for how we do things in general, it seems
more accessible, a little

1639

1701

well, llamacpp does finetuning etc now. they have an example script that
makes checkpoints and loras using adam.
also, tinyllama has released versions, i downloaded a 2-bit quantized one
(400-500MB)
i finally got it to make finetuning progress i could see when i set it to a
context length of a single token. it said it would complete in 19 minutes.
sadly that's a little more time than i have right now.
i did notice these models still use hugely wide matrices for the input and
output embeddings, i noticed these are one of the slowest things to train
because they don't share information across different tokens. i saw on the
[waning feeds?] that there is a new approach to this where the embeddings
are not trained with the model, maybe a research paper by meta/facebook,
not sure
personally i wouldn't use so much vocab, waste of training energy, i'd
maybe do something random like keep a database of near-words and turn the
embeddings into two tiny layers
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 1914 bytes
Desc: not available
URL: <https://lists.cpunks.org/pipermail/cypherpunks/attachments/20240107/0569e513/attachment.txt>