note: the huggingface demo passes information to the model using token ids token ids are just indexed sets of character orders that occur together frequently (the tokenizer counts and decides these) with something based on math, since it's going to be learning using linear algebra, i'm wondering if it might make sense to retain the numeric value of the inputs. this could mean bypassing the use of integer input ids. the integer input ids are converted into high-dimensioned vectors using an 'embedding' matrix at the very start of the model. this matrix could be hand-altered by finding its property in the model, or removed/skipped entirely. a thought. the coefficients in the embedding matrix are trained via backpropagated gradients to determine the vectors. i think they end up being roughly random with some of their dimensions clustering similar data near each other and such.