[ot][spam][crazy] Being a tiny part of a goal

Undiscussed Horrific Abuse, One Victim of Many gmkarl at gmail.com
Sat Apr 2 06:13:29 PDT 2022


# Karl wants to learn around this first block but [doesn't understand
that his working memory doesn't have space for it under these
conditions right now], so it is commented out.
#from tokenizers.pre_tokenizers import Whitespace
#
#tokenizer.pre_tokenizer = Whitespace()

# this ihe training bit for using the example code:

files = [f"data/wikitext-103-raw/wiki.{split}.raw" for split in
["test", "train", "valid"]]
tokenizer.train(files, trainer)


More information about the cypherpunks mailing list