[ot][spam][crazy] Being a tiny part of a goal

Undiscussed Horrific Abuse, One Victim of Many gmkarl at gmail.com
Sat Apr 2 06:52:45 PDT 2022


This goes in colab notebook, referenced at [git notes url in a thread].

import urllib.request
import io
import sentencepiece as spm

# Loads model from URL as iterator and stores the model to BytesIO.
model = io.BytesIO()
with urllib.request.urlopen(
    'https://raw.githubusercontent.com/google/sentencepiece/master/data/botchan.txt'
) as response:
  spm.SentencePieceTrainer.train(
      sentence_iterator=response, model_writer=model, vocab_size=1000)

# Serialize the model as file.
# with open('out.model', 'wb') as f:
#   f.write(model.getvalue())

# Directly load the model from serialized model.
sp = spm.SentencePieceProcessor(model_proto=model.getvalue())
print(sp.encode('this is test'))
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 1637 bytes
Desc: not available
URL: <https://lists.cpunks.org/pipermail/cypherpunks/attachments/20220402/f717569f/attachment.txt>


More information about the cypherpunks mailing list