Once upon a time, Whoops! After people say "once upon a time", often they go off into a mode of communication where a slow fantasy is relayed. Maybe let's visit a text generation model and see if it can generate a story we edit. Let's try ai21, because its api gives some per-generated-token information and I have a little exposure to it. $ git clone https://github.com/xloem/codesynth $ cd codesynth $ python3
from codesynth.causal_language_model import ai21 ai21()('Once upon a time,') [{'prompt_text': 'Once upon a time,', 'prompt_tokens': [{'generatedToken': {'token': '▁Once▁upon▁a▁time', 'logprob': -9.43914794921875}, 'topTokens': None, 'textRange': {'start': 0, 'end': 16}}, {'generatedToken': {'token': ',', 'logprob': -0.5171183943748474}, 'topTokens': None, 'textRange': {'start': 16, 'end': 17}}, {'generatedToken': {'token': '▁there▁was▁a', 'logprob': -2.4541430473327637}, 'topTokens': None, 'textRange': {'start': 0, 'end': 12}}, {'generatedToken': {'token': '▁little▁girl', 'logprob': -3.0728108882904053}, 'topTokens': None, 'textRange': {'start': 12, 'end': 24}}, {'generatedToken': {'token': '▁named', 'logprob': -1.7874130010604858}, 'topTokens': None, 'textRange': {'start': 24, 'end': 30}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -4.059205055236816}, 'topTokens': None, 'textRange': {'start': 30, 'end': 36}}, {'generatedToken': {'token': '.', 'logprob': -0.6026005148887634}, 'topTokens': None, 'textRange': {'start': 36, 'end': 37}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -1.6510628461837769}, 'topTokens': None, 'textRange': {'start': 37, 'end': 43}}, {'generatedToken': {'token': '▁was', 'logprob': -1.4048548936843872}, 'topTokens': None, 'textRange': {'start': 43, 'end': 47}}, {'generatedToken': {'token': '▁', 'logprob': -3.338338613510132}, 'topTokens': None, 'textRange': {'start': 47, 'end': 48}}], 'generated_text': 'Once upon a time, there was a little girl named Sarah. Sarah was ', 'tokens': [{'generatedToken': {'token': '▁Once▁upon▁a▁time', 'logprob': -9.43914794921875}, 'topTokens': None, 'textRange': {'start': 0, 'end': 16}}, {'generatedToken': {'token': ',', 'logprob': -0.5171183943748474}, 'topTokens': None, 'textRange': {'start': 16, 'end': 17}}, {'generatedToken': {'token': '▁there▁was▁a', 'logprob': -2.4541430473327637}, 'topTokens': None, 'textRange': {'start': 0, 'end': 12}}, {'generatedToken': {'token': '▁little▁girl', 'logprob': -3.0728108882904053}, 'topTokens': None, 'textRange': {'start': 12, 'end': 24}}, {'generatedToken': {'token': '▁named', 'logprob': -1.7874130010604858}, 'topTokens': None, 'textRange': {'start': 24, 'end': 30}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -4.059205055236816}, 'topTokens': None, 'textRange': {'start': 30, 'end': 36}}, {'generatedToken': {'token': '.', 'logprob': -0.6026005148887634}, 'topTokens': None, 'textRange': {'start': 36, 'end': 37}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -1.6510628461837769}, 'topTokens': None, 'textRange': {'start': 37, 'end': 43}}, {'generatedToken': {'token': '▁was', 'logprob': -1.4048548936843872}, 'topTokens': None, 'textRange': {'start': 43, 'end': 47}}, {'generatedToken': {'token': '▁', 'logprob': -3.338338613510132}, 'topTokens': None, 'textRange': {'start': 47, 'end': 48}}], 'finish_reason': {'reason': 'length', 'length': 8}}]
Blurg! What a mess! I'd like to see other options at every token. Words that could replace each word of 'Once upon a time, there was a little girl named Sarah. Sarah was ', which can be picked out from the above as the 'generatedText' field. I don't know whether ai21 shows alternative tokens, but I'm guessing that there's an option to, since there's a field called 'topTokens'. I wonder how to enable it? Maybe I can find their api page.