[crazy][fiction][spam] Once upon a time,
Once upon a time, Whoops! After people say "once upon a time", often they go off into a mode of communication where a slow fantasy is relayed. Maybe let's visit a text generation model and see if it can generate a story we edit. Let's try ai21, because its api gives some per-generated-token information and I have a little exposure to it. $ git clone https://github.com/xloem/codesynth $ cd codesynth $ python3
from codesynth.causal_language_model import ai21 ai21()('Once upon a time,') [{'prompt_text': 'Once upon a time,', 'prompt_tokens': [{'generatedToken': {'token': '▁Once▁upon▁a▁time', 'logprob': -9.43914794921875}, 'topTokens': None, 'textRange': {'start': 0, 'end': 16}}, {'generatedToken': {'token': ',', 'logprob': -0.5171183943748474}, 'topTokens': None, 'textRange': {'start': 16, 'end': 17}}, {'generatedToken': {'token': '▁there▁was▁a', 'logprob': -2.4541430473327637}, 'topTokens': None, 'textRange': {'start': 0, 'end': 12}}, {'generatedToken': {'token': '▁little▁girl', 'logprob': -3.0728108882904053}, 'topTokens': None, 'textRange': {'start': 12, 'end': 24}}, {'generatedToken': {'token': '▁named', 'logprob': -1.7874130010604858}, 'topTokens': None, 'textRange': {'start': 24, 'end': 30}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -4.059205055236816}, 'topTokens': None, 'textRange': {'start': 30, 'end': 36}}, {'generatedToken': {'token': '.', 'logprob': -0.6026005148887634}, 'topTokens': None, 'textRange': {'start': 36, 'end': 37}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -1.6510628461837769}, 'topTokens': None, 'textRange': {'start': 37, 'end': 43}}, {'generatedToken': {'token': '▁was', 'logprob': -1.4048548936843872}, 'topTokens': None, 'textRange': {'start': 43, 'end': 47}}, {'generatedToken': {'token': '▁', 'logprob': -3.338338613510132}, 'topTokens': None, 'textRange': {'start': 47, 'end': 48}}], 'generated_text': 'Once upon a time, there was a little girl named Sarah. Sarah was ', 'tokens': [{'generatedToken': {'token': '▁Once▁upon▁a▁time', 'logprob': -9.43914794921875}, 'topTokens': None, 'textRange': {'start': 0, 'end': 16}}, {'generatedToken': {'token': ',', 'logprob': -0.5171183943748474}, 'topTokens': None, 'textRange': {'start': 16, 'end': 17}}, {'generatedToken': {'token': '▁there▁was▁a', 'logprob': -2.4541430473327637}, 'topTokens': None, 'textRange': {'start': 0, 'end': 12}}, {'generatedToken': {'token': '▁little▁girl', 'logprob': -3.0728108882904053}, 'topTokens': None, 'textRange': {'start': 12, 'end': 24}}, {'generatedToken': {'token': '▁named', 'logprob': -1.7874130010604858}, 'topTokens': None, 'textRange': {'start': 24, 'end': 30}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -4.059205055236816}, 'topTokens': None, 'textRange': {'start': 30, 'end': 36}}, {'generatedToken': {'token': '.', 'logprob': -0.6026005148887634}, 'topTokens': None, 'textRange': {'start': 36, 'end': 37}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -1.6510628461837769}, 'topTokens': None, 'textRange': {'start': 37, 'end': 43}}, {'generatedToken': {'token': '▁was', 'logprob': -1.4048548936843872}, 'topTokens': None, 'textRange': {'start': 43, 'end': 47}}, {'generatedToken': {'token': '▁', 'logprob': -3.338338613510132}, 'topTokens': None, 'textRange': {'start': 47, 'end': 48}}], 'finish_reason': {'reason': 'length', 'length': 8}}]
Blurg! What a mess! I'd like to see other options at every token. Words that could replace each word of 'Once upon a time, there was a little girl named Sarah. Sarah was ', which can be picked out from the above as the 'generatedText' field. I don't know whether ai21 shows alternative tokens, but I'm guessing that there's an option to, since there's a field called 'topTokens'. I wonder how to enable it? Maybe I can find their api page.
Ok! in this wrapper you can specify 'top_k' to specify how many alternative words to see. For each word, we can provide that it will give k alternative words, which would produce a different stream of text. Here's the mess:
ai21()('Once upon a time,', top_k=2) [{'prompt_text': 'Once upon a time,', 'prompt_tokens': [{'generatedToken': {'token': '▁Once▁upon▁a▁time', 'logprob': -9.43914794921875}, 'topTokens': [{'token': '▁The', 'logprob': -2.837585926055908}, {'token': '▁', 'logprob': -3.064148426055908}], 'textRange': {'start': 0, 'end': 16}}, {'generatedToken': {'token': ',', 'logprob': -0.5171183943748474}, 'topTokens': [{'token': ',', 'logprob': -0.5171183943748474}, {'token': '▁there▁was▁a', 'logprob': -2.571805953979492}], 'textRange': {'start': 16, 'end': 17}}, {'generatedToken': {'token': '▁there▁was▁a', 'logprob': -2.4541430473327637}, 'topTokens': [{'token': '▁there▁was▁a', 'logprob': -2.4541430473327637}, {'token': '▁in▁a', 'logprob': -3.4072680473327637}], 'textRange': {'start': 0, 'end': 12}}, {'generatedToken': {'token': '▁little▁girl', 'logprob': -3.0728108882904053}, 'topTokens': [{'token': '▁little▁girl', 'logprob': -3.0728108882904053}, {'token': '▁beautiful', 'logprob': -3.4634358882904053}], 'textRange': {'start': 12, 'end': 24}}, {'generatedToken': {'token': '▁named', 'logprob': -1.7874130010604858}, 'topTokens': [{'token': '▁named', 'logprob': -1.7874130010604858}, {'token': '▁who', 'logprob': -2.0686631202697754}], 'textRange': {'start': 24, 'end': 30}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -4.059205055236816}, 'topTokens': [{'token': '▁Sarah', 'logprob': -4.059205055236816}, {'token': '▁', 'logprob': -4.207642555236816}], 'textRange': {'start': 30, 'end': 36}}, {'generatedToken': {'token': '.', 'logprob': -0.6026005148887634}, 'topTokens': [{'token': '.', 'logprob': -0.6026005148887634}, {'token': '▁who', 'logprob': -2.797913074493408}], 'textRange': {'start': 36, 'end': 37}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -1.6510628461837769}, 'topTokens': [{'token': '▁Sarah', 'logprob': -1.6510628461837769}, {'token': '▁She▁was', 'logprob': -1.7057503461837769}], 'textRange': {'start': 37, 'end': 43}}, {'generatedToken': {'token': '▁was', 'logprob': -1.4048548936843872}, 'topTokens': [{'token': '▁was', 'logprob': -1.4048548936843872}, {'token': '▁loved', 'logprob': -2.1236047744750977}], 'textRange': {'start': 43, 'end': 47}}, {'generatedToken': {'token': '▁', 'logprob': -3.338338613510132}, 'topTokens': [{'token': '▁', 'logprob': -3.338338613510132}, {'token': '▁a▁happy', 'logprob': -3.963338613510132}], 'textRange': {'start': 47, 'end': 48}}], 'generated_text': 'Once upon a time, there was a little girl named Sarah. Sarah was ', 'tokens': [{'generatedToken': {'token': '▁Once▁upon▁a▁time', 'logprob': -9.43914794921875}, 'topTokens': [{'token': '▁The', 'logprob': -2.837585926055908}, {'token': '▁', 'logprob': -3.064148426055908}], 'textRange': {'start': 0, 'end': 16}}, {'generatedToken': {'token': ',', 'logprob': -0.5171183943748474}, 'topTokens': [{'token': ',', 'logprob': -0.5171183943748474}, {'token': '▁there▁was▁a', 'logprob': -2.571805953979492}], 'textRange': {'start': 16, 'end': 17}}, {'generatedToken': {'token': '▁there▁was▁a', 'logprob': -2.4541430473327637}, 'topTokens': [{'token': '▁there▁was▁a', 'logprob': -2.4541430473327637}, {'token': '▁in▁a', 'logprob': -3.4072680473327637}], 'textRange': {'start': 0, 'end': 12}}, {'generatedToken': {'token': '▁little▁girl', 'logprob': -3.0728108882904053}, 'topTokens': [{'token': '▁little▁girl', 'logprob': -3.0728108882904053}, {'token': '▁beautiful', 'logprob': -3.4634358882904053}], 'textRange': {'start': 12, 'end': 24}}, {'generatedToken': {'token': '▁named', 'logprob': -1.7874130010604858}, 'topTokens': [{'token': '▁named', 'logprob': -1.7874130010604858}, {'token': '▁who', 'logprob': -2.0686631202697754}], 'textRange': {'start': 24, 'end': 30}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -4.059205055236816}, 'topTokens': [{'token': '▁Sarah', 'logprob': -4.059205055236816}, {'token': '▁', 'logprob': -4.207642555236816}], 'textRange': {'start': 30, 'end': 36}}, {'generatedToken': {'token': '.', 'logprob': -0.6026005148887634}, 'topTokens': [{'token': '.', 'logprob': -0.6026005148887634}, {'token': '▁who', 'logprob': -2.797913074493408}], 'textRange': {'start': 36, 'end': 37}}, {'generatedToken': {'token': '▁Sarah', 'logprob': -1.6510628461837769}, 'topTokens': [{'token': '▁Sarah', 'logprob': -1.6510628461837769}, {'token': '▁She▁was', 'logprob': -1.7057503461837769}], 'textRange': {'start': 37, 'end': 43}}, {'generatedToken': {'token': '▁was', 'logprob': -1.4048548936843872}, 'topTokens': [{'token': '▁was', 'logprob': -1.4048548936843872}, {'token': '▁loved', 'logprob': -2.1236047744750977}], 'textRange': {'start': 43, 'end': 47}}, {'generatedToken': {'token': '▁', 'logprob': -3.338338613510132}, 'topTokens': [{'token': '▁', 'logprob': -3.338338613510132}, {'token': '▁a▁happy', 'logprob': -3.963338613510132}], 'textRange': {'start': 47, 'end': 48}}], 'finish_reason': {'reason': 'length', 'length': 8}}]
How would I like to see this? Maybe as a longer string of text, where I could select a word, and look for alternative words? I'm not really sure. It could be nice if it generated a tree that one could explore. I dunno :S Once upon a time, ....
participants (1)
-
Undiscussed Horrific Abuse, One Victim of Many