[ot][crazy] Mainstream AI news snippets
Everybody seems to like synthetic imagery. - genmo.ai claims to provide a text-to-video service with free demo. I haven’t visited it. - an early text-to-photorealistic-scene paper called dreamfusion came out some months ago. this is like stable diffusion but generates virtual scenes and models from text rather than just images. an open source variant is at https://github.com/ashawkey/stable-dreamfusion . i tried the colab briefly but for some reason it kept getting 0 loss for me. i didn’t look for the results so i’m not sure if that’s a bug or not.
lucidrains’ most recent commit is for a medical chatgpt. note: openai’s chatgpt rlhf training approach looks incredibly slow to me. i haven’t checked if the same thing is being pursued here, https://github.com/lucidrains/medical-chatgpt Explorations into training a ChatGPT, but tailored towards primary care medicine, with the reward being able to collect patient histories in a thorough and efficient manner and come up with a differential diagnosis. May also explore to see if it can be further fine-tuned on pirated copies of Up-To-Date for specialist knowledge
Google added a “v2” subfolder to their FLAN-T5 repository last week: https://github.com/google-research/FLAN/tree/main/flan/v2 FLAN-T5 is a huge downloadable text model pretrained around following instructions. BigScience has similar models.
https://github.com/hwchase17/langchain appears to be a python package for using and managing pretrained language models that supports both local and cloud (most of the examples are cloud) backends in a pluggable manner. (github told me jay hack forked this. it is quite popular and something similar had not previously turned up for me.) looking at projects that use this is one way to find other useful things, for example: - https://github.com/namuan/dr-doc-search uses langchain and the paid openai api to process books into trained systems that can answer questions about their content - https://github.com/hwchase17/chat-langchain uses the paid openai api to produce a chatbot that answers questions about langchain’s documentation - https://github.com/jagilley/fact-checker is script that feeds language model output back on itself so as to correct false information …
- https://github.com/BlackHC/llm-strategy lets you concisely write python classes with methods that call out to language models for their implementations
https://github.com/openai/shap-e openai released text-to-3d-mesh model publicly. size is about 5GB, code is somewhat obscure, reimplementations usually expected over the week.
https://discord.gg/gpt4free https://discord.com/channels/1099825897697189898/1101201318053412934/1107599... <@&1100751124325216256> google bard just got an insane upgrade, it has no char limit, and is really good at deobfuscating code, and reasonning. you can join the waiting list, if its not available in your country, use a vpn, like proton, and set location to USA or UK pics below, with comparison to gpt-4 I will be integrating it in g4f, and I am working on getting autogpt running with the interference api, but I think you just have to change the api_base param in the autogpt repo to your localhost url.
TinyStories: How Small Can Language Models Be and Still Speak Coherent English? via reddit https://www.reddit.com/r/MachineLearning/comments/13j0spj/r_tiny_language_mo... https://arxiv.org/pdf/2305.07759 Language models (LMs) are powerful tools for natural language processing, but they often struggle to produce coherent and fluent text when they are small. Models with around 125M parameters such as GPT-Neo (small) or GPT-2 (small) can rarely generate coherent and consistent English text beyond a few words even after extensive training. This raises the question of whether the emergence of the ability to produce coherent English text only occurs at larger scales (with hundreds of millions of parameters or more) and complex architectures (with many layers of global attention). In this work, we introduce TinyStories, a synthetic dataset of short stories that only contain words that a typical 3 to 4-year-olds usually understand, generated by GPT-3.5 and GPT-4. We show that TinyStories can be used to train and evaluate LMs that are much smaller than the state-of-the-art models (below 10 million total parameters), or have much simpler architectures (with only one transformer block), yet still produce fluent and consistent stories with several paragraphs that are diverse and have almost perfect grammar, and demonstrate reasoning capabilities.
participants (1)
-
Undescribed Horrific Abuse, One Victim & Survivor of Many