Automated Image Generation "Everybody" knows that current neural networks can make any photorealistic image you want. But access to this is not universal. OpenAI made Dall-E some time ago: https://openai.com/blog/dall-e/ Their big hallmark was an "armchair in the shape of an avocado". If you look at their examples with some experience, it looks kinda like they trained for a subset of what's possible, maybe to stimulate competing research. This means they needed fewer resources to produce only the images they demonstrate. That stagnated for a while. People made public approaches such as vqgan and diffusion, mostly (but not all) using a model called CLIP that was released by OpenAI and has significant limitations that can be worked around. Here's a recent one of these developments: https://github.com/openai/glide-text2im . Many of these were aided by work by Katherine Crowson with EleutherAI, and a good community-cooperated result of them is maybe https://github.com/pixray/pixray . A community attempt to replicate Dall-E itself sprang up eventually, and their work was eventually made mainstream, but wasn't very powerful when I last looked: https://github.com/borisdayma/dalle-mini . It's quite inspiring to see the hard work from random peeps, and I know they are still training their model to be better. All of sudden, Russia comes in and releases a public model that at a biased glance looks like somebody just threw a goldmine at it. The encouraged way to use it is to visit a site in russian with javascript and captchas: https://huggingface.co/sberbank-ai/rudalle-Malevich Meanwhile, researchers have finally gotten on board with training networks, as can be see by the new research image model that is being trained live right now as we speak: https://huggingface.co/training-transformers-together/dalle-demo-v1