cypherpunks Digest, Vol 106, Issue 95

Gunnar Larson g at xny.io
Fri Apr 8 17:51:25 PDT 2022


Oh yes he did, he did do it.

Gunnar Larson raped him. He had to ...

On Fri, Apr 8, 2022, 8:32 PM <cypherpunks-request at lists.cpunks.org> wrote:

> Send cypherpunks mailing list submissions to
>         cypherpunks at lists.cpunks.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         https://lists.cpunks.org/mailman/listinfo/cypherpunks
> or, via email, send a message with subject or body 'help' to
>         cypherpunks-request at lists.cpunks.org
>
> You can reach the person managing the list at
>         cypherpunks-owner at lists.cpunks.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of cypherpunks digest..."
>
>
> Today's Topics:
>
>    1. Re: cypherpunks Digest, Vol 106, Issue 94 (Gunnar Larson)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 8 Apr 2022 20:29:43 -0400
> From: Gunnar Larson <g at xny.io>
> To: cypherpunks <cypherpunks at lists.cpunks.org>
> Subject: Re: cypherpunks Digest, Vol 106, Issue 94
> Message-ID:
>         <
> CAPc8xwO4+uLbaR52tWvdyRckPtLWO49uxSHk-boT0HwUkMmUVw at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Did Gunnar Larson, rape Mr. Mark Zuckerburg? Or, was it fair game?
>
> Finders keepers?
>
> On Fri, Apr 8, 2022, 7:56 PM <cypherpunks-request at lists.cpunks.org> wrote:
>
> > Send cypherpunks mailing list submissions to
> >         cypherpunks at lists.cpunks.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >         https://lists.cpunks.org/mailman/listinfo/cypherpunks
> > or, via email, send a message with subject or body 'help' to
> >         cypherpunks-request at lists.cpunks.org
> >
> > You can reach the person managing the list at
> >         cypherpunks-owner at lists.cpunks.org
> >
> > When replying, please edit your Subject line so it is more specific
> > than "Re: Contents of cypherpunks digest..."
> >
> >
> > Today's Topics:
> >
> >    1. Re: cypherpunks Digest, Vol 106, Issue 93 (Gunnar Larson)
> >
> >
> > ----------------------------------------------------------------------
> >
> > Message: 1
> > Date: Fri, 8 Apr 2022 19:54:15 -0400
> > From: Gunnar Larson <g at xny.io>
> > To: cypherpunks <cypherpunks at lists.cpunks.org>
> > Subject: Re: cypherpunks Digest, Vol 106, Issue 93
> > Message-ID:
> >         <CAPc8xwPsCK2cA3tT1U-wjuV09T5kc=
> > TBMcrzLz46uyNHJXV9cg at mail.gmail.com>
> > Content-Type: text/plain; charset="utf-8"
> >
> > At first glance, this was a great article.
> >
> > On Fri, Apr 8, 2022, 7:52 PM <cypherpunks-request at lists.cpunks.org>
> wrote:
> >
> > > Send cypherpunks mailing list submissions to
> > >         cypherpunks at lists.cpunks.org
> > >
> > > To subscribe or unsubscribe via the World Wide Web, visit
> > >         https://lists.cpunks.org/mailman/listinfo/cypherpunks
> > > or, via email, send a message with subject or body 'help' to
> > >         cypherpunks-request at lists.cpunks.org
> > >
> > > You can reach the person managing the list at
> > >         cypherpunks-owner at lists.cpunks.org
> > >
> > > When replying, please edit your Subject line so it is more specific
> > > than "Re: Contents of cypherpunks digest..."
> > >
> > >
> > > Today's Topics:
> > >
> > >    1. Re: DALL-E (coderman)
> > >
> > >
> > > ----------------------------------------------------------------------
> > >
> > > Message: 1
> > > Date: Fri, 08 Apr 2022 23:50:53 +0000
> > > From: coderman <coderman at protonmail.com>
> > > To: coderman <coderman at protonmail.com>
> > > Cc: "cy\"Cypherpunks" <cypherpunks at cpunks.org>
> > > Subject: Re: DALL-E
> > > Message-ID:
> > >
> > >
> >
> <a9WeFGpr9g422W0Uym9aQZyxT6mqWNzNwLsG6yKqqlD4BLpH6NxuARXLOMvBY8IdZF9HMetBKZYGjdH--qJRFZDIWnXdMRQVqr3pmMYVo5I=@
> > > protonmail.com>
> > >
> > > Content-Type: text/plain; charset="utf-8"
> > >
> > > DALL·E[1](https://openai.com/blog/dall-e/#fn1)
> > >
> > > We decided to name our model using a portmanteau of the artist Salvador
> > > Dalí and Pixar’s WALL·E.
> > >
> > > is a 12-billion parameter version of[GPT-3](
> > > https://arxiv.org/abs/2005.14165) trained to generate images from text
> > > descriptions, using a dataset of text–image pairs. We’ve found that it
> > has
> > > a diverse set of capabilities, including creating anthropomorphized
> > > versions of animals and objects, combining unrelated concepts in
> > plausible
> > > ways, rendering text, and applying transformations to existing images.
> > >
> > > ---------------------------------------------------------------
> > >
> > > Text prompt
> > > an illustration of a baby daikon radish in a tutu walking a dog
> > > AI-generated
> > > images
> > >
> > > Edit prompt or view more images
> > > Text prompt
> > > an armchair in the shape of an avocado. . . .
> > > AI-generated
> > > images
> > >
> > > Edit prompt or view more images
> > > Text prompt
> > > a store front that has the word ‘openai’ written on it. . . .
> > > AI-generated
> > > images
> > >
> > > Edit prompt or view more images
> > > Text & image
> > > prompt
> > > the exact same cat on the top as a sketch on the bottom
> > > AI-generated
> > > images
> > >
> > > Edit prompt or view more images
> > > ---------------------------------------------------------------
> > >
> > > GPT-3 showed that language can be used to instruct a large neural
> network
> > > to perform a variety of text generation tasks. [Image GPT](
> > > https://openai.com/blog/image-gpt) showed that the same type of neural
> > > network can also be used to generate images with high fidelity. We
> extend
> > > these findings to show that manipulating visual concepts through
> language
> > > is now within reach.
> > >
> > > Overview
> > >
> > > Like GPT-3, DALL·E is a transformer language model. It receives both
> the
> > > text and the image as a single stream of data containing up to 1280
> > tokens,
> > > and is trained using maximum likelihood to generate all of the tokens,
> > one
> > > after another.[2](https://openai.com/blog/dall-e/#fn2)
> > >
> > > A token is any symbol from a discrete vocabulary; for humans, each
> > English
> > > letter is a token from a 26-letter alphabet. DALL·E’s vocabulary has
> > tokens
> > > for both text and image concepts. Specifically, each image caption is
> > > represented using a maximum of 256 BPE-encoded tokens with a vocabulary
> > > size of 16384, and the image is represented using 1024 tokens with a
> > > vocabulary size of 8192.
> > >
> > > The images are preprocessed to 256x256 resolution during training.
> > Similar
> > > to VQVAE,[14](
> > >
> >
> https://openai.com/blog/dall-e/#rf14)[15](https://openai.com/blog/dall-e/#rf15
> > )
> > > each image is compressed to a 32x32 grid of discrete latent codes
> using a
> > > discrete VAE[10](
> > >
> >
> https://openai.com/blog/dall-e/#rf10)[11](https://openai.com/blog/dall-e/#rf11
> > )
> > > that we pretrained using a continuous relaxation.[12](
> > >
> >
> https://openai.com/blog/dall-e/#rf12)[13](https://openai.com/blog/dall-e/#rf13
> > )
> > > We found that training using the relaxation obviates the need for an
> > > explicit codebook, EMA loss, or tricks like dead code revival, and can
> > > scale up to large vocabulary sizes.
> > >
> > > This training procedure allows DALL·E to not only generate an image
> from
> > > scratch, but also to regenerate any rectangular region of an existing
> > image
> > > that extends to the bottom-right corner, in a way that is consistent
> with
> > > the text prompt.
> > >
> > > We recognize that work involving generative models has the potential
> for
> > > significant, broad societal impacts. In the future, we plan to analyze
> > how
> > > models like DALL·E relate to societal issues like economic impact on
> > > certain work processes and professions, the potential for bias in the
> > model
> > > outputs, and the longer term ethical challenges implied by this
> > technology.
> > >
> > > Capabilities
> > >
> > > We find that DALL·E is able to create plausible images for a great
> > variety
> > > of sentences that explore the compositional structure of language. We
> > > illustrate this using a series of interactive visuals in the next
> > section.
> > > The samples shown for each caption in the visuals are obtained by
> taking
> > > the top 32 of 512 after reranking with [CLIP](
> > > https://openai.com/blog/clip/), but we do not use any manual
> > > cherry-picking, aside from the thumbnails and standalone images that
> > appear
> > > outside.[3](https://openai.com/blog/dall-e/#fn3)
> > >
> > > Further details provided in [a later section](
> > > https://openai.com/blog/dall-e/#summary).
> > >
> > > Controlling Attributes
> > >
> > > We test DALL·E’s ability to modify several of an object’s attributes,
> as
> > > well as the number of times that it appears.
> > >
> > > Click to edit text prompt or view more AI-generated images
> > > a pentagonal green clock. a green clock in the shape of a pentagon.
> > >
> > > navigatedownwide
> > > a cube made of porcupine. a cube with the texture of a porcupine.
> > >
> > > navigatedownwide
> > > a collection of glasses is sitting on a table
> > >
> > > navigatedownwide
> > >
> > > Drawing Multiple Objects
> > >
> > > Simultaneously controlling multiple objects, their attributes, and
> their
> > > spatial relationships presents a new challenge. For example, consider
> the
> > > phrase “a hedgehog wearing a red hat, yellow gloves, blue shirt, and
> > green
> > > pants.” To correctly interpret this sentence, DALL·E must not only
> > > correctly compose each piece of apparel with the animal, but also form
> > the
> > > associations (hat, red), (gloves, yellow), (shirt, blue), and (pants,
> > > green) without mixing them up.[4](https://openai.com/blog/dall-e/#fn4)
> > >
> > > This task is called variable binding, and has been extensively studied
> in
> > > the literature.[17](
> > >
> >
> https://openai.com/blog/dall-e/#rf17)[18](https://openai.com/blog/dall-e/#rf18)[19](https://openai.com/blog/dall-e/#rf19)[20](https://openai.com/blog/dall-e/#rf20
> > > )
> > >
> > > We test DALL·E’s ability to do this for relative positioning, stacking
> > > objects, and controlling multiple attributes.
> > >
> > > a small red block sitting on a large green block
> > >
> > > navigatedownwide
> > > a stack of 3 cubes. a red cube is on the top, sitting on a green cube.
> > the
> > > green cube is in the middle, sitting on a blue cube. the blue cube is
> on
> > > the bottom.
> > >
> > > navigatedownwide
> > > an emoji of a baby penguin wearing a blue hat, red gloves, green shirt,
> > > and yellow pants
> > >
> > > navigatedownwide
> > >
> > > While DALL·E does offer some level of controllability over the
> attributes
> > > and positions of a small number of objects, the success rate can depend
> > on
> > > how the caption is phrased. As more objects are introduced, DALL·E is
> > prone
> > > to confusing the associations between the objects and their colors, and
> > the
> > > success rate decreases sharply. We also note that DALL·E is brittle
> with
> > > respect to rephrasing of the caption in these scenarios: alternative,
> > > semantically equivalent captions often yield no correct
> interpretations.
> > >
> > > Visualizing Perspective and Three-Dimensionality
> > >
> > > We find that DALL·E also allows for control over the viewpoint of a
> scene
> > > and the 3D style in which a scene is rendered.
> > >
> > > an extreme close-up view of a capybara sitting in a field
> > >
> > > navigatedownwide
> > > a capybara made of voxels sitting in a field
> > >
> > > navigatedownwide
> > >
> > > To push this further, we test DALL·E’s ability to repeatedly draw the
> > head
> > > of a well-known figure at each angle from a sequence of equally spaced
> > > angles, and find that we can recover a smooth animation of the rotating
> > > head.
> > >
> > > a photograph of a bust of homer
> > >
> > > navigatedownwide
> > >
> > > DALL·E appears to be able to apply some types of optical distortions to
> > > scenes, as we see with the options “fisheye lens view” and “a spherical
> > > panorama.” This motivated us to explore its ability to generate
> > reflections.
> > >
> > > a plain white cube looking at its own reflection in a mirror. a plain
> > > white cube gazing at itself in a mirror.
> > >
> > > navigatedownwide
> > >
> > > Visualizing Internal and External Structure
> > >
> > > The samples from the “extreme close-up view” and “x-ray” style led us
> to
> > > further explore DALL·E’s ability to render internal structure with
> > > cross-sectional views, and external structure with macro photographs.
> > >
> > > a cross-section view of a walnut
> > >
> > > navigatedownwide
> > > a macro photograph of brain coral
> > >
> > > navigatedownwide
> > >
> > > Inferring Contextual Details
> > >
> > > The task of translating text to images is underspecified: a single
> > caption
> > > generally corresponds to an infinitude of plausible images, so the
> image
> > is
> > > not uniquely determined. For instance, consider the caption “a painting
> > of
> > > a capybara sitting on a field at sunrise.” Depending on the orientation
> > of
> > > the capybara, it may be necessary to draw a shadow, though this detail
> is
> > > never mentioned explicitly. We explore DALL·E’s ability to resolve
> > > underspecification in three cases: changing style, setting, and time;
> > > drawing the same object in a variety of different situations; and
> > > generating an image of an object with specific text written on it.
> > >
> > > a painting of a capybara sitting in a field at sunrise
> > >
> > > navigatedownwide
> > > a stained glass window with an image of a blue strawberry
> > >
> > > navigatedownwide
> > > a store front that has the word ‘openai’ written on it. a store front
> > that
> > > has the word ‘openai’ written on it. a store front that has the word
> > > ‘openai’ written on it. ‘openai’ store front.
> > >
> > > navigatedownwide
> > >
> > > With varying degrees of reliability, DALL·E provides access to a subset
> > of
> > > the capabilities of a 3D rendering engine via natural language. It can
> > > independently control the attributes of a small number of objects, and
> > to a
> > > limited extent, how many there are, and how they are arranged with
> > respect
> > > to one another. It can also control the location and angle from which a
> > > scene is rendered, and can generate known objects in compliance with
> > > precise specifications of angle and lighting conditions.
> > >
> > > Unlike a 3D rendering engine, whose inputs must be specified
> > unambiguously
> > > and in complete detail, DALL·E is often able to “fill in the blanks”
> when
> > > the caption implies that the image must contain a certain detail that
> is
> > > not explicitly stated.
> > >
> > > Applications of Preceding Capabilities
> > >
> > > Next, we explore the use of the preceding capabilities for fashion and
> > > interior design.
> > >
> > > a male mannequin dressed in an orange and black flannel shirt
> > >
> > > navigatedownwide
> > > a female mannequin dressed in a black leather jacket and gold pleated
> > skirt
> > >
> > > navigatedownwide
> > > a living room with two white armchairs and a painting of the colosseum.
> > > the painting is mounted above a modern fireplace.
> > >
> > > navigatedownwide
> > > a loft bedroom with a white bed next to a nightstand. there is a fish
> > tank
> > > beside the bed.
> > >
> > > navigatedownwide
> > >
> > > Combining Unrelated Concepts
> > >
> > > The compositional nature of language allows us to put together concepts
> > to
> > > describe both real and imaginary things. We find that DALL·E also has
> the
> > > ability to combine disparate ideas to synthesize objects, some of which
> > are
> > > unlikely to exist in the real world. We explore this ability in two
> > > instances: transferring qualities from various concepts to animals, and
> > > designing products by taking inspiration from unrelated concepts.
> > >
> > > a snail made of harp. a snail with the texture of a harp.
> > >
> > > navigatedownwide
> > > an armchair in the shape of an avocado. an armchair imitating an
> avocado.
> > >
> > > navigatedownwide
> > >
> > > Animal Illustrations
> > >
> > > In the previous section, we explored DALL·E’s ability to combine
> > unrelated
> > > concepts when generating images of real-world objects. Here, we explore
> > > this ability in the context of art, for three kinds of illustrations:
> > > anthropomorphized versions of animals and objects, animal chimeras, and
> > > emojis.
> > >
> > > an illustration of a baby daikon radish in a tutu walking a dog
> > >
> > > navigatedownwide
> > > a professional high quality illustration of a giraffe turtle chimera. a
> > > giraffe imitating a turtle. a giraffe made of turtle.
> > >
> > > navigatedownwide
> > > a professional high quality emoji of a lovestruck cup of boba
> > >
> > > navigatedownwide
> > >
> > > Zero-Shot Visual Reasoning
> > >
> > > GPT-3 can be instructed to perform many kinds of tasks solely from a
> > > description and a cue to generate the answer supplied in its prompt,
> > > without any additional training. For example, when prompted with the
> > phrase
> > > “here is the sentence ‘a person walking his dog in the park’ translated
> > > into French:”, GPT-3 answers “un homme qui promène son chien dans le
> > parc.”
> > > This capability is called zero-shot reasoning. We find that DALL·E
> > extends
> > > this capability to the visual domain, and is able to perform several
> > kinds
> > > of image-to-image translation tasks when prompted in the right way.
> > >
> > > the exact same cat on the top as a sketch on the bottom
> > >
> > > navigatedownwide
> > > the exact same teapot on the top with ’gpt’ written on it on the bottom
> > >
> > > navigatedownwide
> > >
> > > We did not anticipate that this capability would emerge, and made no
> > > modifications to the neural network or training procedure to encourage
> > it.
> > > Motivated by these results, we measure DALL·E’s aptitude for analogical
> > > reasoning problems by testing it on Raven’s progressive matrices, a
> > visual
> > > IQ test that saw widespread use in the 20th century.
> > >
> > > a sequence of geometric shapes.
> > >
> > > navigatedownwide
> > >
> > > Geographic Knowledge
> > >
> > > We find that DALL·E has learned about geographic facts, landmarks, and
> > > neighborhoods. Its knowledge of these concepts is surprisingly precise
> in
> > > some ways and flawed in others.
> > >
> > > a photo of the food of china
> > >
> > > navigatedownwide
> > > a photo of alamo square, san francisco, from a street at night
> > >
> > > navigatedownwide
> > > a photo of san francisco’s golden gate bridge
> > >
> > > navigatedownwide
> > >
> > > Temporal Knowledge
> > >
> > > In addition to exploring DALL·E’s knowledge of concepts that vary over
> > > space, we also explore its knowledge of concepts that vary over time.
> > >
> > > a photo of a phone from the 20s
> > >
> > > navigatedownwide
> > >
> > > Summary of Approach and Prior Work
> > >
> > > DALL·E is a simple decoder-only transformer that receives both the text
> > > and the image as a single stream of 1280 tokens—256 for the text and
> 1024
> > > for the image—and models all of them autoregressively. The attention
> mask
> > > at each of its 64 self-attention layers allows each image token to
> attend
> > > to all text tokens. DALL·E uses the standard causal mask for the text
> > > tokens, and sparse attention for the image tokens with either a row,
> > > column, or convolutional attention pattern, depending on the layer. We
> > > provide more details about the architecture and training procedure in
> our
> > > [paper](https://arxiv.org/abs/2102.12092).
> > >
> > > Text-to-image synthesis has been an active area of research since the
> > > pioneering work of Reed et. al,[1](https://openai.com/blog/dall-e/#rf1
> )
> > > whose approach uses a GAN conditioned on text embeddings. The
> embeddings
> > > are produced by an encoder pretrained using a contrastive loss, not
> > unlike
> > > CLIP. StackGAN[3](https://openai.com/blog/dall-e/#rf3) and
> > StackGAN++[4](
> > > https://openai.com/blog/dall-e/#rf4) use multi-scale GANs to scale up
> > the
> > > image resolution and improve visual fidelity. AttnGAN[5](
> > > https://openai.com/blog/dall-e/#rf5) incorporates attention between
> the
> > > text and image features, and proposes a contrastive text-image feature
> > > matching loss as an auxiliary objective. This is interesting to compare
> > to
> > > our reranking with CLIP, which is done offline. Other work[2](
> > >
> >
> https://openai.com/blog/dall-e/#rf2)[6](https://openai.com/blog/dall-e/#rf6)[7](https://openai.com/blog/dall-e/#rf7
> > )
> > > incorporates additional sources of supervision during training to
> improve
> > > image quality. Finally, work by Nguyen et. al[8](
> > > https://openai.com/blog/dall-e/#rf8) and Cho et. al[9](
> > > https://openai.com/blog/dall-e/#rf9) explores sampling-based
> strategies
> > > for image generation that leverage pretrained multimodal discriminative
> > > models.
> > >
> > > Similar to the rejection sampling used in [VQVAE-2](
> > > https://arxiv.org/abs/1906.00446), we use [CLIP](
> > > https://openai.com/blog/clip/) to rerank the top 32 of 512 samples for
> > > each caption in all of the interactive visuals. This procedure can also
> > be
> > > seen as a kind of language-guided search[16](
> > > https://openai.com/blog/dall-e/#rf16), and can have a dramatic impact
> on
> > > sample quality.
> > >
> > > an illustration of a baby daikon radish in a tutu walking a dog
> [caption
> > > 1, best 8 of 2048]
> > >
> > >
> > >
> >
> navigatedownwide---------------------------------------------------------------
> > >
> > > Footnotes
> > >
> > > -
> > >
> > > We decided to name our model using a portmanteau of the artist Salvador
> > > Dalí and Pixar’s WALL·E. [↩︎](https://openai.com/blog/dall-e/#fnref1)
> > >
> > > -
> > >
> > > A token is any symbol from a discrete vocabulary; for humans, each
> > English
> > > letter is a token from a 26-letter alphabet. DALL·E’s vocabulary has
> > tokens
> > > for both text and image concepts. Specifically, each image caption is
> > > represented using a maximum of 256 BPE-encoded tokens with a vocabulary
> > > size of 16384, and the image is represented using 1024 tokens with a
> > > vocabulary size of 8192.
> > >
> > > The images are preprocessed to 256x256 resolution during training.
> > Similar
> > > to VQVAE,[14](
> > >
> >
> https://openai.com/blog/dall-e/#rf14)[15](https://openai.com/blog/dall-e/#rf15
> > )
> > > each image is compressed to a 32x32 grid of discrete latent codes
> using a
> > > discrete VAE[10](
> > >
> >
> https://openai.com/blog/dall-e/#rf10)[11](https://openai.com/blog/dall-e/#rf11
> > )
> > > that we pretrained using a continuous relaxation.[12](
> > >
> >
> https://openai.com/blog/dall-e/#rf12)[13](https://openai.com/blog/dall-e/#rf13
> > )
> > > We found that training using the relaxation obviates the need for an
> > > explicit codebook, EMA loss, or tricks like dead code revival, and can
> > > scale up to large vocabulary sizes. [↩︎](
> > > https://openai.com/blog/dall-e/#fnref2)
> > >
> > > -
> > >
> > > Further details provided in [a later section](
> > > https://openai.com/blog/dall-e/#summary). [↩︎](
> > > https://openai.com/blog/dall-e/#fnref3)
> > >
> > > -
> > >
> > > This task is called variable binding, and has been extensively studied
> in
> > > the literature.[17](
> > >
> >
> https://openai.com/blog/dall-e/#rf17)[18](https://openai.com/blog/dall-e/#rf18)[19](https://openai.com/blog/dall-e/#rf19)[20](https://openai.com/blog/dall-e/#rf20
> > )
> > > [↩︎](https://openai.com/blog/dall-e/#fnref4)
> > >
> > > ---------------------------------------------------------------
> > >
> > > References
> > >
> > > - Reed, S., Akata, Z., Yan, X., Logeswaran, L., Schiele, B., Lee, H.
> > > (2016). “[Generative adversarial text to image synthesis](
> > > https://arxiv.org/abs/1605.05396)”. In ICML 2016. [↩︎](
> > > https://openai.com/blog/dall-e/#rfref1)
> > >
> > > - Reed, S., Akata, Z., Mohan, S., Tenka, S., Schiele, B., Lee, H.
> (2016).
> > > “[Learning what and where to draw](https://arxiv.org/abs/1610.02454)”.
> > In
> > > NIPS 2016. [↩︎](https://openai.com/blog/dall-e/#rfref2)
> > >
> > > - Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang X., Metaxas, D.
> > > (2016). “[StackGAN: Text to photo-realistic image synthesis with
> stacked
> > > generative adversarial networks](https://arxiv.org/abs/1612.03242)”.
> In
> > > ICCY 2017. [↩︎](https://openai.com/blog/dall-e/#rfref3)
> > >
> > > - Zhang, H., Xu, T., Li, H., Zhang, S., Wang, X., Huang, X., Metaxas,
> D.
> > > (2017). “[StackGAN++: realistic image synthesis with stacked generative
> > > adversarial networks](https://arxiv.org/abs/1710.10916)”. In IEEE
> TPAMI
> > > 2018. [↩︎](https://openai.com/blog/dall-e/#rfref4)
> > >
> > > - Xu, T., Zhang, P., Huang, Q., Zhang, H., Gan, Z., Huang, X., He, X.
> > > (2017). “[AttnGAN: Fine-grained text to image generation with
> attentional
> > > generative adversarial networks](https://arxiv.org/abs/1711.10485).
> > [↩︎](
> > > https://openai.com/blog/dall-e/#rfref5)
> > >
> > > - Li, W., Zhang, P., Zhang, L., Huang, Q., He, X., Lyu, S., Gao, J.
> > > (2019). “[Object-driven text-to-image synthesis via adversarial
> > training](
> > > https://arxiv.org/abs/1902.10740)”. In CVPR 2019. [↩︎](
> > > https://openai.com/blog/dall-e/#rfref6)
> > >
> > > - Koh, J. Y., Baldridge, J., Lee, H., Yang, Y. (2020). “[Text-to-image
> > > generation grounded by fine-grained user attention](
> > > https://arxiv.org/abs/2011.03775)”. In WACV 2021. [↩︎](
> > > https://openai.com/blog/dall-e/#rfref7)
> > >
> > > - Nguyen, A., Clune, J., Bengio, Y., Dosovitskiy, A., Yosinski, J.
> > (2016).
> > > “[Plug & play generative networks: conditional iterative generation of
> > > images in latent space](https://arxiv.org/abs/1612.00005). [↩︎](
> > > https://openai.com/blog/dall-e/#rfref8)
> > >
> > > - Cho, J., Lu, J., Schwen, D., Hajishirzi, H., Kembhavi, A. (2020).
> > > “[X-LXMERT: Paint, caption, and answer questions with multi-modal
> > > transformers](https://arxiv.org/abs/2009.11278)”. EMNLP 2020. [↩︎](
> > > https://openai.com/blog/dall-e/#rfref9)
> > >
> > > - Kingma, Diederik P., and Max Welling. “[Auto-encoding variational
> > bayes](
> > > https://arxiv.org/abs/1312.6114).” arXiv preprint (2013). [↩︎](
> > > https://openai.com/blog/dall-e/#rfref10a) [↩︎](
> > > https://openai.com/blog/dall-e/#rfref10b)
> > >
> > > - Rezende, Danilo Jimenez, Shakir Mohamed, and Daan Wierstra.
> > “[Stochastic
> > > backpropagation and approximate inference in deep generative models](
> > > https://arxiv.org/abs/1401.4082).” arXiv preprint (2014). [↩︎](
> > > https://openai.com/blog/dall-e/#rfref11a) [↩︎](
> > > https://openai.com/blog/dall-e/#rfref11b)
> > >
> > > - Jang, E., Gu, S., Poole, B. (2016). “[Categorical reparametrization
> > with
> > > Gumbel-softmax](https://arxiv.org/abs/1611.01144)”. [↩︎](
> > > https://openai.com/blog/dall-e/#rfref12a) [↩︎](
> > > https://openai.com/blog/dall-e/#rfref12b)
> > >
> > > - Maddison, C., Mnih, A., Teh, Y. W. (2016). “[The Concrete
> distribution:
> > > a continuous relaxation of discrete random variables](
> > > https://arxiv.org/abs/1611.00712)”. [↩︎](
> > > https://openai.com/blog/dall-e/#rfref13a) [↩︎](
> > > https://openai.com/blog/dall-e/#rfref13b)
> > >
> > > - van den Oord, A., Vinyals, O., Kavukcuoglu, K. (2017). “[Neural
> > discrete
> > > representation learning](https://arxiv.org/abs/1711.00937)”. [↩︎](
> > > https://openai.com/blog/dall-e/#rfref14a) [↩︎](
> > > https://openai.com/blog/dall-e/#rfref14b)
> > >
> > > - Razavi, A., van der Oord, A., Vinyals, O. (2019). “[Generating
> diverse
> > > high-fidelity images with VQ-VAE-2](https://arxiv.org/abs/1906.00446)
> ”.
> > > [↩︎](https://openai.com/blog/dall-e/#rfref15a) [↩︎](
> > > https://openai.com/blog/dall-e/#rfref15b)
> > >
> > > - Andreas, J., Klein, D., Levine, S. (2017). “[Learning with Latent
> > > Language](https://arxiv.org/abs/1711.00482)”. [↩︎](
> > > https://openai.com/blog/dall-e/#rfref16)
> > >
> > > - Smolensky, P. (1990). “[Tensor product variable binding and the
> > > representation of symbolic structures in connectionist systems](
> > >
> >
> http://www.lscp.net/persons/dupoux/teaching/AT1_2014/papers/Smolensky_1990_TensorProductVariableBinding.AI.pdf
> )
> > ”.
> > > [↩︎](https://openai.com/blog/dall-e/#rfref17a) [↩︎](
> > > https://openai.com/blog/dall-e/#rfref17b)
> > >
> > > - Plate, T. (1995). “[Holographic reduced representations: convolution
> > > algebra for compositional distributed representations](
> > > https://www.ijcai.org/Proceedings/91-1/Papers/006.pdf)”. [↩︎](
> > > https://openai.com/blog/dall-e/#rfref18a) [↩︎](
> > > https://openai.com/blog/dall-e/#rfref18b)
> > >
> > > - Gayler, R. (1998). “[Multiplicative binding, representation
> operators &
> > > analogy](http://cogprints.org/502/)”. [↩︎](
> > > https://openai.com/blog/dall-e/#rfref19a) [↩︎](
> > > https://openai.com/blog/dall-e/#rfref19b)
> > >
> > > - Kanerva, P. (1997). “[Fully distributed representations](
> > > http://www.cap-lore.com/RWC97-kanerva.pdf)”. [↩︎](
> > > https://openai.com/blog/dall-e/#rfref20a) [↩︎](
> > > https://openai.com/blog/dall-e/#rfref20b)
> > >
> > > ---------------------------------------------------------------
> > >
> > > Authors
> > > [Aditya Ramesh](https://openai.com/blog/authors/aditya/)[Mikhail
> > Pavlov](
> > > https://openai.com/blog/authors/mikhail/)[Gabriel Goh](
> > > https://openai.com/blog/authors/gabriel/)[Scott Gray](
> > > https://openai.com/blog/authors/scott/)
> > > (Primary Authors)
> > > [Mark Chen](https://openai.com/blog/authors/mark/)[Rewon Child](
> > > https://openai.com/blog/authors/rewon/)[Vedant Misra](
> > > https://openai.com/blog/authors/vedant/)[Pamela Mishkin](
> > > https://openai.com/blog/authors/pamela/)[Gretchen Krueger](
> > > https://openai.com/blog/authors/gretchen/)[Sandhini Agarwal](
> > > https://openai.com/blog/authors/sandhini/)[Ilya Sutskever](
> > > https://openai.com/blog/authors/ilya/)
> > > (Supporting Authors)
> > > ---------------------------------------------------------------
> > >
> > > Filed Under
> > > [Research](
> > >
> >
> https://openai.com/blog/tags/research/)[Milestones](https://openai.com/blog/tags/milestones/)[Multimodal](https://openai.com/blog/tags/multimodal/
> > > )
> > > ---------------------------------------------------------------
> > >
> > > Cover Artwork
> > >
> > > Justin Jay Wang
> > >
> > > ---------------------------------------------------------------
> > >
> > > Acknowledgments
> > >
> > > Thanks to the following for their feedback on this work and
> contributions
> > > to this release: Alec Radford, Andrew Mayne, Jeff Clune, Ashley
> > Pilipiszyn,
> > > Steve Dowling, Jong Wook Kim, Lei Pan, Heewoo Jun, John Schulman,
> Michael
> > > Tabatowski, Preetum Nakkiran, Jack Clark, Fraser Kelton, Jacob Jackson,
> > > Greg Brockman, Wojciech Zaremba, Justin Mao-Jones, David Luan, Shantanu
> > > Jain, Prafulla Dhariwal, Sam Altman, Pranav Shyam, Miles Brundage,
> Jakub
> > > Pachocki, and Ryan Lowe.
> > >
> > > ---------------------------------------------------------------
> > >
> > > Contributions
> > >
> > > Aditya Ramesh was the project lead: he developed the approach, trained
> > the
> > > models, and wrote most of the blog copy.
> > >
> > > Aditya Ramesh, Mikhail Pavlov, and Scott Gray worked together to scale
> up
> > > the model to 12 billion parameters, and designed the infrastructure
> used
> > to
> > > draw samples from the model.
> > >
> > > Aditya Ramesh, Gabriel Goh, and Justin Jay Wang worked together to
> create
> > > the interactive visuals for the blog.
> > >
> > > Mark Chen and Aditya Ramesh created the images for Raven’s Progressives
> > > Matrices.
> > >
> > > Rewon Child and Vedant Misra assisted in writing the blog.
> > >
> > > Pamela Mishkin, Gretchen Krueger, and Sandhini Agarwal advised on
> broader
> > > impacts of the work and assisted in writing the blog.
> > >
> > > Ilya Sutskever oversaw the project and assisted in writing the blog.
> > > -------------- next part --------------
> > > A non-text attachment was scrubbed...
> > > Name: not available
> > > Type: text/html
> > > Size: 45019 bytes
> > > Desc: not available
> > > URL: <
> > >
> >
> https://lists.cpunks.org/pipermail/cypherpunks/attachments/20220408/96a4e98c/attachment.txt
> > > >
> > >
> > > ------------------------------
> > >
> > > Subject: Digest Footer
> > >
> > > _______________________________________________
> > > cypherpunks mailing list
> > > cypherpunks at lists.cpunks.org
> > > https://lists.cpunks.org/mailman/listinfo/cypherpunks
> > >
> > >
> > > ------------------------------
> > >
> > > End of cypherpunks Digest, Vol 106, Issue 93
> > > ********************************************
> > >
> > -------------- next part --------------
> > A non-text attachment was scrubbed...
> > Name: not available
> > Type: text/html
> > Size: 39458 bytes
> > Desc: not available
> > URL: <
> >
> https://lists.cpunks.org/pipermail/cypherpunks/attachments/20220408/66867ca0/attachment.txt
> > >
> >
> > ------------------------------
> >
> > Subject: Digest Footer
> >
> > _______________________________________________
> > cypherpunks mailing list
> > cypherpunks at lists.cpunks.org
> > https://lists.cpunks.org/mailman/listinfo/cypherpunks
> >
> >
> > ------------------------------
> >
> > End of cypherpunks Digest, Vol 106, Issue 94
> > ********************************************
> >
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: not available
> Type: text/html
> Size: 46836 bytes
> Desc: not available
> URL: <
> https://lists.cpunks.org/pipermail/cypherpunks/attachments/20220408/eea82fed/attachment.txt
> >
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> cypherpunks mailing list
> cypherpunks at lists.cpunks.org
> https://lists.cpunks.org/mailman/listinfo/cypherpunks
>
>
> ------------------------------
>
> End of cypherpunks Digest, Vol 106, Issue 95
> ********************************************
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 54321 bytes
Desc: not available
URL: <https://lists.cpunks.org/pipermail/cypherpunks/attachments/20220408/0951276d/attachment.txt>


More information about the cypherpunks mailing list