AI promises to mark the before and after in many areas, but one in particular is highlighting just about everything this year. In April we talked about the huge possibilities of DALL-E 2Artificial intelligence capable of Generate images from text. came later DALL-E Minithe creator of it He surprised us with his raving creations. Now it’s Barty’s turn, an alternative betting on a promising new model for photo-realistic generation.
Unlike DALL-E and its variants, which use a “broadcast” model to generate images from text, Parti (Pathways Autoregressive Text-to-Image) is based on an auto-regressive model that allows for longer text input and is capable of complex structures. As we can see in the featured image, Barty’s scores are more like a work of art than amorphous characters like those offered by the DALL-E Mini (photo below).
Google’s new image creator
Google researchers rely on Blog post who tested Barty on four scales (350m, 750m, 3b, 20b) under the same parameters, i.e. using the same text inputs. When tested, they find that the latter scale particularly excels at abstract claims, requiring knowledge of the world, specific perspectives, and representations of symbols.
In one attempt, they used the following entry text: “A map of the United States made of sushi. It is on a table next to a glass of red wine (Map of the United States made of sushi. It’s on a table next to a glass of red wine).” As we can see, the 350M scale makes a bewildering representation, things get better in the 750M, they offer “creativity” in the 3B and Amazing result in 20 b.
We can also see a test in which researchers evaluated Barty’s work in various complex scenarios. Enter the text “Picture of a tiger wearing a train captain’s hat and holding a skateboard with the symbol of Yin and Yang (Picture of a tiger in a train worker’s hat holding a skateboard with the Yin and Yang symbol)“.
They requested variations in photography, comics, oil painting, and the marble statue, among others. Surprisingly, AI has been shown to be able to stick to specific image formats and styles, although such good results are not always achieved. “While Parti produces high-quality results for a wide range of indicators, the model nonetheless has many limitations,” refer from google.
The Mountain View giant will continue to train and improve its AI models to “improve human creativity and productivity.” It should be noted that for security reasons (Google wants to prevent abuse), Parti is not publicly available, As with the DALL-E Mini, so we will not be able to create our own images from text. However, we are left with the alternative of seeing a large number of examples On the project page s See the full investigation.