How do these two AI image generators compare side by side?
At the Google IO 2024 event, Google announced a slew of brand-new products and huge AI updates. One of the major announcements was the brand new version of its text-to-image AI tool, Imagen 3.
Based on what they showcased during the announcement, there has been a significant improvement in visual quality. Imagen 3 has reached a level where it can easily compete with Midjourney v6.
But how do these two AI image generators compare side by side?
Letâs dive in and find out.
Three women stand together laughing, with one woman slightly out of focus in the foreground. The sun is setting behind the women, creating a lens flare and a warm glow that highlights their hair and creates a bokeh effect in the background. The photography style is candid and captures a genuine moment of connection and happiness between friends. The warm light of golden hour lends a nostalgic and intimate feel to the image
Both images look gorgeous, and the people in the frames are incredibly photorealistic. If I had to choose between the two, Iâd still prefer the image generated by Midjourney. The specular reflection looks better, and the skin texture is smoother, giving a more natural feel to the candid moment.
A large, colorful bouquet of flowers in an old blue glass vase on the table. In front is one beautiful peony flower surrounded by various other blossoms like roses, lilies, daisies, orchids, fruits, berries, green leaves. The background is dark gray. Oil painting in the style of the Dutch Golden Age.
Imagen 3 takes the win here. The softer and warmer tone of the overall image makes me want to hang it on my wall. While Midjourney also did a great job, it often uses wildly saturated colors that can take away from the naturalism of the result.
Generative AI Publication is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
Subscribed
A weathered, wooden mech robot covered in flowering vines stands peacefully in a field of tall wildflowers, with a small bluebird resting on its outstretched hand. Digital cartoon, with warm colors and soft lines. A large cliff with a waterfall looms behind.
Imagen 3 did a better job on this one. Despite trying several times, Midjourney continuously fails to adhere completely to the promptâââthe robot does not stretch its hand and is not looking at the bird, which diminishes the emotional impact present in the first image.
A view of a personâs hand as they hold a little clay figurine of a bird in their hand and sculpt it with a modeling tool in their other hand. You can see the sculptorâs scarf. Their hands are covered in clay dust. a macro DSLR image highlighting the texture and craftsmanship.
I remember the days when everyone was talking about how bad AI image generators render hands and limbs. Today, almost all AI models have improved a lot in that aspect and the examples above represent that progress.
Comparing the two images, the sculptorâs hand is covered in clay dust in the Midjourney-generated image, while itâs very clean in the Imagen 3 version.
A single comic book panel of a boy and his father on a grassy hill, staring at the sunset. A speech bubble points from the boyâs mouth and says: âThe sun will rise againâ. Muted, late 1990s coloring style
In this example, to be fair to Midjourney, I tried generating the image five times but failed to get the correct text rendered. Even after adding quotes to the text to fit Midjourneyâs text rendering rules, it wasnât able to render the text properly.
Elephant amigurumi walking in savanna, a professional photograph, blurry background
Both results are stunning, with mind-blowing levels of detail on the yarn loops. Itâs easy to mistake them for real photographs. However, if I had to choose which one is better, Iâd say the one from Midjourney edges out Imagen 3 in this case. Do you agree?
Word âlightâ made from various colorful feathers, black background
This is a good example of just how better Imagen 3 is with text rendering capability. It was a nice try from Midjourney but the result isnât that legible and contains unwanted artifacts.
This is a cherry-picked result from Imagen, though. I donât know how many times they had to generate the image with the same prompt to get that awesome image.
Detailed illustration of majestic lion roaring proudly in a dream-like jungle, purple white line art background, clipart on light violet paper texture
Comparing the two images, Imagen 3 demonstrates more consistency as a line art piece, with colors much closer to the requested light violet compared to Midjourneyâs result. Both look very cool, though, and itâs impressive to see AI handle various art styles.
Claymation scene. A medium wide shot of an elderly woman. She is wearing flowing clothing. She is standing in a lush garden watering the plants with an orange watering can
Both images adhere to the prompt, but the one from Imagen 3 looks more polished. In the Midjourney version, the elderly womanâs hand holding the watering can doesnât look quite right, and the water doesnât come out directly from the canâs spout.
Photographic portrait of a real life dragon resting peacefully in a zoo, curled up next to its pet sheep. Cinematic movie still, high quality DSLR photo.
While Imagen 3 has improved significantly in generating creatures, Midjourney is still the king in this category. Just look at how cute the dragon and sheep look together in the Midjourney image.
Head over to Googleâs official blog post of Imagen 3 and click on the âSign up to try on ImageFXâ button.
ImageFX is part of Googleâs test kitchen for its AI tools.
You can also request access to Imagen 3 from the ImageFX dashboard.
Okay, thatâs about it. I hope you found this comparison article helpful. If you want me to do a comparison of Imagen 3 against other image generators like OpenAIâs Dall-E 3 or Adobe Firefly 2.0, let me know.
Overall, itâs great to see these two image models performing really well. The images are very detailed, coherent, and overall stunning.
From an aesthetic perspective, I still find Midjourney superior, but we have now reached a saturation point in text-to-image models the text rendering is on the OpeAIâs Dall-E 3 level.
While itâs important to keep in mind that the example images are cherry-picked by Google and may not be fully representative of Imagen 3âs performance once itâs publicly available, I must admit that Iâm impressed by what Iâve seen so far.
â
Software engineer, writer, solopreneur