Google’s most advanced image generator has arrived, months after the tech giant announced the model at this year’s Google I/O event. The Imagen 3 model is available now via Google’s Gemini AI platform, both the free version and the subscription-based Gemini Advanced service, as well as within Google’s enterprise products. Google clearly wants Imagen 3 to compete with the rapidly growing competition among AI image generators with its own approach to turning words into images.
Like its predecessors, Imagen 3 can create images in a variety of styles, including the photorealistic landscapes and cartoonish claymation seen above. The new version is an improvement over Imagen 2 in many ways, particularly when it comes to taking photos of people. The company has been very vocal about the fact that Imagen 3 won’t repeat the historical mistakes that embarrassed the company earlier this year . That said, “photorealistic, identifiable individuals” are still prohibited.
Imagen 3 also brings the real-time editing options that were spotted in the code last month. You can tell Gemini what you think of generated images, and tell the AI to alter them however you want. The company didn’t mention that you’ll be able to circle the part of the image you want altered, but that may come later. Imagen 3 is integrated into Gemini, starting in English, with more languages coming. Imagen 3 should be a major draw for Gemini, where Google apparently wants people to default to, much like how so many people mindlessly head to its search engine.
AI Image War
Imagen 3 also continues Google’s image marking efforts with the SynthID tool for watermarking AI-generated images taken with Gemini. SynthID embeds invisible watermarks into images so you don’t notice them, but any attempt to pass them off as a real photo or something you painted would be quickly debunked. Google describes it as a way to fight back against misinformation and bring transparency to the world of AI-generated images. SynthID is another security measure Google is using for Imagen 3, along with protections against producing photos of people, violent images, and other problematic scenes.
Imagen 3 is a clear indicator of the rapid advancements in AI image creation and their integration into all sorts of content creation platforms. This is an area where Google has a head start with most of its completion. Ideogram, Midjourney, and other AI image makers tend to be standalone tools. On the other hand, OpenAI has DALL-E as a key feature for ChatGPT, and X recently built Flux into its Grok AI chatbot. Imagen 3 combined with Gemini gives Google a clear boost, but there’s no way to know which, if any, of the AI image generators will dominate the race. It’ll be a photo(realistic) finish.