Google explains how Gemini’s AI image generation went wrong and how it will be fixed

By James On Feb 26, 2024

A few weeks ago, Google launched a new image generation tool for Gemini (the suite of AI tools formerly known as Bard and Duet) that allowed users to generate all kinds of images from simple text prompts. Unfortunately, Google’s AI tool repeatedly missed the mark and generated inaccurate and even offensive images, leaving many of us wondering: how did the bot get things so wrong? Well, the company has finally released a statement explaining what went wrong and how it plans to fix Gemini.

The official blog post Addressing the issue, it states that when designing the text-to-image feature for Gemini, the team behind Gemini “wanted to ensure that it doesn’t fall into the pitfalls we’ve seen in the past with generation technology of images – such as creating violent or sexually explicit images, or images of real people.” The message goes on to explain that users probably don’t want to keep seeing people with only one ethnicity or some other distinguishing characteristic.

So to give a pretty simple explanation for what’s going on: Gemini has thrown up images of people of color when asked to generate images of white historical figures, causing users’various Nazis‘, or simply ignore the part of the prompt where you specified exactly what you are looking for. Although Gemini’s imaging capabilities are currently suspended, if you were to access the feature you would specify exactly who you were trying to generate (Google uses the example “a white vet with a dog”) and Gemini would seemingly spend the first half of ignore the image. that veterinarians of all breeds encourage and generate except the one you asked for.

Google further explained that this was the result of two crucial flaws: first, Gemini showed a range of different people without taking into account a range not to show. Additionally, in an effort to create a more aware, less biased generative AI, Google admits that “the model became much more cautious than we intended and refused to answer certain clues completely – wrongly ignoring some very anonymous clues were interpreted as sensitive.”

So what now?

At the time of writing, the ability to generate images of people on Gemini has been paused while the Gemini team works to fix the inaccuracies and conduct further testing. The blog post notes that AI “hallucinations” are nothing new when it comes to complex deep learning models — even Bard and ChatGPT had some questionable tantrums as those bots’ creators worked out the kinks.

The post ends with a promise from Google to continue working on Gemini’s AI-powered human generation until everything is resolved, noting that while the team can’t promise that won’t happen, either. ever produce “embarrassing, inaccurate or offensive results”, action is taken to ensure that this happens as little as possible.

Overall, this whole episode puts that into perspective AI is as smart as we make it. Our Editor-in-Chief Lance Ulanoff succinctly noted, “If an AI doesn’t know history, you can’t blame the AI.” With how quickly artificial intelligence has emerged and inserted itself into various facets of our daily lives – whether we like it or not – it’s easy to forget that the public spread of AI only began eighteen months ago. As impressive as the tools currently available to us are, we are ultimately still in the early days of artificial intelligence.

We can’t storm the Google Gemini parade just because the errors were more visually striking than, say, ChatGPT’s recent gibberish-filled meltdown. Google’s temporary pause and rework will ultimately lead to a better product, and sooner or later we’ll see the tool as it was intended.