Grok gets glasses to see what you’re talking about
X (formerly Twitter) Premium subscribers can now ask the Grok AI assistant to describe images, not just create them. The Elon Musk company xAI has unveiled a new feature for visual content analysis, giving it the ability to describe photos, diagrams and other snapshots using the Grok-2 AI model that powers the AI chatbot and the creation of Flux AI images.
The feature puts Grok on par with ChatGPT, Gemini and other rivals. If you subscribe to X’s plans, you can try it out now by clicking a button in an image post within X and asking Grok questions about the image or just for clear descriptive analysis.
Along with the new feature, Grok showed off a new benchmark called RealWorldQA that should show how well a model can describe a real-world image, including the space between objects. The company claims that RealWorldQA shows Grok is as good or better than its rivals at explaining images, even though it is still in development. Below is an example of how it works, shared on Elon Musk’s X.
Grok now understands images and even explains the meaning of a joke. This is an early version. It will improve soon. https://t.co/gQ5BBISVRcOctober 28, 2024
See and Grok
As the screenshot illustrates, Grok is able to break down a complex, multi-phase image and explain what’s happening within it. It may extrapolate the humor of the joke, but as is almost always the case, explaining the joke makes it much less funny. Still, it’s a sign that xAI isn’t done releasing new features for Grok, especially multimodal tools. This could be a step toward Grok being able to explain audio and video content in the same way it does with visuals.
One element not mentioned is how Grok’s visual analysis might reflect the free image creation by the AI chatbot, which seems to have little or no compassion for copyright issues. It’s something that users who captured images of Mario had to deal with when Nintendo’s copyright hunter, Tracer, came after them for infringement. It would be interesting to discover if an AI image of Mario or any other intellectual property would be described as such or in more general terms.
Because the owner of xAI is who he is, there is also very clear potential for the position at other tech companies owned by Musk. Tesla’s semi-autonomous driving would certainly benefit from being able to identify people and objects around it and how they are spaced out. The same applies to the long-promised humanoid robots that Tesla has been developing in recent years.