James in Tech

The 7 Biggest AI Announcements from Google I/O 2024 – From Gemini to Project Astra

The Google I/O 2024 keynote was a packed Gemini-fest, and CEO Sundar Pichai was right when he described it as the version of The Eras Tour – especially the ‘Gemini Era’ – right at the top.

The entire keynote was about Gemini and AI; Google even said the latter 121 times. From the unveiling of a futuristic AI assistant called “Project Astra” that can run on a phone (and maybe one day, glasses too) to the fact that Gemini is being used in virtually every service or product the company offers: AI was definitely the big theme.

It was all enough to melt the minds of all but the most ardent LLM enthusiast, so we broke down the top seven things Google revealed and discussed at its main I/O 2024 keynote.

1. Google dropped Project Astra – an ‘AI agent’ for everyday life

So it turns out that Google does have an answer to OpenAI’s GPT-4o and Microsoft’s CoPilot. Dubbed an ‘AI agent’ for everyday life, Project Astra is essentially Google Lens on steroids and looks seriously impressive, capable of understanding, reasoning and reacting to live video and audio.

Demonstrated on a Pixel phone in a recorded video, the user was seen walking around an office, providing a live feed from the rear camera and immediately asking Astra questions. Gemini viewed and understood the images and answered the questions at the same time.

It speaks to the multimodal and long context in Gemini’s backend, which works in a snap to quickly identify and deliver a response. During the demonstration, it knew what a specific part of a speaker was and could even identify a neighborhood in London. It’s also generative because it quickly created a band name for a cute puppy next to a stuffed animal (see the video above).

It won’t be rolling out immediately, but developers and press like us at Ny Breaking will be able to try it out at I/O 2024. And while Google didn’t provide clarity, there was a sneak peek of glasses for Astra, which could mean Google Glass would can make a comeback.

Still, even as a demo at Google I/O, it’s seriously impressive and potentially very compelling. It could boost smartphones and current assistants from Google and even Apple. Plus, it also shows Google’s true AI ambitions, a tool that can be immensely useful and is no hassle to use at all.

2. Google Photos got a useful AI boost from Gemini

I really want to know if this is the real child of a Google employee or if it is a Gemini generated child… (Image credit: Google)

Have you ever wanted to quickly find a specific photo you took somewhere in the distant past? Maybe it’s a note from a loved one, an early photo of a dog as a puppy, or even your license plate number. Well, Google is making that wish a reality with a major update to Google Photos that combines it with a Gemini. This gives him access to your library, allows him to search it and easily delivers the result you are looking for.

In an onstage demo, Sundar Pichai revealed that you can ask for your license plate number, and Photos will provide an image showing this, along with the numbers/characters that make up your license plate number. Likewise, you can ask for photos of when your child learned to swim, along with any further details. It should make even the most disorganized photo libraries a little easier to search.

Google has called this feature ‘Ask Photos’ and will roll it out to all users in the ‘coming weeks’. And it will almost certainly come in handy and make people who don’t use Google Photos a little jealous.

3. Your child’s homework just got a lot easier thanks to NotebookLM

(Image credit: Google)

All parents will know what a horror it is to help children with homework; If you ever knew about this stuff in the past, there’s no way the knowledge is still lurking in your brain twenty years later. But Google may have made the task a lot easier, thanks to an upgrade to its NotebookLM note-taking app.

NotebookLM now has access to Gemini 1.5 Pro, and based on the demo given at I/O 2024, it will now be a better teacher than you’ve ever been. The demo showed Google’s Josh Woodward loading a notebook full of notes about a learning topic – in this case, science. At the touch of a button he could create a detailed learning guide, with further deliverables including quizzes and frequently asked questions, all drawn from the source material.

Impressive – but it was about to get much better. A new feature – still in prototype for now – was able to output all content as audio, essentially creating a podcast-style discussion. Additionally, the audio featured more than one speaker, who naturally talked about the topic in a way that would definitely be more helpful than a frustrated parent trying to play the role of teacher.

Woodward could even interrupt and ask a question, in this case “give us a basketball example” – after which the AI changed course and came up with clever metaphors for the subject, but in an accessible context. The parents on the Ny Breaking team are eager to try this one out.

4. Soon you will be able to search Google with a video

(Image credit: Google)

In a strange on-stage demo with a record player, Google showed off a very impressive new search trick. You can now record a video and search it to get results and hopefully an answer.

In this case, it was a Googler wondering how to use a record player; she pressed the record to film the unit in question asking something and then sent it. Google worked its search magic and provided an answer in text that could be read aloud. It’s a brand new way of searching, like Google Lens for video, and also distinctly different from Project Astra’s upcoming daily AI, as it requires recording and then searching rather than working in real time.

Still, it’s part of a Gemini and generative AI infusion with Google Search, aiming to keep you on that page and make it easier to get answers. Before this video search demo, Google showed off a new generative experience for recipes and dining. This allows you to search for something in natural language and get recipes or even food recommendations on the results page.

Simply put, Google is going full throttle with generative AI in search, both for results and different ways to get the results.

We’ve been marveling at the creations of OpenAI’s text-to-video tool Sora for the past few months, and now Google is joining the generative video party with its new tool called Veo. Like Sora, Veo can generate minute-long videos in 1080p quality, all from a simple prompt.

That prompt can include cinematic effects, such as a request for a time-lapse or aerial view, and the early samples look impressive. You don’t have to start from scratch either: upload an input video with a command, and Veo can edit the clip to match your request. There is also the option to add masks and adjust specific parts of a video.

The bad news? Like Sora, Veo is not yet available everywhere. Google says it will be available to select creators through VideoFX, one of its experimental Labs features, “in the coming weeks.” It may be a while before we see a widespread rollout, but Google has promised to bring the feature to YouTube Shorts and other apps. And that will make Adobe shift uncomfortably in its AI-generated chair.

6. Android got a major Gemini infusion

(Image credit: Google)

Just as Google’s “Circle to Search” feature sits on top of an application, Gemini now integrates into the core of Android to integrate with your flow. As demonstrated, Gemini can now view, read and understand what’s on your phone’s screen, so it can anticipate questions about whatever you’re viewing.

So it can get the context of a video you’re watching, anticipate a summary request when viewing a long PDF, or be prepared for countless questions about an app you’re in. It’s definitely not a bad thing and could be super helpful.

In addition to Gemini being integrated at the system level, Gemini Nano with Multimodality will launch on Pixel devices later this year. What will it make possible? Well, it should speed things up, but the most important feature for now is that Gemini listens to calls and can alert you in real time if they’re spam. That’s pretty cool and builds on call screening, a long-standing feature of Pixel phones. It is ready to be faster and process more on the device instead of sending it to the cloud.

7. Google Workspace is getting a lot smarter

(Image credit: Google)

Workspace users get a wealth of Gemini integrations and useful features that can make a big impact on a daily basis. Within Mail, a new side panel on the left lets you ask Gemini to summarize all recent conversations with a colleague. The result is then summarized with bullet points highlighting the most important aspects.

Gemini in Google Meet can give you the highlights of a meeting or what other people might ask during the call. You no longer have to take notes during that conversation, which can be useful if it takes a long time. Within Google Sheets, Gemini can help make sense of data and process requests, such as retrieving a specific amount or data set.

The virtual teammate ‘Chip’ is perhaps the most futuristic example. It can live in a G-chat and be summoned for various tasks or questions. While these tools will make their way to Workspace, likely first through Labs, the remaining question is when they will be available to regular Gmail and Drive customers. Given Google’s approach to AI for all and pushing search so hard, this is likely just a matter of time.

1. Google dropped Project Astra – an ‘AI agent’ for everyday life

2. Google Photos got a useful AI boost from Gemini

3. Your child’s homework just got a lot easier thanks to NotebookLM

4. Soon you will be able to search Google with a video

6. Android got a major Gemini infusion

7. Google Workspace is getting a lot smarter

You might also like it