Google’s Gemini AI has only been around for two months at the time of writing, and the company is already launching its next-generation model, called Gemini 1.5.
The announcement post goes into more detail and explains all the AI improvements in detail. It’s all quite technical, but the main takeaway is that Gemini 1.5 will deliver “dramatically improved performance.” This was achieved with the implementation of a “Mixture-of-Experts Architecture” (or MoE for short) in which multiple AI models work together. By implementing this structure, Gemini was easier to train and faster at learning complicated tasks than before.
There are plans to roll out the upgrade to all three major versions of the AI, but the only one being released for early testing today is Gemini 1.5 Pro.
What is unique about it is that the model has “a context window of up to 1 million tokens”. CoinsAs they relate to generative AI, these are the smallest pieces of data that LLMs (large language models) use “to process and generate text.” Larger context windows allow the AI to process more information at once. And a million tokens is huge, much bigger than what GPT-4 Turbo can do. For comparison, OpenAI’s engine has a context window limit of 128,000 tokens.
Gemini Pro in action
With all these numbers being thrown around, the question is what does Gemini 1.5 Pro look like in action? Google has made several videos showing the AI’s capabilities. Admittedly, it’s quite interesting, as they show how the upgraded model can analyze and summarize large amounts of text according to a prompt.
In one example, they gave Gemini 1.5 Pro the 400+ page transcript of the Apollo 11 moon mission. It showed that the AI could ‘understand, reason about and identify’ certain details in the document. The prompter asks the AI to pinpoint “comic moments” during the mission. After 30 seconds, Gemini 1.5 Pro managed to find a few jokes the astronauts made in space, including who told it and explained any references.
These analysis skills can be used for other modalities. In another demo, the development team gave the AI a 44-minute Buster Keaton movie. They uploaded a rough sketch of a flowing water tower and then asked for the timestamp of a scene with a water tower. Sure enough, the exact part was found ten minutes into the film. Please note that this is done without any explanation of the drawing itself or any other text besides the question. Gemini 1.5 Pro understood that it was a water tower without any additional help.
Experimental technology
The model is currently not available to the general public. Currently, it is being offered as an early preview to “developers and enterprise customers” through Google. AI studio And Vertex AI platforms for free. The company warns testers that they may experience long latency times as it is still experimental. However, there are plans to improve speeds along the line.
We reached out to Google asking for information on when people can expect the launch of Gemini 1.5 and Gemini 1.5 Ultra, plus the wider release of these next-generation AI models. This story will be updated at a later date. Until then, check out Ny Breaking’s roundup of the best AI content generators for 2024.