While previous editions of Google I/O had already focused on artificial intelligence, this year Google took it a step further by unveiling innovations and updates on its Gemini model, once again underscoring its commitment to a technology that is now ubiquitous.
This year’s conference was not only a technology showcase, but also revealed Google’s strategic depth in leveraging artificial intelligence to strengthen and diversify its products. By integrating Gemini with various platforms such as Android, Google Workspace, and even Google Search, the company is proving that AI is no longer just a side project, but a central pivot of its product ecosystem.
Gemini 1.5: Up to 2 million tokens for the Pro version
In Google’s technological arsenal, the Gemini 1.5 Flash and Pro models now occupy a prominent place and it is on the latter that all the most impressive demos of this keynote are based. First presented at the beginning of the year, this new iteration of Gemini is based on a MoE (Mixture of Experts) architecture to offer performance close to its competitors in terms of reasoning, but above all a memory of a million (!) tokens, well beyond the 128k of GPT-4o and Claude. Gemini 1.5 Flash, on the other hand, looks like a quantized version of its big brother, optimized for low-latency, high-frequency tasks.
Its counterpart, Gemini 1.5 Pro, on the other hand, doubles the data processing context window to 2 million memory tokens, providing significantly greater analysis and understanding capacity, which is critical for applications such as real-time translation or in-depth personal assistance. Initially reserved for a small caste of partners, Gemini 1.5 is available to all developers, but especially to all Gemini Advanced users, all in 35 languages.
But is all this enough to fuel Google’s multimodal AI ambitions? The Mountain View company strongly believes in it, and that’s what it wanted to demonstrate during its keynote. With multimodal, Google intends to offer artificial intelligence that can ingest sound, images, and videos, all with temporality compression.