Google Is Done Building Assistants. It Is Building Agents.

Google

Tech & AI | May 23, 2026

At Google IO 2026, the company’s agentic AI ambitions were impossible to miss: more than 100 announcements across four days at Shoreline Amphitheatre, all pointing toward the same conclusion. The era of conversational AI, where users type questions and read back answers, is being replaced by something more autonomous, more persistent, and considerably harder to compete against.

The company listed more than 100 announcements across the four-day developer conference. Strip away the product names and version numbers, and the architecture underneath is consistent. Google is building systems that do not wait to be asked. They plan. They act. They run in the background while you are asleep. And they are woven, at every layer, into the products that two billion people already use every day.

The Model That Kicked Everything Off

The announcement that defined the conference’s technical register was Gemini 3.5 Flash, the first model in Google’s new 3.5 series and, by any measure, an unusual proposition: a model priced and positioned as a lightweight option that outperforms last year’s flagship.

On the benchmark scorecard, 3.5 Flash hits 76.2 percent on Terminal-Bench 2.1, 83.6 percent on MCP Atlas, and 84.2 percent on CharXiv Reasoning, all figures above the Gemini 3.1 Pro scores it directly references as the comparison point. Its output speed runs at 289 tokens per second, which Google claims is four times faster than comparable frontier models and 70 percent faster than the previous Flash. Pricing lands at $1.50 per million input tokens and $9.00 per million output tokens, with cached inputs at $0.15, a cost structure Google describes as less than half that of comparable alternatives.

The 1-million-token context window, paired with a 64,000-token output cap, makes it capable of processing large codebases, long document sets, and extended multi-step instructions in a single pass. During a developer demonstration, Google showed Flash running 93 parallel subagents handling 15,000-plus requests across a 12-hour window for under $1,000 in API costs. For the developers in the audience who have been waiting for a frontier-capable model at sub-frontier prices, that figure was the headline.

Gemini 3.5 Flash went generally available the same day it was announced. By May 20 it was the default model inside both the Gemini app and AI Mode in Google Search globally. The successor, Gemini 3.5 Pro, is already in internal testing at Google and is expected to roll out next month.

Gemini Spark and the Always-On Agent

The consumer announcement that will carry the most long-term significance is Gemini Spark, which Google introduced not as a model but as an agent, specifically, a 24-hour, seven-days-a-week personal agent that runs on a dedicated Google Cloud virtual machine and operates whether or not the user has their phone open or their laptop on.

Where existing AI products are reactive, waiting for input before doing anything, Spark is proactive by design. It reads incoming email and surfaces what needs action before the user asks. It drafts replies and follow-ups across days and weeks. It tracks ongoing tasks across multiple applications and executes multi-step workflows without being prompted at each stage.

At launch, Spark connects to the full suite of Google Workspace: Gmail, Google Docs, Sheets, Slides, and Drive. The expansion to third-party services, including Canva, OpenTable, and Instacart, is planned for within weeks and will run through the Model Context Protocol, the open standard that Anthropic introduced in late 2024 and that has since been adopted across the agent industry as the standard interface between AI models and external tools.

Spark is available initially to Google AI Ultra subscribers in the United States. That subscription tier was repriced from $250 per month to $100 per month at the same I/O keynote, bundled with 20 terabytes of cloud storage, YouTube Premium, and the Spark beta. The pricing move is a direct response to competition at the premium AI subscription layer, where OpenAI’s Pro tier at $200 per month and Anthropic’s comparable offering have been cannibalising the segment Google wants.

The agent architecture Spark represents is fundamentally different from the assistant architecture it is replacing. An assistant waits; an agent acts. An assistant holds context for a single conversation; an agent holds context across weeks. An assistant ends when the chat window closes; an agent runs on a virtual machine whether or not any chat window is open. This is a significant structural shift in what AI products are, and Google is moving to this model at the same moment its core competitors are making similar transitions.

Developer Infrastructure: Antigravity and the Agent Stack

For developers building on top of Google’s models, IO 2026 delivered Antigravity 2.0, a standalone desktop application positioned as the central hub for agent development. Antigravity supports parallel subagent execution, meaning developers can orchestrate multiple specialised agents running simultaneously rather than sequentially. It integrates scheduled tasks for background automation, an architecture that aligns directly with the always-on design of Spark. The updated harness is co-optimised with Gemini 3.5 Flash specifically.

Google also released a lightweight terminal product for rapid agent creation and an updated programmatic SDK. The full developer stack is built around a premise: that most serious applications in the next two years will not be single-turn queries but multi-step workflows involving many model calls, tool invocations, and asynchronous background processes. Antigravity is the workbench for building those workflows.

On Google Cloud, the I/O announcements included expanded access to model tuning, extended context caching for cost reduction on repeated long-context calls, and deeper integration between Gemini models and BigQuery for enterprise data workflows. The cloud strategy is aimed at the enterprise customer who wants frontier model capabilities without paying the cost structure of running everything at full context every time.

Android XR and the Hardware Bet

Away from models and agents, Google announced two new categories of Android XR hardware. The first, display glasses built with Samsung, were previewed last year and are moving toward a launch window later in 2026. The second, audio glasses without displays, is a new product category announced at this I/O, with Samsung, Warby Parker, and Gentle Monster named as hardware partners.

The audio glasses will connect to both Android and iOS devices. They are positioned below the display glasses in price and capability but above existing Bluetooth earbuds in terms of ambient intelligence, able to hear what is around the user and provide contextual information through audio without requiring the user to look at a screen.

The XR strategy is a direct response to Meta’s Ray-Ban smart glasses, which have become one of Meta’s highest-selling consumer hardware products. Google ceded the audio glasses market to Meta for roughly two years while focusing on more ambitious display hardware. The move to launch audio glasses with Warby Parker and Gentle Monster, both brands with established fashion credentials, suggests a more deliberate approach to consumer positioning than Google’s previous hardware efforts.

SynthID and the Watermarking Infrastructure

One announcement that received less keynote time but carries broad industry implications was the expansion of SynthID, Google’s AI content watermarking technology. SynthID embeds imperceptible signals into AI-generated images, audio, and video that survive typical post-processing, allowing content to be verified as AI-generated without visible markers.

At IO 2026, Google announced that SynthID verification for images, video, and audio had been added to the Gemini app itself, and that verification capability was expanding to Google Search and Chrome. The technology has now been used 50 million times globally. More significantly, OpenAI, Kakao, and ElevenLabs have committed to bringing SynthID technology to their own AI-generated content, marking the first meaningful cross-industry adoption of a shared watermarking standard.

The provenance problem, knowing whether an image, audio clip, or video is AI-generated, is becoming a practical policy and legal question as AI-generated media proliferates. SynthID is Google’s attempt to make itself the infrastructure layer for answering that question, positioning its verification standard as the default across the industry rather than just one of several competing approaches.

What This Means for Google’s Competitors

The strategic logic of IO 2026 is clearest when read against what Google’s competitors are doing.

OpenAI has been moving in the same agentic direction, with Operator and the recently announced self-serve Ads Manager inside ChatGPT. Anthropic’s Claude has gained significant ground in enterprise coding and reasoning tasks. Meta is building agents into WhatsApp and its broader social infrastructure. Microsoft has Copilot embedded across Office and Windows.

What Google has that none of these competitors can replicate quickly is the existing user base. Gemini 3.5 Flash became the default model in Google Search globally on the same day it was announced. That is not a deployment in the traditional sense; it is a distribution event touching billions of queries per day. Spark launching inside Gmail gives it access to a communications graph that no new-entrant AI company has.

The deeper competitive question is whether Google can execute on the agentic vision at the pace its announcements suggest. Google has historically struggled to ship consumer AI products that match its research capabilities. Assistant, released in 2016, never fulfilled the potential its technology suggested. Duplex, demonstrated in 2018, never scaled to the ambition of its original demo. The gap between Google’s capability announcements and its product executions has been a consistent pattern.

The agents announced at IO 2026 are, by design, harder to evaluate at announcement time than previous AI products. An assistant that answers questions can be tested immediately. An agent that manages workflows over days and weeks requires weeks to evaluate. That timeline difference gives Google some runway, but it also means the honest verdict on Gemini Spark will not be available until the summer at the earliest.

What Remains Contested

The benchmarks used to validate Gemini 3.5 Flash, including Terminal-Bench 2.1 and MCP Atlas, were not widely familiar before this announcement. Models releasing alongside the benchmarks that validate them is a pattern the AI industry has not resolved cleanly. Independent third-party testing, from Artificial Analysis and similar organisations, will be the more reliable source of comparative performance data over the next few weeks.

The economics of always-on agents are also unresolved. Running a personal agent on a cloud virtual machine, 24 hours a day, at the kind of capability level Spark claims, has real infrastructure costs. Whether $100 per month covers those costs at scale, or whether the pricing is intentionally below cost to drive adoption, is an open question. The history of consumer AI subscriptions suggests the latter.

The privacy architecture of an agent that reads your email continuously and executes actions on your behalf is also not fully specified. Google’s privacy disclosures at IO were high-level. The detailed data handling documentation will matter for enterprise adoption and for any regulatory scrutiny that follows in Europe, where the AI Act creates compliance obligations for systems of this kind.

What Google IO 2026’s Agentic AI Shift Actually Means

Taken together, the IO 2026 announcements represent something more significant than a product refresh cycle. Google is rebuilding the premise of what an AI product is, shifting from software you invoke to infrastructure that operates continuously on your behalf.

That shift is real. The questions that remain, around execution, privacy, cost structure, and the reliability of always-on agents across complex real-world workflows, are also real. The companies watching most carefully are the ones that have built their business models on the assumption that Google would continue to be slow.

Sources: 100 things we announced at Google I/O 2026, Google Blog | Gemini 3.5 Flash launch details, MarkTechPost | Google introduces Gemini Spark, TechCrunch | Everything announced at Google I/O 2026, 9to5Google | Biggest Google I/O 2026 announcements, Tom’s Guide | Gemini 3.5 Flash, Google DeepMind | Gemini 3.5 Flash benchmarks and pricing, NxCode

signalmoss is an independent editorial publication covering technology, finance, business, gaming, luxury, science, and culture. Our writers follow the stories that matter - from AI's impact on the workforce to the resale markets behind a sold-out watch drop - with clear analysis and no filler. We believe good journalism doesn't require jargon, and that curious readers deserve writing that respects their intelligence.