Google Unveils Gemini 2.5 Computer Use That Clicks, Types, and Scrolls Like Humans

gemini 2.5 computer use ai model launched by google
Image Credit: Google
In Short
  • Google's latest Gemini 2.5 Computer Use AI model is designed to perform actions on web browsers and Android UIs.
  • It outperforms OpenAI's Computer-Using AI Agent and Anthropic's Claude Sonnet 4.5 in key benchmarks.
  • Versions of this model is already powering Project Mariner and AI Mode in Google Search.

Today, Google released the Gemini 2.5 Computer Use AI model that is designed to interact with user interfaces (UIs). It’s built on the flagship Gemini 2.5 Pro model, bringing its visual and reasoning capabilities to power AI agents. The Gemini 2.5 Computer Use model can navigate browser and web interfaces along with Android UI interfaces.

Google says the new Gemini 2.5 Computer Use AI model can click, type, and scroll, just like humans to complete a task. In fact, in the WebVoyager benchmark, Gemini 2.5 Computer Use model scores 88.9% while OpenAI’s Computer-Using AI Agent achieves 87%. In the Online-Mind2Web benchmark, Google again outperforms OpenAI’s Operator AI agent.

Image Credit: Google

It shows that Google has trained a leading AI model to power AI agents, which can reliably perform tasks on browsers. In terms of accuracy and latency too, Google has an upper hand over Claude Sonnet 4.5 and OpenAI’s Computer-Using Agent.

Google has already deployed versions of this model on Project Mariner and AI Mode in Google Search. Besides that, the API for Gemini 2.5 Computer User is available via Google AI Studio and Vertex AI.

Comments 0
Leave a Reply

Loading comments...