It is the next revolution after chatbots. That of “agents”, that is, AI capable of performing tasks instead of users. Several companies have embarked on this adventure, including Google. The company, which already offers AI capable of coding as well as agent capabilities in its AI mode and elsewhere, wants to go further.
On October 7, it presented a new model capable of interacting with the web, Gemini 2.5 Computer Use. Based on the Gemini 2.5 Pro’s visual understanding and reasoning capabilities, you can scroll through pages, navigate drop-down menus, and even fill out forms.
A model that is not yet fully operational
However, this model is far from being fully operational. It is only available as a demo, through Browserbase, a browser designed specifically for AI agents and applications. Users only need to enter their queries to view the Gemini 2.5 computer usage while browsing the web.
Google also has developers to enrich its model. They will have access to a previous version thanks to a tool called “computer_use”, they will be able to “create browser control agents that interact with tasks by automating them”, as the company explains on a dedicated page. In this way they will be able to create AIs in charge of carrying out searches on different websites or that will automate the entry of repetitive data or the completion of forms.
If Gemini 2.5 Computer Use is not yet ready for public release, Google is already using it to power some of its projects. As the company revealed, it powers some of the agent capabilities of the AI mode, but also Project Mariner, an agent that browses the web in place of the user.
With this project, it once again competes with OpenAI and Anthropic, which launched similar features earlier this year. The first, in particular, seems to be far ahead of the American giant, with an operator that can reserve a restaurant, order food or even make purchases.
Companies are still at the beginning of this agent revolution. “The ability to natively fill out forms, manipulate interactive elements like drop-down menus and filters, and operate behind login credentials is a crucial step in creating powerful and versatile agents,” Google said.
Source: BFM TV
