If Google is strong to show that Gemini has its place to change our daily life, its intelligent assistant also serves, in other ways, in the field of robotics. In March, the company presented a new family of models called Gemini Robotics, allowing robots to perceive, reason, use tools and interact with humans. These machines can also solve complex tasks.
But Google wants to go further to develop truly versatile robots. For this purpose, Deepmind, his subsidiary specialized in AI, presented two new models “offering agents experiences thanks to advanced reflection”, as indicated in a blog article. To perform more elaborate tasks, robots equipped with these models can connect to the Internet to get help.
Think before acting
Therefore, a robot manages, for example, to classify waste, compost and recyclable materials of an internet search adapted to the specific requirements of a place (San Francisco, in this case) as shown in a video. This task, like many others, requires contextual information and several steps that can be achieved. After looking for local directives in terms of recycling on the Internet, the machine examines the objects before it and determines how to order it according to the information collected, before acting.
To help a person make their bag for a trip to London, another robot looks at the weather to tell him if he will rain and remind him of putting an umbrella in his bag. To achieve this, the two models work together. The first, Gemini Robotics-Er 1.5, orchestra the activities of a robot “as a high-level brain”, according to Deepmind. It is a vision language model (VLM) capable of reasoning about the physical world, but also of using digital outis and creating detailed plans in several stages to carry out a mission.
Once Gemini Robotics-Er 1.5 has been reflected, he instructed Gemini Robotics 1.5, which is a model of vision action (VLA) capable of transforming these instructions into engine commands for the robot. Use your vision and understanding of language to carry out specific actions directly.
Even more surprising, this model of AI is able to transfer the measures learned from one robot to another, without a specialization that is necessary for each new incarnation. “This advance accelerates learning new behaviors, helping robots to be smarter and more useful,” Deepmind said.
Like other companies, Google wants to reach an AI that would be as intelligent as humans and capable of performing complex tasks like us. Certain actors, such as Openai, believe they approach the agents of AI and LLM, to the point of being just a few years of such advent. Others, like Yann Le Cun, think that in the end an intelligence of general artifice is inevitable, but that the way to get still long …
Source: BFM TV
