Google DeepMind launches two new AI models based on Gemini 2.0
Published: March 17, 2025 17:14
Recently, Google DeepMind unveiled two new AI models based on Gemini 2.0, aimed at completing complex real-world tasks by leveraging the reasoning abilities of large language models, helping robots adapt to complex environments.
The first model is called Gemini Robotics. According to official sources, this is a visual-language-action model. The second model is Gemini Robotics-ER (Embodied Reasoning). This model has enhanced spatial understanding capabilities and, by utilizing the reasoning ability of large language models, helps robots complete increasingly complex real-world tasks, enabling them to work efficiently and precisely in highly challenging environments.
Google DeepMind believes that for a robot AI model to be useful to humans, it must possess three core traits: generalization (ability to adapt to various scenarios), interactivity (ability to quickly understand and respond to commands or environmental changes), and dexterity (ability to perform fine operations similar to human hands). These two newly released models enable a wide range of robots to perform real-world tasks more extensively than ever before.
Gemini Robotics
DeepMind's Gemini Robotics project is designed to address the challenges of applying traditional robot technology in the real world. Traditional robotic systems usually rely on pre-programmed algorithms to perform tasks, but these methods often fail in complex and dynamic physical environments. The core innovation of Gemini Robotics lies in incorporating AI reasoning abilities into robots, allowing them to proactively perceive their environment, adapt to changes, and execute complex tasks.
One of the major highlights of this project is the Gemini 2.0 series, which combines deep learning, reinforcement learning, and large language model reasoning abilities, enabling robots to reason and make decisions like humans. This allows them to more flexibly and intelligently adapt to different environments and task requirements.
Gemini Robotics-ER
Gemini Robotics-ER (Embodied Reasoning) is a significant technological breakthrough within DeepMind’s Gemini Robotics project. It enhances a robot’s ability to reason in complex scenarios, enabling it to perform not only traditional tasks but also those requiring deep understanding and multi-step reasoning.
The core advantage of this extended model is its ability to combine real-time environmental data sensed by the robot with the reasoning capabilities of large language models to make deep decisions. This means that robots can respond flexibly to voice or text commands and, when faced with environmental changes, make more accurate predictions and plans. For example, when a robot faces multiple operational choices, Gemini Robotics-ER can analyze complex environmental information, predict the potential outcomes of different choices, and help the robot select the optimal course of action.
Additionally, Gemini Robotics-ER demonstrates advantages over traditional reasoning models when robots are performing high-difficulty tasks. For instance, in manufacturing, healthcare, and warehousing environments, robots can use the extended reasoning model to understand the multi-layered context of a scene and make more efficient and precise decisions.
Large Language Models Make Robots Smarter
A core innovation of Gemini 2.0 and its extended reasoning model is that they empower robots with powerful language reasoning capabilities. Traditional robots often rely on programmed behavior control, which can be inflexible when facing complex real-world environments. With the introduction of large language models, Gemini Robotics enables robots to engage in language reasoning, allowing them to understand, generate, and execute more complex and abstract tasks.
Robots interact with their environment through natural language and, based on this interaction, reason about the current task. For example, when a robot receives a command, it doesn’t simply execute it but also analyzes multiple variables in the environment, such as the location of obstacles or the properties of objects, to decide which action will achieve the best result. For instance, in a warehouse, a robot can plan how to quickly locate a target item even when its position is uncertain.
Adapting to Complex Tasks in the Real World
The true value of Gemini Robotics lies in its ability to adapt to complex tasks in the real world. In many industry applications, robots are required not only to complete standardized tasks but also to adapt to constantly changing environmental conditions. Whether in manufacturing assembly tasks or precision operations in healthcare, robots must have flexibility and efficient decision-making capabilities.
Gemini Robotics-ER excels in this area. Through the extended reasoning model, robots can perceive and process data from various sensors, such as visual, tactile, and auditory information, in real time. This data, combined with reasoning models, helps robots make dynamic adjustments while performing tasks. For example, on an automated production line, robots can recognize parts of different shapes and materials and adjust their grasping strategies in real time to avoid errors or damage.
Through this technology, Gemini Robotics not only performs tasks in simple environments but also maintains high efficiency in changing and challenging environments. The robots' learning capabilities are greatly enhanced, enabling them to autonomously reason and complete tasks when faced with unknown challenges.
Bridging the Gap Between the Virtual and Physical Worlds
Although Gemini Robotics and its extended reasoning model represent a significant leap in AI and robotics technology, there are still many challenges in achieving a seamless transition from the virtual to the physical world. Especially in the dynamic and complex real-world environments, ensuring that robots can not only execute tasks efficiently but also adapt to different changes remains a key challenge for technological development.
However, with the continuous optimization of Gemini Robotics-ER, future robots will provide more efficient services in more complex environments. For instance, in logistics, robots will not only move items but also adjust their paths flexibly according to environmental changes; in healthcare, robots will be able to reason and understand doctors' instructions to perform surgeries efficiently.
As technology progresses, DeepMind’s Gemini Robotics project will continue to drive the development of robotics technology, helping robots become not only essential tools in industrial production but also widely used in healthcare, home, education, and other fields.
The Future of AI and Human Collaboration
The launch of Gemini Robotics represents more than just a breakthrough in robotics technology; it also paves the way for a new direction in AI and human collaboration. With the introduction of Gemini Robotics-ER, robots will no longer be mere tools but will become partners that work alongside humans. Robots will perform precise operations in complex environments, helping humans with dangerous, strenuous tasks and improving work efficiency.
In the future, robots will work side by side with humans in fields such as healthcare, manufacturing, and agriculture to address global challenges. Gemini Robotics-ER will accelerate this process, promoting the practical application of AI across industries and transforming work modes, thereby enhancing human productivity.
DeepMind’s Gemini Robotics project and its extended reasoning model Gemini Robotics-ER empower robots to adapt to more complex and dynamic real-world tasks. As the reasoning capabilities of large language models and deep learning technologies continue to improve, robots in the future will be able to provide efficient and precise services in more industries, becoming valuable assistants in human life and work. Through the ongoing innovation of these technologies, robots will not only perform preset tasks but also think, reason, and make decisions autonomously, playing an increasingly important role in the intelligent age.
PRODUCTS
Product Name | Product Model | MOQ | Datasheet |
---|---|---|---|
![]() |
LLIS | 1 |
![]() |
![]() |
O2S-FR-T6 | 1 |
![]() |
![]() |
OXY-FLEX Series | 1 |
![]() |
![]() |
EM-FECS(B) | 1 |
![]() |
![]() |
EM7162 | 1 |
![]() |
![]() |
EM7000 | 1 |
![]() |
NEW PRODUCTS
More >-
PST’s intrinsically safe optical liquid level switches are designed and certified for use in demanding applications where direct contact with hydrocarbons
Model Number:LLIS
-
Zirconia O2 Sensors Screened Probe Series Long Housing O2S-FR-T6
Model Number:O2S-FR-T6
-
PST offers a compact and cost-effective zirconia transmitter to measure percentage level oxygen in combustion processes
Model Number:OXY-FLEX Series
-
The EM-FECS(B) evaluation module is designed to perform the testing and evaluation of the three-electrode electrochemical gas sensors in the FECS-series
Model Number:EM-FECS(B)
-
The EM7162 evaluation module is designed to facilitate evaluation of the characteristics of the CDM7162 carbon dioxide (CO2) sensor module.
Model Number:EM7162
-
The EM7000 Communication Board for Gas Sensor Evaluation Modules,for facilitatig
evaluation of the characteristics of various Figaro gas sensorsModel Number:EM7000