After Parting Ways with OpenAI, Figure AI Unveils Its “Secret Weapon”: Helix

Published: February 21, 2025 17:31

On February 20, Figure AI announced the end of its collaboration with OpenAI and unveiled its next-generation general-purpose embodied intelligence model, Helix. The release of Helix marks a significant breakthrough in robotics technology, not only enabling more precise and flexible robot motion control but also greatly enhancing robot-human collaboration capabilities.

 

Simply put, the Helix model primarily integrates deep fusion of visual, language understanding, and motion control, addressing a series of long-standing technical bottlenecks in the robotics field, particularly in robot precision and multi-robot collaboration.

 

Core Innovations of Helix

As a Vision-Language-Action (VLA) model, Helix is the first to achieve high-frequency, continuous control of the entire upper body of a humanoid robot. This innovation means that Helix not only controls movements of the robot’s fingers, wrists, head, and torso but also precisely coordinates these movements to achieve highly detailed operations.

 

Specifically, Helix can convert natural language instructions into specific robot actions without requiring individual training for each movement. This capability addresses two major problems in traditional robot models: Vision-Language Models (VLMs) are universal but not fast, while visuomotor strategies are fast but not generalizable. By combining these two elements, Helix enables robots to quickly adapt to new tasks, overcoming the limitations that traditional models faced in complex situations.


 

Helix adopts a unique "System 1, System 2" architecture, dividing robot control into two parts: System 2 is responsible for scene and language understanding and operates at a frequency of 7 to 9 Hz; System 1 executes actions in real-time at 200 Hz, converting the semantic information from System 2 into specific movements. The advantage of this architecture is that each system works at its most suitable frequency, ensuring robot response speed and precision in operation.

 

In practical operation, Helix can precisely coordinate the robot’s hand, torso, and head movements, ensuring that the robot maintains high precision and fluidity when completing complex tasks. For example, when performing an object-picking task, Helix adjusts the robot’s fingers and wrists accurately to ensure a firm grip on the target object. At the same time, Helix can coordinate the robot’s head and torso, optimizing its movement path and field of view to perform more refined operations. This precise coordination, particularly in complex, high-dimensional action spaces, breaks through previous bottlenecks in robotics technology.

 

Zero-Shot Multi-Robot Coordination

In addition to precise control of individual robot actions, Helix also demonstrates powerful capabilities in multi-robot collaboration. In a multi-robot collaboration task demonstration, Figure AI showcased two robots successfully completing a zero-shot collaboration task using Helix. This means that, without any prior training, the two robots can coordinate and collaborate via natural language instructions, such as "Hand the bag of cookies to the robot on your right" or "Receive the bag of cookies from the robot on your left and place it in the open drawer."


 

The innovation of Helix here lies in the fact that it enables multi-robot collaboration without the need for task-specific training or role assignments for each robot. With identical Helix model weights, the robots can complete collaborative tasks in the same environment without requiring dedicated training for each task. This successful demonstration means that robots can adapt flexibly to entirely new objects and coordinate work in complex multi-robot environments, handling more complex tasks.

 

Precise Item Pickup Capability

Helix also showcases its impressive ability to pick up any item. When the robot receives the instruction “Pick up the desert item,” it can recognize objects that match this description, such as a toy cactus, and accurately select the appropriate hand to grasp it. This ability not only demonstrates Helix’s object recognition capabilities but also illustrates how it can understand complex concepts from natural language instructions and execute precise operations.


 

This capability makes robots more efficient in everyday tasks, especially in household environments. Robots can automatically pick up unseen objects based on natural language instructions, without the need for extra demonstrations or programming. This seamless conversion from language to action boosts the robot's intelligence, allowing it to operate in more complex, unstructured environments.

 

Efficient Training and Resource Utilization

Compared to traditional VLA systems, Helix shows exceptionally high training efficiency. Figure AI trained Helix with just 500 hours of high-quality supervised data, which is less than 5% of the size of traditional VLA datasets. Furthermore, Helix’s training does not rely on multi-robot data or multi-stage training, making it far more resource-efficient. Unlike traditional robot models, which require extensive manual programming or demonstrations, Helix adapts to multiple tasks with a single training process.

This training advantage not only allows the robot to adapt to different tasks in a shorter period but also enables it to achieve efficient control under resource constraints. This efficient training method enables Helix to handle more complex upper body control tasks while maintaining high-frequency, high-precision output in higher-dimensional action spaces.

 

Single Neural Network Weights

Another highlight of Helix is that it uses a single set of neural network weights for all tasks. This means Helix doesn’t require specialized fine-tuning for different tasks, nor does it need to design separate action heads for each task. With a unified set of neural network weights, Helix can complete tasks such as object picking, placing, and even multi-robot collaboration. This simplified architecture improves training efficiency and enhances the robot’s flexibility and adaptability.

 

Future Scaling and Commercialization

As Helix technology continues to evolve, Figure AI looks toward scaling its applications to more robotics fields. Helix not only demonstrates a robot’s ability to handle complex tasks but also opens up broad possibilities for robots in household, healthcare, and logistics environments. Figure AI believes that with the ongoing upgrades to Helix, robots will play an increasingly important role in everyday life.

 

The release of Helix is a breakthrough for Figure AI in the field of embodied intelligence and a significant milestone in robotics technology. In the future, Helix is expected to become the core platform for embodied AI and robotics, helping robots transition from laboratories to everyday life, achieving a truly intelligent, automated future.