Technology
Karan Kamble
Apr 09, 2024, 11:45 AM | Updated Apr 12, 2024, 12:25 PM IST
Save & read from anywhere!
Bookmark stories for easy access on any device or the Swarajya app.
Handling eggs and folding clothes are normal chores. One among hundreds we do everyday almost unthinkingly. If humans were to be freed of these chores by robots it stands to reason that robots must be trained in each of these small tasks. Such training makes the robots very task specific and renders robots unsuitable for other tasks.
But what if a robot can learn — nearly on the go? That is where Tesla's Optimus scores over other humanoid robots. And this ability is key to introducing millions of humanoid robots among us in the near future.
But before we understand how Tesla's Optimus 'learns' let's first understand some other broad features of the robot.
In a video released in December 2023, Elon Musk’s Tesla unveiled the latest version of its general-purpose robot, called “Optimus Gen 2” (second generation). The robot wore a more polished and “human” look than in its earlier avatars — the exploratory prototype “Bumblebee” and the Tesla-designed platform “Optimus Gen 1.”
The video showed the humanoid robot capable of picking up an egg from a crate delicately using two fingers, transferring the egg gingerly from one hand to the other, and then laying it down precisely into an egg boiler.
This might count as a delicate task even for humans, especially the younglings.
Optimus is also able to do squats, walk more like a human than the early mechanised robots, gaze at its dexterous hand intensely in a way resembling the Marvel comic character Thanos, and even dance like it was letting its non-existent hair down at a club.
The robot, 5 feet and 8 inches tall and weighing 50 kilograms (kg), has reached this impressive stage, thanks to various incremental improvements made over many months.
Tesla took the design of two critical elements of a robot, the actuators and sensors, in-house. Actuators are responsible for moving and controlling a robot’s parts, while sensors provide robots with information — light, sound, temperature, proximity — about their surroundings. Together, they enable robots to interact with and adapt to their environment.
Earlier, for the exploratory prototype, the artificial intelligence (AI) and robotics company (Tesla likes to describe themselves this way) bought and used components off the shelf, forcing their hand with what’s already out there rather than customise components to their needs.
Optimus Gen 2 has been enhanced in several other ways: the neck moves up and down or side to side, indicated by “2 DoF” or two degrees of freedom, for a smoother rotation; the hands exhibit greater mobility, enjoying 11 DoF in addition to arm articulation; the gait is more natural with a 30 per cent rise in walking speed; the feet, designed as per human geometry, sense the surface they walk on for a more adaptive response; and all fingers have tactile sensing, thereby enabling the manipulation of delicate objects, like eggs.
“In a year, it will be able to thread a needle,” Musk said on X about how improved the hands are from an engineering perspective.
This robot’s weight is also lower by 10 kg, with better body control and balance overall.
Self Learning
While these sets of specifications are impressive achievements in a relatively short period of time, Optimus has gained an edge over other advanced humanoid robots currently in the works, such as Boston Dynamics’ “Atlas”, due to one particularly special feature: self-learning.
Optimus learns how to do things all by itself using end-to-end neural networks. “A neural network is a machine learning program, or model, that makes decisions in a manner similar to the human brain,” as per IBM. It does so “by using processes that mimic the way biological neurons work together to identify phenomena, weigh options and arrive at conclusions.”
Neural networks are a subset of machine learning, and integral to deep learning models. They feed on training data to learn and get better over time. Once they achieve a satisfactory level of accuracy, the speeds at which they perform tasks are transformed. Google’s search algorithm is a popular example of a neural network.
“We’ve designed, trained and deployed some of the first end-to-end neural nets for humanoid robots ever demonstrated to autonomously perform tasks requiring coordinated control of humanoid torso, arms, and full hands with fingers,” Milan Kovac, an engineer at Tesla, said on X.
The neural network in Tesla's Optimus is trained fully end-to-end, with video input and control output. The robot watches video content of humans carrying out physical tasks and mimics it over and over again until it gets it right. In this way, it is able to watch, learn, and improve by practice rather than act strictly according to specific pre-programming, which is the usual fare with humanoid robot training.
To say it simply, Optimus’ every action is not explicitly coded.
It can go beyond its ‘mandate’, as it were, making it versatile and adaptable. In contrast, Atlas, though advanced in many other ways, such as in its physical abilities, needs to be programmed for each task.
The Boston Dynamics' robot is trained using a combination of traditional control approaches, machine learning algorithms, and rigid body dynamics libraries.
Through self-learning, Optimus is able to sort objects by itself, for instance. In a popular demonstration, Optimus Gen 1 is able to pick up and place green blocks in a green tray and blue blocks in a blue tray by relying on its vision-based neural net on board.
The robot is able to sort blocks despite intentional trickery — a human keeps moving the blocks around on the table while the robot is in the middle of picking them up. Optimus is able to locate the blocks at their new locations, pick them up, and place them in the designated tray.
What’s more, it is even able to correct its course autonomously if something goes wrong; for instance, when a block was flipped to its side upon placing in the tray, the robot paused for a moment, picked it up, and placed it upright. The implications are huge — the robot can respond to a real-world, dynamic environment without first being told (coded) to change.
“Just collect more data and we can train a new complex task without changing any code!” Tesla’s Julian Ibarz said on X.
In a reverse scenario, the robot can also unsort objects if trained to do so, or learn to do any other physical, manual task carried out by humans, for that matter. Its self-learning superpower enables the robot to acquire new skills and capabilities independently — a bit like us humans.
Tesla, with its manufacturing prowess and use of similar technologies in its electric vehicles, can make, train, and deploy autonomous humanoid robots in large numbers in the future to perform unsafe, repetitive, or boring tasks, in line with Musk’s vision.
Some of the earliest prototypes would likely be put to use at Tesla factories. A job post in March indicates that Tesla's humanoid robots are ready for trials in factories. In the future, the company might sell or lease these robots to whoever may want to make use of them.
Our society is all set to look and work very differently in the years to come. At the very least, simple labour work, whether at home, in the factory, or on the surface of the Moon, could be offloaded to self-learning robots.
Karan Kamble writes on science and technology. He occasionally wears the hat of a video anchor for Swarajya's online video programmes.