
On June 3 local time, Tesla CEO O Elon Musk previewed on Twitter that the prototype will be released on Tesla artificial intelligence on September 30. It is called "Optimus", also known as "Tesla Bot", and is Tesla's most important product this year.
The emergence of humanoid robots can empower thousands of industries and is the next wave of artificial intelligence scenarios. With the continuous maturity and commercialization of technology, it is expected to bring an unprecedented blue ocean of trillions.
The intelligent internal reference for this issue, we recommend West China Securities report "Tesla Bot: AI's Star Sea", to analyze the market prospects of humanoid robots.
Source West China Securities
Original title:
"Tesla Bot: AI's Star Sea "
Author: Liu Zejing
1. A new chapter of AI, humanoid robot
Tesla may launch its first humanoid robot prototype on September 30, 2022, and name it "OPTIMUS". As early as August 19, 2021, Musk proposed to launch humanoid robots on Tesla's artificial intelligence day, aiming to solve the dangerous errand and boring errands.
Musk announced his entry into the field of AI robots, which means that Tesla is not just an electric vehicle company, but an AI company. In addition, Musk claims that Tesla robots will one day be more important than car companies over time.
Tesla robot can simply split 2 domains, namely AI domain and technical domain .
AI domain : FSD computer is used as the core of computing power, equipped with 8 Autopliot Cameras as sensors, supporting deep learning, big data analysis, Dojo training, automatic tagging and other algorithms.
Technical Domain : The robot's head contains an information screen to display information. In addition, the robot consists of lightweight material , and its limbs contain about 40 electromechanical actuators, and it can achieve smooth and agile walking on both feet through a force feedback sensing system.
According to Musk, the robot is about 1.73 meters and weighs about 56.7 kg. It can pick up about 20.4 kg of goods and walk at the fastest speed of about 8KM/hour.
AI domain is the core of humanoid robots, because robots can only complete designated tasks through continuous machine learning training. In addition, Tesla's humanoid robot is the culmination of Tesla's autonomous driving. Because the core of humanoid robots shares the FSD system with intelligent driving, we expect that many neural network systems for intelligent driving will be applied to humanoid robots.
data is the foundation for realizing intelligent driving and intelligent robots, and computing power provides the basic power for machine learning and neural network . With the exponential growth of the data processed by Tesla, the company gave up the Nvidia A100 GPU as a supercomputer array for training due to power consumption. Instead, relying on its powerful vertical integration of capabilities, it developed a Dojo D1 chip focusing on deep learning training. So Tesla's Dojo supercomputer came into being.
1. Brain: D1 chip
D1 chip As a key unit of Dojo supercomputer, it realizes super computing power and ultra-high bandwidth, and achieves a balance between space and time. The chip adopts a distributed structure and a 7-nanometer process, and is equipped with 50 billion transistors, and 354 training nodes. The internal circuit alone is as long as 17.7 kilometers.
Dojo supercomputer is actually a "performance beast" with a computing power of up to s 9PFLOPs. The training module of Dojo supercomputer is composed of 1,500 D1 chips, with a total of more than 530,000 training nodes. The delay between adjacent chips is low. Combined with Tesla's own high-broadband and low-latency connectors, the computing power is as high as 9PFLOPs, making it the world's leading supercomputer. Compared with the industry, the performance of the same cost can be improved by 4 times, the performance of the same energy consumption can be improved by 1.3 times, and the space savings can be saved by five times.

Tesla DOJO1 chip schematic
Tesla Dojo D1 chip can be mainly disassembled into 4 parts, namely CPU, Switch, Matmult, SIMD.
CPU, namely central processor , is the operation and control core of the computer series, and is the final instruction unit for information processing and program operation.
Switch is the switch, which is a bridge between computer chips and chips, with data transmission function.
SIMD is a single instruction stream and multiple data streams. It can be understood as parallel computing. It uses a controller to control multiple processors and intervenes in the technology of realizing spatial parallelism. In simple terms, it is a single instruction that can process multiple data.
Mat mult is a computing unit that can focus on the calculation of neural networks and accelerate the calculation speed of neural networks. It is one of the fundamental reasons why Tesla computers realize the beast of computing power. This computing unit can be understood as an artificial intelligence chip, namely an AI processor, which is a chip specially used for machine learning algorithms and neural network operations, which can be used for training and inference. Compared with the CPU and GPU during the same period, it can achieve 15-30 times performance improvement and 30-80 times efficiency (performance) improvement.

Dojo D1 chip architecture
2. Soul: AI machine vision
machine vision is an application and technical direction of AI deep learning. Whether it is humanoid robots or intelligent driving, it is one of the directions of machine vision.
neural network is an important algorithm for realizing AI deep learning, covering the entire process of humanoid robots from recognition to generating instructions. It is an artificial system with intelligent information processing functions such as learning, associative , memory and pattern recognition through modeling and linking the basic unit neurons of the human brain, and has developed a model that simulates the functions of the human brain system. Neural networks are widely used in intelligent robots, mainly in various aspects such as object recognition, planning, hypothesis, training//testing. The most important feature of the
neural network is that it can learn from the environment and store the learning results in the synaptic connection of the network. Neural networks are a process of learning. Under the stimulation of the environment, some sample modes (input layers) are inputted to the network one after another, and the weight matrix of each layer of the network is adjusted according to a certain learning algorithm. When the weights of each layer of the network converge to a certain value, the learning process will end (output layer).
Tesla has the same effect on the path of intelligent driving and humanoid robots in machine vision. A complete set of training, testing ((work)) movement includes five parts: sensor, perception, evaluation, planning, and brake.

Tesla machine vision full process diagram
Tesla's most famous AI algorithm is a pure vision solution in its machine vision, which continues it in the manufacturing of humanoid robots.
Image-based object detection: The purpose is to determine whether there is a given category of target instance in the image, which can be a dynamic or static target, and if so, return the spatial location and coverage of each target instance. Object detection is the basis for solving more complex and higher-level visual tasks such as segmentation, scene understanding, object tracking, image description, event detection and activity recognition.


Schematic diagram of Tesla 2D object recognition to 3D object recognition process
2. The prospects are broad, and it is ready to go
A changes from smart cities to intelligent driving AI waves are expected to be the next application scenario for artificial intelligence.
Big Data Era: In 2016, AI defeated Ke Jie . At the same time, with the improvement of basic computing power, my country has ushered in a new round of artificial intelligence craze, namely the big data era. Policies and capital come first, and application scenarios are gradually enriching. Drones, AI translators, etc. have been landed one after another.
Intelligent Driving: With the explosion of massive data and the evolution of basic computing power and chips, Tesla Autopilot has officially entered the era of intelligent driving with its complete functional definition, algorithms that rely on data to continuously learn, and software upgrades achieved through OTA. At the same time, Internet giants such as Google , Baidu , Tencent , Huawei and other Huawei have successively entered the market to promote the accelerated development of intelligent driving. In addition, the superimposed policies continue to promote the implementation of commercial operation of autonomous driving. Nowadays, domestic manufacturers in my country have made practical breakthroughs in smart cockpits, driving, etc., and the domestic ecosystem will have great potential in the future.
Humanoid Robot: With the implementation of humanoid robots in the future, boring and repetitive tasks such as buying groceries and housework are likely to be replaced by humanoid robots. We believe that this is the next wave of artificial intelligence, and domestic companies are very likely to replicate the achievements made in the field of intelligent driving.

Changes in the wave of artificial intelligence
According to McKinsey data, with the continuous progress of AI, it is estimated that around 375 million people around the world will be reemployed in 2030 due to technological breakthroughs in AI. In terms of quantity, 12 million to 102 million people in my country will need to be reemployed. The global average replaced labor force ratio is 15%, and my country, as a large population country, basically remains the same as the world at 16%.
In addition, Musk revealed that the actual cost of humanoid robots will not be very high, and may be lower than that of cars. Andrew predicts that it will be $25,000, about RMB 160,000. According to , the minimum selling price of Tesla MODEL3 is about 280,000 yuan, and conservatively estimated that Optimus is priced at 200,000 yuan. In the long run, it is conservatively estimated that the global humanoid robot market size can reach trillions by 2030, which is another unprecedented blue ocean for AI after intelligent driving trams. The emergence of
humanoid robots can empower thousands of industries and are expected to replace tedious work tasks with high repetition and monotony and boring. At the same time, a series of dangerous problems such as search and rescue are expected to be solved, and a series of scenarios such as express delivery, housekeeping, service industry, and industry are expected to be implemented first. In addition, humanoid robots are the next wave of AI scenarios. With the continuous maturity of technology and the implementation of commercialization, it is expected to bring a trillion-level unprecedented blue ocean. In the context of the gradual breakthrough and maturity of my country's intelligent driving ecosystem, humanoid robots are imperative, and domestic companies are very likely to replicate the achievements made in the field of intelligent driving.
Manufacturers with self-developed AI processors can provide computing power support for the neural network of humanoid robots. Compared with AI algorithms, data is the top priority. computing power is the source of power for data acceleration processing, and its importance is self-evident.
can be divided into two links according to the algorithm steps of machine learning. The training link requires extremely large data input to support a complex neural network model. During the training process, due to the complex neural network structure and massive training data, the computing volume is huge, so the computing power and efficiency (energy consumption) of the processor are extremely required.
AI processor chip can support deep neural network learning and accelerated computing, and has exponential performance improvements and extremely low power consumption levels compared to GPUs and CPUs. In addition, the calculation amount of the inference process is relatively small compared to the training process, but it still involves a large number of matrix operations. Therefore, artificial intelligence chips will play a large role.
The landing of humanoid robots requires data fusion in downstream scenarios, and manufacturers with AI algorithms have comparative advantages. Tesla has implemented pure vision solutions in the field of intelligent driving, and related FSD systems can be directly used in the field of machine vision of humanoid robots.
However, from the commercial implementation of humanoid robots, its data needs to be closely integrated with downstream segmented scenarios, and it is continuously iteratively trained through data and algorithms in high-quality segmented scenarios, and finally provide valuable commercial services. Humanoid robots cannot be directly obtained in segmented scenarios, and companies with commercialization of AI algorithms have the advantage of connecting with downstream segmented scenarios. Both parties can jointly empower customers through cooperation, thereby accelerating the commercialization of humanoid robots.
新大小时 believes that the goal of Tesla humanoid robots is to apply in repetitive, boring, and dangerous environments and working conditions, and will eventually go to our family to completely solve the trend of constant shortage of labor. Although Tesla has a strong technical foundation in AI and robotics, whether Musk's Haikou can come true may not be known in September.