At this AI Day conference, Elon Musk regarded the "Optimus Prime" humanoid robot as the focus of publicity. That's right, if we can achieve mass production at the price of $20,000 introduced by Musk, "Optimus Prime" may have a profound impact on human life and the whole society.

At this AI Day conference, Elon Musk regarded the "Optimus Prime" humanoid robot as the focus of publicity. That's right, if we can achieve mass production at the price of $20,000 introduced by Musk, "Optimus Prime" may have a profound impact on human life and the whole society. But the relatively unobtrusive part of the presentation is the most substantial new news. This is the Dojo supercomputer, and it is expected to change the world faster than a bipedal robot.

Each Dojo bay consists of 6 D1 processor tiles.

The first thing to emphasize is that Tesla itself is a software company, but it just happens to make hardware solutions that match the software. As the core force of "software-defined cars", Tesla took the lead in introducing systems and connectivity elements into cars. This not only reduces costs, enhances functions, but also makes system updates easier. In fact, although Tesla is leading in every aspect, it suppresses the strongest hands of its competitors, which are excellent software development capabilities.

The most important emerging capability of a car is the autonomous driving function, which is basically a software problem. Tesla's FSD beta is controversial for treating car owners as test subjects, but just as humans themselves cannot learn to drive without going on the road, autonomous cars also need to go through real situations to develop coping strategies. Companies that develop autonomous driving system can establish simulation and test models based on real-world data to accelerate the entire training process. But for FSD to truly work, it still needs to be tested against the chaos in real scenarios and improve its response strategies accordingly.

This is where Dojo's stage is. Tesla is already using large supercomputers powered by Nvidia GPUs to process its FSD data, thus building a stronger autonomous driving model. It contains 5,760 Nvidia A100 graphics cards, installed in a total of 720 nodes, each node contains 8 GPUs. Its performance has reached 1.8 billion times, becoming one of the fastest supercomputers in the world. An important task of the system is to "automatically tag", which is to add labels to the original data and make it part of the decision-making system. Although autonomous vehicles will perform partial recognition independently during operation, most sensor data must match the preprocessed world model and then take predefined actions for specific situations. Just as humans will judge road conditions and respond accordingly based on their past experience, autonomous vehicles must also use driving experience in AI models to decide how to act.

Each Tesla exapod consists of 10 cabinets, each cabinet is equipped with two brackets.

Dojo promises to significantly speed up the improvement of these models. During AI Day, Tesla claimed that only four Dojo system cabinets were needed to achieve automatic marking performance equivalent to 4,000 GPU in 72 traditional racks. The company has made similar performance improvement commitments for other links in autonomous driving model training. Tesla will deploy Dojo through the so-called "exapod" cluster, which consists of 10 cabinets and plans to deploy seven such exapod clusters in the Palo Alto data center. Each exapod has a processing capacity of 1.1 billion times. After converting the AI model for Tesla's self-driving cars (and possibly also include "Optimus Prime" robots), its processing capacity will approach 80 billion times. The design idea of ​​

Dojo is very different from that of traditional supercomputers based on CPU or GPU. Dojo consists of numerous "tiles", which is very different from a regular computer CPU or GPU. CPU generally integrates multiple processing cores into a single chip, each processing core can perform complex software operations at high frequency. However, the current mainstream CPU of is designed to support up to 64 cores, while a single node can accommodate up to 2 CPUs and 128 cores. CPU-based supercomputers will gather a large number of such nodes in the same system. Frontier, the world's fastest supercomputer launched this year, has 9,400 nodes, corresponding to 600,000 2,112 CPU cores.

The number of cores in modern GPUs is very exaggerated. The recently released Nvidia GeForce RTX 4090 has 16384 cores, and Tesla's A100 used in its latest GPU-based supercomputer contains 6912 cores. But unlike CPUs, the core of the GPU can only perform very simple operations and is extremely fast. Therefore, GPUs are widely favored by AI and machine learning applications, especially programs involving building autonomous driving models. Common nodes can accommodate up to 8 GPUs, while Tesla's latest GPU-based supercomputer cluster contains nearly 40 million GPU cores. What’s special about

Dojo is that its D1 tile is not composed of multiple small chips, but a single large chip with 354 cores, designed specifically for AI and machine learning. After that, one bay can accommodate 6 pieces of D1 tile plus supporting computing hardware, and two such bays can be installed in each cabinet. In this way, each cabinet will contain 4248 cores, and the exapod composed of 10 cabinets has a total of 42480 cores. CPU-based supercomputers definitely don't have that many cores in the same space, and GPUs have a crushing advantage in this regard. And because Dojo is optimized specifically for AI and machine learning processing, it is orders of magnitude faster than traditional CPUs or GPU supercomputers within the same data center space.

Tesla's "Optimus Prime" robot will also benefit from Dojo's faster AI model processing capabilities.

Tesla's goal is to deploy the first Dojo exapod in the first quarter of 2023, but it is still unclear when the other six will be implemented. When this level of processing performance is installed, I believe that Tesla's FSD model training will be greatly accelerated, thereby promoting the significant development of autonomous vehicles. Currently, more than 160,000 Tesla owners around the world are participating in FSD beta to collect real-world driving data for the company. Dojo exapod will use this data to build a new model and continuously push system updates to these 160,000 users, thus forming a virtuous cycle. If the results are good, the project will attract more testers to join, further promoting development acceleration.

So we think that the real big news at Tesla's AI Day 2022 conference should be Dojo, and it is by no means "Optimus Prime". At the previous AI Day 2021, Tesla announced the specifications of the D1 chip and showed early samples. A year has passed and things have changed a lot. Although Musk's publicity is often too strong and we cannot believe it, if Dojo can really start delivering within next year, it is expected that Tesla's FSD beta will speed up its iteration and improvement speed, and the commercial promotion of autonomous driving may really exceed our previous expectations.