It is not far from the last GTC conference. Such a intensive conference reminds us of the failed Moore's Law. To be precise, the content of Moore's Law is not outdated, but computing power is updating forward at a speed that surpasses Moore's Law, and at the same time bringing mo

2025/04/0520:17:37 digitals 1471

　　The last GTC conference is not far away. Such a intensive conference reminds us of the invalid Moore's Law . To be precise, the content of Moore's Law is not outdated, but computing power is updating forward at a speed that surpasses Moore's Law, and at the same time bringing more innovative power to thousands of industries. What this GTC conference brings us is the latest achievements of the accelerated computing power revolution.

　　Between September 19th and 22nd, the 2022 autumn GTC conference is here. This GTC conference brings together a large number of new achievements in AI, computer graphics, data science, etc., allowing developers, researchers, corporate leaders, creators, IT decision makers and students to truly feel the transformation role of AI for various industries and the entire society.

　　From the advent of RTX4090 graphics cards, GPU has entered the RTX era, to the Hopper architecture server that users have been expecting, and to the large language model cloud service that is expected to allow more users to enjoy large language models, a series of innovations have made this GTC conference full of surprises. Let us take a look at these new surprises below.

　　The new surprise brought by the GeForce RTX 40 series

　　What is a surprise? The newly released GeForce RTX 40 series is a surprise. Previously, users had high expectations for the RTX 40 series, not only because its power consumption is more flexible, but also because its price will surprise users.

　　The RTX 4090 in this series is known as the world's fastest gaming GPU, with 76 billion transistors, 16384 CUDA cores and 24GB of high-speed GDDR6X video memory, and continues to run at more than 100FPS in 4K resolution games. Compared with previous generations, the performance improvement of the RTX 4090 in ray tracing games can be as high as 4 times compared to the RTX 3090 Ti. In raster games, the RTX 4090 also has up to 2 times performance improvement, while maintaining the same 450W power consumption.

　　From another perspective, graphics cards are by no means game display drivers, but will bring more creative power, and applications such as AI and visual capture will be accelerated. With the GeForce RTX 40 series, Nvidia can be regarded as a completely new definition of its invented GPU. A new era of real-time ray tracing and neural network rendering using AI to generate pixels has arrived.

　　NVIDIA founder and CEO Huang Renxun introduced in his keynote speech at the GTC conference: "The era of RTX ray tracing and neural network rendering is fully unfolding, and the new NVIDIA Ada Lovelace architecture has taken it to a new height."

　　NVIDIA Hopper will officially unveil

　　In April this year, NVIDIA Hopper architecture was officially launched, and it has attracted much attention because it will replace the NVIDIA Ampere architecture launched two years ago. At that time, the corresponding core number was "GH100". Because of six major innovations including chip, Transformer engine, second-generation secure multi-instance GPU, confidential computing, fourth-generation NVIDIA NVLink, DPX instructions, users in the fields of HPC high-performance computing and AI artificial intelligence are full of expectations for Hopper architecture products. Long before the official announcement of the core parameters, the relevant specifications had been dug out.

　　At this GTC, NVIDIA announced the full production of NVIDIA H100 Tensor Core GPU, and NVIDIA Global Technology Partners plan to launch the first batch of products and services based on the pioneering NVIDIA Hopper architecture in October. For users in need, NVIDIA Hopper architecture products will no longer be just legends.

　　The H100-equipped system provided by computer manufacturers is expected to be shipped in the next few weeks, and more than 50 server models will be available by the end of this year, and dozens will be available in the first half of 2023. Partners that have already built the system include Source (Atos), Cisco , Dell Technologies , Fujitsu , Gigabyte Technologies , HPE, Lenovo and Ultramicro .

　　In addition, H100 has also begun to move to the cloud. AWS, Google Cloud, Microsoft Azure, Oracle Cloud Infrastructure will be the first to deploy H100-based instances on the cloud starting next year.Higher education and research institutions including the Barcelona Supercomputing Center, Los Alamos National Laboratory, Switzerland National Supercomputing Center (CSCS), Texas Advanced Computing Center and the University of Tsukuba will also adopt the H100 in the next generation of supercomputers.

　　H100 enables enterprises to reduce the deployment cost of AI. Compared with the previous generation, when providing the same AI performance, it can increase energy efficiency by 3.5 times, reduce the total cost of ownership to 1/3, and reduce the number of server nodes used to 1/5. But on the other hand, there are also some new application problems.

　　Although the H100's Transformer engine technology can help enterprises quickly develop large language models with higher accuracy, as the scale of these models continues to expand, its complexity is also increasing, and some models have training time for several months. To solve this problem, large language models and deep learning frameworks such as NVIDIA NeMo Megatron, Microsoftml2 DeepSpeed, Google JAX, PyTorch, TensorFlow and XLA are being optimized on the H100. Combined with the Hopper architecture, these frameworks can significantly improve AI performance and reduce the training time of large language models to several days or even hours. After the complexity problem is solved, the application path of Hopper architecture products will be smoother.

　　 large language model services are expected to become popular

　　In the past few years, everyone from artificial intelligence experts to the general public has been attracted by the amazing output of large language models (LLM: Large Language Models). These models, using descriptive input, can produce everything from convincing artificial images to stories and poetry. However, research labs in academia, nonprofits and small companies are difficult to create, research, or even use LLM, because only a few industrial labs with the necessary resources and exclusive rights have full access to them.

　　NVIDIA's two new large-scale language model (LLM) cloud AI services released by NVIDIA at this conference - NVIDIA NeMo Large-scale Language Model Service and NVIDIA BioNeMo LLM Service, can enable developers to easily adjust LLM and deploy customized AI applications, which can be used for content generation, text summary, chatbots, code development, and protein structure and biomolecular properties prediction.

　　With the NeMo LLM service, developers can quickly customize multiple pre-trained basic models using a training method called prompt learning on NVIDIA-managed infrastructure. NVIDIA BioNeMo service is a cloud application program programming interface (API), which can extend LLM use cases to scientific applications outside the language, thereby speeding up drug development for pharmaceutical and biotech companies.

　　With the NeMo LLM service, developers can use their own training data to customize basic models - from 3 billion parameters to Megatron 530B, one of the world's largest LLMs. The process takes only minutes to hours compared to the weeks or months it takes to train a model from scratch.

　　The BioNeMo LLM service includes two new BioNeMo language models for chemical and biological applications. The service provides support for protein, DNA and Biochemistry data, helping researchers discover patterns and insights in biological sequences.

　　BioNeMo enables researchers to expand their research scope by utilizing models containing billions of parameters. These large models can store more information about protein structure, evolutionary relationships between genes, and even generate novel biological molecules for treatment.

　　In addition to adjusting the basic model, the LLM service also provides options to use ready-made and customized models through the cloud API. This gives developers access to various pre-trained LLMs including the Megatron 530B, as well as T5 and GPT-3 models created using the NVIDIA NeMo Megatron framework. The NVIDIA NeMo Megatron framework is now in public beta and can support a variety of application and multilingual service needs. The era of popularization of LLM services will no longer be far away.

　　 focuses on the smart car market

　　Now cars are becoming more and more like a large smartphone. As there are more and more intelligent applications, the desire for edge computing power becomes stronger and stronger.Nvidia's current leading automotive system-level chip is DRIVE Orin. In ordinary cars, the functions of the car are controlled by dozens of electronic control units distributed throughout the car. Orin replaces these components by centralizing control over these core areas to simplify already highly complex supply chains for automakers.

　　DRIVE Orin is designed for software definition, so it can achieve continuous upgrades throughout the entire life cycle of the car. At present, NVIDIA DRIVE Orin has also further expanded in the domestic automobile market.

　　QCraft announced the launch of the latest generation of automotive-grade pre-installed mass production autonomous driving solutions equipped with NVIDIA DRIVE Orin, and achieved the first implementation of the L4 passenger car fleet in China. Qingzhou Zhihang will work with T3 Travel to jointly launch the public operation of Robotaxi in Suzhou in September to provide citizens with safe and efficient shuttle services. Qingzhou Zhihang has also become the first company in the industry to implement the deployment and operation of Robotaxi fleet based on DRIVE Orin.

　　At the same time, Xiaopeng Motors 's latest flagship model - the ultra-fast charging full-smart SUV G9 is officially launched in China and will be delivered to users in the fourth quarter. As the fourth in Xiaopeng smart electric vehicle product series, the G9 is equipped with NVIDIA DRIVE centralized computing platform and DRIVE Orin system-based chip (SoC), and is equipped with the latest technology developed by Xiaopeng Motors. The hardware upgrade of DRIVE Orin will help Xiaopeng G9 release the potential of the in-vehicle system to a greater extent and improve the closed-loop and iterative solutions of Xiaopeng .

　　In addition to DRIVE Orin chips, at this GTC conference, Nvidia released Thor, a single computing power of 2000TFLOPS, with computing power of 8 times that of Orin and 14 times that of Tesla FSD chips. DRIVE Thor was born for the central computing architecture of the car. Obviously, Nvidia wants to use the new generation of chips to achieve a chip that dominates everything in the car.

　　At this GTC conference, Nvidia also released more joint innovations with its partners. Among them, Nvidia and Deloitte announced an expansion of the scope of cooperation to help global enterprises use NVIDIA AI and NVIDIA Omniverse Enterprise platforms to develop, implement and deploy hybrid cloud solutions; it also expanded its cooperation with Booz Allen Hamilton to provide AI-enabled, GPU-accelerated network security platform for public and private domain network customers.

　　Reviewing this conference, we will find that intelligence and computing power are two parallel themes. As the wave of intelligence swept across society, the extraordinary vision of accelerated computing opened up the advancement of AI, which in turn will benefit industries around the world. In this cycle, new ideas, new products and new applications are constantly emerging. The charm of the 2022 autumn GTC conference lies in providing a stage for displaying these innovative achievements.