"Engineer's Metaverse": NVIDIA's deep learning technology makes 2D images instantly "stand up" to 3D

2021/04/2720:19:07 technology 1530

Fasten your seat belts, your flat car is ready to go!

GANverse3D, developed by the NVIDIA AI Research Lab in Toronto as an extension to the NVIDIA Omniverse platform, converts flat images into photorealistic 3D models and visualizes them in a virtual space for people to freely manipulate.

GANverse3D has breathed new soul into the AI ​​car KITT from the American TV series Knight Rider. The researchers input the flat image of the car kit into the model, and by predicting the corresponding 3D texture mesh, GANverse3D generates a 3D model of the kit, which drives in a virtual space, and is equipped with realistic headlights and taillights. and turn signals.

In addition, the NVIDIA Omniverse suite and NVIDIA PhysX tools can convert predicted textures into high-quality materials, giving kits a more realistic look and "feel".

(Source: NVIDIA)


With GANverse3D, architects, creators, game developers and designers can easily convert flat designs into 3D models with minimal rendering aspects, even without knowledge of 3D modeling budget. The basic principle of

"the police catch the thief"-style learning model

GANverse3D is Generative Adversarial Networks (GAN, Generative Adversarial Networks).

Generative Adversarial Network is a recent deep learning model, which was first proposed by Professor Ian Goodfellow in 2014. Once it was proposed, the research on GAN was in full swing. The framework of

GAN includes at least two modules: Generative Model and Discriminative Model. The

good friend professor compares its principle to a game between a counterfeiter (Generator) and a police (Discriminator) who make counterfeit banknotes.The counterfeiters are constantly updating their technology to create fake banknotes, and the police need to use more advanced technology to identify counterfeit banknotes. The two continue to upgrade their skills in the confrontation.


Therefore, in the training process of the GAN model, the task of generating the network G (Generator, G for short) is to generate a realistic image deception network D (Discriminator, D for short), and the goal of D is to distinguish the images generated by G from real image. In this way, G and D constitute a dynamic "game process". Ultimately, in the ideal case, G is able to generate a judgment that successfully confuses D with a fake image.

In the original theory, G and D can be functions as long as they can fit the corresponding model, and they are not required to be neural networks. But in actual development, people mostly use deep neural networks as G and D. An excellent GAN application requires scientific and effective training methods, otherwise the freedom of the neural network model may lead to suboptimal output.

NVIDIA utilizes generative adversarial networks to form training datasets. It's like a photographer walking around a parked car, shooting from different angles to get a multi-view image. These multi-view images are eventually put into the inverse graphics rendering frame to composite 3D images.

inverse graph refers to the process of inferring a 3D mesh model from a 2D image. Whereas previous inverse graph models relied on 3D graphics as training data, NVIDIA's GANverse3D is trained on real images. NVIDIA researcher Jun Gao said that AI models trained on real images "can generalize better to the real world."

Without the help of 3D elements, "we turned the GAN model into a very efficient data generator so we can create 3D objects based on any 2D image on the web,” said Wenzheng Chen, a researcher at NVIDIA.

The GAN verse3D research results will be presented at the ICLR meeting in May and the CVPR meeting in June.

Building a Metaverse for Engineers

NVIDIA CEO Huang Renxun: "The Metaverse in Science Fiction is near."


Many technology companies are drawing inspiration from the concept of the Metaverse, and NVIDIA is no exception, launching the Metaverse while it's hot Engineer Edition.

Huang Renxun announced at the 2021 GTC conference that NVIDIA will launch Omniverse, a real-time collaborative simulation platform to B-side, a virtual work platform for enterprises.

NVIDIA launched the Omniverse beta version as early as last October, and has been used in games, architecture and other fields. More than 17,000 customers from BMW, Ericsson, Volvo, Adobe and more have experienced the beta.

Huang Renxun said in a previous interview that Omniverse allows game developers to easily deal with complex pipeline work, thereby improving work efficiency. On the Omniverse platform, the pipeline engineering of conventional game production such as animation, texture, lighting, geometry, etc. will be able to open the link in one fell swoop. Everyone can see what others are doing and can "seeing is believing".

(Source: NVIDIA)

Creators in games, architecture, and design are highly dependent on virtual worlds and need to test and visualize their prototypes before their products are complete. But just rendering a car or a street requires capturing hundreds or thousands of multi-view images, incurring significant costs. So not every creator has the resources and energy to turn their drawn images into 3D models.

Based on this, NVIDIA developed the GANverse3D application, which can quickly trace real-time rays to create a real virtual world.

Any 2D image can be converted in Omniverse into 3D graphics that can be customized and animated.

NVIDIA Deep Learning Engineer Jean-Francois Lafleche said: "Omniverse enables researchers to bring exciting cutting-edge research directly to creators and end users. As an extension to Omniverse, GANverse3D will help artists for game development, urban planning and even training new machine learning models to create richer virtual worlds."

technology Category Latest News