source: Zhongguancun Online
RTX 40 series graphics card The first unbanned graphics card has met us, and the super flagship "90" level product performance has also made everyone feel addicted. Today, we will bring you the first test of the non-public graphics card of the Colorful iGame GeForce RTX 4090 Vulcan OC.
The order of the RTX 40 series is somewhat intriguing. Although the "90" level products will be announced in the first batch, the "80" graphics cards that usually represent the gaming flagship will take the lead.
In fact, according to the current situation, the launch of RTX 4090 and the discontinuation of RTX 3090 and "Ti" are all for taking care of the RTX 30 series products that are still on the market.
Of course, this time NVIDIA is also very much looking forward to the performance of RTX 4090, and provides a large number of games for DLSS 3 testing.
iGame GeForce RTX 4090 Vulcan OC
The commercial time of this RTX 4090 is the same as the performance unlocking time, which means that you can start grabbing the card when you see the first test. Colorful has released three models of RTX 4090, namely the Colorful iGame Vulcan Overclocking Edition, the iGame Water God Overclocking Edition, and the original price Colorful Tomahawk Deluxe Edition. Interested players can pay attention.
01 iGame GeForce RTX 4090 Vulcan OC Overview
In recent generations of graphics cards, the design concept of iGame is quite interesting every time, taking the Vulcan series we tested this time. In the RTX 30 series, it has changed the previous armor design and incorporated the cyberpunk elements, this futuristic and technological style into the graphics card . The outer armor has a very tension and clear edges.
Cyberpunk era
In this RTX 40 series graphics card, although the Vulcan series continues the cyberpunk style, the overall design gives people a feeling of collision between reality and fantasy - this is the post- Cyberpunk era.
I believe everyone is familiar with the concept of cyberpunk. "High-tech, low-life" can give a good overview of people in the cyberpunk world. Under the neon night sky, the wet road surface becomes a reflective source, highlighting the deformed prosperity of the city, and the entire city is full of faults and dislocation.
Post-Sky Bottom Era
Although the post-Sky Bottom Era still inherits the basic pattern of "high technology and low life", the view on "high technology" is more realistic and the attitude towards "low life" is more optimistic. It is more about making joy in suffering than anger. The most obvious feature of the post-season blog era is that it lacks many strong collision colors, and instead uses black, white and gray as the main ones, and a small number of colorful light sources to embellish the lying flat life.
Interestingly, the Cyber era is more of a parody and self-deprecating Cyberpunk.
The popular game "Stray", which was a few days ago, is actually the best interpretation of the post-season blog era. The residents of life have never seen the real blue sky, and their yearning for nature only stays in the remaining pictures a hundred years ago.
"Stray"
When humans are enjoying the huge convenience brought by technology, the earth's climate environment is no longer suitable for human habitation. In order to avoid the plague, humans built an underground fortress, from top to bottom, the top floor, the middle city area, the ant village, and the dead city.
At the same time, due to the enclosure of the underground fortress, people cannot shine on the sun, so people can only rely on the virtual reality world to avoid reality.
The biggest upgrade of the iGame Vulcan series in appearance is actually two points, the outer armor and the screen.
As mentioned before, this generation of Vulcan graphics cards seem more realistic, and it is the "reality" of reinforced concrete building a fortress. The overall color is black, white and gray, just like people's most solid dependence.
, Fantasy is the "smart screen" upgraded by Vulcan this time. It is no longer limited to folding, but is connected to the computer using a USB cable and can be placed anywhere. This is extremely similar to the screen that can be seen everywhere in the cyberpunk era.
First, let’s take a look at the accessories of this iGame GeForce RTX 4090 Vulcan OC. The two screens are located on the leftmost side, and the smart screen is more thoughtful to protect it to avoid surface scratches affecting the ornamental effect. In addition, there are metal graphics card holders and screwdrivers.
In addition, it is an indispensable 16pin to 8pin*4 power adapter cable, light synchronization cable material upload cable and smart screen magnetic-sucking base.
Although the recommended power supply for the public version RTX 4090 is the same as the RTX 3090 Ti, both of which are 850W. However, the overclocked graphics card plus CPU and other devices that require power. The 1000W power supply recommended by the official is more secure, so everyone should see if it is compatible.
In addition, this iGame also launched interesting peripherals, iGame Fun Host Pinyin Building Blocks. Although it is just a small ornament with a console shape, its difficulty is no less than the production of a small street scene. Players who are interested in it may wish to learn about it.
iGame GeForce RTX 4090 Vulcan OC card is 348.5×159.5×70.4mm. The size of this card is relatively small compared to other AIC RTX 4090s we have reviewed, especially in terms of length. But even so, this RTX 4090 is still a giant, and it has never been available in size compared to the previous iGame graphics cards. Although the overall design of the outer armor of the GeForce RTX 4090 Vulcan OC is mainly in the dark gray "concrete" style, the metal elements can still be seen, as if they are the last line of defense to protect humans in a solid fortress.
In terms of heat dissipation, the iGame GeForce RTX 4090 Vulcan OC uses a Vortex radiator as the overall, and three 104mm large-diameter fans are used as active heat dissipation. The innovative "wind gathering sickle ring" fan blades are the key to this improvement. The upgraded and reinforced blade rings are connected to provide the maximum heat dissipation effect.
In the internal heat dissipation module, the iGame GeForce RTX 4090 Vulcan OC uses flow-guided fins, and the 9 8mm heat pipes configuration is luxurious. The "reflow soldering" process makes the heat pipes fit closer to the fins, optimizing heat dissipation efficiency.
Vacuum incarnate technology
. The vacuum incarnate technology is used inside. The ultra-flat closed cavity is filled with condensate. After absorbing heat, heat is dissipated through the principle of phase change. The vacuum incarnate cavity is combined with the heat pipe and fins into a whole, and the heat dissipation efficiency is better. On the
video output interface, the iGame GeForce RTX 4090 Vulcan OC adopts the same four-interface design as the public version of HDMI 2.1 + DP 1.4a*3.
As for the DP 2.0, which has a high voice, most consumer-grade gaming monitors are not currently installed, and the DP 1.4a standard can also support 8K 60Hz refresh rate monitors. So, overall, it is definitely enough.
is the iconic one-click overclocking button of Colorful Rainbow, and is naturally inherited in the RTX 40 series. The good mechanical feedback feel and the illuminated ice-blue light make the ritual feel full.
6pin auxiliary power supply on the left is the material uploading light synchronization interface
This time, the TDP of iGame GeForce RTX 4090 Vulcan OC is officially given by 550W/515W, equipped with a single 16pin auxiliary power supply. The internal power supply of the public graphics card is 20+3 phases; while the iGame graphics card is 24+4 phases, and the frequency is also higher, so the recommended 1000W power supply is indeed not an exaggeration.
Some power supply manufacturers have released the latest ATX 3.0 standard high-end power supply, with a 16pin power supply interface of 12VHPWR, and a single port can support up to 600W power supply. So if nothing unexpected happens, perhaps the next generation of graphics cards will also use such a single 16pin to power it.
Although all graphics card manufacturers currently have a relay cable, the degree of 8pin*4 can be imagined. If possible, an ATX 3.0 standard power supply is simply too tidy.
It should be noted that the 12pin interface and power adapter currently suitable for RTX 30 series are incompatible with RTX 40 series graphics cards.
In addition, NVLink will no longer be supported in RTX 40 series graphics cards, so it is impossible to reproduce the four-way Titan of the past.
Let’s look at the back panel again. From the hollow back panel on the right, you can see a large number of heat dissipation fins and heat pipes. This iGame GeForce RTX 4090 Vulcan OC also uses a shorter PCB board to meet the overall heat dissipation effect of the heat dissipation fins.
02 Smart Screen and iGame Center Software
With the release of RTX 40 series graphics cards, the iGame Center software has also ushered in an upgrade, and the interface layout of the new 2022 version is more neat.
will display all hardware information on the software homepage, and it is very detailed, such as the number of CUDA on the graphics card and the video memory, which can be clearly seen.
In hardware control, the lighting system can be mainly adjusted, both globally or separately. Selecting the above separate control is the focus of our smart screen setting this time.
The default lighting effects in this time are actually very good, especially the parameter display of GPU and CPU, which has a very cyberpunk feeling.
In custom pictures, players can manually upload homemade pictures. At the same time, due to the upgrade of the screen, the resolution has been increased from the previous generation of 480×128px to 800×216px, which is more visually impactful.
On the side of the graphics card, you can see the original position of the flipped screen of the RTX 30 series, which has been changed to a contact-type magnetic design, which can absorb iGame Vulcan light control components or horizontal and vertical smart screens.
Due to the upgrade of the smart screen, it can be equipped with a base and placed outside the chassis. This position is of course empty. Therefore, when the smart screen is not on the graphics card, the light control component of this iGame Vulcan can be directly attached to it. However, there is one thing that this light control component is not compatible with the base.
iGame Vulcan smart screen has magnetic contacts on the bottom and back, and it is very convenient to "suck and use". The advantage of
external base is that it can be placed on any desktop ornament, and the high-definition resolution display effect will be better after upgrading. Regular uploading of pictures has no impact. Below, I uploaded a video of NVIDIA Racer RTX.
However, it should be noted that it is best to use the material upload line included in the graphics card to use, otherwise the waiting time will be longer.
03 Who is Ada Lovelace?
Let’s take a look at the launch of the NVIDIA Ada Lovelace architecture. Let’s start with Ada Lovelace. Compared to Ampere, this person seems to be more unfamiliar with everyone.
Ada Lovelace (1815-1852) is a British mathematician and founder of computer programs. He established the concept of loops and subroutine , and is known as the world's first programmer .
Ada has had a very high talent in mathematics since childhood. Her father called her the "parallelogram princess", and her later partner Charles Babbage called her the "digital witch". At the age of 19, Ada married her former science tutor, and after marriage, she remained enthusiastic about mathematics.
1842 to 1843 spent 9 months translating Babbage's "Introduction to Analysis Machines" and wrote many notes, which gave a detailed explanation of using a computer to solve Bernoulli numbers. As a result, Ada is widely regarded as the world's first programmer.
The language named after her - ada language has become the language used by the US military to develop cutting-edge weapons such as fighter jets.
From a few lines of short life introduction, it is not difficult to see that although Ada's life has only experienced a short 37 spring and autumn , it is enough to be remembered by future generations.
This is why the slogan "respecting legends with the future" was used in the premier publicity of NVIDIA RTX 40. Let's analyze in detail what innovations and transcendences there are in this Ada Lovelace.
04 NVIDIA Ada Lovelace architecture
The GeForce RTX 40 series graphics card released this time is built by the brand new NVIDIA Ada Lovelace architecture, using TSMC 4nm customization process (TSMC 4nm NVIDIA Custom Process). The flagship core AD102 has reached a terrifying 76 billion transistors, while 28 billion in RTX 30 series graphics cards.
Compared with the previous generation of NVIDIA Ampere, NVIDIA Ada Lovelace has more than 2 times the performance improvement at the same power. The shader data throughput can reach up to 90-TFLOPS, while the GeForce RTX 4090 released this time has reached 83-TFLOPs, compared with the previous generation of NVIDIA Ampere, it only has 40-TFOPs.
The complete AD102 core has a total of 18432 CUDAs, including 12 graphics processing clusters (GPCs), 72 texture processing clusters (TPCs), and 144 streaming multiprocessors (SMs). 144 third-generation ray tracing cores (RT cores), 576 fourth-generation tensor cores (Tensor Cores). In addition, we can see that the Boost frequency has also increased from 1.9GHz to 2.5GHz.
Another point that is not reflected in the architecture diagram is that the AD102 core also contains 288 FP64 double-precision floating-point cores (2 per SM), which are used to ensure that the FP64 code is correctly processed, including the FP64 tensor core code.
Generally speaking, single-precision floating-point operation will be used for deep learning model training, while double-precision floating-point operation is used for numerical simulation work. Usually, the game card will cut off FP64, which not only saves costs but also has no impact on the game itself. Professional cards retain FP64 for higher accuracy training and calculation.
This information only mentioned that the AD102 core is equipped with 288 FP64s, and it is not known whether there will be any changes in the subsequent products launched.
Understand the complete GA102 core, let’s take a look at the core of RTX 4090. In fact, if we know the parameters of RTX 4090, we can probably understand what the difference is in the "Ti" series that may be launched in the future.
Compared with the complete GA102, the RTX 4090 has a total of 16384 CUDAs, including 11 GPCs, 64 TPCs and 128 SM units, the third-generation RT Cores are 128, and the fourth-generation Tensor Cores are 512.
In fact, it can be seen from the complete architecture diagram that the overall structural changes of the Ada architecture are not much, which can be clearly confirmed from the SM unit, the same FP32 CUDA core, the same FP32/INT32 hybrid CUDA core, the same L1 level cache, etc. Of course, the Tensor Core inside each SM cell is upgraded to the fourth generation.
However, the most significant change is the third generation ray tracing core. We look at it in combination with the two generations of architectures. In the second generation of ray tracing core, the Box Intersection Engine engine is responsible for boundary cross-testing, and the Triangle Intersection Engine engine is responsible for triangle cross-testing.
. In the third generation of ray tracing core, two new engines have been added: Opacity Micro-Map Engines (OMM) and Displaced Micro-Mesh Engines (DMM). These two new hardware units can greatly improve ray tracing performance (specific principles are introduced in detail later).
So far, every 2 SM units form a TPC unit, and every 6 groups of TPC units form a complete GPC top-level unit (in some cores, 5 groups of TPCs form a GPC unit).
, and each GPC unit is equipped with an independent raster engine and two sets of ROP partitions (each group contains 8 ROP units).
too much about counting, I won’t introduce it anymore. After all, the overall architecture of this architecture is basically the same as that of NVIDIA Ampere. Let’s take a look at what other upgrades are besides the performance Ada architecture.
Shader Execution Reordering (SER) Shader performs reordering
SER's main function is to improve the performance of the shader, which can dynamically reorganize inefficient workloads into more efficient workloads. The performance improvements mainly for ray tracing are very large.
Simply put, GPU is most efficient when performing similar tasks. But as the ray tracing effect becomes more and more powerful, millions of light may illuminate different materials in each scene, and we know that the reflectivity of different materials and the reflection effects are also different. So this creates a large number of divergent, inefficient workloads for the shader.
SER can reclassify these messy instructions and dynamically reorganize them into more efficient workloads. According to NVIDIA, SER can improve shader performance up to 2 times and improve game frame rate up to 25%.
To give a simple example, when the light is from the first time from the emission end to the collision end, it is a very regular ray, and the secondary ray tracing after collision with the object will appear, a large number of divergent and irregular reflections will occur, which is very high for the ray tracing load.As you can see from the figure, SER can sort these instructions in a quadratic order to maximize the performance of the shader.
Fortunately, such practical functions are not patented by the RTX 40 series. It is an easy-to-integrate SDK and currently requires game developers to integrate into the game. In addition, since it is a general logic, it is possible to directly integrate into the Windows API in the future, so that game developers can directly call the system API without special references.
It can be said that SER is a great blessing for N-card users who hold RTX 20 series and above (can enable ray tracing). After all, who doesn’t like the free-to-improve ray tracing performance?
The third generation RT Cores
RT Core lies in faster ray tracing computing power. If it is a bit difficult to enjoy 4K high frame rate games in RTX 30 series graphics cards, then it will be easy in RTX 40 series graphics cards.
on the GeForce RTX 4090 graphics card, the fastest processing capacity of 191 RT-TFLOPs, while the fastest processing capacity of RTX 30 series graphics card is 78 RT-TFLOPs, which is 2.4 times. And according to NVIDIA's official statement, the peak RT-TFLOPs of the third-generation RT Core is 2.8 times higher than the previous generation. This can only show that this 4090 is not the final form of the Ada Lovelace architecture.
Opacity Micro-Map Engines (OMM)
has introduced two important hardware units in the third generation RT Cores. The first is Opacity Micro-Map Engines, which can be understood as a micromap transparency engine. Its main function is to optimize ray tracing rendering and can greatly reduce the work burden of shaders.
For complex objects such as leaves, different rays will affect their performance status and the rebound of light between leaves, so the calculation amount of ray tracing is huge.
However, Opacity Micro-Map Engines can bake ray tracing features into opaque masks, so those irregularly shaped and translucent objects can be rendered faster and more accurately, greatly reducing the work burden of the shader.
Displaced Micro-Mesh Engines (DMM)
Displaced Micro-Mesh Engines can be understood as a micro-mesh replacement engine. It can build ray tracing BVH (Bounding volume hierarchy) by 10 times faster! The video memory used has been reduced by 20 times!
DMM is processed locally by the third generation RT core. Compared with previous generations, it only uses basic triangles to render complex geometry, greatly reducing storage and processing needs. The specific working principle of
is clear from the figure. The new DMM can simplify complex graphics with a very large number of faces and create a simple model, but the overall ray tracing effect remains unchanged.
Through some model data, we can see in detail how much the new DMM simplifies the model. The original model of 11 million triangle facets has only about 150,000 microgrids after simplification, and the construction speed of BVH has been increased by 8.5 times and 6.5 times smaller.
. This is not the most exaggerated. The more complex the model is, the better the optimization effect. In these sets of comparison examples shown by the official, the fastest speed can be improved by more than 15 times and the capacity is simplified by 20 times.
Fourth-generation Tensor Cores
In addition to the upgrade of ray tracing units, the upgrade of the fourth-generation tensor core is even more terrifying. It uses the new FP8 tensor engine, and the throughput reaches 1.32 Tensor petaFLOPs on the GeForce RTX 4090 graphics card, a 5-fold increase.
Note the unit here - petaFLOPs. Previously, TFLOPs were trillions of floating-point operations, while petaFLOPs were tens of trillions of floating-point operations.
DLSS 3 Neural Network Rendering New Era
The DLSS 3 launched this time is also a major selling point of the RTX 40 series. From DLSS 2.3 to directly enter the 3.0 version, we can also see how big the upgrade this time is. DLSS 3 is also officially called the new era of neural network rendering by NVIDIA.
The new DLSS 3 has added optical multi-frame generation technology to generate brand new frames, unlike the original DLSS super resolution.
DLSS 3 combines three technologies: DLSS super resolution, DLSS frame generation and NVIDIA Reflex, which can rebuild seven-eighth of pixels and greatly improve performance.
In games with GPU restricted, such as higher resolutions with 2K resolution and above, DLSS 2 can increase the frame rate by 2 times, and DLSS 3 can increase the frame rate by 4 times.
This time, DLSS 3 spans a large version, and has been upgraded again in terms of ideas and principles. The technology of "guessing" 1 frame is simple to explain, but it requires a lot of reasoning and calculations to implement, as well as absolutely advanced ideas.
is not the 1 frame generated "out of thin air", which is definitely higher in latency than DLSS 2. So in this complete DLSS 3, NVIDIA Reflex is bundled with, which can effectively help reduce latency.
This does not disappoint NVIDIA, which gave it the name of "a new era of neural network rendering". Looking at the XeSS and FSR technologies currently on the market, DLSS can definitely be called "the shoulders of giants". Of course, the hardest thing for years of innovation is that players who hold the previous generation of graphics cards want to experience the frame generation of DLSS 3. The only way at present is to buy an RTX 40 series graphics card.
New Optical Flow Accelerator
New Optical Flow Accelerator is the latest introduction in the fourth generation of Tensor Cores, which is why frame generation in DLSS 3 is exclusive to RTX 40 series graphics cards.
Optical Flow Accelerator can also calculate the optical flow field within two consecutive frames based on the original DLSS 2, and can capture the direction and speed of the game screen from the first frame to the second frame, and capture pixel information such as particles, reflections and lights. The motion vector and optical flow are calculated separately to obtain accurate shadow reconstruction effect.
Take "Cyberpunk 2077" as an example. In the first frame, the optical flow accelerator will capture information such as particles, reflections and light in each pixel. And find the matching pixel area in the second frame and calculate the difference between the frames.
If DLSS 2 can "guess" the remaining pixels in a picture, then DLSS 3 can "guess" the picture of the next frame in addition to these.
In addition, since the frame generation of DLSS 3 is processed and run in the GPU, even if the game encounters a CPU bottleneck, AI can also increase the frame rate. This is why it is said in this press conference that DLSS 3 can break through the CPU limit to increase frame count.
Dual AV1 Encoder
The eighth generation NVENC encoder upgraded this time can be said to be a great blessing for live broadcasts, videos, and post-production workers. It has added support for AV1 encoding for the first time, and the most obvious effect is live broadcast.
Compared with traditional H.264 encoding, the efficiency of AV1 encoding is improved by 40% on average, and the image quality of AV1 encoding will be better at the same code rate. Currently, the resolution and clarity of most live broadcasts are limited by the maximum bit rate specified by the platform. Taking the 8Mbps limited by Twitch as an example, you can see that under the same bandwidth, the picture with the same 2K 60 frames, the clarity of AV1 encoding is significantly higher than that of H.264.
Speaking of live broadcast, I believe everyone is familiar with OBS. In the upcoming patch in October, OBS added NVENC's AV1 encoding support
. Of course, live broadcast is just an advantage of AV1 that is easier for us to see. In all aspects of video work, AV1 encoding can bring great improvements.
So, as seen in the figure. NVIDIA has laid a complete ecosystem for the majority of users, from encoding API, software, platform to players, it will fully support AV1 encoding.
Also let’s talk about the dual AV1 encoding that NVIDIA has always emphasized. As the name suggests, some graphics cards are equipped with two encoders, and the effects it brings are also obvious.
First of all, according to the official promotion, RTX 4090 is 2.2 times that of RTX 3090 Ti in terms of export speed of 4K H.265; 2.5 times the export speed of 8K H.265. The improvements in this part are also applicable to the cut-and-screen , which is commonly used by everyone. Interested users may wish to experience it for themselves.
In addition to the export speed, 8K 60 frame video recording was simply unimaginable in the past. The advantage of the dual encoder is that it can divide the image into two, and the two encoders process the image information of 7680×2160 respectively, and finally put it intact.
The encoding part may not be deeply felt by most users, but when one day you want to record the screen, you find that the graphics card does not support it, and you will realize its importance...
As images gradually enter the era of ultra-clear, hardware encoding and rendering have almost become indispensable helpers. Although hardware encoding is still not as good as CPU soft programming in terms of quality, soft programming has to endure infinite time for the ultimate picture quality.
Even in an 8K rendering, the time gap between the two encoding methods has reached several hours, let alone a 10-second CG animation. In the ever-advanced hardware encoding, quality and time are constantly being challenged and refreshed.
05 Introduction to the test platform
First, let’s introduce the test platform. In order to ensure the performance of the monster iGame GeForce RTX 4090 Vulcan OC, our platform has also been fully updated again.
However, since there is no flagship processor on hand, it adopts this generation of mid-to-high-end products and has focused on upgrading on the power supply. It uses Xingu 1250W gold medal full module power supply.
First look at the parameters of GPU-Z. iGame GeForce RTX 4090 Vulcan OC adopts AD102 core and uses TSMC 4nm customization process (TSMC 4nm NVIDIA Custom Process). The chip area is 608 square millimeters , which is smaller than the 628 square millimeters of the RTX 30 series GA102.
has 16384 CUDAs, which is 52% more than the 10752 of the RTX 3090 Ti. The Boost frequency reaches 2625MHz, which is a very large increase compared to the public version of 2520MHz.
uses 24GB GDDR6X Micron video memory, with a bit width of 384bit, video memory bandwidth reaches 1008.4 GB/s, and 176 and 512 grating units and texture units.
06 Theoretical performance test
The following is the 3DMARKFS set used to measure the theoretical performance of graphics card DX11: FS, FSE, and FSU correspond to the theoretical performance of graphics cards in 1080P, 2K, and 4K respectively. The actual test results are as follows:
In the 3DMARK FS set test for graphics card DX11 performance, the improvement of iGame GeForce RTX 4090 Vulcan OC is amazing. It can be seen that the higher the resolution, the greater the improvement of this graphics card, among which FS is increased by 49%; FSE is increased by 67%; FSU is violently increased by 78%.
Overall, in the test of the entire FS package, the iGame GeForce RTX 4090 Vulcan OC has an increase of about 65% compared to the GeForce RTX 3090 Ti.
. In the Time Spy and Time Spy Extreme tests in DX12 environment, the improvements of iGame GeForce RTX 4090 Vulcan OC compared to GeForce RTX 3090 Ti are: TS increases by 57%; TSE increases by 69%, which is about 63%.
PortRoyal is a test item specifically for ray tracing performance in 3DMARK. The iGame GeForce RTX 4090 Vulcan OC has an increase of about 56% compared to the GeForce RTX 3090 Ti.
Overall, the theoretical performance of iGame GeForce RTX 4090 Vulcan OC has improved by about 61% compared to the GeForce RTX 3090 Ti.
iGame GeForce RTX 4090 Vulcan OC DLSS 3 4K
In this test, we used the beta version of 3DMARK to conduct relevant tests on DLSS 3. At 4K resolution, DLSS off is 52.34 frames, and DLSS 3 is 156.56 frames after it is turned on.
RTX 3090 Ti DLSS 2 4K
In addition, we also tested the results of GeForce RTX 3090 Ti under this program, where DLSS off is 32.73 frames. Since DLSS 3 is not supported, the results under DLSS 2 are 83.63 frames.
iGame GeForce RTX 4090 Vulcan OC has increased by 199% compared to shutdown after turning on DLSS 3; while GeForce RTX 3090 Ti has increased by 155% compared to shutdown after turning on DLSS 2.
Of course, the most exaggerated thing about DLSS 3 is more than just the numbers. Let’s take a look at this picture again.
iGame GeForce RTX 4090 Vulcan OC DLSS 3 8K
In the DLSS 3 test at 8K (7680×4320) resolution, the iGame GeForce RTX 4090 Vulcan OC had only 12.66 frames when DLSS is turned off, and it can no longer run the game normally. After turning on DLSS 3, it reached a smooth level of 87.24, an increase of 582%!
passed the DLSS test, which actually shocked me a lot.It is not difficult to find that the higher the resolution, the greater the frame rate increase. 8K 60 frames are no longer an out of reach for current graphics cards. At 4K resolution, even 3A games can reach e-sports level frame rate. We will conduct detailed tests on the game later.
07 Regular gaming performance test
Since the RTX 40 series has added new DLSS 3 technology, it will be tested separately later. Here we still choose several mainstream 3A masterpieces for game performance comparison.
First of all, it can be seen in "Horizon 5" that not only at 1080p resolution, but even at 2K resolution, the situation of restricted CPUs is still obvious. As a standard 3A game, it can still run to 135 frames at 4K resolution, which was absolutely unimaginable before. In terms of
performance, the improvements of iGame GeForce RTX 4090 Vulcan OC compared to GeForce RTX 3090 Ti are: 35% improvement in 1080p; 39% improvement in 2K; 59% improvement in 4K, and 44% improvement in overall improvement.
In "Assassin's Creed: Valhalla", the improvements of iGame GeForce RTX 4090 Vulcan OC compared to GeForce RTX 3090 Ti are: 1080p increase by 49%; 2K increase by 53%; 4K increase by 44%, and 49% comprehensive increase.
In "Borderlands 3", the improvements of iGame GeForce RTX 4090 Vulcan OC compared to GeForce RTX 3090 Ti are: 51% improvement in 1080p; 64% improvement in 2K; 70% improvement in 4K, and 62% improvement in overall.
"Municipality of Light: Infinity"'s ray tracing testing software is a game-independent testing tool, with more ray tracing technology used in the game, and the test conditions are "RTX highest/DLSS quality". Therefore, the test frame rate is relatively low, but the actual game configuration is quite affordable. In terms of performance of
, the improvements of iGame GeForce RTX 4090 Vulcan OC compared to GeForce RTX 3090 Ti are: 1080p is increased by 42%; 2K is increased by 58%; 4K is increased by 67%, and the overall increase is increased by 56%.
In another domestic game "Border", the situation is basically the same as "Music of Light: Infinity", and the test conditions are all carried out under "RTX highest/DLSS quality".
In "Border", the improvements of iGame GeForce RTX 4090 Vulcan OC compared to GeForce RTX 3090 Ti are: 1080p is increased by 49%; 2K is increased by 67%; 4K is increased by 81%, and the overall increase is increased by 66%.
08 DLSS 3 Performance Test
Due to the launch of this new technology DLSS 3, 35 games will launch the new DLSS 3 functions in the near future. We have also obtained the beta version of some games this time.
In addition, "Super Human", "Life and Death Reincarnation", "Fu Yunting", "Microsoft Flight Simulation", and "Plague Legend: Requiem" will release versions that support DLSS 3 in October.
Among them, "Cyberpunk 2077", "F1 22", "Plague Legend: Requiem", "Microsoft Flight Simulation", and "Against the Cold" conducted DLSS 3 tests. In addition, Unity and Unreal Engine also provided this test program.
The DLSS 3 test icon is quite cumbersome, and 1% Low FPS and delayed tests are added. Ordinary FPS is easy to understand, so what does this 1% Low FPS mean?
First of all, the FPS that game benchmark usually tests is the average game frame over a period of time. 1% Low FPS arranges the frame counts over a period of time from large to small, takes the smallest 1%, and then averages the 1% number.
In fact, simply put, neither of these two values can represent our specific feelings at which moment when we are playing, but FPS pays more attention to the overall situation, while 1% Low FPS is to find the average from the worst and be more cautious.
understands 1% Low FPS, let's look at this chart again. The one on the left side of the axis is delay (the lower the better), and the one on the right side of the axis is frame count (the higher the better), and since positive and negative coordinates are involved, the values on both sides may be different. The test result in
Frameview is three digits after the decimal point. For the sake of viewing, the frame number is rounded here, and the delay is reserved for the decimal point after the decimal point. Since all the games currently tested DLSS 3 are beta versions, bugs are inevitable.
In Microsoft Flight Simulation, the scores are almost unchanged when DLSS 2 is turned on and off.This game is a game that consumes extremely CPU resources. If the bottleneck is stuck on the processor, then traditional DLSS 2 does not really provide more frame rate bonus.
. In DLSS 3, we can clearly see a significant increase in frame count. You should know that all our DLSS 3 tests are performed at 4K resolution.
However, frame generation is not without disadvantages, which is why this test has added delays. And after turning on DLSS 3, NVIDIA Reflex is bundled and enabled. However, compared with the increased delay of DLSS 2, the experience is not strong in actual experience.
in "Cyberpunk 2077". It can be seen that with the highest ray tracing with DLSS, even the iGame GeForce RTX 4090 Vulcan OC graphics card has only 43 frames, and the delay reaches 85.3 milliseconds.
, and after turning on DLSS 3, the frame count is 129, an increase of 200%. Although the delay is about 6 milliseconds higher than DLSS 2, it remains at a lower level compared to turning off DLSS.
"Plague Legend: Requiem" is an upcoming game, with the increase in frame count between DLSS 3 and DLSS levels, also reaching 128%. However, in this game, the delay of DLSS 3 has increased by 18.4ms compared to DLSS 2, but it is still much lower than when DLSS is turned off.
Currently, there are also problems with the data testing of "F1 22", and there is no delayed data in both DLSS levels and DLSS 2. The
group mainly depends on the increase in frame count. Among them, DLSS 3 has increased by 121% compared to DLSS, and has increased by 57% compared to DLSS 2.
Finally, it is the ray tracing test of the domestic game "Against the Cold". The test demo we selected this time uses real global lighting.
So after I tried to turn off DLSS, the first time the computer crashed and restarted, and the second time I was lucky enough to run, the number of frames was only single digits, and the delay was already tens of thousands.
I still remember that "Infinite Light" and "Border" tested in pure ray tracing software can reach about 80 frames if there is only DLSS 2 this time. The real global lighting of "Against the Cold" has only about 42 frames after DLSS 2 is turned on, which is really terrifying.
In addition, the previous DLSS 3 test has reached about 80 frames at 8K resolution. In the ray tracing test of "Against the Cold", when DLSS 3 is turned on, the 4K resolution is only 69 frames. After testing
, I couldn't help asking questions. Is global lighting really the form of future games? Although the extremely realistic game scene can be seen through NVIDIA's official promotional video, the hardware requirements are beyond imagination.
Personally, I think it is difficult to popularize global lighting games at least in a short period of time, unless... the architecture of the next generation will have major changes.
Of course, we also conducted tests on the image quality. In the above picture, we intercepted a role in "Cyberpunk 2077". We can see that in the two DLSS modes, there is almost no obvious change in the original image quality compared to the original image quality, and the light and shadow effects are different only at the fence, but for such a large frame rate increase, this point can be almost negligible next time.
RTX 3090 Ti real-time frame count 39 frames
iGame GeForce RTX 4090 Vulcan OC real-time frame count 77 frames
In Unity's test program, there is a set of real-time calculations of ray tracing + DLSS frame count comparison. After turning on DLSS 3, the real-time frame count of the iGame GeForce RTX 4090 Vulcan OC is 77; when turning on DLSS 2, the real-time frame count is 39 frames, which is about 97%.
DLSS Off 70 frames
DLSS 2 121 frames
DLSS 3 160 frames
In the test game provided by UE5, a quick test of DLSS is conveniently given, which is divided into DLSS off (super resolution offset + frame generation offset + Reflex offset); DLSS 2 (super resolution performance + frame generation offset + Reflex switch); DLSS 3 (super resolution performance + frame generation switch + Reflex switch) three tests.
where iGame GeForce RTX 4090 Vulcan OC's real-time frame count in DLSS is 70 frames, DLSS 2 is 121 frames, and DLSS 3 is 160 frames.However, the DLSS 3 delay of this test in UE5 is 58.6ms, while DLSS 2 is 19.9ms, which is relatively high.
09 Professional software test
As a "90" level graphics card, it has 24GB of super large video memory, and applications in the field of content creators are indispensable. We use SPECviewperf 13, an industrial and professional software to run scores.
comparison graphics cards are iGame GeForce RTX 4090 Vulcan OC graphics cards, the previous generation flagship GeForce RTX 3090 Ti graphics cards, and the previous generation gaming flagship GeForce RTX 3080 Ti graphics cards.
In the software test of SPECviewperf 13, there are still many problems. Each professional software has different levels of performance improvement, and the graphics card just launched has more or less problems with the software adaptation.
3DS MAX even experienced negative growth, so tests will be done after the software update is improved. However, the scores of some software can still reflect the strength of iGame GeForce RTX 4090 Vulcan OC, among which CATIA has increased by about 55% compared to RTX 3090 Ti.
iGame GeForce RTX 4090 Vulcan OC Test score
RTX 3090 Ti Test score
Blender is a professional three-dimensional rendering software. This time, a fixed benchmark score software was launched, saving the hassle of installing software and downloading materials. This scoring software only needs to download the startup program, and the software will automatically render and test the monk/junkshop/classroom for three scenarios.
The picture above shows the scores of the iGame GeForce RTX 4090 Vulcan OC graphics card, with 6324/2908/2964 points, with an average of 4065 points; the picture below shows the scores of the GeForce RTX 3090 Ti graphics card, with 3136/1812/1549 points, with an average of 2165 points. Through the comparison of the average score, it is not difficult to find that the improvement is very obvious, reaching 88%, which can greatly save time for animations with frames as rendering units.
10 Power consumption and temperature test
In the power consumption test, we chose FurMark software for copying test and used GPU-Z to detect the temperature. The power consumption is only calculated by the graphics card itself.
In this copy test, the iGame GeForce RTX 4090 Vulcan OC performed well. With 100% TDP full load, the power consumption has reached about 530W. It can be seen that the power supply requirements of this RTX 4090 overclocking version are indeed very terrifying.
In addition, in the full-load copy machine test, the peak temperature of the iGame GeForce RTX 4090 Vulcan Vulcan OC is 71℃, while the peak temperature of the hot spot is 82℃, which is already very good for the RTX 4090 and AD102 cores.
11 Cyber Fortress Blade Upgrade
This iGame GeForce RTX 4090 Vulcan OC graphics card first test is more about theoretical and game testing around the newly added DLSS 3, and the actual effect is indeed amazing.
However, it can be found from the test that there are still various problems with the beta version. Previously, NVIDIA has also announced a number of games that will support DLSS 3 soon, such as "Super Human", "Life and Death", "Fu Yunting", "Microsoft Flight Simulation", and "Plague Legend: Requiem" will launch upgraded versions in October. You can also taste them yourself and feel the pleasure of the surge in frame count.
In addition, famous works such as "Chernobian", "Heart of the Atom", "Fighting Intention", "Cyberpunk 2077", "Black Myth: Wukong", "Memory of Light: Infinity", "Eternal Damn", "The Deserted Light 2: The Battle between Man and Benevolence", and "The Witcher III" will all add DLSS 3 support to subsequent versions.
Regarding the Ada architecture for the upgrade of the RTX 40 series graphics card, people were not optimistic about it at the beginning. After all, from the overall architecture diagram, there is almost no change. Has NVIDIA no longer innovated? The addition of
OMM and DMM engines has greatly increased the number of games again, which has indeed brought exponential growth compared to RTX 30 series graphics cards. Even games limited by CPU can break through the hardware display. Microsoft Flight Simulation can achieve a score of more than 144 frames at 4K resolution, which is shocking.
It seems that NVIDIA does not have any bottlenecks in the advancement of graphics. Even other frame lifting technologies on the market are one step slower than NVIDIA every time.In terms of performance of
, the performance of rasters is actually considered to be fair, after all, the big architecture is there. But the real improvement of the RTX 40 series graphics card lies in the advanced RT Core and Tensor Core. Therefore, this time, the RTX 40 series graphics card is better to "eat" with DLSS 3.
However, for players who want to change graphics cards, they still need to consider whether their hardware is currently compatible. Most AIC graphics cards are recommended to have 1,000W. Although according to our test results, power supplies below this standard can still operate, but if there is really a problem, there is no guarantee.
Finally, let’s talk about the appearance and smart screen upgrade of the Vulcan series. Compared with RTX 30 series graphics cards, this generation is not that sharp, and the overall momentum is much more constrained, giving people a sense of déjà vu. The upgrade of iGame Vulcan smart screen is very similar to the virtual world in the future world. I have to say that this overall upgrade is very consistent with both form and intention.