(Nweon September 27, 2022) Interactive 3D models are the next media type after images and videos, so the industry is increasingly using <model-viewer> and other renderers to display relevant 3D models in business environments, museums and elsewhere. Users are interested in ensuri

Category：hotcomm

2025-04-12

(分新彩Nweon September 27, 2022 ) Interactive 3D model is the next media type after images and videos. Therefore, the industry is increasingly using model-viewer and other renderers to display relevant 3D models in business environments, museums and elsewhere. Users are interested in ensuring that the presented pixels accurately represent the real object on which the model is based. Therefore, a quality assurance process is needed to ensure that the 3D model itself has been designed accurately and that the model is presented realistically.

A few days ago, Emmett Lalish, a senior software engineer who is mainly responsible for model-viewer in Google , wrote an article introducing the color accuracy of glTF, thus providing everyone with the background needed to establish the process and set expectations:

Khronos' glTF is the first 3D model format for designated base rendering (Physically-Based Rendering/PBR/Physical-based rendering), which means it contains material properties that define how light should be reflected and refracted in real-world units. This means that the renderer can be freely innovated in GPU shaders to create increasingly accurate approximations of underlying physics, since glTF does not specify any single approximations.

This also means that while different renderers may make different tradeoffs between precision and speed, you can be sure that glTF will look consistent even in irrelevant codebases (although not the same pixels). We call glTF a JPEG of 3D because it is compressed for efficient web delivery and can be consistently rendered by a large number of viewers.

At first glance, different glTF viewers may be inconsistent, but this is usually not due to rendering differences, but rather due to the default scene settings. Physical-based rendering means that the scene takes ambient light as an input, just like a real camera. So, similar to getting a consistent photo, we need not only the same object, but also the same lighting and camera settings. There is no standard for the default settings of Viewer, so it is important to consciously set it to a consistent value. This is exactly what we do to show that glTF rendering converges to the state of various popular renderers.

. What is color accuracy?

PBR's goal is to create color accurate images at frame rate when the user interacts with the 3D model. Of course, the renderer can only be as accurate as the input 3D model, so it is equally critical to determine whether the 3D model actually accurately represents the real-world object on which it is based.

The most correct way is to set up the photo shoot of a real object, capture the surrounding ambient lighting in its full dynamic range, record camera settings and locations, then set the same rendering scene and compare the output images. Unfortunately, the costs are often too expensive.

Even in this ideal comparison scenario, you will encounter a very important question: What metrics do people use to compare images? Simple pixel comparison indicators like peak signal-to-noise ratio often give too much weight to insignificant differences. Therefore, perceptual metrics are better, but more casual and harder to define.

Since most products are designed with RGB material specifications, the usual idea is to simply reflect this in the base color of glTF. This is a very good method of precise color modeling if the RGB value is in the proper color space. glTF specification says that the baseColor factor (normalized between 0 and 1) is exactly the fraction of the given channel (wavelength) of the light reflected by the material. The baseColor texture is the same, but the linear value is extracted first through the sRGB transfer function. It should not be assumed that the given draw sample RGB value is defined in the same way.

When verifying the color accuracy of the render, the first idea is usually to check if the output rendered image has the same RGB pixel value as the glTF baseColor (or the expected draw sample RGB). This is a fundamentally wrong expectation because it negates the purpose of base material rendering. The following details and examples support this assertion and point the way for a more useful verification scheme.

. What is the problem with rendering colors matching baseColor?

The most important point to be clear about PBR is that it accurately represents the interaction between incident light and material properties, among which metallicity and roughness are also very important in addition to baseColor. However, the rendering output for a given pixel is only RGB, meaning that if it matches baseColor RGB, incident light and other material properties will not affect the generated image in any way by definition.

Let's start with a simple example: six spheres with uniform material. The top behaves white (baseColor RGB:[1,1,1]) and the bottom behaves yellow (baseColor RGB:[1,1,0]). From left to right are shiny metal (metal: 1, roughness: 0), shiny plastic (metal: 0, roughness: 0) and matte plastic (metal: 0, roughness: 1). The one on the far left can be considered roughly polished silver and gold.

Basic example of spheres with different uniform materials

Please note that different materials with the same baseColor are rendered differently. Which pixels match baseColor RGB? In fact, if you really want the rendered pixels to match the base color RGB value, glTF has a dedicated extension: KHR_materials_unlit. This extension is not physics-based and therefore applies to cases where only RGB textures are generated, such as labels and 3D scanning, where all applied lighting is baked as part of the capture process. This is how the above model looks when it is expanded by unlit:

The spheres expanded by glTF Unlit are the same as above

Obviously, lighting is very important in improving the fidelity of the three-dimensional model. The next common idea is to choose a nice uniform neutral lighting scenario to get the output RGB value "close to" the expected baseColor. Well, it's easy to produce a uniform lighting environment, but for PBR, the results may be surprising:

Base material rendering with uniform lighting

It looks like the white spheres have disappeared, but they still exist. In tilt view, you can see them obscuring the yellow spheres. In fact, they are just perfectly camouflaged. The scene is called Furnace Test and is mainly used to check the renderer's energy conservation . Physics can prove that under completely uniform light, white spheres should evenly reflect exactly the same light as the light incident from the environment, and therefore cannot be distinguished from it.

Please note that the result of the yellow sphere is actually very close to the unlit result. There is little noticeable difference between gloss and matte, or metal and plastic. This is actually accurate; if you really find a place where the lighting is completely even, you can equally not distinguish the differences between the materials described (assuming you can hide your reflections because you are part of the lighting environment). What seems so unreal is that it is almost impossible to find such an environment. The real environment has texture, and we use the reflection of the texture to understand reflection.

Next, let's go back to the original version, which uses the "neutral neutral" environment image of the model viewer. For more information on how this environment is designed, please visit here and here. The lighting is designed to be uniform (from each side), but not uniform, thus providing enough texture to distinguish material types.

It is pure grayscale, so the tone of the material will not change. This is contrary to the possibility of indoor lighting being yellow, outdoor lighting being blue or sunset being red. PBR will faithfully produce the colors that camera captures in the scene, which of course looks different from the same object under neutral lighting.

Back to the original example, but using the exposure slider

Please note that we can call the ball in the upper right corner paper white: a perfect matte reflector. However, despite the white baseColor (sRGB:[255,255,255]), note that rendering colors vary from [177,177,177] to [237,237,237] and never reach pure white. Why is this happening? The ball in the middle and upper part does reflect a certain pure white from its shiny surface.Specular reflection is a complement to diffuse reflection (diffuse reflection is obtained from matte surfaces), so if diffuse reflection has saturated our pixels, it is impossible to distinguish between glossy objects and matte objects. Try to add an exposure slider to see the effect, which is called overexposure in photography.

You may notice that exposure seems to have a greater impact on the intermediate value than black and white (although exposure is linear light multiplication). You are right, this is caused by the last nonlinear tone mapping step that occurs in the rendering pipeline. Tone mapping itself is a complex subject, but it is essential to understanding PBR. Before we start, let's compare rendering and photography pipelines.

. How does rendering compare to photography?

D rendering (particularly PBR) is designed to simulate photography, which in turn is designed to simulate the human eye and brain. The brain part is essential, because the goal of a real photo is to evoke the same perception as viewing the original scene by looking at it. This is very difficult because the light reflected by the printed photo or the light emitted by the display is significantly lower than the real world and has lower contrast. Even HDR displays have orders of magnitude lower contrast than what the eyes see outside normally.

Thankfully, our brains have made a lot of adjustments to our perception, including correcting for contrast. This allows us to print photos in a very compressed dynamic range while still providing a real sunset feel. We call this compression of dynamic range tone mapping. In photography, you can think of it as a conversion from the original image of the camera to the final image. It becomes even more important in modern photography with exposure superposition, where original images with higher dynamic ranges can be produced than the sensor can produce in a single shot.

Comparison of photography and 3D rendering pipelines

In 3D rendering, there is no sensor, the calculation is done in floating point , which means the original image is valid full HDR, with a range even greater than the exposure stack usually possible. When viewing the histogram of the original image, a long tail is usually shown, representing a stronger small, shiny flicker than the rest of the scene. To maintain perception while compressing to SDR, nonlinear tone mapping curves are used.

. How does tone mapping work?

Tone mapping is a general term that can basically refer to any color conversion function. Even separate operations that the photographer may apply in post-processing, such as gamma correction, saturation, contrast, etc., can be combined into a single result function. We call it tone mapping here. However, we are only interested in tone neutral functions, which is the only type of tone mapping function we will discuss here. Therefore, the focus will be on brightness.

The tone mapping function used by the model viewer is ACES, which is the standard developed in the film industry and is widely used in 3D rendering. Like most tone mapping curves, it is fairly linear at the center focus of the contrast range and then gradually outwards, smoothly compressing the bright and dark long tails to the desired zero-to-one output range. The idea is that humans perceive the difference between the overly bright and over dark areas is smaller than most of the scene. However, since a certain output range is reserved for the extra bright highlights, the remaining range representing the matte baseColor input range is also reduced. This is why paper white spheres don't produce white pixels.

Sometimes, when you work on matte objects and try to compare the output color to baseColor, you will notice this tone mapping compression and identify it as the source of obvious tone differences. The solution that comes to mind is usually to solve this problem by not applying tone mapping.

But the problem is that "no tone mapping" does not exist, because somehow the unbounded input range must be converted to the zero-to-one range that the encoder expects.

. What is color accuracy?

. What is the problem with rendering colors matching baseColor?

Basic example of spheres with different uniform materials

The spheres expanded by glTF Unlit are the same as above

Base material rendering with uniform lighting

Back to the original example, but using the exposure slider

. How does rendering compare to photography?

Comparison of photography and 3D rendering pipelines

. How does tone mapping work?

But the problem is that "no tone mapping" does not exist, because somehow the unbounded input range must be converted to the zero-to-one range that the encoder expects.Without performing this step, the encoder will simply clamp these values, equivalent to a segmented linear tone mapping function with sharp corners, and introduce perceptual errors for glossy objects, as shown in the following example.

Please note that you can't distinguish between shiny and matte white plastic balls even without adding exposure. Since half of the matte white sphere now renders pure white, there is not enough headroom to accommodate the shiny highlights. Likewise, since the shading is removed by clamping values, the upper half of the sphere will lose its 3D appearance. Check the check box to return to the ACES tone map for quick comparisons. Remember to look back after switching. Another skill of human perception is its dependence on anchoring. The yellow looks to fade immediately after switching from saturated yellow, but the feeling will disappear after looking around.

This example also highlights the second key element of the good tone mapping function: desaturation of overexposed colors. Check out the golden sphere (bottom left corner) and compare with the previous version where ACES tone mapping is applied. The baseColor of the metal multiplies the incident light, so the white light on the golden sphere produces a yellow reflection (full saturated yellow, in this case a fully saturated baseColor). When using clamp tone mapping, the highlights are indeed saturated yellow, but this doesn't look right visually, even if you can prove that it's physically correct.

Good tone mapping curves (such as ACE) not only compresses brightness, but also pushes the color toward the brighter white. This is why the highlights of the golden sphere turn white instead of yellow. This follows the behavior of the camera sensor and our eyes when reacting to overexposed colored light. You can see this effect simply by observing the flame or spark of the candle, and the brightest parts all look white despite the different colors. For interested readers, Nvidia provides more details on tone mapping and HDR here.

The last way to avoid tone mapping is to select exposure, so that all pixels are in the range [0, 1], thereby avoiding value clamping. For matte objects with low dynamic range lighting, this gives half-decent results, but for glossy objects it completely crashes as shown in the following screenshot.

The image of the above sphere has no tone mapping and exposure settings

The problem is that the specular highlight is orders of magnitude brighter than most scenes of the scene, so to adapt it to the output range, the exposure needs to be reduced by more than 50 times. This reduces brightness and contrast in most scenes, as there are only a few small highlights. This neutral environment does not have a high dynamic range; if an outdoor environment containing the sun is to be used, the exposure must be so low that almost the entire rendering is black.

Everything displayed here is rendered as 8-bit sRGB output, but HDR displays and formats are becoming more and more common. Can we avoid tone mapping by keeping the HDR of the original image in the HDR output format? In short, the answer is no.

Because HDR displays may have a higher dynamic range compared to traditional SDR, but they are still several orders of magnitude less than our eyes experience in the real world, so all the same reasons for tone mapping still apply. But it should be noted that the choice of tone mapping function should depend on the output format. Ideally, it even depends on the display's contrast, brightness settings, and the ambient lighting level, but the data is unlikely to be available.

. How do we verify glTF 3D rendering?

Hopefully the previous discussion has convinced you that simply verifying the colors of certain rendered pixels based on the "correct" color of the object is not a useful or effective process. If it were that simple. Instead, you need to look at your entire channel and think about what your ultimate goal is.

For measurability, we assume that the ultimate goal is to minimize the return rate for online shopping, where the shopper's main visual clue is a 3D model (assuming we are doing good enough rendering that it will not affect the purchase rate).

In order to succeed, we must render the image perceived by the shopper, making it equivalent to the perception when the actual product arrives. This is exactly the same as the photo objectives of a magazine or website.

Product photography has a much longer history, so let's start here. What does the pipeline for a product photographer look like? Obviously, they are different, but the overall steps tend to be: 1) lighting settings; 2) camera settings; 3) taking pictures; 4) post-processing. Ideally, you can capture the lighting environment, convert the camera settings to matrix and exposure, use the same post-processing, write glTFs based on the measured material properties, and output renderings that match the photos very well (especially using a path-tracking renderer).

It is easy for one to blame renderers for any differences observed, especially real-time rasterizers, but in fact, many studies have been dedicated to making them physically amazing accuracy and consistency, as you can see in the fidelity comparison. Almost always the worst mistakes are differences in lighting, materials, exposure and post-processing.

Let's consider what happens in the post-photography step. Some things don't work for rendering, such as occluding shadows to make them translucent, which 3D renderers can do automatically. In addition to color neutral tone mapping, color is sometimes deliberately "corrected". Why? After all, if the measured light in the actual scene is incorrect, what is it?

may be some post-processing color corrections are just a temporary measure. Before digital photography, in order to get the right look, you have to adjust the lights and scenes, which requires manual labor. Digital post-processing can reduce lighting accuracy. But in 3D rendering, lighting is also digital, so it is usually better to keep the post-processing steps simple (e.g. ACES tone mapping only) and adjust the ambient image if necessary. Since 3D rendering is completely automatic and real-time, it is impossible to manually customize color adjustments for each frame.

. What role does perception play?

Like photography, the goal is the audience's perception of the image. No mathematical indicator can represent human perception and perception of real human beings. Isolated pixel metrics are problematic because most of human perception is affected by the background and environment of the object. This is because our brains are effectively trying to eliminate the effects of light, allowing us to consistently perceive potential material properties. Our brain estimates which lighting to remove based on the background.

When rendering a product in AR using camera images as background, in order to obtain the most consistent product perception, the actual rendering pixels need to change the color, which ideally needs to be changed according to the actual local lighting environment. This helps ensure that our perception returns the properties of a real object when our brains “correct” the light. Ideally, the same tone mapping and white balance applied to the camera's original image should be taken in the rendering pipeline. Unfortunately, accessing the data can be very difficult.

Of course, human perception is not consistent, which makes all this more difficult. A good example is this dress photo, which is actually a blue and black dress in an overexposed photo under yellow light. Although the pixel colors are roughly purple and brown, more than half think the skirt is blue and black, while another 30% think it is white and gold. Obviously, such blurred images are not desirable in a commercial environment, but it helps to understand the differences in perceived color vs. pixel color.

7. How do we verify the glTF model?

Generally speaking, part of the purpose of 3D rendering is to avoid photography costs by digitizing the entire process. For fine arts, 3D models are expensive to make, but it can quickly create many different photos and interactive experiences. So, how to verify the accuracy of this three-dimensional model for physical products? Of course, size and shape can be measured (glTF is always in meters, so check your units), but we will focus on material properties such as baseColor and roughness.

The most accurate way to select material properties is to measure them, because all glTF materials are defined in physical terms. However, this measurement is difficult to perform. There are products on the market that can scan inserted samples and match relevant characteristics to them. Modern 3D scanning algorithm can restore material properties, but the machine learning system is not perfect and it is difficult to conduct benchmarking. Until the technology develops further, we may not be able to obtain the properly measured material properties.

The most common workflow for creating a material is to manually adjust properties until the rendering looks correct. Ideally, art places physical objects on the table and compares them to rendering, focusing on perception rather than pixel values. But most importantly, their creative software is set to present in the same way that the object is presented to the user.

This is because when the rendering looks incorrect, it may be that the material or lighting needs to be changed. If the material is measured, one can safely focus on changing the lighting, but if it doesn't, it will be very helpful to choose a fixed lighting environment in advance to eliminate this variable. It is also important that the tone mapping functions used need to be consistent.

Ideally, art should check the rendering of the model in various lighting environments. Specific types of lighting tend to hide or expose certain material properties errors. At least, it is best to test in a neutral indoor environment and a sunny outdoor environment. The sun will generate very high dynamic range, as well as color lighting, which will help detect various possible material problems. Such scenes tend to contain a wide variety of real lighting situations and may be applied to virtual photo shooting or AR.

Finally, common warnings apply to the differences between color space and display. glTF format specifies the use of sRGB color space internally, which is the most common output format, but as HDR displays become more common, it is likely to become more complex soon.

8. What conclusions can we draw?

Unfortunately, there is no simple answer because color and perception are complex, even more complex than physics-based rendering. Hopefully this background introduction can provide a framework for your creative and validation process. The most important thing to remember about PBR is that the rendering color will be different from the material baseColor, which makes the following demonstration very realistic.

Please note how different the colors of the product are in each environment, but that is exactly what makes it look realistic

. What role does perception play?

7. How do we verify the glTF model?

8. What conclusions can we draw?

Please note how different the colors of the product are in each environment, but that is exactly what makes it look realistic

hotcomm Latest News

Site article recommendation