Monday, January 24, 2011

Let’s get physical

[Note, this was originally posted here on AltDevBlogADay]
Not just a catchy Olivia Newton John slice of the 80′s but also a motto I find myself increasingly trying to incorporate into my professional work life these days. Before our HR dept. comes down on me, let me elaborate. A conversation I had with one of our cinematic guys last year went something like this:
Me: “What do you mean you needed to remake all the shaders on that character for the real-time cinematic?”
Him: “Well, we added some extra lights for the cinematic but then that made the specular response too bright, so I tweaked the shaders a little so they looked good, but now they don’t look good in game anymore.”
Me: “Hmmm, that’s not good.”
The issue in this case was that as he was fine tuning the specular response on these surfaces, the renderer was inadvertently adding light energy into the system, in effect making those shaders “hardcoded” to that particular lighting environment. The solution involved examining how we handled light in our shaders and reworking the process using physical principles. Essentially, by not allowing surfaces to reflect more light than is incident on them, we, for a large part, have been able to avoid this problem. In hindsight, this sounds very simple but physically based approaches can have deep implications throughout a rendering pipeline.
It’s all about context
As with almost every other area of programming, one of the particularly useful questions to ask is “what’s my context?”, and this is no different. So this discussion only applies to games where our goal is to create authentic renderings of reality, which seems to be the majority of 3D games, and is especially the case with AAA titles. So the points here may not apply if your goal is non-photorealistic rendering.
Why should we do it
Here are some reasons why physically based rendering is a desirable property of your rendering engine (assuming reality is your goal). Of course, a true physical basis isn’t always achievable in practice but making decisions based on a physically motivated thought process is a very powerful tool indeed.
  •  We need to start somewhere. That somewhere might as well be physically based rather than arbitrary.
  • Author once, reuse everywhere. In nature, if we move an object from an indoor environment to an outdoor environment, it just looks right. You can’t make it look wrong. This is a very desirable goal for your renderer also.
  • Games are getting bigger. The manpower needed to make an AAA title is huge and we need to be smarter about asset creation. In practical terms, this means exposing less sliders for artists to tweak. The more sliders you expose, the larger the space of possible permutations they can use to get a desired result, and the more onus on them to hone in on the optimal combination. On top of that, if one slider needs to be tweaked depending on the value of another slider, as is commonly the case, then these kinds of dependencies can add up to a lot of extra work over the course of a project. One goal therefore would be to remove the bad, un-useful permutations and just expose useful sliders and ranges to work with. One solid way to accomplish this is to base your rendering on reality, and let nature fill in most of the blanks.
  • We can quantify rendering algorithms. Almost everything we do rendering wise in games is an approximation of what’s going on in reality. At the end of the day, we’ve got to still hit 30/60fps and so we strive to make the smartest tradeoffs. I’ve had discussions with co-workers where we’re intuiting about the pros and cons of a particular rendering algorithm. Both of us would have a mental picture in our heads of what the results would look like but in a hazy abstract kind of way. Well if you can compare it to ground truth images (which I’ll talk about later) then it’s much easier to visualize and quantify the short comings of different approaches.
  • Going forward, as processing power increases, real-time renderers are only going become more and more physically based. With this generation of consoles, we’ve made large strides forward in achieving realism due to having the horsepower to incorporate more physically based behavior than we could in the past. For example, linear light computations and HDR buffers are now commonplace. Even with these current-gen consoles, we’ve already seen some progress toward dynamic global illumination solutions. SSAO convincingly mimics short range occlusion. Crytek’s light propagation volumes and Geomeric’s Enlighten spring to mind also. Next generation is where we can expect this type of tech to really take center stage.
But what about artistic expression?
This is a common reaction people sometimes bring up when discussing physically based rendering and I totally see their point. In the past, sliders could go up to 11 and beyond, and sometimes that can be just what a particular scene in the game needs. So to be clear, I’m not necessarily advocating we don’t allow that but just that we’re aware that with this freedom comes extra responsibility. The pros and cons of anything which prevents reusability, such as the specular example above, should be seriously considered. The higher frequency which we could potentially reuse an asset, the more consideration it should be given.
It can be instructive to look to the world of film for where the creative limits we could hope to achieve via physically based rendering might be. It’s plain to see that a film like The Matrix has a very different look and feel than something like a Wallace and Gromit film. Yet both are results of filming real light bouncing off of real surfaces through real cameras. Animated films from the likes of Pixar and Dreamworks, and games such as littleBigPlanet, are very much physically motivated in terms of rendering. Yet, contrast these to the latest Hollywood action blockbuster and we can see there’s actually quite a wide creative palette at our disposal (as shown below), all without violating (well maybe we’re violating, but hopefully not abusing) physical laws. Post effects such as motion blur and depth of field, as well as cinematography / lighting setups will play a much larger part in achieving a desired mood going forward.

Examples of physically based scenes, (1) the stylized post processing of Sin City, (2) stop motion in Wallace and Gromit, (3) synthetic image creation in UP, and (4) runtime global illumination in littleBigPlanet.
So, yes, control is diminished in an absolute sense of what outputs you get from the lighting pipeline, but hopefully we are removing a large chunk of the non-desirable outputs, leaving primarily the desirable outputs remaining.
First steps toward physical rendering
When I first embarked down the path of physical correctness, my initial approach was to be very thorough when implementing lighting related code, to make sure equations and units were done “by the book” at each stage and leave it at that. In hindsight, this is the wrong way to approach the problem. I never knew for sure if there was a bug or not since I had no concrete frame of reference. Even if something looked right, I didn’t know definitively if itwas right.
A much easier and more robust approach is to simply compare your results to the “ground truth” images (akin to running regression tests on a codebase). A ground truth image is a rendering of what the scene should look like, that is, if you were to remove all resource constraints and try to compute the image to the best of our working knowledge of light (*).For this purpose, I recommend taking a look at the book Physically Based Rendering - From Theory to ImplementationI was drawn to it since the full source code is freely available and impeccably documented. As well as being thoroughly modern, it has a nice simple scene description language and obviously, as implied by the name, the principles and units are physically based throughout.
I plan to delve deeper into the specifics for my next post. I’ll take a closer look at the rendering equation and walk through the process of how we can isolate each of the various terms for purposes of comparing it with what your runtime renderer is doing. There are some prerequisites for a renderer before even starting down this path however. All shader computations should be done in linear space using HDR render targets where appropriate. You can find a good AltDevBlogADay post on this topic by Richard Sim here. Essentially we can store data whatever way makes sense, but we want to convert it so we’re that dealing with conceptually linear data in the shader stages of the pipeline.
Additionally, a goal is to have inputs authored or generated in known radiometric units. For example, background environment maps would ideally be in units of radiance, convolved diffuse cubemaps would ideally be in units of irradiance, point and spot lights ideally in units of intensity, etc. Hopefully the conversion to such units should fall out from the runtime to ground truth comparison process. And yes, the above steps can amount to a tremendous amount of work and pipeline modifications if you’re not already set up for it.
Traditional gamedev rendering wisdom states that if it looks good, then it is good. Well this is only part of the story; I would suggest that if it looks good and scales to large scale production, then it is good. Until next time…
(*) In practice, even offline renderers don’t account for all physical phenomena due to negligible cost/reward benefits. For example they generally collapse the visible light spectrum down to 3 coefficients and ignore polarization effects, amongst other things. This is totally fine for our purposes.