New Techniques for Acquiring, Rendering, And Displaying Human Performances

>> KEN: I would like to say that Paul has moved on after Berkeley to go to the University Of Southern California Institute Of Creative Technologies and is the associate director of graphics research over there involve with photorealistic rendering environments and more recently, virtual actors. He was the computer animation Festival chair for SIGGRAPH 2007 and is now here over at Google to tell us about his latest works. So, without further ado, Paul Debevec. >> DEBEVEC: Thank you very much, Ken. All right. Hey, thanks folks. So, it's very exciting to be here and another reason that I'm up here in the Bay Area is we've been doing a couple of screenings of the Computer Animation Festival, the electronic theatre for SIGGRAPH. This is the show that originally premiered in San Diego and there's going to be a showing tonight, if you are a fan of computer graphics or know any friends who might be at San Francisco State University. If you just go to the San Francisco SIGGRAPH website, sfsig.org, I think the show starts at 7:30 and it's going to have Sony SXRD 4K projection off of our HDCAM SR and master tape. And it's going to look absolutely gorgeous. So, if you want to see a very faithful rendition of the SIGGRAPH Computer Animation Festival, the best computer animations over the last year. That's a great opportunity to check it out. So, what I brought to talk about is some of the computer graphics work that we've been doing at the--my group at the Institute For Creative Technology is kind of a sampling of a couple of recent projects and kind of to motivate sort of how we got here. I have a little bit of some more historical material. The take on computer graphics that my group is mostly been involved in has been graphics that tries to make a lot of use of images. Ad when I was doing my PhD at Berkeley, I got interested in trying to do pretty realistic renditions of things like the Berkeley campus. And it seemed like photographs were good way to do that. A--I was very lucky to be helped out by a Professor there, Chris Benton who did Kite Aerial Photography. And eventually, when we got our kite off the ground, we were able to take some aerial photographs of the Berkeley Campus. And in particular, the Berkeley Campanile which I thought would be a good focal point for some image based modeling and rendering of this. And I also got to climb up in the lantern there and take some photographs of the campus all the way around as well. And these photographs you can see here, we also found a couple of photographs from an aerial mapping survey so we had some parallax from away from the top of the tower. And then using a system that I developed with CJ Taylor and my adviser, Jitendra Malik, we were able to build interactively in not too much effort. A 3 dimensional model of the campus from these 20 photographs here. It's not a terribly detailed model. It's kind of a lot of, you know, boxy buildings, maybe with a roof on them, the Campanile is by far our best model. But the idea of course is that we're not going to look at them this way, we're going to look at them with textures applied to them and projected on. So if we take that geometry based model, put the texture maps on there, maybe do a little view dependent texture mapping, depending on where you're looking at it from. And all of a sudden became a really exciting thing for us to take a look at because it looked really real. We had no idea how we could have made this model look as realistic any other way. And since I was careful enough to take all the photos in the same lighting conditions, we sort of are reproducing the appearance of the reflectance properties, all of the surfaces relatively, realistically. And the light transport, how the light bounced between all of the surfaces is replicated there. And I think somebody over here on the left was pointing at me maybe like a whole filling algorithm artifact that you can see. So anyway, there's plenty of little artifacts all around but it was real enough that at least back then, we could believe we were really flying around the campus and I had the chance to make a short film. This is a clip from it that took some real video of campus as well. This is me standing on top of the Campanile and doing a little match move shot back to kind of a fly around here in order to create a virtual fly around of the campus. So when I have a chance to play around with Google Earth, I really love seeing that a lot of these same kind effects are here and they're right there on software that everybody has and the idea of that continuing to the point where everything in Google Earth, you know, looks as every bit as photo real as this does right here, I think is a really amazing vision for the future and obviously, Google is in the front seat for doing that. So, this was an exciting project that had some applications in the movie industry as well for kind of like doing virtual backgrounds and reconstructing various kinds of sets for virtual cinematography. What I got interested after that was trying to reconstruct other kinds of environments. So like another place that I've tried to do a reconstruction of was the interior, Saint Peter's Basilica and the idea was we tried to do a little dynamic simulation, some objects that would be there in the--in the Basilica and I got interested in trying to record real world illumination conditions so that I could illuminate new objects to insert into these image based model--modeled scenes. So this is a scene from a film we did for the SIGGRAPH '99 electronic theater where the interior of Saint Peter's was modeled from basically two panoramas, one in the nave, one near the altar here. Some rough geometry built with the façade system to project antium and then using high dynamic re-imagery to record the range of light that was in there and then illuminate these new computer generated objects with the light that was actually there. And the way that that technology really works is that you would basically try to use one of these several available techniques for taking an omni-directional photograph. And one of the once that was particularly successful in the early days is something that was originally is for purposes of environment mapping which is to take a photograph of a mirrored ball which it turns out gives you a view of the entire scene all around. If you shoot that as a series of exposures from under exposed images that can see into the bright areas of the scene like the sky and like the--you know, any direct light sources that you would have to longer exposure your photographs that can see the indirect light coming from the ground and the trees, any shadow detail if you happen to want that. And you've really scientifically recorded the four range of illumination and we have some algorithms that will put that together into an image that has pixel values that go not 0 to 255 but 0 to whatever they need to go to--a hundred thousand, a million and if you then hook these into a good computer graphics rendering algorithm such as at the time we use Greg Ward's radiant system, we could take these image-based lighting environments, wrapped them around the CG scene and then illuminate those objects with that illumination and get relatively realistic renderings of things that otherwise might just look like very computer-generated objects. So, these are just a very simple scene that I modeled in Emax illuminated by a couple of different kinds of illumination environments that we took. So, this is I think Funston Beach just a little bit north of here, this is the Eucalyptus Grove at Berkley. This is Grace Cathedral in San Francisco which is one of the prettiest illumination environments we ever had a chance. And this is the Galleria Delli of Fitsi which is from one of the middle scenes in Fiat Lux. So, these images are these lighting environments are posted on our website in so often rather frequently actually people use them to do some of their own renderings and I get to see things like this, you know, very realistically rendered amplifier that a fellow in France did to illuminated by the light of my kitchen when I was living in Berkley because I posted on here and I've seen quite a number of objects strangely rendered into my kitchen which is a cool thing. Even more fun is, occasionally you see someone has gone and shot their light probes for their own crazy little idea and this is a series of light probes that surfaced on the internet a while ago shot by a young fellow, 17 year old fellow at the time named Nick Bertke who went around his parents house, had an inexpensive Canon Powershot G2 camera and shot a high-dynamic re-image series of this mirrored sphere. This is a--showing that, you know, this bright enough that you can see the indirect light coming from the walls and the floor. The light from the window is totally blown out so you wouldn't be able to accurate light objects using that as your record of illumination. But he shot it with this bracketed exposures and as you get to shorter and shorter exposures, you can actually finally see correctly within range of the sensor what the illumination was, you know, the blue light from the sky and some of the other less blue lights that bouncing of with the concrete outside. And what he wanted to do with this, he had a little idea, he's a fan of the game Half-Life 2 and he knew how to hack the games you can output the character models and then load them up into your own modeling software and he had the idea that he'd make some of these Half-Life 2 characters basically come and visit him at his parent's house and he had quite a bit of luck with them. So, here these characters are added in to some background plate photography of the scene and their lit by the light that was actually there at the time. So with relatively little effort, he could make it look like these guys were kind of hanging out and spending the day with him there. He's getting kind of comfortable in the couch here. And then maybe they stayed--they stayed awhile and watch some TV later and here he is. And the exiting thing about this is that, you know, you can see kind of like a little bit of the light from the TV on this coffee cup which I think was actually there and then that's completely consistent with the light that you see on the character. And also all the indirect illumination and the soft shadowing that you get from the characters is basically consistent. So what he is able to do is take this crazy idea he had in his head and communicate it visually in a way that the first thing that you see in the image is not whether it looks real or not but it's the idea that he was trying to communicate. And that's really, you know, the most exciting thing I see when some of these technologies get used for creative purposes and this technique is also been used quite a bit for feature film production. The academy award winning visual effects film The Golden Compass done by Mike Fink and his team this year used extensive image-based lighting techniques to render folks like the digital animals and the polar bears and such and with their great artistry, they got some amazing results with that as well. So one of the things that you need to create really compelling computer graphics is you need to be able to render people and there's been a lot of exiting work over the years recently in the area of digital people and some of our work has actually played in to some of that as well. The first project is we got interested in, in rendering people was actually the last project that I did at UC Berkeley in 2000 before I went down to the institute for creative technologies and we built this device called Light Stage 1 which have the goal of basically taking a data set of how a person's face looks lit from light coming from every direction that light can come from and with this set of plastic pipes and wood that we got at home depot over the course of about ten visits as we figured out how to build this thing. We could rotate this light in a spiral, took about a minute to go from top to bottom but we would get a data set that would show the person's face lit from all of these different illumination directions. We just record it live to a mini DV camera and then pull the footage off of that. We see the face lit from the front, the sides, above and below and even from behind as it turns out lighting objects from close to behind is also really important because you get these kind of rim lighting effects that cinematographers like to exploit when they're lighting their real characters. And the idea was let's take this face and try to light it with one of our light probe images that we got like the Grace Cathedral light probe. So what would it take to illuminate this face with this illumination environment? Well as it turns out, there's actually a very straight forward way to do it which is that if you take this image here, it's an omni directional image. You can resample it to a different coordinate mapping here. This here is like a latitude, longitude mapping and essentially if you multiply this data set by this data set, they're in the same coordinate space, they're sampled the same way now. You actually end up lighting the face with that lighting environment one piece of the environment at a time. So these images of the face here are now bright and yellow because there was bright yellow light coming from the corresponding directions in the environment. These images here are bright and kind of a cool color because there's bright cool illumination coming from the stained glass windows above. And all of these other images of the face here are sort of dim browns and yellows and purples because those are the different colors of the indirect illumination bouncing up from the floor and coming in from the walls onto this fellow here. So now that we've illuminated his face by that environment one little piece at a time, we can just take advantage of the linearity of light in the superposition principle and simulate him lit by the entire environment at the same time just by adding all of those images together and the result is that you get an image of his face lit by that lighting environment without him never having to go over to Grace Cathedral and actually get lit by the light. And it gets nice things like, you know, the yellow kind of rim lighting effect here from the light bouncing off the altar. You can see the stained glass windows reflecting in his forehead and in his hair. You get all the right effects of how light hits skin and is able to--and is able to have, you know, a specular component, a diffuse component, subsurface scattering self-shadowing into reflections and that's just because it's all there in the original data that you capture. So let's see here. So were going to go for a couple of slides here that we're not going to be able to put on the webcast but I wanted to talk about a chance that we had where we have to work with Sony Pictures Imageworks to apply this kind of technique to some of their digital stunt double characters for a couple of films that they did. Basically, throughout collaborator on our first project named Mark Sagar who worked on a--went to work at Sony Imageworks sometime after this and then also visual effects supervisor Scott Stokdyk, I could see a cigarette paper they thought this could be a good way to get a realistic skin reflectants for some digital actors that they wanted to shoot, first of all for the movie Spiderman II. So, starting in about late 2002 or early 2003, we did some test and then they brought over a couple of the actors from the film this is Alfred Molina who played Doc Oct in the movie. And we capture the data set on film with him and one of our new light stages, this is Light Stage 2, and it had a strobe light, it's kind of semi-circle of strobe lights that go around. So this is actually a long exposure photograph that makes it look like there's a whole sphere of light around him but it's actually just a semi-circle of lights. We captured that kind of data and then what we did for this is we had a rough kind of cyberware skin 3D geometry of the face and we projected this images from the sides, from the front as basically relightable texture maps that you could then put on the 3D geometry and then get a relatively realistic rendering of the face lit by any kind of illumination environment that you want. We had to be a little bit careful to kind of separate the specular reflection of the face from the subsurface scattering component of the face because when you change your viewpoint around, the specular reflection actually needs to shift around according to the Surface Normals in your view point. So, we actually do the color-based separation of those and then re-synthesize the specular component according to the new view points. So, they would shift around on the digital character. It wouldn't just seem plastered on to the face. But it was relatively successful for them. They were able to use it in about 40 shot to the film for a totally digital Doc Oct character for all of the skin. They augmented it with traditional computer graphics with some nice cloth simulations for the rest of the body. They added, you know, digital sun glasses so they have to figure out how to make the light interact with the computer generated object that's close to one of these image-based objects. And they had a full screen digital close up for his death scene when his floating back in the water. They thought would be a little dangerous to film for real. And we also shot Toby Maguire for a couple of scenes where he has to stop a train and has his mask off there. More recently we got to work with this fellow here who played the new Superman in the Superman movie that came out in 2006. And with this, they scan the film at higher resolution, they have the pipeline more refined and they were able to get some even better results. So when he needs to throw the space shuttle back in the space, that's a digital Superman looking at the scene there. There's a scene to where the end of the movie where things have worked out reasonably well and he's pretty satisfied, so he has kind of a nice satisfied expression flying around Metropolis. In this case, sure, they actually start it with a neutral scan of his face and then animated it using their animated system that kind of put a little bit of pleasant expression on there. That works pretty well. If you needed to really act and talk and going to extreme expressions, that's not going to work as well because it's too far away from a neutral post to look quite as realistic. So some of the work we're doing now is actually looking and trying to capture these data that's live, people in different expressions and positions. There's also one shot were they came up really, really close to him. He's out in space, he's thinking really hard, we have to dramatize this and so the camera flies in really close to--he's probably thinking right over in this area here. And they--because it's an image-based data set, they can, you know, design the lighting to have, you know, a little bit of a rim light here and little bit of warm light coming from below, I'm not sure where that is from outer space but it's somewhere at this point. And you can see they really can get a lot of skin detail and a lot of nice skin reflectants effects that would otherwise be kind of difficult. So, we've had a chance to continue working on these on the research side and one of the devices here which we can totally put on our web presentation is Light Stage 6, which was a idea to try to extend the light stage idea from just capturing faces to capturing the whole human body. And one of the reasons this actually happened is that my institute had some extra space and a satellite facility that they had to find something to do with and they said, "Hey Paul, you were always kind of talking about building that big light stage. Well it turns out this could actually be a useful thing for us because we got a good valid project to do with the rest of the space." So at that point, we had to figure out how to turn the various talk that we had into a realizable plan and pretty soon we had a [INDISTINCT] model that looked like this, which was a somewhat daunting thing to put together for just a one research group but we're lucky that a fellow name Sebastian Sylvan who's now at Autodesk, head--been a head of a virtual stage facility in Italy and wanted to work with us. We know actually how to get bigger projects together and make bigger thing so it happened. So, I worked with him very closely to actually get the design here. He found all the places that could source all the parts and within about two months and a little bit of sore shoulders, we were able to put together Light Stage 6. And the idea of Light Stage 6, being an entire sphere of lights is that we wanted to very rapidly be able to capture these data sets. We wanted to very quickly go from being able to light somebody from this direction of light to this direction of light and capture this kind of image base relighting data set in a real time. Now, if we have some luck here, this will--video will play and we'll see here. Now, I'm going to pop out of the program and play it from here. And what we've--that is one initial project where we actually were capturing what amounts to a seven dimensional data sets of people going through natural motions. So, this here is a Bruce Lamond, one of our researchers in our laboratory. We have him on a treadmill here. He normally, as a relatively serious fellow, and looks like he's concentrating--curious concentrating a little harder than usual because he's trying not to fall off at this treadmill. He's actually paying attention to some grooves that we cut underneath the treadmill belts, we can still where he is left and right and forward back. But the idea is that if we spun him around for about 45 seconds and shot him with high speed cameras under a time multiplexed illumination, we would get a data set of him under all lighting conditions. So, if we see him in slow motion, we're going to slow this video down. You can see that's really happening is we're very rapidly going from one lighting direction to another. Sometimes it's dark, to another, to another, and where interleaving all of these data sets at frame rate of the camera. We're actually in this experiment capturing 33 different lighting conditions every 30th of a second. And over a 30th of a second this are all these lighting conditions here of Bruce. And from that kind of data, we can use that image- based relighting process under kind of--any kind of illumination. And the reason that we have him spinning around on the treadmill, if we skip forward, you can see we actually got him from all different angles as well. It's so that we could also have virtual control of his viewpoint. So, we had our high speed camera at about chest level and then from a little bit above, we had a borrowed high speed camera courtesy of Vision Research. And then from floor level we had another barrowed, high speed camera. So, we really had three cameras and the idea was well just have him repeat his motion 36 times as we rotate him 10 degrees over the course of each one of those. And we defectively get this three by thirty-six light field of Bruce, also from every frame of his animation and for everyone of these illumination conditions. Now, we wanted to eventually take Bruce out of the light stage and then make it look like he's running across some place that he's never been to and complete control of viewpoint and illumination. And part of what we would need for that is to get a Alpha channel or a mat for him. So, one of our lighting conditions, we turned off all the light on Bruce and we just turned on lights on this peace of gray paper that's behind him and that gives us the silhouette and that's exactly the right image that you need if you want to composite him out of that environment and then into another environment. The problem was in this case here, that didn't give us a good mat for his feet and we couldn't really think of a way to take this, you know, treadmill that we'd found at the local Sport Chalet and get it to, you know, glow brightly for a 30th of a--or for 1000th of a second, 30 times per second. So, what we used for that is we actually covered the entire turntable in the belt of the treadmill with retro reflective cloth and then put some ring lights around the camera that were also time multiplexed in. So, when we're shooting the mat frames, that actually glows back toward the camera as well and then we get a good mat for the entire body at that point. So, going back to the relighting idea, here's Bruce walking forward in the stage but we can relight it to show him lit from, you know, any direction that we want or we can play that image-based relighting trick and show him under the light of Grace Cathedral or the Pizza gallery. And if we want to change the viewpoint, then essentially what we're going to do is morph between the different viewpoints that we have. We run optical flow between adjacent viewpoints and then we actually combine the idea of view interpolation, which is one of the inventions that Lance Williams made at Apple back in 1993, a very important paper with another very important idea which is the light field concept, which is developed at Stanford in 1996 and also some researches at Microsoft Research contemporaneously. And if we basically combine the idea of view interpolation with light field such that the light field quadrilinear interpolation coefficients are also used to change the displacement vectors that we get between as we morph from one view to another. Then we can actually put both of these things together and then smoothly from a relatively sparsely sampled light field actually generate views that are further away than we were originally captured closer and then we originally captured and then any direction all the way around. So, this is actually real time rendering on an NVIDIA card, a demo done by Charles Felix Chabert in our group. Kind of pushing in to the scene and then doing a slow rotation around it. So, finally onto our problem, here is a location that we thought it would be cool to watch Bruce running through. I shot it as a high dynamic range, omni directional image, this time I actually using Canon still camera and fisheye lenses in a couple of different directions, put that back together into this high dynamic range lighting environment and then we're going to drop Bruce into the scene and we'll see the results. One of the first results we got with the technique, and here he is. So, what we did is I animated kind of a camera pan across the scene. We're matching the viewpoint on burst as the camera pans across. We've also illuminated him with the light from that environment. So hopefully it looks like, you know, the color balance and the light directions are about correct. We have simultaneous questions from Ken and Lance here. Let's see if it's the same question. What do we have? >> Shadows? >> DEBEVEC: Shadows, very good. And Lance? >> Motion blur. >> DEBEVEC: Motion blur, okay. Different questions, both very good. Let's see here. On shadows, I'll show you in a second what we did with that. For motion blur, we did not add motion blur to the scene and some of our earlier facial time multiplexed illumination, we actually did use the optical flow vectors to re-synthesize the appropriate 180 degree shadow--motion blur. We just didn't do that for this project because we were doing--we're using an NVIDIA card to do the rendering and we're trying to make it more real time for that. But we did get some very nice results in some of the facial work that we've got. The shadows, if we go a little bit further, he is actually casting a soft shadow. He also has a friend over here just to prove that this is all virtual. And the shadows here, they're not terribly high resolution shadows but we did is we actually use the silhouettes that we got from all around to curve out a basic volume of him which otherwise is the first use of any notion of his geometry that we have in any of these renderings. We'll get a basic kind of voxel model of him going along. And then we use that to cast rays from all of our basis lighting directions to figure out a shadow map that you get from each basis illumination condition and then we essentially do an image-based pre-lighting combination of those shadow maps to figure out how much light would be blocked in one direction versus the other. So you actually get kind of a warm colored shadow when you're blocking the skylight and you got a cool colored shadow when you're blocking the indirect light from the warm-colored building that's behind us. And thinking that you can't have too much of a good thing, we put a couple of Bruce's together here. They're not actually inter-reflecting the light of each other or self shadowing each other, that's kind of future work at this point. But this was--this is enough to at least amuse Bruce quite a bit. And he actually did all the work on the compositing and the map finding there. This is a reversed time lapse of building our light stage over a course of four days. There we go. All right. So, let's go back to some slides here. And what I wanted to talk about is a more recent project that we've done that's stage--face related that was inspired a little bit by a completely different kind of facial rendering pipeline that has also shown a lot of promise which is the idea of--if we're going to try to do a digital model of an actor. Maybe the first thing you should do is take a life cast of the actors face in plaster. And then get that scan since now it doesn't move and it's diffused, it's a good surface to digitize. You can do that very, very high resolution. This is something that's commonly done--and there's a company XYZ RGB that does this really, really amazingly well. There's a digital face [INDISTINCT] project that Lance Williams was involved in that was a test at Disney, I think in like 2000 or so you're working on this. When--they actually did--the first time that this really high resolution face casting process was applied to creating a digital actor and they got some amazing results with that. The standard problem here is that it's pretty much not good for getting like, you know, a live performance of an actor since it requires taking the cast and such as. There's a bit of inconvenience involved. Some people say they're kind of changes the shape of the face a little bit and the other problem is you don't easily get a lined texture maps for the face. So you might ask why not just you know scan the face itself at really high resolution and you can do a really fast laser scanner for that. And one of the problems associated with that is the fact that skin is not as nice as surface to scan as gray plaster. The problem is of course subsurface scattering and so if you have little laser line on a piece of paper, it might give you a nice sharp line. But once it actually hits skin, it's going to diffuse out and get blurry. So if you're try to measure the geometry of the fine skill skin wrinkling and such based off of that blurry line, you're going to have some trouble. Now as it turns out, they're actually is some light that reflects off the skin that does not get affected by subsurface scattering. And that is the specular reflection of the skin. And this is an image that we found. This is actress Hilary Swank. It's not attractively lit image because it's flash right from the front. But it demonstrates this point where if you can see in her specular reflection of the light, that's where you can actually see the skin detail, of the shape of the pores and the fine wrinkles. And assumably, on her forehead next to where the specular reflection is, assumably, she has a similar kind of skin texture right there but you don't see it at all. That's because that is the subsurface scattering and blurs it all out. It's in the specular reflection that you see this. And in fact, it's because you see an in specular reflections that it's actually important for rendering digital characters. If we didn't have any specular reflection, you'd never see this effect. You could probably get away with that modeling and--but if you want to try to get that realistic skin look in the specularities, you need to get that kind of geometry. So, our idea was maybe there's a way that we can photograph just the specular reflection of somebody's face and then figure out the detailed shape of the face, just from that. And we thought back to some of the work we did for our first Light Stage paper, where you had done a little experiment using cross-polarization to remove specular reflections for someone's face. This is Holly Ken, who is one of our undergrad students, working with us at the time. And she's lit by a single light and photographed by a camera right in front of her. You can see that we've got both the specular reflection and the subsurface reflection of the face here. As it turns out, if you put--when you're polarizer's on both the camera--both the light source and the camera and if they're at opposite angles, the specular reflection maintains the polarization of the light. And so, it can't make it through that second polarizer. And it doesn't show up in the photograph. So, this is a cross-polarized image of Holly without any specular reflection. The subsurface light, since it actually gets underneath the surface, scatters around a couple times, it gets depolarized. And about half of it will make it through that second polarizer. So, the result is that we can actually observe only the subsurface scattering on its own. And if you radiometrically calibrate your cameras, which is something that fortunately we knew how to do at the time. If you take the difference between the diffuse only or the subsurface only image and the specular image, you can get an image of only the specular reflection just on its own from just two photographs. So, I thought was well, let's try to take some photos of just the specular light on the face and try to figure out what the shape of the face is from that. The problem with doing with just a single light is that you only see specular reflections for certain areas of the face. In this case, here, you know, on one cheek but not the other cheek. And what we really want to know is the specular illumination coming from the entire face at the same time. So, as it turns out, we have devices in our lab that can illuminate a face from all the directions that light can come from at the same time. This is our Light Stage 5 device which was just for faces, otherwise, similar in a lot of ways to Light Stage 6. And we asked ourselves, could we cross-polarize out the entire sphere of illumination at the same time? And as it turns out, first empirically, and then actually figuring this out. There's a specific pattern of when your polarizer's you can put on every single light of the stage. Such that, the specular reflection from every possible surface normal will end up with the same orientation of polarization by the time it gets to a camera in the front. And it looks like this basically, there's kind of a bit of a whirl around the booster angle here and otherwise, they're vertical here, horizontal here. The--it takes us about an hour or so to get all of these oriented correctly. But the result is then that we can actually light somebody from every direction of light at the same time and observe them without any specular reflection whatever. So, here I think, the video projectors are brightening us up a little--a little bit extra here. But this is an image of Tom; he's a producer on our group, lit from the entire sphere of light with no specular reflection whatsoever. So, if you are for example making a digital character model, this could be a very useful image to start with as your diffuse texture map because it has very little effects of, you know, specularity or variable shading which is usually a challenge when creating characters. Now, this is with a cross-polarized that if we rotate it the other way and have parallel polarizers, then, we can actually bring the specular reflection back in. And here, it's definitely too bright. You can see the specular light comes in. But if we take the difference, hopefully this will show up pretty well. We can get an image of just the specular reflection of the face from the entire sphere of illumination at the same time. This looks like a black and white image. It's actually one shot in color. And since the specular light hasn't had a chance to interact with, you know, your melanin or your hemoglobin, it doesn't pick up any skin color. So, this does correctly look like it has basically no chromaticity to it. And as you can see, we're actually picking up a lot of the detail of the skin shape and shading from this specular only channel. If you're again creating a digital character and you need to have your specular intensity map, this could be a very good image to use as a start for that, as well. So, the last part of the project that we had was the idea of--let's try to do a variant of photometric stereo. This is a computer vision technique where if you light an object from different directions, analyzing the diffuse reflection, you can figure out what the surface normal is because there's going to be only one surface normal that would explain the different colors that those different light directions would produce to the camera. And what we came up with in the specular case, you actually have to use full spherical positions because the specular lobe is narrow enough that you might miss it entirely if you use point sources. But we came up with a technique that uses four spherical gradient patterns. Full sphere, a gradient of light from top to bottom, a gradient of light from front to back, a gradient of light from left to right that essentially, if we just shoot this four images, and of course, we really also have to shoot these images here of the corresponding patterns on diffused illumination in order to compute these images. But from just these images and some very simple math, essentially, you take this image and divided by this image and you scale it so it's between minus one and one. It reads out the reflection vector of every one of the pixels on the face based on just the specular reflection. And then, if you just tilt that half back toward the camera then, you have an estimate of the surface normal. So, the result is that you can--you can also do this of course, with just the diffuse channel with these patterns here and figure out where the diffuse lights coming from. If you do with just the diffuse channel, you can get a normal map that will shade an image that looks something like this. You see a little bit of detail where you've got like, you know, whiskers and such that darken the--darken the image. But you don't seem nearly the kind of detail that you see from the specular map. So, this is actually shaded with a normal that's gotten from just the specular component. And you can see it actually picks up all of these fine wrinkle details and all the skin pore detail. And the final thing that we needed to do was to figure out a way to actually apply this kind of map--and let's skip over a couple slides here--to some geometry. And as it turns out, there are some techniques out there. We'd experimented with them in our group in about 2001 where if you start with a low resolution face scan that you do get from the laser scanner, doesn't have skin pore detail on it. And then you know what the surface normal map should be for that geometry. You can essentially emboss that surface normal map onto the geometric model. And then put that kind of skin pore and fine wrinkle detail onto your 3D geometry. The nice thing about it is that we not only can get this hi-res geometry but we have perfectly aligned texture maps for the diffused component, the specular intensity component and such. And we can map those under the face as well. Another thing that we realize--and again, I apologize, it's a little blown out on the video projector here. But this is actually a real time rendering, just using our diffuse map and our specular maps. Since we get surface normal maps for both the specular component and the diffused component, we can actually render those two components of the face with their corresponding surface normal maps. And the diffuse components normal map is actually going to have less surface normal variation than the specular component because of the fact that the light is scattered and infectively blurs where the light is reflecting from and has less to do with the surface shape of the skin than what's going on underneath the skin. So, as it turns out, if you render with these hybrid normal maps and you have the smoother normal map for the diffuse and a sharper normal map for the specular, it actually gives you a first order approximation to the correct subsurface scattering behavior of what the skin is doing and, you know, just a local shading model. It won't get you light bleeding in the shatter regions or the ears glowing when they're lit from behind. But for, you know, the convex areas of the face, it gives you a very close approximation to how it will look with the full subsurface scattering from a full on illumination conditions. You can also take the model and show it with a subsurface scattering rendering, as well. Rendering just from the specular map and then using a subsurface scattering such as--this is the Gents Nimbular 2002 Technique to get very nice rendering, as well. One of our data sets, we got interested in trying to get the data of a person's hand. So, this is a Hideshi Yamada's hand that he put up in our light stage. You can see, we got the details of sort of the, you know, pretty fine skin wrinkles and such. If we render it with the hybrid normal maps, we get a pretty nice rendering there with the specular component. In this case, here are the subsurface scattering rendering was particularly compelling because it gets that skin color bleed into the shadows here and into shadow here and that really also helps sell it quite a bit as well. So, we've gotten excited about the fact that with this photometric only technique that as it turns out, it requires eight photographs for the spherical illumination conditions and then just five photographs should do in a structured light scan at the end. And the small of photographs working at very high resolution geometry and registered, calibrated, texture maps for diffuse and specular in this normal maps that it seems like a good way to capture faces. When we've started capture faces in different expressions, which we think we can use to drive digital after models. And we're also looking and taking a variant of this and running it in real time, shooting it with--even not all that high speed photography, we think we can shoot this kind of data set or close to it at frame rate and capture it for actor's performances. That's some of the direction that the work has continuing. So I think I have five more minutes and I have more thing that I could talk about which is on a little bit of different topic. But if we can play the video, I'll try to tie it into what we've been talking about before. This is a project we did in collaboration with Mark Bolas who's at the USC School of Cinematic Arts and Ian McDowall from Fakespace Labs which is near here, Hideshi Yamada from Sony, you saw his hand just a while ago, and Andrew Jones, he's a lead author from our research group. And it was an idea to use some high speed video projection techniques that we've been using for doing real time structured light scan in the faces and try to adapt it into becoming a 3D display. And the kind of 3D display, we showed this at SIGGRAPH this last summer. The idea was, let's make a 3D display that doesn't necessarily have a terribly large image. In fact, it's a pretty image. It's about five inches tall right here. People are peering in to it. But one that you can see from any direction all the way around and does not require 3D glasses, so you can see here, this is actually a lot of our friends from the Software Department at Digital Domain who came for a visit. I think they have about 11 people around the display here. And they're all getting their own individual 3D view of the scene from all these different angles. And the basic way that this works is that we have a video projector on top that projects imagery down on to a spinning mirror that's at 45 degrees, and the mirror spinning around the y-axis so that it kicks the light, the image from the video projector out to all different directions around it relatively quickly. We spin the mirror about 15 to 20 frames per second. So it is a little bit sturdy that's fast enough that you can enjoy the 3D. The next versions will be 30 to 40 frames per second. And the video projector is projecting on to the mirror fast enough so that in this 15th of a second that it takes to do a rotation, we actually get an individual image for every degree in a quarter all the way around the circle. So that's 288 images at 15 times per second, you'll find out from that that we actually have the project imagery under this mirror at 45,000 frames per second. We have a question. >> What projector have you used? >> This actually started its life as 2000 lumen optoma DLP projector and it got a seriously hot wired in order to make it play this imagery as quickly as it does. And the technique that we use for that actually takes advantage of DVI. We decided, you know, for these first versions, let's not worry about color. We took the color filter reel out of the project. Let's not worry about gray levels for the moment. Let's just go for binary images, it shows some nice wire frames and stuff. And what we are dealing is we're actually rendering imagery out of an NVIDIA graphics card that's encoded like this. Normally, you send 24-bit color images over your DVI. What we're doing is we're sending 24 1-bit images with different view of the scene. So this is our model here. These are 24 different views, each of degree in a quarter spaced around. And they're all packed in a 124-bit color image. So we actually send this to the projector. We render them just by setting the bit--the bit pattern to the projector and then we--the projector automatically plays each image as a 24-frame movie. If we set the refresh rate of the card up to about 180 Hertz or even higher than that, it doesn't really start to flake out until about 240 Hertz. But then we can actually get this 45,000 frame per second movies. And the digital micro mirror devices, the DMD TI chip mirrors have no problem with any of these. When they're showing color, they're usually going around 9,000 flips per second or more. So I have a video that shows basically how this works here. This is the mirror before it spins up. The mirror has an anisotropic diffuser on it so when the light hits it, it gets spread out vertically and a little bit horizontally but this makes it so you can see it even if you're a little above or below. We don't get vertical parallax naturally with this device. It's a horizontal parallax only display. But it gives us a chance to get views out to everyone. And this actually does mean it's a little bit more of a complicated story figuring out how to project imagery on to the display. We actually are running a custom vertex shader to render somewhat multiple center of projection images out there so that when this anisotropic diffuser kind of re-binds it out in the space, you end up seeing correct perspective. And that's, you know sort of what section three of the papers' all about. But if we actually get the spinning up in concert with the video projector, here it goes. We get a three-dimensional scene on there. And this is just me shooting handheld walking around. Now, one of the nice qualities of this is that since we're sending completely independent images out in all directions we have no problem getting occlusion in this display. Some other kinds of volumetric displays have a nice three-dimensional image but it's kind of this ghostly--all light is lit up, spaces lit up and you can see things through other things. Here, when we're looking at the back of head, we don't see the face anymore. Because, you know, what you see from the other side really doesn't have anything to do with what you're seeing from the back side. We're just rendering this out of the graphics card. And the other cool thing is that since graphics cards are so fast, this is about a 2,000 or 3,000 polygon model. We can actually render at 5,000 frames per second on the graphics card natively. So, that can let us make this interactive display. This is Andrew Jones with the Polhimas device, actually interactively moving the model around because it's live off the NVIDIA card. We did one experiment where we used the tracker to actually track the--just the vertical position of where the camera was and then have the NVIDIA card interactively adjust the vertical view points. So we can sort of simulate vertical parallax. So this is where the track camera and we can make it so you see it from above when you actually go above there. As you can see so far, this is black and white. We made one slightly desperate attempt to color which was this tent mirror where we actually had two faces, there--one, we kind of split the spectrum down the middle and we had one that was sort of orange-ish, and one that was kind of bluish. And the ideas that this mirror turns around, we can do--I think. Oh, we have a connection successful. Hi, there. I guess I might have called myself. And there we go and we're right back to the video. Thank you. So we spun this thing around and by doing two channels of light, we got at least the to two channel color that we had. Now the right way to do this is really to use a three chip DLP and just put, you know, red, green and blue, light down under the thing. We're talking to Texas instruments so they seem kind of interested in our project. So maybe this year, we'll have something like that. The other cool thing was that when we had the tent mirror for the stuff that was whitish, it actually gave us two displays of the image for every rotation. So this became a much more stable image going at 30 to 40 frames per second. Two more examples, one is which we got interested in trying to show not just wider frame imagery but maybe photographically acquired imagery. So going back to this light field concept, we shot a light field of this tourist souvenir that one of our folks brought back. And we dithered it using Victor Ostromoukhov's dithering algorithm and loaded up the entire NVIDIA graphics card memory with different views of these. And we're actually live re-binning of this image here according to vertical tracking and then we're putting the light field back out of the space. So, this is the real object sitting next to the virtual version of that. And that's sort of looking pretty cool and we realized that the fact that we've only got black and white pixels isn't so bad because a little bit of blur that you get from the motion blur and the diffuser kind of starts to make it look like pretty good gray levels. And we like that so much, we thought what we really need to try to do is do something that actually like, you know, kind of he's animated and moves. And we wondered is there some kind of photographic data set that we can shoot from all directions of something that's moving. And we realized that actually we have that kind of data set. So calling Bruce back into service here, we got our Light Stage 6 data set of Bruce re-binds on to the display and then we're able to sort of [INDISTINCT], thank you. So, not Princess Leah yet but maybe we're getting there. Very cool. All right. Well, that's all I brought today. To thank everyone here, there's some websites with all the videos. Thank you very much. And if there's time for any questions, I think they're going to do another talking here right away but I'm happy to answer questions as long as there's time. Yes. >> I think probably [INDISTINCT]. >> DEBEVEC: You know--okay. So one of the members of our computer animation festival jury was Randal Kleiser who I'm sure you know. He's a film director. And when he showed it to--we showed it to the whole jury when they were choosing films. He said, "Have you ever thought of doing, you know, Princess Leah on this?" He says, "I know Carrie Fisher. I mean, I'm sure she'd be into it." And at that point I thought, A, that would be incredibly cool, B, I'm so sure she would not be in do it. So, but well see. Maybe it'll--maybe it'll--I hope maybe someday, we can at least demonstrate it for her. That would be cool enough. Okay. Thank you very much.