Tip:
Highlight text to annotate it
X
>> KEN: I would like to say that Paul has moved on after Berkeley to go to the University
Of Southern California Institute Of Creative Technologies and is the associate director
of graphics research over there involve with photorealistic rendering environments and
more recently, virtual actors. He was the computer animation Festival chair for SIGGRAPH
2007 and is now here over at Google to tell us about his latest works. So, without further
ado, Paul Debevec. >> DEBEVEC: Thank you very much, Ken. All
right. Hey, thanks folks. So, it's very exciting to be here and another reason that I'm up
here in the Bay Area is we've been doing a couple of screenings of the Computer Animation
Festival, the electronic theatre for SIGGRAPH. This is the show that originally premiered
in San Diego and there's going to be a showing tonight, if you are a fan of computer graphics
or know any friends who might be at San Francisco State University. If you just go to the San
Francisco SIGGRAPH website, sfsig.org, I think the show starts at 7:30 and it's going to
have Sony SXRD 4K projection off of our HDCAM SR and master tape. And it's going to look
absolutely gorgeous. So, if you want to see a very faithful rendition of the SIGGRAPH
Computer Animation Festival, the best computer animations over the last year. That's a great
opportunity to check it out. So, what I brought to talk about is some of the computer graphics
work that we've been doing at the--my group at the Institute For Creative Technology is
kind of a sampling of a couple of recent projects and kind of to motivate sort of how we got
here. I have a little bit of some more historical material. The take on computer graphics that
my group is mostly been involved in has been graphics that tries to make a lot of use of
images. Ad when I was doing my PhD at Berkeley, I got interested in trying to do pretty realistic
renditions of things like the Berkeley campus. And it seemed like photographs were good way
to do that. A--I was very lucky to be helped out by a Professor there, Chris Benton who
did Kite Aerial Photography. And eventually, when we got our kite off the ground, we were
able to take some aerial photographs of the Berkeley Campus. And in particular, the Berkeley
Campanile which I thought would be a good focal point for some image based modeling
and rendering of this. And I also got to climb up in the lantern there and take some photographs
of the campus all the way around as well. And these photographs you can see here, we
also found a couple of photographs from an aerial mapping survey so we had some parallax
from away from the top of the tower. And then using a system that I developed with CJ Taylor
and my adviser, Jitendra Malik, we were able to build interactively in not too much effort.
A 3 dimensional model of the campus from these 20 photographs here. It's not a terribly detailed
model. It's kind of a lot of, you know, boxy buildings, maybe with a roof on them, the
Campanile is by far our best model. But the idea of course is that we're not going to
look at them this way, we're going to look at them with textures applied to them and
projected on. So if we take that geometry based model, put the texture maps on there,
maybe do a little view dependent texture mapping, depending on where you're looking at it from.
And all of a sudden became a really exciting thing for us to take a look at because it
looked really real. We had no idea how we could have made this model look as realistic
any other way. And since I was careful enough to take all the photos in the same lighting
conditions, we sort of are reproducing the appearance of the reflectance properties,
all of the surfaces relatively, realistically. And the light transport, how the light bounced
between all of the surfaces is replicated there. And I think somebody over here on the
left was pointing at me maybe like a whole filling algorithm artifact that you can see.
So anyway, there's plenty of little artifacts all around but it was real enough that at
least back then, we could believe we were really flying around the campus and I had
the chance to make a short film. This is a clip from it that took some real video of
campus as well. This is me standing on top of the Campanile and doing a little match
move shot back to kind of a fly around here in order to create a virtual fly around of
the campus. So when I have a chance to play around with Google Earth, I really love seeing
that a lot of these same kind effects are here and they're right there on software that
everybody has and the idea of that continuing to the point where everything in Google Earth,
you know, looks as every bit as photo real as this does right here, I think is a really
amazing vision for the future and obviously, Google is in the front seat for doing that.
So, this was an exciting project that had some applications in the movie industry as
well for kind of like doing virtual backgrounds and reconstructing various kinds of sets for
virtual cinematography. What I got interested after that was trying to reconstruct other
kinds of environments. So like another place that I've tried to do a reconstruction of
was the interior, Saint Peter's Basilica and the idea was we tried to do a little dynamic
simulation, some objects that would be there in the--in the Basilica and I got interested
in trying to record real world illumination conditions so that I could illuminate new
objects to insert into these image based model--modeled scenes. So this is a scene from a film we
did for the SIGGRAPH '99 electronic theater where the interior of Saint Peter's was modeled
from basically two panoramas, one in the nave, one near the altar here. Some rough geometry
built with the façade system to project antium and then using high dynamic re-imagery to
record the range of light that was in there and then illuminate these new computer generated
objects with the light that was actually there. And the way that that technology really works
is that you would basically try to use one of these several available techniques for
taking an omni-directional photograph. And one of the once that was particularly successful
in the early days is something that was originally is for purposes of environment mapping which
is to take a photograph of a mirrored ball which it turns out gives you a view of the
entire scene all around. If you shoot that as a series of exposures from under exposed
images that can see into the bright areas of the scene like the sky and like the--you
know, any direct light sources that you would have to longer exposure your photographs that
can see the indirect light coming from the ground and the trees, any shadow detail if
you happen to want that. And you've really scientifically recorded the four range of
illumination and we have some algorithms that will put that together into an image that
has pixel values that go not 0 to 255 but 0 to whatever they need to go to--a hundred
thousand, a million and if you then hook these into a good computer graphics rendering algorithm
such as at the time we use Greg Ward's radiant system, we could take these image-based lighting
environments, wrapped them around the CG scene and then illuminate those objects with that
illumination and get relatively realistic renderings of things that otherwise might
just look like very computer-generated objects. So, these are just a very simple scene that
I modeled in Emax illuminated by a couple of different kinds of illumination environments
that we took. So, this is I think Funston Beach just a little bit north of here, this
is the Eucalyptus Grove at Berkley. This is Grace Cathedral in San Francisco which is
one of the prettiest illumination environments we ever had a chance. And this is the Galleria
Delli of Fitsi which is from one of the middle scenes in Fiat Lux. So, these images are these
lighting environments are posted on our website in so often rather frequently actually people
use them to do some of their own renderings and I get to see things like this, you know,
very realistically rendered amplifier that a fellow in France did to illuminated by the
light of my kitchen when I was living in Berkley because I posted on here and I've seen quite
a number of objects strangely rendered into my kitchen which is a cool thing. Even more
fun is, occasionally you see someone has gone and shot their light probes for their own
crazy little idea and this is a series of light probes that surfaced on the internet
a while ago shot by a young fellow, 17 year old fellow at the time named Nick Bertke who
went around his parents house, had an inexpensive Canon Powershot G2 camera and shot a high-dynamic
re-image series of this mirrored sphere. This is a--showing that, you know, this bright
enough that you can see the indirect light coming from the walls and the floor. The light
from the window is totally blown out so you wouldn't be able to accurate light objects
using that as your record of illumination. But he shot it with this bracketed exposures
and as you get to shorter and shorter exposures, you can actually finally see correctly within
range of the sensor what the illumination was, you know, the blue light from the sky
and some of the other less blue lights that bouncing of with the concrete outside. And
what he wanted to do with this, he had a little idea, he's a fan of the game Half-Life 2 and
he knew how to hack the games you can output the character models and then load them up
into your own modeling software and he had the idea that he'd make some of these Half-Life
2 characters basically come and visit him at his parent's house and he had quite a bit
of luck with them. So, here these characters are added in to some background plate photography
of the scene and their lit by the light that was actually there at the time. So with relatively
little effort, he could make it look like these guys were kind of hanging out and spending
the day with him there. He's getting kind of comfortable in the couch here. And then
maybe they stayed--they stayed awhile and watch some TV later and here he is. And the
exiting thing about this is that, you know, you can see kind of like a little bit of the
light from the TV on this coffee cup which I think was actually there and then that's
completely consistent with the light that you see on the character. And also all the
indirect illumination and the soft shadowing that you get from the characters is basically
consistent. So what he is able to do is take this crazy idea he had in his head and communicate
it visually in a way that the first thing that you see in the image is not whether it
looks real or not but it's the idea that he was trying to communicate. And that's really,
you know, the most exciting thing I see when some of these technologies get used for creative
purposes and this technique is also been used quite a bit for feature film production. The
academy award winning visual effects film The Golden Compass done by Mike Fink and his
team this year used extensive image-based lighting techniques to render folks like the
digital animals and the polar bears and such and with their great artistry, they got some
amazing results with that as well. So one of the things that you need to create really
compelling computer graphics is you need to be able to render people and there's been
a lot of exiting work over the years recently in the area of digital people and some of
our work has actually played in to some of that as well. The first project is we got
interested in, in rendering people was actually the last project that I did at UC Berkeley
in 2000 before I went down to the institute for creative technologies and we built this
device called Light Stage 1 which have the goal of basically taking a data set of how
a person's face looks lit from light coming from every direction that light can come from
and with this set of plastic pipes and wood that we got at home depot over the course
of about ten visits as we figured out how to build this thing. We could rotate this
light in a spiral, took about a minute to go from top to bottom but we would get a data
set that would show the person's face lit from all of these different illumination directions.
We just record it live to a mini DV camera and then pull the footage off of that. We
see the face lit from the front, the sides, above and below and even from behind as it
turns out lighting objects from close to behind is also really important because you get these
kind of rim lighting effects that cinematographers like to exploit when they're lighting their
real characters. And the idea was let's take this face and try to light it with one of
our light probe images that we got like the Grace Cathedral light probe. So what would
it take to illuminate this face with this illumination environment? Well as it turns
out, there's actually a very straight forward way to do it which is that if you take this
image here, it's an omni directional image. You can resample it to a different coordinate
mapping here. This here is like a latitude, longitude mapping and essentially if you multiply
this data set by this data set, they're in the same coordinate space, they're sampled
the same way now. You actually end up lighting the face with that lighting environment one
piece of the environment at a time. So these images of the face here are now bright and
yellow because there was bright yellow light coming from the corresponding directions in
the environment. These images here are bright and kind of a cool color because there's bright
cool illumination coming from the stained glass windows above. And all of these other
images of the face here are sort of dim browns and yellows and purples because those are
the different colors of the indirect illumination bouncing up from the floor and coming in from
the walls onto this fellow here. So now that we've illuminated his face by that environment
one little piece at a time, we can just take advantage of the linearity of light in the
superposition principle and simulate him lit by the entire environment at the same time
just by adding all of those images together and the result is that you get an image of
his face lit by that lighting environment without him never having to go over to Grace
Cathedral and actually get lit by the light. And it gets nice things like, you know, the
yellow kind of rim lighting effect here from the light bouncing off the altar. You can
see the stained glass windows reflecting in his forehead and in his hair. You get all
the right effects of how light hits skin and is able to--and is able to have, you know,
a specular component, a diffuse component, subsurface scattering self-shadowing into
reflections and that's just because it's all there in the original data that you capture.
So let's see here. So were going to go for a couple of slides here that we're not going
to be able to put on the webcast but I wanted to talk about a chance that we had where we
have to work with Sony Pictures Imageworks to apply this kind of technique to some of
their digital stunt double characters for a couple of films that they did. Basically,
throughout collaborator on our first project named Mark Sagar who worked on a--went to
work at Sony Imageworks sometime after this and then also visual effects supervisor Scott
Stokdyk, I could see a cigarette paper they thought this could be a good way to get a
realistic skin reflectants for some digital actors that they wanted to shoot, first of
all for the movie Spiderman II. So, starting in about late 2002 or early 2003, we did some
test and then they brought over a couple of the actors from the film this is Alfred Molina
who played Doc Oct in the movie. And we capture the data set on film with him and one of our
new light stages, this is Light Stage 2, and it had a strobe light, it's kind of semi-circle
of strobe lights that go around. So this is actually a long exposure photograph that makes
it look like there's a whole sphere of light around him but it's actually just a semi-circle
of lights. We captured that kind of data and then what we did for this is we had a rough
kind of cyberware skin 3D geometry of the face and we projected this images from the
sides, from the front as basically relightable texture maps that you could then put on the
3D geometry and then get a relatively realistic rendering of the face lit by any kind of illumination
environment that you want. We had to be a little bit careful to kind of separate the
specular reflection of the face from the subsurface scattering component of the face because when
you change your viewpoint around, the specular reflection actually needs to shift around
according to the Surface Normals in your view point. So, we actually do the color-based
separation of those and then re-synthesize the specular component according to the new
view points. So, they would shift around on the digital character. It wouldn't just seem
plastered on to the face. But it was relatively successful for them. They were able to use
it in about 40 shot to the film for a totally digital Doc Oct character for all of the skin.
They augmented it with traditional computer graphics with some nice cloth simulations
for the rest of the body. They added, you know, digital sun glasses so they have to
figure out how to make the light interact with the computer generated object that's
close to one of these image-based objects. And they had a full screen digital close up
for his death scene when his floating back in the water. They thought would be a little
dangerous to film for real. And we also shot Toby Maguire for a couple of scenes where
he has to stop a train and has his mask off there. More recently we got to work with this
fellow here who played the new Superman in the Superman movie that came out in 2006.
And with this, they scan the film at higher resolution, they have the pipeline more refined
and they were able to get some even better results. So when he needs to throw the space
shuttle back in the space, that's a digital Superman looking at the scene there. There's
a scene to where the end of the movie where things have worked out reasonably well and
he's pretty satisfied, so he has kind of a nice satisfied expression flying around Metropolis.
In this case, sure, they actually start it with a neutral scan of his face and then animated
it using their animated system that kind of put a little bit of pleasant expression on
there. That works pretty well. If you needed to really act and talk and going to extreme
expressions, that's not going to work as well because it's too far away from a neutral post
to look quite as realistic. So some of the work we're doing now is actually looking and
trying to capture these data that's live, people in different expressions and positions.
There's also one shot were they came up really, really close to him. He's out in space, he's
thinking really hard, we have to dramatize this and so the camera flies in really close
to--he's probably thinking right over in this area here. And they--because it's an image-based
data set, they can, you know, design the lighting to have, you know, a little bit of a rim light
here and little bit of warm light coming from below, I'm not sure where that is from outer
space but it's somewhere at this point. And you can see they really can get a lot of skin
detail and a lot of nice skin reflectants effects that would otherwise be kind of difficult.
So, we've had a chance to continue working on these on the research side and one of the
devices here which we can totally put on our web presentation is Light Stage 6, which was
a idea to try to extend the light stage idea from just capturing faces to capturing the
whole human body. And one of the reasons this actually happened is that my institute had
some extra space and a satellite facility that they had to find something to do with
and they said, "Hey Paul, you were always kind of talking about building that big light
stage. Well it turns out this could actually be a useful thing for us because we got a
good valid project to do with the rest of the space." So at that point, we had to figure
out how to turn the various talk that we had into a realizable plan and pretty soon we
had a [INDISTINCT] model that looked like this, which was a somewhat daunting thing
to put together for just a one research group but we're lucky that a fellow name Sebastian
Sylvan who's now at Autodesk, head--been a head of a virtual stage facility in Italy
and wanted to work with us. We know actually how to get bigger projects together and make
bigger thing so it happened. So, I worked with him very closely to actually get the
design here. He found all the places that could source all the parts and within about
two months and a little bit of sore shoulders, we were able to put together Light Stage 6.
And the idea of Light Stage 6, being an entire sphere of lights is that we wanted to very
rapidly be able to capture these data sets. We wanted to very quickly go from being able
to light somebody from this direction of light to this direction of light and capture this
kind of image base relighting data set in a real time. Now, if we have some luck here,
this will--video will play and we'll see here. Now, I'm going to pop out of the program and
play it from here. And what we've--that is one initial project where we actually were
capturing what amounts to a seven dimensional data sets of people going through natural
motions. So, this here is a Bruce Lamond, one of our researchers in our laboratory.
We have him on a treadmill here. He normally, as a relatively serious fellow, and looks
like he's concentrating--curious concentrating a little harder than usual because he's trying
not to fall off at this treadmill. He's actually paying attention to some grooves that we cut
underneath the treadmill belts, we can still where he is left and right and forward back.
But the idea is that if we spun him around for about 45 seconds and shot him with high
speed cameras under a time multiplexed illumination, we would get a data set of him under all lighting
conditions. So, if we see him in slow motion, we're going to slow this video down. You can
see that's really happening is we're very rapidly going from one lighting direction
to another. Sometimes it's dark, to another, to another, and where interleaving all of
these data sets at frame rate of the camera. We're actually in this experiment capturing
33 different lighting conditions every 30th of a second. And over a 30th of a second this
are all these lighting conditions here of Bruce. And from that kind of data, we can
use that image- based relighting process under kind of--any kind of illumination. And the
reason that we have him spinning around on the treadmill, if we skip forward, you can
see we actually got him from all different angles as well. It's so that we could also
have virtual control of his viewpoint. So, we had our high speed camera at about chest
level and then from a little bit above, we had a borrowed high speed camera courtesy
of Vision Research. And then from floor level we had another barrowed, high speed camera.
So, we really had three cameras and the idea was well just have him repeat his motion 36
times as we rotate him 10 degrees over the course of each one of those. And we defectively
get this three by thirty-six light field of Bruce, also from every frame of his animation
and for everyone of these illumination conditions. Now, we wanted to eventually take Bruce out
of the light stage and then make it look like he's running across some place that he's never
been to and complete control of viewpoint and illumination. And part of what we would
need for that is to get a Alpha channel or a mat for him. So, one of our lighting conditions,
we turned off all the light on Bruce and we just turned on lights on this peace of gray
paper that's behind him and that gives us the silhouette and that's exactly the right
image that you need if you want to composite him out of that environment and then into
another environment. The problem was in this case here, that didn't give us a good mat
for his feet and we couldn't really think of a way to take this, you know, treadmill
that we'd found at the local Sport Chalet and get it to, you know, glow brightly for
a 30th of a--or for 1000th of a second, 30 times per second. So, what we used for that
is we actually covered the entire turntable in the belt of the treadmill with retro reflective
cloth and then put some ring lights around the camera that were also time multiplexed
in. So, when we're shooting the mat frames, that actually glows back toward the camera
as well and then we get a good mat for the entire body at that point. So, going back
to the relighting idea, here's Bruce walking forward in the stage but we can relight it
to show him lit from, you know, any direction that we want or we can play that image-based
relighting trick and show him under the light of Grace Cathedral or the Pizza gallery. And
if we want to change the viewpoint, then essentially what we're going to do is morph between the
different viewpoints that we have. We run optical flow between adjacent viewpoints and
then we actually combine the idea of view interpolation, which is one of the inventions
that Lance Williams made at Apple back in 1993, a very important paper with another
very important idea which is the light field concept, which is developed at Stanford in
1996 and also some researches at Microsoft Research contemporaneously. And if we basically
combine the idea of view interpolation with light field such that the light field quadrilinear
interpolation coefficients are also used to change the displacement vectors that we get
between as we morph from one view to another. Then we can actually put both of these things
together and then smoothly from a relatively sparsely sampled light field actually generate
views that are further away than we were originally captured closer and then we originally captured
and then any direction all the way around. So, this is actually real time rendering on
an NVIDIA card, a demo done by Charles Felix Chabert in our group. Kind of pushing in to
the scene and then doing a slow rotation around it. So, finally onto our problem, here is
a location that we thought it would be cool to watch Bruce running through. I shot it
as a high dynamic range, omni directional image, this time I actually using Canon still
camera and fisheye lenses in a couple of different directions, put that back together into this
high dynamic range lighting environment and then we're going to drop Bruce into the scene
and we'll see the results. One of the first results we got with the technique, and here
he is. So, what we did is I animated kind of a camera pan across the scene. We're matching
the viewpoint on burst as the camera pans across. We've also illuminated him with the
light from that environment. So hopefully it looks like, you know, the color balance
and the light directions are about correct. We have simultaneous questions from Ken and
Lance here. Let's see if it's the same question. What do we have?
>> Shadows?
>> DEBEVEC: Shadows, very good. And Lance?
>> Motion blur.
>> DEBEVEC: Motion blur, okay. Different questions, both very good. Let's see here. On shadows,
I'll show you in a second what we did with that. For motion blur, we did not add motion
blur to the scene and some of our earlier facial time multiplexed illumination, we actually
did use the optical flow vectors to re-synthesize the appropriate 180 degree shadow--motion
blur. We just didn't do that for this project because we were doing--we're using an NVIDIA
card to do the rendering and we're trying to make it more real time for that. But we
did get some very nice results in some of the facial work that we've got. The shadows,
if we go a little bit further, he is actually casting a soft shadow. He also has a friend
over here just to prove that this is all virtual. And the shadows here, they're not terribly
high resolution shadows but we did is we actually use the silhouettes that we got from all around
to curve out a basic volume of him which otherwise is the first use of any notion of his geometry
that we have in any of these renderings. We'll get a basic kind of voxel model of him going
along. And then we use that to cast rays from all of our basis lighting directions to figure
out a shadow map that you get from each basis illumination condition and then we essentially
do an image-based pre-lighting combination of those shadow maps to figure out how much
light would be blocked in one direction versus the other. So you actually get kind of a warm
colored shadow when you're blocking the skylight and you got a cool colored shadow when you're
blocking the indirect light from the warm-colored building that's behind us. And thinking that
you can't have too much of a good thing, we put a couple of Bruce's together here. They're
not actually inter-reflecting the light of each other or self shadowing each other, that's
kind of future work at this point. But this was--this is enough to at least amuse Bruce
quite a bit. And he actually did all the work on the compositing and the map finding there.
This is a reversed time lapse of building our light stage over a course of four days.
There we go. All right. So, let's go back to some slides here. And what I wanted to
talk about is a more recent project that we've done that's stage--face related that was inspired
a little bit by a completely different kind of facial rendering pipeline that has also
shown a lot of promise which is the idea of--if we're going to try to do a digital model of
an actor. Maybe the first thing you should do is take a life cast of the actors face
in plaster. And then get that scan since now it doesn't move and it's diffused, it's a
good surface to digitize. You can do that very, very high resolution. This is something
that's commonly done--and there's a company XYZ RGB that does this really, really amazingly
well. There's a digital face [INDISTINCT] project that Lance Williams was involved in
that was a test at Disney, I think in like 2000 or so you're working on this. When--they
actually did--the first time that this really high resolution face casting process was applied
to creating a digital actor and they got some amazing results with that. The standard problem
here is that it's pretty much not good for getting like, you know, a live performance
of an actor since it requires taking the cast and such as. There's a bit of inconvenience
involved. Some people say they're kind of changes the shape of the face a little bit
and the other problem is you don't easily get a lined texture maps for the face. So
you might ask why not just you know scan the face itself at really high resolution and
you can do a really fast laser scanner for that. And one of the problems associated with
that is the fact that skin is not as nice as surface to scan as gray plaster. The problem
is of course subsurface scattering and so if you have little laser line on a piece of
paper, it might give you a nice sharp line. But once it actually hits skin, it's going
to diffuse out and get blurry. So if you're try to measure the geometry of the fine skill
skin wrinkling and such based off of that blurry line, you're going to have some trouble.
Now as it turns out, they're actually is some light that reflects off the skin that does
not get affected by subsurface scattering. And that is the specular reflection of the
skin. And this is an image that we found. This is actress Hilary Swank.
It's not attractively lit image because it's flash right from the front. But it demonstrates
this point where if you can see in her specular reflection of the light, that's where you
can actually see the skin detail, of the shape of the pores and the fine wrinkles. And assumably,
on her forehead next to where the specular reflection is, assumably, she has a similar
kind of skin texture right there but you don't see it at all. That's because that is the
subsurface scattering and blurs it all out. It's in the specular reflection that you see
this. And in fact, it's because you see an in specular reflections that it's actually
important for rendering digital characters. If we didn't have any specular reflection,
you'd never see this effect. You could probably get away with that modeling and--but if you
want to try to get that realistic skin look in the specularities, you need to get that
kind of geometry. So, our idea was maybe there's a way that we can photograph just the specular
reflection of somebody's face and then figure out the detailed shape of the face, just from
that. And we thought back to some of the work we did for our first Light Stage paper, where
you had done a little experiment using cross-polarization to remove specular reflections for someone's
face. This is Holly Ken, who is one of our undergrad students, working with us at the
time. And she's lit by a single light and photographed by a camera right in front of
her. You can see that we've got both the specular reflection and the subsurface reflection of
the face here. As it turns out, if you put--when you're polarizer's on both the camera--both
the light source and the camera and if they're at opposite angles, the specular reflection
maintains the polarization of the light. And so, it can't make it through that second polarizer.
And it doesn't show up in the photograph. So, this is a cross-polarized image of Holly
without any specular reflection. The subsurface light, since it actually gets underneath the
surface, scatters around a couple times, it gets depolarized. And about half of it will
make it through that second polarizer. So, the result is that we can actually observe
only the subsurface scattering on its own. And if you radiometrically calibrate your
cameras, which is something that fortunately we knew how to do at the time. If you take
the difference between the diffuse only or the subsurface only image and the specular
image, you can get an image of only the specular reflection just on its own from just two photographs.
So, I thought was well, let's try to take some photos of just the specular light on
the face and try to figure out what the shape of the face is from that. The problem with
doing with just a single light is that you only see specular reflections for certain
areas of the face. In this case, here, you know, on one cheek but not the other cheek.
And what we really want to know is the specular illumination coming from the entire face at
the same time. So, as it turns out, we have devices in our lab that can illuminate a face
from all the directions that light can come from at the same time. This is our Light Stage
5 device which was just for faces, otherwise, similar in a lot of ways to Light Stage 6.
And we asked ourselves, could we cross-polarize out the entire sphere of illumination at the
same time? And as it turns out, first empirically, and then actually figuring this out. There's
a specific pattern of when your polarizer's you can put on every single light of the stage.
Such that, the specular reflection from every possible surface normal will end up with the
same orientation of polarization by the time it gets to a camera in the front. And it looks
like this basically, there's kind of a bit of a whirl around the booster angle here and
otherwise, they're vertical here, horizontal here. The--it takes us about an hour or so
to get all of these oriented correctly. But the result is then that we can actually light
somebody from every direction of light at the same time and observe them without any
specular reflection whatever. So, here I think, the video projectors are brightening us up
a little--a little bit extra here. But this is an image of Tom; he's a producer on our
group, lit from the entire sphere of light with no specular reflection whatsoever. So,
if you are for example making a digital character model, this could be a very useful image to
start with as your diffuse texture map because it has very little effects of, you know, specularity
or variable shading which is usually a challenge when creating characters. Now, this is with
a cross-polarized that if we rotate it the other way and have parallel polarizers, then,
we can actually bring the specular reflection back in. And here, it's definitely too bright.
You can see the specular light comes in. But if we take the difference, hopefully this
will show up pretty well. We can get an image of just the specular reflection of the face
from the entire sphere of illumination at the same time. This looks like a black and
white image. It's actually one shot in color. And since the specular light hasn't had a
chance to interact with, you know, your melanin or your hemoglobin, it doesn't pick up any
skin color. So, this does correctly look like it has basically no chromaticity to it. And
as you can see, we're actually picking up a lot of the detail of the skin shape and
shading from this specular only channel. If you're again creating a digital character
and you need to have your specular intensity map, this could be a very good image to use
as a start for that, as well. So, the last part of the project that we had was the idea
of--let's try to do a variant of photometric stereo. This is a computer vision technique
where if you light an object from different directions, analyzing the diffuse reflection,
you can figure out what the surface normal is because there's going to be only one surface
normal that would explain the different colors that those different light directions would
produce to the camera. And what we came up with in the specular case, you actually have
to use full spherical positions because the specular lobe is narrow enough that you might
miss it entirely if you use point sources. But we came up with a technique that uses
four spherical gradient patterns. Full sphere, a gradient of light from top to bottom, a
gradient of light from front to back, a gradient of light from left to right that essentially,
if we just shoot this four images, and of course, we really also have to shoot these
images here of the corresponding patterns on diffused illumination in order to compute
these images. But from just these images and some very simple math, essentially, you take
this image and divided by this image and you scale it so it's between minus one and one.
It reads out the reflection vector of every one of the pixels on the face based on just
the specular reflection. And then, if you just tilt that half back toward the camera
then, you have an estimate of the surface normal. So, the result is that you can--you
can also do this of course, with just the diffuse channel with these patterns here and
figure out where the diffuse lights coming from. If you do with just the diffuse channel,
you can get a normal map that will shade an image that looks something like this. You
see a little bit of detail where you've got like, you know, whiskers and such that darken
the--darken the image. But you don't seem nearly the kind of detail that you see from
the specular map. So, this is actually shaded with a normal that's gotten from just the
specular component. And you can see it actually picks up all of these fine wrinkle details
and all the skin pore detail. And the final thing that we needed to do was to figure out
a way to actually apply this kind of map--and let's skip over a couple slides here--to some
geometry. And as it turns out, there are some techniques out there. We'd experimented with
them in our group in about 2001 where if you start with a low resolution face scan that
you do get from the laser scanner, doesn't have skin pore detail on it. And then you
know what the surface normal map should be for that geometry. You can essentially emboss
that surface normal map onto the geometric model. And then put that kind of skin pore
and fine wrinkle detail onto your 3D geometry. The nice thing about it is that we not only
can get this hi-res geometry but we have perfectly aligned texture maps for the diffused component,
the specular intensity component and such. And we can map those under the face as well.
Another thing that we realize--and again, I apologize, it's a little blown out on the
video projector here. But this is actually a real time rendering, just using our diffuse
map and our specular maps. Since we get surface normal maps for both the specular component
and the diffused component, we can actually render those two components of the face with
their corresponding surface normal maps. And the diffuse components normal map is actually
going to have less surface normal variation than the specular component because of the
fact that the light is scattered and infectively blurs where the light is reflecting from and
has less to do with the surface shape of the skin than what's going on underneath the skin.
So, as it turns out, if you render with these hybrid normal maps and you have the smoother
normal map for the diffuse and a sharper normal map for the specular, it actually gives you
a first order approximation to the correct subsurface scattering behavior of what the
skin is doing and, you know, just a local shading model. It won't get you light bleeding
in the shatter regions or the ears glowing when they're lit from behind. But for, you
know, the convex areas of the face, it gives you a very close approximation to how it will
look with the full subsurface scattering from a full on illumination conditions. You can
also take the model and show it with a subsurface scattering rendering, as well. Rendering just
from the specular map and then using a subsurface scattering such as--this is the Gents Nimbular
2002 Technique to get very nice rendering, as well. One of our data sets, we got interested
in trying to get the data of a person's hand. So, this is a Hideshi Yamada's hand that he
put up in our light stage. You can see, we got the details of sort of the, you know,
pretty fine skin wrinkles and such. If we render it with the hybrid normal maps, we
get a pretty nice rendering there with the specular component. In this case, here are
the subsurface scattering rendering was particularly compelling because it gets that skin color
bleed into the shadows here and into shadow here and that really also helps sell it quite
a bit as well. So, we've gotten excited about the fact that with this photometric only technique
that as it turns out, it requires eight photographs for the spherical illumination conditions
and then just five photographs should do in a structured light scan at the end. And the
small of photographs working at very high resolution geometry and registered, calibrated,
texture maps for diffuse and specular in this normal maps that it seems like a good way
to capture faces. When we've started capture faces in different expressions, which we think
we can use to drive digital after models. And we're also looking and taking a variant
of this and running it in real time, shooting it with--even not all that high speed photography,
we think we can shoot this kind of data set or close to it at frame rate and capture it
for actor's performances. That's some of the direction that the work has continuing. So
I think I have five more minutes and I have more thing that I could talk about which is
on a little bit of different topic. But if we can play the video, I'll try to tie it
into what we've been talking about before. This is a project we did in collaboration
with Mark Bolas who's at the USC School of Cinematic Arts and Ian McDowall from Fakespace
Labs which is near here, Hideshi Yamada from Sony, you saw his hand just a while ago, and
Andrew Jones, he's a lead author from our research group. And it was an idea to use
some high speed video projection techniques that we've been using for doing real time
structured light scan in the faces and try to adapt it into becoming a 3D display. And
the kind of 3D display, we showed this at SIGGRAPH this last summer. The idea was, let's
make a 3D display that doesn't necessarily have a terribly large image. In fact, it's
a pretty image. It's about five inches tall right here. People are peering in to it. But
one that you can see from any direction all the way around and does not require 3D glasses,
so you can see here, this is actually a lot of our friends from the Software Department
at Digital Domain who came for a visit. I think they have about 11 people around the
display here. And they're all getting their own individual 3D view of the scene from all
these different angles. And the basic way that this works is that we have a video projector
on top that projects imagery down on to a spinning mirror that's at 45 degrees, and
the mirror spinning around the y-axis so that it kicks the light, the image from the video
projector out to all different directions around it relatively quickly. We spin the
mirror about 15 to 20 frames per second. So it is a little bit sturdy that's fast enough
that you can enjoy the 3D. The next versions will be 30 to 40 frames per second. And the
video projector is projecting on to the mirror fast enough so that in this 15th of a second
that it takes to do a rotation, we actually get an individual image for every degree in
a quarter all the way around the circle. So that's 288 images at 15 times per second,
you'll find out from that that we actually have the project imagery under this mirror
at 45,000 frames per second. We have a question. >> What projector have you used?
>> This actually started its life as 2000 lumen optoma DLP projector and it got a seriously
hot wired in order to make it play this imagery as quickly as it does. And the technique that
we use for that actually takes advantage of DVI. We decided, you know, for these first
versions, let's not worry about color. We took the color filter reel out of the project.
Let's not worry about gray levels for the moment. Let's just go for binary images, it
shows some nice wire frames and stuff. And what we are dealing is we're actually rendering
imagery out of an NVIDIA graphics card that's encoded like this. Normally, you send 24-bit
color images over your DVI. What we're doing is we're sending 24 1-bit images with different
view of the scene. So this is our model here. These are 24 different views, each of degree
in a quarter spaced around. And they're all packed in a 124-bit color image. So we actually
send this to the projector. We render them just by setting the bit--the bit pattern to
the projector and then we--the projector automatically plays each image as a 24-frame movie. If we
set the refresh rate of the card up to about 180 Hertz or even higher than that, it doesn't
really start to flake out until about 240 Hertz. But then we can actually get this 45,000
frame per second movies. And the digital micro mirror devices, the DMD TI chip mirrors have
no problem with any of these. When they're showing color, they're usually going around
9,000 flips per second or more. So I have a video that shows basically how this works
here. This is the mirror before it spins up. The mirror has an anisotropic diffuser on
it so when the light hits it, it gets spread out vertically and a little bit horizontally
but this makes it so you can see it even if you're a little above or below. We don't get
vertical parallax naturally with this device. It's a horizontal parallax only display. But
it gives us a chance to get views out to everyone. And this actually does mean it's a little
bit more of a complicated story figuring out how to project imagery on to the display.
We actually are running a custom vertex shader to render somewhat multiple center of projection
images out there so that when this anisotropic diffuser kind of re-binds it out in the space,
you end up seeing correct perspective. And that's, you know sort of what section three
of the papers' all about. But if we actually get the spinning up in concert with the video
projector, here it goes. We get a three-dimensional scene on there. And this is just me shooting
handheld walking around. Now, one of the nice qualities of this is that since we're sending
completely independent images out in all directions we have no problem getting occlusion in this
display. Some other kinds of volumetric displays have a nice three-dimensional image but it's
kind of this ghostly--all light is lit up, spaces lit up and you can see things through
other things. Here, when we're looking at the back of head, we don't see the face anymore.
Because, you know, what you see from the other side really doesn't have anything to do with
what you're seeing from the back side. We're just rendering this out of the graphics card.
And the other cool thing is that since graphics cards are so fast, this is about a 2,000 or
3,000 polygon model. We can actually render at 5,000 frames per second on the graphics
card natively. So, that can let us make this interactive display. This is Andrew Jones
with the Polhimas device, actually interactively moving the model around because it's live
off the NVIDIA card. We did one experiment where we used the tracker to actually track
the--just the vertical position of where the camera was and then have the NVIDIA card interactively
adjust the vertical view points. So we can sort of simulate vertical parallax. So this
is where the track camera and we can make it so you see it from above when you actually
go above there. As you can see so far, this is black and white. We made one slightly desperate
attempt to color which was this tent mirror where we actually had two faces, there--one,
we kind of split the spectrum down the middle and we had one that was sort of orange-ish,
and one that was kind of bluish. And the ideas that this mirror turns around, we can do--I
think. Oh, we have a connection successful. Hi, there. I guess I might have called myself.
And there we go and we're right back to the video. Thank you. So we spun this thing around
and by doing two channels of light, we got at least the to two channel color that we
had. Now the right way to do this is really to use a three chip DLP and just put, you
know, red, green and blue, light down under the thing. We're talking to Texas instruments
so they seem kind of interested in our project. So maybe this year, we'll have something like
that. The other cool thing was that when we had the tent mirror for the stuff that was
whitish, it actually gave us two displays of the image for every rotation. So this became
a much more stable image going at 30 to 40 frames per second. Two more examples, one
is which we got interested in trying to show not just wider frame imagery but maybe photographically
acquired imagery. So going back to this light field concept, we shot a light field of this
tourist souvenir that one of our folks brought back. And we dithered it using Victor Ostromoukhov's
dithering algorithm and loaded up the entire NVIDIA graphics card memory with different
views of these. And we're actually live re-binning of this image here according to vertical tracking
and then we're putting the light field back out of the space. So, this is the real object
sitting next to the virtual version of that. And that's sort of looking pretty cool and
we realized that the fact that we've only got black and white pixels isn't so bad because
a little bit of blur that you get from the motion blur and the diffuser kind of starts
to make it look like pretty good gray levels. And we like that so much, we thought what
we really need to try to do is do something that actually like, you know, kind of he's
animated and moves. And we wondered is there some kind of photographic data set that we
can shoot from all directions of something that's moving. And we realized that actually
we have that kind of data set. So calling Bruce back into service here, we got our Light
Stage 6 data set of Bruce re-binds on to the display and then we're able to sort of [INDISTINCT],
thank you. So, not Princess Leah yet but maybe we're getting there. Very cool. All right.
Well, that's all I brought today. To thank everyone here, there's some websites with
all the videos. Thank you very much. And if there's time for any questions, I think they're
going to do another talking here right away but I'm happy to answer questions as long
as there's time. Yes. >> I think probably [INDISTINCT].
>> DEBEVEC: You know--okay. So one of the members of our computer animation festival
jury was Randal Kleiser who I'm sure you know. He's a film director. And when he showed it
to--we showed it to the whole jury when they were choosing films. He said, "Have you ever
thought of doing, you know, Princess Leah on this?" He says, "I know Carrie Fisher.
I mean, I'm sure she'd be into it." And at that point I thought, A, that would be incredibly
cool, B, I'm so sure she would not be in do it. So, but well see. Maybe it'll--maybe it'll--I
hope maybe someday, we can at least demonstrate it for her. That would be cool enough. Okay.
Thank you very much.