Tip:
Highlight text to annotate it
X
>>Vijay Kumar: So unmanned aerial vehicles. We've been working
on this for about 15 years now,
and I want to show you this picture which illustrates
the number of unmanned aerial vehicles and how it's grown over the years.
This is a picture in 2010, and the smaller vehicles that I'm going to talk to you about,
we started playing around with these in 2005.
You can see the exponential growth in these vehicles.
In the 1980s we did not have commercial GPS, and therefore
it was really hard to develop autonomous flying robots.
So in 2010, people said, well, this is going to be a $10 billion industry
and people projected all kinds of uses. But it was primarily military uses.
It was about surveillance, about spying, force protection, warfare,
all the kinds of things that many of us are not really interested in.
Certainly, from a scientific standpoint, these don't present the challenges
or the potential for impact; no pun on that word.
The FAA famously predicted that we'll have 15,000 civilian drones by 2020.
So fast forward for five years, and now they are all over the place.
I don't have to explain to anyone what an unmanned aerial vehicle is
or what a drone is. And going back to the number 15,000,
15,000 drones were sold in a single month last year, per month.
And people predict, estimate that over a million drones were sold
just in December in the Christmas season.
So it's a $15 billion industry already, and again,
people make famous predictions that it's going to grow to $20 billion, $25 billion by 2020.
And you know that that number is also going to be wrong.
But what's exciting now is that the applications have grown to areas that we never imagined.
So agriculture, inspecting different aspects of our civilian infrastructure,
border patrols, photography, construction and so on, so forth.
So it's really grown to be a very exciting field.
Of course, different people call them different things.
We prefer to use the word aerial robots, because robots we think of as being smart,
able to make their own decisions and so on.
The military calls them remotely piloted vehicles, because in fact,
they're not drones. There are human beings that are controlling these vehicles
every step of the way. So it's actually a misnomer to call them drones.
But of course, the popular press and all of us call them drones.
So to me, they're all now pretty much the same thing.
It seemed appropriate to show this picture in this museum.
You think about the evolution of aerial robots, and we are just starting.
And really, we want to be further along.
In my lab, we look at what I call the five S's of aerial robotics.
So the first S is we want to make them small.
If you want to navigate an environment of humans, you want to be small.
You want to be able to maneuver. You want to go through doorways.
You want to go across rubble in buildings. So we are looking for making them small.
We also want to make them safe.
Clearly, we don't want something banging into humans causing harm,
so therefore we're trying to make them as safe as possible.
Smart, this is an obvious thing. If you're building a robot you want it to be smart.
We also want these robots to move quickly.
So we want to create robots that you actually have to slow them down
to actually see what happens, just like NFL replays.
So those are the kinds of robots we're shooting for.
And then finally, we think about swarms, so that's the fifth S in our vernacular.
So having just given you a flavor of the kinds of problems we're interested in,
I want to tell you a little bit about how we think about autonomous control.
First, you're trying to control something which really lives in six dimensions.
There's three positions and three orientations that you have to simultaneously control.
And a robot like this has four rotors, and you only have four inputs.
You have four motors and you're trying to control with these four motors
six different things. The system is under-actuated.
It's sort of an unfair problem mathematically because you're trying to do six things with
only four inputs.
So these robots are called quad rotors because they have four rotors.
Even if you add more rotors you are fundamentally under-actuated.
And so we spend a lot of time just attacking this problem.
The second thing we do is think about how to
design software that runs in real time. So I'm showing you a picture here.
I'm just trying to impress you with a block diagram,
but the thing I want to get to you is the fact that you have these feedback loops.
So if you look at the inner most feedback loop you will see that
it operates at roughly a millisecond.
That means every millisecond the robot is estimating its rotation,
its attitude in the real world and its angular velocity,
and trying to regulate that to get the precise orientation it wants.
So then the intermediate feedback loop in the middle,
you're feeding back positon and velocity,
and that operates roughly at 10 milliseconds.
And then finally, the outermost loop, you're thinking about trajectories in the real world
and how to plan trajectories, and that operates roughly at 100 milliseconds.
So those are the three levels of intelligence that need to be built into the system.
And although all of our students are engineers,
they spend 80 percent of their time thinking about software.
So therefore, they also become computer scientists in this field.
The most critical things are the computations that happen onboard.
These are the orientation calculations to figure out how to control the orientation.
Some of the other computations actually don't happen to happen onboard.
So if you look at the position, sometimes we get estimates of positions
from external cameras, and those computations can actually happen off board.
In fact, a lot of the trajectory planning software
that we write and test in the lab happens on a laptop like mine.
Today we hear a lot about cloud infrastructure.
Well, we use the cloud infrastructure to run real time control loops
for vehicles like this as they maneuver through the environment.
We try to make these robots as small as possible.
This is work of Yash Mulgaonkar. This is the smallest robot we've built.
It's only 11 centimeters tip to tip. It has a max speed of about six meters per second.
And in terms of the size and the velocity, in terms of body lengths per second,
it is equivalent to a Boeing 787 flying 50 times the speed of sound.
So it actually flies pretty fast for something this small.
By making it small, we automatically make it safe.
And by making it small, we also make it more maneuverable, as I'll explain in a minute.
The inspiration from this really comes from nature.
So if you look at honeybees, for example, they're extremely small.
They've very maneuverable. In fact, unlike all the big robots we build,
they don't even think about avoiding collisions.
So for us, the nightmare is, we build this big behemoth,
and the first thing we think about is, we don't want to bump into other features
in the environment, bump into each other.
Well, you look at these honeybees and they love collisions, because by colliding, they
learn.
So by contacting your neighbor you actually know who your neighbor is
and you know a little bit about your environment.
So we'd love to be able to create robots of this scale.
In our lab, we actually think about how to scale things down.
Here you work of Yash Mulgaonkar. And that's the sound of the robots.
You can see that one of the things you notice is that when the robots become small,
they're able to respond more quickly to perturbations.
In fact, we can show through scaling laws that the maximum acceleration
you can get goes as one over the characteristic length.
In other words, you make a robot half the size and their ability
to maneuver in the rotational direction doubles.
Likewise, the robustness, which we call the basin of attraction,
grows dramatically as you shrink the size of the vehicle.
This might be counterintuitive. All of us who fly on large aircrafts
prefer those to smaller turboprops, but the turboprops or the large aircrafts
are never subject to perturbations like this.
If you want to respond to collisions and react to them and be robust to them,
you really want to think about sizing things that are much smaller,
and that's what we try to do.
Anecdotally, when we first started working on this we realized we needed to have
a first aid kit, so we started buying Band-Aids and things like that.
If you plot a histogram of Band-Aids over the years, now it's tailed off
when we moved to these little guys, people don't get hurt, which is great.
And this is, again, 1/20th speed, probably the first mid-air collision -
planned mid-air collision, and vehicles bumping into each other and recovering from these
collisions.
These robots are traveling at roughly two meters per second, walking speed.
So imagine one person standing still and you walk right into that person.
You feel the impact. Well, these robots feel it, but
they're able to recover from it quite spontaneously.
So that's the advantage of size. In terms of figuring out how to plan these motions,
we think a lot about how to represent the dynamics of these vehicles.
So if you look on the right hand side, there's this huge vector of things that the robot
stores-
its position, its velocity, its rotation, its angle of velocity.
And we think of clever ways in which to abstract from this a smaller dimensional representation,
which consists only of the positon and the orientation, the heading angle,
much like you would when you drive a car.
When you drive a car you think of your position in the road
and you think of the yaw angle of the car, and that's roughly what you see in this left
hand side picture.
If you work in the smaller abstraction,
then you can think about planning trajectories in that space that are safe,
and then some fancy mathematics that ensures that these trajectories are as smooth as possible.
So again, the intuition is, if you have lots of inertia,
you don't want your trajectories to be jerky. You want them to be smooth.
We try to minimize what is called a snap, which is the fourth derivative of positon
over time.
So the derivative is velocity, second is acceleration, third is jerk, fourth is snap.
You can also do crackle and pop, but we don't.
But we try to minimize that and then find the right trajectory.
So that's the essence of the planning problem.
Once you do that in the simpler space, which is the problem in computational geometry,
then we transform this over into this more complex space and then we execute them.
And that's basically what you see in these videos.
You see the robot going through these planned obstacles.
So if the robot knows where the obstacles are,
it can plan these minimum snap trajectories at a fraction of a second, often 20, 30 times
a second.
And it doesn't matter if the obstacle is moving.
If the robot knows how the obstacle is moving,
it can determine how to plan trajectories to go through the obstacle.
So this is a bread and butter for all planning algorithms that we use.
Some of you may have seen videos of birds fishing to catch their prey,
and this is amazing. Look at this bird coordinating its flight, its vision and so on,
and its claws. We try to do the same thing with robots.
So here's a robot fishing for Philly cheesesteak hoagies,
and it's able to pick that out. So again, we focus on the split second timing,
coordinating vision, coordinating arms, coordinating hands,
and flight as you fly through complex environments.
Then finally, work of Sarah Tang, where she's able to use this framework
to think about transporting suspended payloads
whose length is more than the height of the window.
So you have to figure out how to get the momentum of the object to be such that
the suspended payload swings through first before the robot actually goes through it.
So these calculations look complicated,
but by abstracting the dynamics of the simplest space,
we're able to solve this in real time and then feed it to the robot.
And then lastly, this problem of trying to perch in complex environments.
Again, you want to perch to save energy, to rest.
And the challenge for us is to purge on vertical surfaces.
So we have a gripper which is made out of a dry adhesive.
I call these the Spiderman claws, and they're able to hold onto flat surfaces;
a gripper designed by colleagues at Stanford at Mark Cutkosky's lab.
And again, this framework allows us to land on any vertical surface,
or any tilted surface for that matter, at just the right velocity to achieve perching.
So we're able to get autonomy in a wide variety of settings.
Not just in flight, but also perching, grasping and things of that nature.
I want to tell you a little bit about the problem of state estimation.
Everything I've shown you thus far we have cheated.
We have cheated in the following way.
In the lab, our robots are equipped with motion capture cameras and reflective markers.
So the cameras see the reflective markers and they compute the position
of the robot a hundred to two hundred times a second,
and then deliver that information to the robot.
The robot knows where it is at all times. It's like having GPS on steroids.
You know exactly where you are, and unlike in the city when you're going around
and you lose GPS, here you never lose GPS.
So this gives the robots an unfair advantage and they'd be able to do all the things
that you just saw with amazing precision.
In the real world, and here's a typical building on the Penn campus,
it becomes really challenging. Without external cameras you don't know where you are.
In fact, in this building, GPS doesn't work. My cellphone doesn't work.
We barely get Wi-Fi coverage.
So how do you get robots to localize in complex environments like this?
So we work a lot on this problem, and I want to show you a prototype
that was built by a former student who is now a professor
at Hong Kong University of Science and Technology, Shaojie Shen.
The system he built consists of two forward facing cameras,
and you can see the GPS receiver on top. You can see a laser scanner,
which is this orange band on the top. And then there's a downward facing camera, too,
which you don't see.
So this package allows the vehicle to sense features in the environment
and determine where it is relative to those features. Then as it moves,
much like humans, when we walk we're looking at things in the environment.
We take steps. We know roughly how far we've walked.
And then we look at how these features are flying past our retina.
We integrate that motion to then figure out where we are in the real world.
And that's what this robot is able to do.
Here you'll see work of Sikang Liu that essentially takes this information
coming from the robots and it's able to construct the three dimensional maps.
This is just outside our lab. You can see it build high resolution maps
at five centimeter resolution, and how it's entering the lab as you can see,
with all the clutter - obviously our lab. And the bottom, you see the map that it's
building
and you'll see that the color of the objects that it sees is overlaid on this map.
So this is now leading to "smart." And it's not really smart in the sense that
it's not making any intelligent decisions in this particular experiment.
But it's smart enough now that it's able to perceive the environment
and represent it in terms of this three dimensional map.
Which is a great starting point. Imagine being outside a building and then
deploying the vehicle inside the building where you have a complete picture
of what's inside the building. You know something about its structural integrity.
If there's an active shooter in the building you can probably detect that shooter.
And if there are victims in the building you can localize
and tell rescue workers where they are. So this basic technology,
while it might not appear to be very smart to us, is actually smart enough to do lots
of useful things.
Here's the same type of technology, an outdoor flight.
Many of us have now heard about Amazon and Google
wanting to deliver packages to our doorstep. This, in theory, works.
It works when you are flying at let's say 400 feet, just at the FAA ceiling,
where GPS is clear and you're relatively unobstructed.
But what happens when you get to features such as trees, where your GPS might not work?
And in our case, when you have cameras,
the cameras might not have enough illumination to function.
So we look at combinations of sensors, as you see on the top left.
And at every instant, the robot is able to estimate its error.
So if you see that ellipsoid in the middle, this is not unlike the ellipsoid you see in
your Google maps
which tells you the error in your position.
So the vehicle not only calculates where it is, it's also able to tell the software
what its estimate of the error is. And as it goes around this complex,
indoor and outdoor, using lasers indoors where cameras and GPS don't work,
and outdoors in bright sunlight where the camera doesn't work but maybe GPS works,
it's able to navigate its way through a fairly complex environment.
So is a half a kilometer flight at roughly walking speed,
and it's able to do all of this autonomously.
So this is a very important technology as you sort of get close to human build environments,
like buildings or trees that just happen to grow since the last time you were there.
So you need to detect that. You have to react to it and then behave in a safe way.
So one problem that we run into is that these vehicles burn a lot of power.
So if you look at rotor crafts, they burn roughly 200 watts per kilo. That's a lot.
That's like four light bulbs for every kilo of payload you carry.
And part of the problem is that all this hardware I've shown you is actually quite heavy.
The cameras are about 80 grams, the laser range finder that I showed you is about 370
grams.
Our Intel processor of the board is about 200 grams.
So you add all of this up, not only are you burning power to power the devices,
you're also burning power just to carry these devices.
So a big challenge for us is to actually limit the power consumption.
If you don't limit the power consumption you have to carry bigger batteries,
nd if you carry bigger batteries that's extra payload and you're burning even more power.
All the things I've been telling you about, being small,
being safe, that goes right out the window because your devices keep getting bigger.
So this is a big challenge for us. But consumer electronics sometimes comes to the rescue.
So if you ask yourself the question, what is an inexpensive device that you can buy
today
that has sensing and computing in a lightweight package and low power,
of course it's your smart phone. So we started asking the question,
could we build something that's powered exclusively through smart phones?
So we came up with this idea of a "phlone". So you buy an off-the-shelf,
in this case Samsung Galaxy S5 phone, and you download our app.
And then you buy a USB cable -and make sure the USB cable is as small as possible
because you want to limit the weight- and you plug it into a drone.
This just happens to be the robot that we built, but it will work with most drones.
Then you can actually power the device using a smart phone.
So I want to show you - this is Giuseppe Loianno's work, and show you the robot that he built,
where this phone is actually taking pictures of everything it sees in the environment
30 times a second, calculating features in the environment, estimating distance to the
features,
and from that, estimating its position.
So all the computation and all the sensing is done onboard using the phone's camera,
the phone's processor and the phones inertial measurement unit,
which is basically a system of accelerometers and gyros that measure accelerations
and angular accelerations.
So this is in collaboration with Qualcomm, but you can see three meters per section
autonomous flight, all planned by Giuseppe through his software.
And of course, you can get it to do whatever you want, and you can just imagine,
you can take the mother of all selfies if you position it wherever you want.
So this give us some hope that you can actually build really lightweight devices
with off-the-shelf hardware. So it's inexpensive, lightweight, and also safe.
The other S-word I talk about is speed. So this is what we'd like to be able to do.
This is actually being driven by an expert pilot.
Imagine again responding to 911 calls and getting there
and responding to things quickly, finding out where the bad guys are.
We'd like to be able to do this autonomously.
There's only one small segment of this, and I'll show you this in a minute,
where the flight is autonomous.
So this piece, flying down the hallway. This is about three or four meters per second,
maybe a little more than that. So we know how to do that autonomously.
But navigating these bends at high speeds and going up and down the stairs,
these are things we're still working on. But that is something we'd like to do before this
then becomes an effective tool that we might imagine using in a search and rescue and first
response.
Finally, I'd like to talk a little bit about cooperative control,
where we look at the problem of how to get all of these robots to collaborate
and do something useful. And of course, once again, we're inspired by nature.
So this is a picture with half a million to a million starlings off the coast of Denmark,
and you can see them form these incredible patterns in the sky.
To my knowledge, they don't use a whole lot of mathematics to do this, right,
but mathematics is the tool that we have at our disposal.
So it's a real challenge to be inspired by nature, and then work with tools
that we know to create these kinds of behaviors.
Instead of trying to mimic them, what we have tried to do is, instead,
understand some basic organizing principles that we believe allows us to
accomplish these kinds of movements. So it's not just about flight.
We can see ants cooperatively carrying objects, and they carry this object back to their nest.
They think it's food. The reason they think it's food is because this plastic object
we created is coated with the juice from figs, so they think it's food
and they carry it back to their nest. But this allows us to study cooperation.
This is actually an elastic disk, so it allows us to see which ants are pulling,
as you can see on the top, and which ants are pushing at the bottom.
You can also see which ants are not doing anything.
They're just goofing off and they're there for the ride.
But it's really intriguing how these ants spontaneously form teams
and are able to accomplish these incredibly complex tasks.
At least from a robotics standpoint, these are very complex tasks.
So again, the organizing principle is, first, each ant, each bird acts independently.
So we want robots to think about being completely autonomous and being self-contained.
Second, we'd really like them to work with local information.
There is no way in a room like this, if we had to make decisions,
that we wait for consensus to emerge and do something as a group.
Maybe that's what the government does today. That's why they don't do anything.
But it's very hard to achieve that. So you really have to work based on
what you know locally, and then act based on that.
The third idea is also fairly simple. This notion of anonymity.
We want individuals to be agnostic to who their neighbors are.
So if you think about a completely altruistic society of robots,
then the robots shouldn't care who their neighbors are.
We want them to collaborate and be exactly the same way,
independent of the specificity of who they're surrounded by.
So we try to incorporate all of these elements into our software.
You could see here that he's demonstrating the first idea.
This is Katie Powers' work, where she has encoded these leader/follower behaviors into
the robots.
So the first robot is literally hijacked by David Pogue, and he is able to manipulate
it.
The other robots are basically responding to their neighbors.
And they don't care that one of them has actually been lifted up by a human being and is moving
it.
They're just reacting to the position.
The simple idea here is that a single individual can actually manipulate,
maybe not quite a swarm, but in principle a swarm.
So the control computations that have to be done don't scale with the number of robots.
It's just the same computations you'd have to do if you just had a single robot.
And then everything else follows, because every robot is following a leader,
and then that robot has another leader and so on and so forth.
The second idea is this concept of anonymity, Matt Turpin's work.
And here you could see that the robots have been asked to form a circular pattern.
They know the patter that they have to form, but again, they are agnostic to their specific
neighbors.
They're agnostic to even the number of robots on the team.
So as long as they know where the pattern has to be formed and what the shape
of the pattern is, they're able to find their place, adjust their spacing
with respect to their neighbors.
And now we're beginning to see something that might resemble the pattern formation
that we saw in the starlings. Admittedly, for a very simple circular pattern,
but still, doing them autonomously without worrying about the number of robots on the
team.
Then finally, you see some of these things put together where the pattern
actually changes shape, starting with a rectangle, then into an ellipse,
into a straight line, back into a circle. In all of these computations,
a programmer is essentially telling the robots what patterns to form by giving the robots
different shapes as a function of time. And the robots figure out which robot
needs to be where in order to describe the shape and they adapt to the commands.
So you could see how these kinds of algorithms might be used now
for half a million robots if we had them. And if there was a place
we could do these kinds of experiments,
these algorithms would scale to those large numbers.
I want to talk a little bit about why we're doing what we're doing,
besides creating these cool videos and publishing them on YouTube.
And of course, everybody loves those, but ultimately,
we're interested in solving some real problems.
The first problem area that we're very excited about is agriculture.
If you look at the challenges facing society, you quickly come to the conclusion that
water and food, and actually these challenges are related,
are our number one challenge. The efficiency of almost all production systems in the world
has gone up over time, but for food it's actually going down for a variety of reasons.
So one thing we're really interested in is trying to see how we can use robots
to monitor and tend crops. Here's our robot flying in an apple orchard
carrying all kinds of sensors. And they're able to, in this environment,
do fairly simple things. On the bottom left, they're gathering infrared information.
On the bottom right, they're building three dimensional maps of apple trees.
And in the center, they're computing an index called NDVI.
So each of these pieces of information is useful in order to assess the health of a
plant.
So for instance, if you know something about the size of the plant,
if you have a three dimensional map, you can fly by that plant week to week
and measure the state of growth, and you can estimate how healthy it is.
If you look at this NDVI, the central thing, that essentially tells you something about
the vigor of the plant. Something even more basic, flying past these plants
we can count apples and we can estimate the yield on the plants,
and helps them plan for downstream picking, harvesting and then shipping;
something quite basic with every manufacturing facility has, but farmers don't have.
Another thing, and this is Kartik Mohta's work, we're working on
is this notion of robot first responders. So imagine you have a 911 call
from a building. You can imagine a swarm of robots equipped with cameras
getting to the building and surrounding it long before search and rescue workers
come to the scene, long before first responder police officers come to the scene.
What we are really trying to do with these, on the top left you see
the operator interface, what the dispatcher might see before he or she even reacts to
the 911 call.
The robots are surrounding the building deciding who takes up
what position around what ingress or egress point,
all the time assimilating information and building a mosaic,
as you see on the top right, and a three dimensional map on the bottom.
So now, if a police car were to drive up to the scene,
they would be equipped with all this information before they even get there.
And they would know what to do before they got there.
This is a very important tool in operations,
where oftentimes speed of response is so critical.
This is not true just for outdoor operations, but also indoor operations.
I want to show you some experiments we did.
This was about five years ago, after the Fukushima earthquake.
This was in a town not too far from Fukushima where our aerial robot is hitching a ride
on one of our Japanese colleagues' ground robots. And by the way,
the reason it hitches a ride is because our robots are programmed to be lazy.
They burn a lot of power, so anytime they can ride on top of something else, they do.
But you can see in this collapsed doorway they quickly realize that the team
cannot go through. So the aerial robot takes off, is able
to cross over the bookshelf, see what's on the other side,
all the time creating a three dimensional map.
And this kind of information then can be made available to somebody
who is standing outside the room or outside the building,
and providing valuable information in terms of the structure integrity of the collapsed
building,
in terms of potential victims, and assessing the state of the building.
In this particular experiment, again, this was five years ago -
we were able to build three dimensional maps.
And this is three stories - the seventh, eighth and ninth floor of a nine story building.
So the map is a five centimeter resolution map, and this took a long time to build.
This experiment lasted about two and a half hours, and that's one of the challenges of
robotics.
If I tell a search and rescue worker or a first responder that I want you
to give me two and a half hours so I go into this building and give you this
wonderful map, nobody is going to give me that time. I'll be lucky if they give me two
and a half minutes, or maybe two and a half seconds.
That's where this idea of swarms come in. We really want systems
that can go in really quickly, collect the data, and by the time they cut out,
they've assimilated this information and built a three dimensional map.
And that's the kind of thing we're shooting for.
So let me just conclude with a poster of an upcoming Warner Brothers movie
called The Swarm.
Actually, some of you might be old enough to actually remember this movie.
Has anyone seen this? If you've seen it you probably know
that you won't recommend it to your friends. It's actually a terrible movie.
It's about killer bees that attack mankind and so on.
But I love the poster because everything about this poster is true.
The size is immeasurable. I hope I've convinced you the power is limitless.
Even that last piece, it's enemy is man, which is true.
We have the technology and we have to find a way to harness the technology
and use it in a way that could be beneficial to society and to mankind.
So even that part is true.
So thank you very much.