Tip:
Highlight text to annotate it
X
Hello, everyone.
My name is Brian Dorsey, and we're going to
talk about load balancing.
So as you're probably aware, load balancing is a critical
part of nearly every scalable service you might want to run
on the internet.
And luckily, Google Cloud Platform Compute Engine has a
very scalable, powerful load balancer built right in.
Basically what you do is you configure a
pool of your instances.
And the Load Balancer will automatically direct traffic
among them, spreading out TCP connections and [? UDP ?]
packets amongst all of your instances.
As long as they stay healthy, they get traffic.
And if they become unhealthy, the Load Balancer will no
longer send them traffic until they become healthy again.
And you get to define what exactly
healthy is for your instances.
What happens is Compute Engine will send HTTP requests to
your instances.
And when they get a 200 response back,
it's considered healthy.
And if anything else comes back, it's unhealthy.
So it's completely under your control.
Let's go ahead and take a look at the demo.
So what we've got here is an App Engine application.
When I hit this Start VMs button, it's made a request to
the Compute Engine API, and we're spinning up some
instances to run this demo workload.
It's a fractal generator.
And as the instances come up, they're running a startup
script that downloads a go language fractal generator,
and spins that up and starts it running.
As the instances have gone green here on the demo, that
means the instance is up and it's starting to go through
its boot process.
And as soon as we start seeing check marks, then those are
actually running this fractal software.
And this is basically intended as a proxy for your
application.
You can imagine any sort of CPU-heavy application--
this is taking the place of that in the demo.
So let me go ahead and show the fractals here.
On the left-hand side, we've got a single instance
serving this up.
And on the right-hand side, right
now we have 10 instances.
So we can go ahead and add more VMs.
And as we zoom around, even though we have new VMs coming
up, we see both of them working well.
And the one on the right, things are coming in faster
because we're actually cooperating, using multiple
instances to pull this up.
And that's all transparent as far as this client
application's concerned.
Each side is just hitting one IP address, and the answers
are coming back.
If something were to go wrong, say we head behind the scenes
and we cause a failure in this case, but something happened
to one of your instances, we're going to go ahead and
get rid of number two.
And it's zero-based here, so we should see this one drop
out, and there it goes.
And we can still zoom around and nothing is changed as far
as our client application's concerned.
It's still sending and receiving requests, still
moving fast.
We've added new instances in and we've dropped one out, and
all of that's transparent to both our application and all
of our clients.
So let's come back.
Thanks for that.
And I also want to stress that this is not just a virtual
machine serving this traffic up.
This is Google's network infrastructure acting as your
load balancer.
So our network infrastructure is actually routing your data
to your instances.
And what that means is, if you have a big spike in traffic,
you don't have to wait for virtual machines to warm up in
order to handle that traffic.
Google's network infrastructure will
handle it for you.
So please give it a try.
We'll have some links to the docs and other details down
below the video.
So take care, and happy load balancing.