Tip:
Highlight text to annotate it
X
CRAIG MCLUCKIE: All right.
Welcome everybody.
Oh, a little bit loud.
I'm going to take a little bit of time to go through
something that was announced today, which is
Google Compute Engine.
During the keynote, Urs introduced our new
Infrastructure-as-a-Service product to the
Google Cloud platform.
I'm going to take a few minutes to elaborate on some
of the stuff he shared with you.
He showed how Google Compute Engine has been used by the
Institute for Systems Biology to take a life-saving cancer
research process and move it from something that took hours
to minutes to seconds.
I'm now going to take a little bit of time to share with you
how Google Compute Engine can offer you that same set of
capabilities, how you can tackle some really big
computing problems on Google's infrastructure.
And during the session, I'm going to provide a very broad
brush stroke sense of what we've been up to so you
understand what the product looks like.
I'm going to invite some of our partners up on stage to
share some of their experiences and the
technologies that they're offering.
And I'm also going to give you some demos so you can actually
see the product in action.
So let's jump right in.
Google Compute Engine is Infrastructure-as-a-Service.
So when describing this product, I think the logical
place to start is with Google's infrastructure.
Now, as you know, Google runs some very big internet-scale
businesses.
And to be successful in these big businesses, we need a
really large infrastructure.
The process of running search, for instance.
We index every day billions and billions of web pages.
Our Caffeine index is over 100 million gigabytes.
It's huge.
And we're able to provide results within a quarter of a
second to those search queries.
To do that requires some very interesting infrastructure.
To be successful as a business, it's not just about
the size of the infrastructure, the
performance of the infrastructure, the scale of
the infrastructure.
It's also about the efficiency of infrastructure.
And the efficiency to us is important for two reasons.
One is environmental impact, obviously.
We're very green by nature.
And our data centers actually consume only about half the
energy of traditional data centers.
But it also translates into money.
And it means more
profitability for the business.
So we have spent a tremendous amount of time and effort
innovating relentlessly, focusing on bringing new
capabilities to market and new technologies
to market for ourselves.
And we've also focused on refinement of every process in
our data centers, whether it's the design of the hardware
down to the silicon; whether it's our software-- and we've
brought some really neat and interesting technologies to
fuse together commodity hardware and provide very high
quality of service offerings on top of it--
to our data center design.
Whether it's dealing with cooling; whether it's dealing
with power distribution; or to our operations, rolling out
new infrastructure, doing that efficiently, dealing with the
life cycle of our hardware, recycling it.
So we've relentlessly refined across this.
And what we brought together is this incredible power, this
incredible scale, and incredible efficiency.
And now you can access that and run your processes on our
infrastructure.
So what is Google Compute Engine?
At its heart, it's Infrastructure-as-a-Service.
Infrastructure-as-a-Service starts with Compute.
We provide Linux virtual machines that you can rent by
the hour on demand.
You can configure them the way you want, and you can run
traditional workloads as if it was running
in your data center.
It's about storage.
It's about providing options for storage, which emulate
local disk, whether that's a durably replicated, reliable,
highly performant off-instance storage, or whether it's
having access to large local storage for data-centric
applications where you need access to large, efficient,
local storage, to our internet-scale cloud storage
offering, which provides the ability to create shared
content in a cloud-scale or internet-scale object store.
It's about network, being able to take these virtual machines
and expose them to the network so that you can access them
and to do that in such a way that you have flexible control
over who can speak to them, who can see them by using
flexible firewalls.
And it's also about being able to fuse these virtual machines
together to form very powerful compute clusters, so you can
tackle large-scale data-processing problems the
same way Google does.
And then, of course, it's about the tools.
You need to be able to command and configure and control
these virtual machines.
And so we've provided a portfolio of Tools options so
that you can jump in there and configure them exactly the way
you want with a low-level command-line tool.
Or you can access them and get easy access
using a simple UI tool.
And we'll spend some time jumping into these tools and
explaining the service to you more.
And so what I'd like to do is invite Chris up.
And I think the easiest way to really understand the product
is just to see it.
That will really give you a sense of what exactly this is.
So Chris will do a little demo for us.
CHRIS: All right.
Let's flip these guys.
CRAIG MCLUCKIE: There we go.
CHRIS: All right.
Thanks, Craig.
I'll be giving a brief tour of our UI today, showing you
around a virtual machine instance.
So let's go ahead and dive right in.
We can look in the Google Developer APIs site is where
our UI is located.
We have our compute nodes here in the form of
virtual machine instances.
I only have a single instance running right now.
We can look at our durable storage in the form of
persistent disks that we list here.
Again, just a single example.
And then our globally available networks.
And right now, I just have my default network defined here.
We also have our resource zones and a list of operations
used for auditing what actions have been
taken on your resources.
We'll get into those in some later talks.
So all these are bundled together in a
Google Developer project.
So here in the overview pane, we can see that my project is
provisioned for 20,000 instances,
20,000 CPUs, et cetera.
So let's go ahead and start an instance.
All right.
I like fractals, so I'll name this guy Mandelbrot.
And I'm going to boot this with an Ubuntu image that has
some fractal tools baked into it, which I created just
before this talk.
I'm also going to turn on a service account.
This enables seamless authentication to other cloud
services that we have without you needing to manage keys,
push keys into your VMs, anything like that.
So we'll go ahead and get that guy started.
So while that's spinning up, it's worth noting that our UI
is actually an App Engine application making calls
against Compute Engine's public rest API.
There's no back doors in use here.
We'll be open sourcing this UI to you.
So you'll be able to use it as the basis for specialized
platforms or more sophisticated UIs.
In addition to this, I'll show you a brief look at our
command-line tool written in Python.
We will be open sourcing that tool as well.
All right.
So we're up and running now.
I've got a SSH command line here I can paste in.
And here we go.
I'm in my VM.
So we can poke around here, See all the running processes.
We can see what kind of kernel we're running on.
We can check out the CPU that's in here, so a full
Linux virtual machine available for you to use very
quickly, very easily.
I'm going to go ahead and have this instance do a little bit
of work for me.
So he's going to generate a series of
image tiles for a fractal.
And you can see here that we're uploading these to
Google Cloud Storage using that seamless service account.
And then as those tiles hit Google Cloud Storage, I have a
local server picking them up and displaying them here in a
pretty basic web UI, so pretty quick and easy.
CRAIG MCLUCKIE: Thanks, Chris.
So as you see, we have gathered a very simple Linux
virtual machine as a service offering.
This is Infrastructure-as-a-Service as
you'd expect it to be.
And I'd like to sort of contextualize a little bit and
help you think about what we're focused on, what the
value proposition of this is from our perspective.
It comes down to this--
single bit Linux virtual machines are great.
But what we do at Google is we throw a lot of infrastructure
at problems.
So we've developed a set of technologies, and we've
derived a tremendous amount of benefit from having very large
amounts of accessible, affordable infrastructure that
we can throw at large-scale computing problems.
And in many ways, that's one of the toughest challenges we
have to face.
It's like, how do we build a service like this that can
really scale?
So that it enables you to tackle problems the same way
we tackle problems on the same infrastructure that we use.
So this first version of Google Compute Engine, this
initial offering, is really focused on large-scale compute
problems, helping you solve these large-scale
data-processing problems.
And for us, that mean's scalability, that means being
able to stand up these virtual machine instances quickly so
that you can quickly bring up one of these clusters, do your
work, and then turn it down when you're done with it.
And it also means being able to perform effectively.
It's not just about performing fast in a straight line.
We actually have some pretty cool toys that Urs has built
us over the last decade.
It's pretty easy to go fast singly on this kind of
infrastructure.
We've really focused on being able to do this at scale and
being able to perform better the bigger and bigger the
cluster is.
And so that's some of the things that's really unique
about this technology.
And then, of course, it's about affordability.
If you're buying a large amount of virtual machines,
you're buying a lot of CPU, you want to be able to do that
affordability.
And we really think about this is a utility.
You should be able to plug into this thing, consume as
much resource as you want, just like a utility, be able
to get that resource affordability, just like you
do from a traditional utility company.
Because that utility company is able to deliver it at scale
and achieve very high efficiencies of scale.
And so that's what Google Compute Engine is about.
It's about bringing these three things together--
scale, performance at scale, and affordability.
All right.
I think to really understand the service, it makes sense to
step back and describe some of the guiding principles.
These are the architectural principles that we used to
steer our design and steer every decision we've made in
delivering the service.
I think it's useful for two reasons.
One is it'll help you understand the service better.
But it will also help you understand where
the service is going.
Because these principles are still in effect.
These principles are driving our
day-to-day decisions today.
And they will continue to drive them in the future.
And we believe that these guiding principles are very
effectively manifest in the service.
And as you get a chance to play with it, as you get a
chance to use it, you'll experience it for yourself.
And our first guiding principle here, the thing that
trumps everything, is security.
We recognize that your data is your business.
And to win your custom, to win your trust, we have to treat
that data with a tremendous amount of respect, and we have
to create very strong controls to ensure the security of your
data and to ensure the privacy of your users.
And so at every stage of the game, we focused on delivering
strong security controls.
A couple of examples of this are our network.
Any data that transits between virtual machines, whether
they're sitting on the same rack right next to each other
or whether they're going thousands of miles across
regions, runs on Google's global private network.
It's secure.
There's very high levels of encapsulation that we've
applied to ensure that no one can intercept those packets or
to try to attempt to ensure that no one can intercept
those packets and drive the security of the system.
We also encrypt all data at rest.
This is really important.
Our customers have asked for this.
Our customers demand it, and so we've provided it.
So any data that's written to our block devices, whether
it's at the system block device or our local disk, is
encrypted at rest.
And because we're doing that, we're able to do that very
efficiently.
So you don't see noticeable performance
impact of that control.
The next criteria that we focused on is
tremendously important.
This is a little bit nuanced.
But it actually has steered us and is probably one of the
most impactful things we've really focused on.
And that's the idea of consistency.
When you're building real-world solutions, it pays
to be operating in a consistent environment.
Traditionally, you have to design for
the worst-case scenario.
Sometimes you don't even know what that's going to be.
If you're operating in an unstable environment, you have
to design in controls to deal with rapid-order scaling, and
it adds a lot of degrees of freedom to the solution.
And so at every stage of the game, we focused on creating a
consistent environment.
Our target here is not a massively multi-tenant cloud.
That's not what we're trying to deliver.
Our target here is to deliver an experience, an environment
that feels like you're in a data center.
That's the class of consistency that we're trying
to achieve.
And to support this, one of the things we've been very
concerned about as we've seen what's been happening in the
cloud is the movement away from the separation of storage
and compute.
We've seen 20 years of sand development, where enterprises
have moved to a world where they're able to effectively
decouple these two things.
And it creates a great degree of control and
flexibility for them.
It made us a little bit sad when we observed that people
were moving to a world where storage was getting pushed
back into the compute container because the storage
block devices were just not able to achieve the
performance and the consistency of performance
necessary to meet enterprise-level SLAs.
So that's one of the things we've focused on
significantly, and we've paid a tremendous amount of
attention to, is creating a persistent block device that
offers better characteristics in terms of performance,
better characteristics in terms of throughput than
writing to our local block device on disk.
And that's really important to us.
The next principle, and this is kind of obvious to us and
it's really steered our decision, is
we have to be open.
We have to be flexible.
There's a tremendous amount of code that exists out there.
People have invested probably trillions of dollars in
developing code.
It makes sense for them to be able to take that code and run
it in our data centers on this awesome infrastructure.
And the open source community is just this incredible source
of innovation.
It's a tremendously innovative space.
We think our customers should be able to take whatever the
open source community is producing and be able to run
it on our infrastructure and achieve the benefits of what
we're doing.
So we focused on creating an open ecosystem.
We've also focused on things like an open API.
As Chris mentioned, our tools are all built on an open API.
We don't have any back doors.
There's no cheating.
You can get in there, and you can extend our
stack at any level.
And we really like the idea of being able to plug into our
stack at any level.
And it's also about the ecosystem.
Our goal is to deliver beautiful, pure,
high-performance, affordable infrastructure.
And we really want to create a vibrant, strong, partner
ecosystem that can deliver great experiences when
managing this and great experiences for our users when
they want to move workloads around.
And then it's about being proven, and being proven to us
means one thing, really.
This has to be a technology that we're willing to bet our
own business on.
So as we've gone-- and it's still early days for us with
the stack we're just bringing it into the market now with
this initial offering.
But we are running Google businesses on
this technology today.
So it's one technology for internal businesses, and it's
one technology for the outside business.
And it's the same technology.
One example of this is Invite Media.
Urs mentioned it today.
And we're very lucky to have Hansa, who will be talking in
a later tech session about their experiences when moving
to this technology stack.
Long story short, they've had a great experience.
They've been able to benefit from the power, the consistent
power of the platform.
They've been able to reduce the number of cores they
needed to actually run their workload by half for similarly
configured instances.
And they've also been able to benefit from the consistency
of experience, where some of the variance that resulted in
some errors, in bidding in their case, just went away
because of this much more robust, consistent platform
than what they'd had previously building in another
multi-tenant cloud environment.
So I mentioned this when I was talking a little bit about
being open.
And it's worth just really calling out
to our partner ecosystem.
Again, we're focused on pure, beautiful infrastructure.
We're not going to do all of everything, certainly not
coming out of the gate.
And so being able to tap into the power of the ecosystem,
being able to work with partners that completed, that
create the best experiences around managing and support
the mobility of workloads between on-premises in the
cloud and between clouds as well.
We believe very strongly that you should be able to pick the
cloud that makes sense based on the merits of the core
technology.
We feel that very strongly.
And so we've worked hard to find a set of partners that
share those values and are able to support that mission.
And ultimately for us, it really comes down to this--
we're going to deliver you a set of servers that you can
configure and run any way you want.
We recognize that it makes sense for you to be--
you probably want to think about this in terms of
services, not servers.
And so this ecosystem is really going to enable that
and support that idea.
So what I'd like to do now is invite Michael Crandell, the
CEO of RightScale, on stage.
Here you go, Michael.
MICHAEL CRANDELL: Chris, thank you very much.
Wow.
It is really great to be here at the launch of Google
Compute Engine and on behalf of RightScale to announce our
support for that.
We do have a live demo for you today.
But before we launch into that, a word about RightScale.
RightScale pioneered the whole category of cloud management.
And our vision since we got started five years ago was
really to open up this new world of cloud services to
everyone by making it easier and more efficient and more
reliable to run your apps and workloads on low-level cloud
infrastructure resources, so compute storage and networking
resources across a variety of different providers.
And we do this by offering a web-based, pretty broad cloud
management platform that encompasses everything from
automation to configuration, to monitoring, user
management, and so on and so forth that really acts as an
environment that's a bridge between your apps and
workloads and the low-level infrastructures, whether
they're public clouds, or private clouds, or hybrid
clouds that you want to run them on.
And so over those five years, we've gained a lot of
experience.
We've had lots of customers, a lot of traction.
We've, on behalf of our customers, launched more than
4 million servers in the cloud.
We've also powered some of the biggest scaling events that
have occurred.
And then finally, we've enabled large migrations from
one cloud infrastructure to another in excess of 20,000
different servers.
I'd like to highlight what we think is really special about
Google Compute Engine, and specifically three things that
we believe are our differentiators.
The first one has to do with the global private network
that Google is offering, right?
I think we've all known for some time that Google's
infrastructure is a key part of the secret
sauce of the company.
And it's pretty amazing that they're actually opening that
up to all of us to access and to use on an as-you-go basis.
But this global private network, which is part of it,
really enables global deployment in a much more
easier fashion.
You've got basically what looks like a private network
spanning the globe.
And specifically it makes replication and disaster
recovery failure isolation strategies much, much easier.
Second big point I wanted to emphasize is fast boot times.
Why does that matter?
Well, it obviously matters in terms of auto scaling.
When you need to scale up quickly, the faster that
servers boot, the faster you auto scale, but also in terms
of day-to-day buildup and tear-down of environments that
we all use in the dev test production life cycle.
And we've consistently seen boot times of two minutes.
Very consistent, as Craig mentioned, across the board,
loading from cloud storage.
Third area that I'd like to emphasize is, harking back to
the point about security, encrypting data at rest.
So we all know that security is probably the top item on
the minds of larger companies.
But really it should be for all of us as a requirement for
using cloud.
And because data is encrypted at rest on storage to other
local volumes or network-attached storage, it
couldn't be much simpler or easier.
And there's virtually no impact on the performance, so
very excited about that.
There are three things I wanted to emphasize from
RightScale's point of view that are core principles about
operating in the cloud that we've embraced as a company
since we got started.
And I'd like you keep these in mind as we get into the demo
in a second here.
The first one is what we call usable stuff.
And what we mean by that is that you, as developers and
users, should have access to cloud-ready components that
you can get off the shelf, so to speak, customize, if you
like, but put to work very quickly.
Whether those are at a script level, a recipe level, a
server template level, or deployments of many, many
servers, you should be able to just pick from a library, if
you will, and utilize them quickly.
What comes out of that is automation.
We really believe automation is probably the key core
principle behind all of this cloud revolution.
And it stems from the famous concept of auto scaling all
the way through to the notion of launching complex
multiserver deployments that know how to configure
themselves as they come up.
And then finally, what we call workload liberation.
And what we mean by that is simply you should have freedom
of choice to run your apps and workloads on whatever cloud
resource pool you choose to based on your set of
requirements.
So those are three key principles to keep in mind as
we launch into the demo.
And with that, I'd like to invite my colleague Shivan up.
And we'll get right into it.
Shivan, can you tell us what we have to see
today as a live demo?
SHIVAN BINDAL: Yes.
Thank you, Michael.
So we have a customer's typical video transcoding
application that we're going to walk through.
And this slide here just kind of talks about the
architecture of this application.
We're talking about taking video files from the web,
creating transcoding jobs that are then placed onto a queue,
and then having a multitude of consumer servers taking those
jobs, downloading the videos, transcoding them, and then
placing them for retrieval from Google storage.
So they're actually sending those video
files to Google storage.
Let's see what that looks like in RightScale.
Here I have the RightScale system.
And this is that single pane of glass, Michael, that you
talked about, where you're managing your cloud resources.
So we're talking about compute,
networking, and storage.
Here we have in the clouds menu Google Compute Engine
connected to RightScale.
So I've provided my credentials and made the
resources that Google makes available available here in
RightScale.
And that's really where RightScale starts in terms of
the cloud management and automation
capabilities we offer.
MICHAEL CRANDELL: So can you show us that automation at
work in the actual app?
SHIVAN BINDAL: So with any application, you have a
variety of servers that comprise that application.
And here what we're looking at is the set of all servers for
this application.
We've got three different server types, but really a
multitude of servers.
In RightScale, that is displayed as a deployment.
So it's a collection of servers for the use case here
of your application.
Three different types of servers, as I mentioned.
We've got that producer who's creating all the jobs.
We've got the queue server that's
holding all of the jobs.
And then we've got the consumers down here that are
running all the jobs and doing all the transcoding.
Each of these types of servers are running on top of what's
called a server template within RightScale.
A server template is considered a blueprint of how
your servers will reliably and consistently be launched every
single time in the cloud so that they are coming up with
the same state, with the same configuration so that things
are actually usable.
And to really understand how we're operating here at scale,
I've got about 325-odd servers that are
running this workload.
And here you'll see with RightScale's built-in
monitoring the capability that that compute power provides.
So each of these graphs is a single CPU from the servers
that we're looking at.
And the blue here shows the actual load on the CPUs for
doing the video transcoding.
So, Michael, here you have, at scale, an application running
on Google Compute Engine managed via RightScale.
MICHAEL CRANDELL: So cool.
And by the way, audiences normally
applaud at this juncture.
[LAUGHTER AND APPLAUSE]
MICHAEL CRANDELL: It's very, very exciting to us.
So thanks for your forbearance.
What else do we have to show?
SHIVAN BINDAL: Any application is living and breathing.
This is one example of how you might transcode some video.
But what happens if you have new sources of video that you
need to start transcoding?
How quickly can you reuse this environment for a new use case
or new instantiation of the same application?
Well, with RightScale, it's fairly simple.
I'm going to take what we call the deployment, that
collection of servers, and clone it.
By cloning it, I'm taking the configurations that we had and
making them available for configuration in the new case
or the new instance.
And what I'm going to do with just a few clicks is I'm going
to start launching that entire deployment of servers so that
they're available to start doing these jobs again.
What I've just done is I've enabled what RightScale calls
a server array, which has all of those workers
now starting to launch.
And now I'm just going to go ahead and launch up the queue
and video producer or the server that's going to create
all these jobs.
I'm going to launch that as well.
And what you're going to see--
I actually clicked it.
It does look like it's clicked.
So what actually happens now is you'll see these
servers come up.
And as this comes up-- it'll take a few
minutes to actually happen.
But from my perspective as a user, that's all the
interaction I need with RightScale.
You'll see they're already in pending state.
And now, again Michael, these 325 servers plus these other
two servers that are doing a lot of the command and control
stuff all launched through RightScale.
MICHAEL CRANDELL: Thank you.
And on the left-hand side there, you can see in the
events window the activity starting to propagate.
Well, behind every demo there's usually a little bit
of an interesting story.
Could you tell us the back-story of what really
happened to put this together?
SHIVAN BINDAL: Two things that I'd like to point out.
One is, as with any demo, we were building this demo out as
we were completing our integration with Google
Compute Engine.
And so--
funny story-- we built this application on another cloud.
But with RightScale's technology of server templates
and the configuration management that we bring to
the table, it was fairly simple for us to take that
application and launch it on Google Compute Engine.
And it really just worked, which was really very
fascinating and exciting.
The second point is we didn't have a lot of time.
So when we built the application to do the video
transcoding, we wanted to keep things simple.
We started with using only one core, making things very
streamlined in terms of one video per job.
And so what that meant was we ended up launching a whole ton
of instances or virtual machines on Google Compute
Engine to do all of the work for us and keep the
application layer very simple.
MICHAEL CRANDELL: So it makes sense to make the
infrastructure do the work, not the developers.
That probably resonates with some of you
out there, that idea.
So just to prove the point and show that we're doing
something real, can you show some of the output that came
out of the transcoder?
SHIVAN BINDAL: So here I have Google Storage,
Google Cloud Storage.
This is the storage browser that's available by Google.
And anybody who has a project and access to Google
Cloud can see this.
This is one of the videos that we've transcoded.
And, well, we can play it really quickly.
But what I wanted to highlight is that we saw some really
great performance in terms of the communication between
Google Compute Engine virtual machines and Google Storage,
just very low-latent connectivity
and very high bandwidth.
So it was very quick.
And to play this video, let's see if I can
full screen it here.
[VIDEO PLAYBACK]
-When we started working with the team at Google, it was
very clear from the very beginning that this is an
all-out effort.
This is serious.
This is worldwide.
This leverages the full depth of Google engineering.
[END VIDEO PLAYBACK]
MICHAEL CRANDELL: Cool.
There you have it.
That's the video.
[APPLAUSE]
MICHAEL CRANDELL: That, by the way, is our CTO and co-founder
Thorsten, if you'll wave your hand.
We're all around through Friday.
We have a sandbox here.
We also have our private beta of support for Compute Engine
at rightscale.com/google.
Very much looking forward to partnering.
Thank you so much.
CRAIG MCLUCKIE: Wonderful.
Thank you, Michael.
It's wonderful that you can deploy such large, complex
services with the click of a button.
MICHAEL CRANDELL: Absolutely.
CRAIG MCLUCKIE: And we also love the fact you can move
between clouds, so--
MICHAEL CRANDELL: All right.
CRAIG MCLUCKIE: Thank you for your time.
MICHAEL CRANDELL: Thank you.
[APPLAUSE]
CRAIG MCLUCKIE: All right.
So let's get into a little bit more detail about the
technology and help you understand some of the
specifics of what we're actually doing here.
I'm gonna start off and just walk through the stack.
And the logical place to start, again, is Compute.
You've already seen the virtual machines.
These are KVM-based machines.
So we were running on the KVM hypervisor.
And we've worked pretty closely with Red Hat for quite
a while to get to a point where we have a very secure,
highly performant, high-consistency environment
to host these class of virtual machines.
We really appreciate the leadership that Red Hat has
shown in this space.
And we'll continue to work with them in the future.
These virtual machines are available in multiple sizes,
so you can get them in one, two, four, and eight cores.
And they come with 3.75 gigabytes of RAM per core.
So these are pretty beefy virtual machines.
Our smallest virtual machines is actually quite a lot bigger
than the smallest virtual machines you'll see elsewhere.
So that's something to think about as you actually look at
what we're doing.
And that makes sense here.
Because what we really focused on is delivering
high-performance computing, like lots of compute power
that you can access and tap into just because of these
data-center characteristics.
And we offer two versions of Linux.
We have Ubuntu and CentOS out of the gate.
But you can actually take whatever you want, any
Linux-based image, and create bootable images and run what
you want to in this environment.
Storage.
I mentioned this earlier, and we have a variety of storage
capabilities to meet different needs, two block devices that
enable you to run familiar workloads that require a block
device to support them.
The first is persistent disk, our off-instance, durably
replicated, storage medium.
This is a very high-consistency,
high-throughput solution that enables you to store data
securely that lives beyond the life of your
virtual machine instance.
So this is the place where you would want to write stuff,
like if you wanted to run a database, for example, this
would be a great backing store for a database.
We also provide a kind of cheap and
cheerful local disk option.
We recognize that our focus workloads are very
data-centric.
So it makes sense to have access to a lot of affordable
storage that's coupled to your virtual machines to store data
that your processing as a cache.
This data is bound to the life cycle of your virtual machine.
So if you stop your virtual machine, this data is
permanently gone because we encrypted at rest, and that
key is only ever stored in the virtual machine.
So when the virtual machine goes away, the data goes away.
That's something to consider as you're building solutions
on this technology, is definitely If you have
something you want to store beyond the life cycle of the
virtual machine, persistent disk is the way to go.
And then there's Google Cloud Storage.
Google Cloud Storage is our enterprise-grade, internet
object store.
So this is the place you can go to write objects that need
to be accessible by the internet.
It has some really interesting characteristics.
I'll pick on two of them, but it's a
pretty interesting service.
The first characteristic is that it benefits from our
global high-performance network backbone.
So it almost comes with a CDN baked in.
So the content that you want to access will be replicated
to the place where you need it and will be
accessible very quickly.
And we invite our customers to try this out.
We think the performance
characteristics are really good.
The other thing that it offers is read-your-write
consistency, which is an interesting characteristic
when you're trying to build a system at scale.
It's nice to have an object store that you can write
content to with that level of predictability.
And we found it very useful as we've developed our own
solutions in this space.
Network.
Our network really shines.
It's one of things that I think truly
distinguishes the service.
Our private network enables you to fuse these virtual
machines together very efficiently.
And it offers tremendous network
cross-sectional bandwidth.
We really invite our early customers to try this out.
We think this is going to be something that's very distinct
about the service.
And we'll have a demo in a little bit that kind of
showcases some of these interesting performance
characteristics in this cloud.
And then, obviously, no virtual machine stands alone,
so the ability to assign a static IP address to the
machine that you can maintain for a long time.
And these IP addresses are actually global.
You can remap that IP address to another machine in another
region that's the geographically isolated.
And Google's network will just take care of making sure that
the traffic gets to the right location.
And then, obviously, firewalls.
You have to be able to secure your virtual machines.
You want to be able to control who can access them.
So we have a very simple intuitive firewall setup that
enables you to very specifically control who talks
to what in the system.
And tooling.
Chris demonstrated our tools briefly.
And he showed you the little UI tool that we've developed.
And we think that's an interesting metaphor for us.
It really shows and demonstrates our
commitment to openness.
By all means, we're gonna make that source code available.
It runs in App Engine.
We think App Engine's a great environment to write this
class of management tool, to control clusters, and to get a
view of the clusters.
So the fusion of those two technologies together is
actually pretty powerful.
And then we have for the command line and client, we
have a very nice little command-line tool that
provides and exposes the full flexibility of API.
And then, of course, we have our partners.
Our partners are contributing tremendously
to our tooling story.
They've enabled us to really focus on getting the
infrastructure right, knowing that folks that need a
best-of-breed management story have someplace to go.
And with that, I'd like to invite our friends from MapR
on stage to do a little demo.
And they're going to show--
as I mentioned, this service is really about execution at
scale and experiencing the power and efficiency of
Google's data centers to solve some large problems.
And they have some really interesting
technology to do just that.
JOHN SCHROEDER: All right.
Thanks, Craig.
And thanks to Google for inviting us
to be up here today.
I'm John Schroeder.
I'm the CEO and co-founder of MapR Technologies.
And MC Srivas is the chief technology officer and founded
the company with me a little over three years ago.
And actually, Srivas is an ex-Googler as well.
So let's see, how do I advance this?
So before I get into MapR, I thought I'd give some of our
impressions of the Google Compute Engine.
And when we first got access to Google Compute Engine, the
engineers came to me the same day and said John, this is
blazing fast.
Performance is really important to MapR.
And we can tell it's also important to
Google Compute Engine.
With our platform, we're able to drive a lot of bandwidth to
disk and really drive network bandwidth as well.
And we're impressed with the performance there.
And then our technology runs on
large clusters of computers.
And we put a lot of effort into building our own QA lab
internally to do testing of very large deployments and our
customer base as well.
But with Google Compute Engine, we could quickly spin
up 1,000, 1,200, 1,600 servers in a matter of minutes and run
those types of scalable applications.
And then finally, I got a glimpse at the pricing over
the last few days.
And it's very compelling.
I think that it really changes the game.
And any organization that's looking to move some or all of
their on-premise computing into the cloud really should
take a hard look at Google Compute Engine.
So we're going to use MapR to demonstrate some of the speed
and scale of Google Compute Engine.
So let me tell you a little bit about MapR.
We're a provider of an open enterprise-grade Hadoop
distribution.
So we really take Hadoop and really have transformed it
into a very reliable compute platform and
dependable data store.
We've done a lot of standards-based extensions to
Hadoop to really broaden the use cases
it's appropriate for.
We've got deployments at thousands of companies,
customers in most of the major vertical market segments,
including financial services, a lot of Web 2.0, telco,
federal government, aerospace, and such.
If you don't know what Hadoop is-- probably most of you do,
it's pretty popular--
but it's a big data analytics platform.
So it's a platform for being able to run analytics at a
very large scale, petabytes or even 100 petabytes of data.
And that data may be structured or semi-structured.
It runs generally on commodity hardware in
large clusters of computers.
And since we're at a Google conference, we want to make
sure that we also identify that really the whole Hadoop
project was inspired by a paper about MapReduce that was
published by a couple of Google scientists back in
2004, Jeffrey Dean and Sanjay.
And we're really happy to have MapR now running as part of
the Google Compute Engine.
So a few days ago, Craig said could you put together a demo
that would show some of the capabilities of
Google Compute Engine?
Also, it'll allow you to show your product a bit, too.
So Srivas and I talked.
And he said, well, why don't we just run a big sort?
And TeraSort is a standard benchmark that's run to
measure performance of applications.
And it's a very popular benchmark for demonstrating
the performance of Hadoop distributions.
So we stitched together quickly a 1,250-node cluster
and ran a TeraSort.
And I think we'll do a demo of that right now, and Srivas can
walk you through how that worked.
MC SRIVAS: So what we have here is each of these green
squares represents a node in the Google Compute cluster.
So we have a full cluster of about 1,250 nodes with five
control nodes.
And this is the heat map that we have, which lets you look
at the cluster from one console and put different
views on it.
And one of the views we're putting on out here will be,
how does the CPU load on the system when we run a TeraSort?
And, of course, it turns from green to red and back again.
On the other side is the actual TeraSort benchmark.
It's a command line that shows the elapsed time as it runs.
So what's very impressive about this was-- let
me start this up.
So here we are launching a TeraSort.
And as you can see, it fans out the work to all
the nodes very fast.
The Google Compute network was incredibly efficient.
We could just go and load up the clusters almost instantly.
A TeraSort runs within a minute.
And so you have to go and load up about several thousands of
computers almost instantly with data and then have them
all do a criss-cross section of the data going across.
And then you can see how there's no
nodes that are green.
Every node is lit up well.
And what it shows is the cross-cluster bandwidth of
thousands of machines talking to each other in a complete
crossbar working really well.
So this was kind of an interesting benchmark because
if you had told me this three months ago, hey, do this
benchmark on a cloud, I would have said ha, I mean, you can
never do this.
And when we were invited to work on Google Compute Engine,
we were just blown away by its performance, its scale, how
easy it was to deploy.
And we literally thought of this last week, doing this.
And to pull this off in a week, we could not have done
without the ease, the simplicity, and the
reliability, and the consistent performance of
Google Compute.
It is just incredible.
So it's still finishing.
Usually the problem is the last
stragglers take a long time.
So this took about a minute, I think 80 seconds, a minute and
20 seconds, and it's probably--
I'll hand this off to John to talk about what's
the impact of this.
JOHN SCHROEDER: OK, so if you look at running TeraSort, the
fastest TeraSort I'd seen recorded on physical hardware
is described in the middle column there.
And it was deployed at 1,460 physical servers running on
bare metal Linux with a high number of disks, 5,800 disks,
and almost 12,000 cores.
And that's a record that took a long time for a company to
put together in their own internal data center to run.
Here in a really short period of time in a virtual
environment running on fewer servers, 1,256 servers, one
disk per server and about 5,000 cores, we're right in
the neighborhood with that.
So I'd love to come back here next year and maybe raise the
bar and do a petabyte benchmark instead, but a
really great performance.
And on on-premise Hadoop implementations, they rarely
run in virtualized environments because of the
performance overhead.
So it's also another indication of the performance
of Google Compute Engine.
[APPLAUSE]
JOHN SCHROEDER: Now, if you look at putting this together,
those two benchmarks, if you tried to assemble 1,460
servers in your data center, it'd take you months.
I mean, I'd be negotiating with my server vendor probably
for months.
And then I'd have to rack and stack 50 to 70 racks of
servers, switching our infrastructure, get the
electrical brought into the data center to handle that
server load, probably 50 to 75 tons of air conditioning.
So it's a massive project.
It'd take just months to do.
And with the Google Compute Engine, we're basically up and
running on over 1,200 instances
in a matter of minutes.
So the time to deploy is very, very fast.
And then, if you noticed, we're not paying for that
anymore now either.
Once we're done running the TeraSort, we give those
resources back to Google, where with the physical
servers, they'd be mine.
And three to five years from now, I'd have to swap all
those servers out for newer models.
And we leave all that to Google as the
cloud provider as well.
And from a cost perspective, just to
take a really, really--
[LAUGHTER AND APPLAUSE]
JOHN SCHROEDER: Just a very conservative run.
If I just allocate 4,000 a server, which is probably 50%
lower than what most Hadoop servers go for.
And this doesn't count racks, PDUs, switching
infrastructure, ongoing OPX, yeah, $6 million, a minute 20
of instance time?
$16.
So a tremendous cost advantage.
Finally we are part of the private beta.
If you'd like to try running MapR Hadoop on Google Compute
Engine, really simple URL to remember--
Mapr.com/Google.
So thanks much for your time.
Thank, Craig.
[APPLAUSE]
CRAIG MCLUCKIE: Thanks, John.
Thank you.
That's a great demo.
And I'm definitely looking forward to having you come
back and do some more interesting stuff with our
data centers.
And this is really early days for us.
This is the first offering of the service.
It's just going to keep getting better.
We're going to bring more and more of the awesome
infrastructure technologies of Google to you so that you can
benefit from them and benefit from the efficiencies.
So John already talked about this a little bit.
A big part of our commitment with the service is to
transfer the efficiencies we benefit to you.
And then we believe this is reflected in our price sheet.
If you take a look at this price sheet,
it may not be obvious.
But when you look at this compared to what other cloud
providers are offering, you get up to 50% more compute for
your money.
And we really invite you to get on board, try this out,
and take a look at it.
That standard, that one, is not a small instance.
There's a whole bunch of compute power in that thing.
So we definitely want you to take a look at it.
Take it for a spin and see if you experience the same kind
of benefits of that we've seen when running our processes on
these things.
And with that, I do thank you for your time.
And I invite you to join our limited preview program.
We're available now.
And I expect there'll be a fair amount of demand because
access to the program comes with a pretty generous quota
of free compute.
And so while we have very, very big data centers, we do
have finite spots in our compute program, so we beg
your patience.
And do apply, and we'll try to service everyone who wants to
participate in this program.
For folks that reach the end and really like the service,
it will be available commercially.
And we will be offering an SLA.
And we will be offering support to
our commercial customers.
So you can absolutely contact our sales team, and they'd be
delighted to speak to you about the service, tell you a
little bit more about it, and tell you about
how to get this program.
And so go ahead and do apply for access.
And we invite you to come to some other sessions.
If you want to learn more about Google Compute and you
want to get into some of the details, Joe Beda, who's the
technical lead, one of the original architects on the
project, will be delivering a session called 313 at 5:30
this afternoon.
And then tomorrow, we'll have another session, 308, which
will really clarify the relationship between Google
Compute Engine and Google App Engine and help you understand
how to build very powerful systems that bring together
the best of Platform-as-a-Service and
Infrastructure-as-a-Service in the Google Cloud platform.
And with that, I'd like to invite my colleagues on stage.
We'll do a little Q&A. I think we have about 10 minutes.
So if you have questions, please let us know.
[APPLAUSE]
AUDIENCE: It's obvious that it does scale very quickly.
CRAIG MCLUCKIE: Yes.
AUDIENCE: But what about certain applications that need
32 cores on one machine?
CRAIG MCLUCKIE: So yes, it does scale quickly.
Right now, we offer instance sizes up to eight cores.
We're absolutely looking at ways to bring even more of our
infrastructure capabilities to market.
But that's just where we started.
So right now, it's really about lots of work that can be
paralyzed, and we will provide vertical
scaling in the future.
AUDIENCE: Thank you.
AUDIENCE: So when MapR showed their cost calculation of $16,
they actually broke it down by second.
CRAIG MCLUCKIE: Yes.
AUDIENCE: So are you going to be billing at seconds,
minutes, or to the hour?
CRAIG MCLUCKIE: It's a great point.
At this point, it's by the hour.
So what they really should have said is you could have
done this 60 times in a row for this.
Right now, it is on demand by the hour.
But that's definitely something
we're thinking about.
AUDIENCE: OK, thanks.
AUDIENCE: My question is about how it plays well with Google
App Engine, specifically on the areas of load balancer.
So once I've integrated it with Google App Engine, will
it work with the same authentication?
The second being how it works with memcache.
CRAIG MCLUCKIE: OK.
So obviously, we have two technologies, Google App
Engine and Google Compute Engine.
We're going to be doing a session tomorrow.
We'll get into much more of the details.
But I'll provide you a very brief sense of where we are.
Coming out of the gate, Google Compute Engine is about being
able to deploy a lot of service to solve some
computationally hard problems.
App Engine is our web-facing, high-productivity,
high-efficiency environment for application development.
The two together actually work pretty well.
We have the ability to flow credentials using OAuth so
that you can use service accounts to authenticate
across them.
And App Engine becomes a very natural orchestrator and
management framework for these Compute containers.
And the Compute offers the ability to open App Engine up
and run general workloads that can connect as part of your
App Engine application.
Now over time, we will work to create even richer
connectivity between the two services.
For now, it works very naturally.
And a great example of that is our UI.
That was an App Engine application.
That was an App Engine application that was just
spinning up these virtual machines.
So App Engine's a very natural way to do things like order
scaling and to deal with some of those constituent parts.
And it'll just get better with time.
Hope that answered your question.
Thank you.
AUDIENCE: Hi.
Are the data centers located internationally?
And if so, can I specify a fail oversight?
CRAIG MCLUCKIE: So at this point, we only have domestic
data centers, US data centers.
We are working on creating a global data center footprint.
AUDIENCE: Even within the US, can I specify a fail oversight
in East Coast or West Coast?
CRAIG MCLUCKIE: Yeah, you can pick.
We actually publish what region they're in.
And so you can pick which specific data center you want
to be deployed in, so that you can be
close to your customers.
We're launching initially with three data center locations
that are centered on and East Coast, which works well for
this class of workload.
And we're looking at bringing on more as the service matures
and as we bring more capabilities to market.
AUDIENCE: OK.
Thank you.
AUDIENCE: Hi.
I saw in the flow diagram that there were connections to the
outside world from the VMs.
But I didn't see anywhere in the API console when he
flipped through that showed like IP address allocation or
anything like that.
Is there--
CRAIG MCLUCKIE: Oh yeah.
So go ahead, Joe.
JOE BEDA: I'm going to be talking about that a lot more
in my session at 5:15, not 5:30.
CRAIG MCLUCKIE: Oh, was it 5:15?
Sorry.
I apologize.
JOE BEDA: But yeah we do provide public IP addresses,
IPv4 addresses that are mapped one-to-one to instances.
And you can either have those be statically allocated and
attached to your project, or you can have those be
ephemeral, and they come and go with
that particular instance.
AUDIENCE: How do we sign up?
[LAUGHTER]
CRAIG MCLUCKIE: Go to cloud.google.com, press the
button, and do it quick.
JOE BEDA: I just want to drive home the point that these
things, these IP addresses, are actually
global across our service.
So they can float from region to region, which we think is,
especially going towards the disaster recovery or failover,
it's an important feature for that.
AUDIENCE: By what time do you expect that a small business
can rent out, like let's say around 10 servers of--
medium-size servers?
CRAIG MCLUCKIE: Smaller locations.
So we're in limited preview right now.
Folks that have workloads that we think will really benefit
from the scale and efficiencies of the data
centers we will take on board and we'll take on as
commercial customers.
Based on our experiences with the limited preview program,
as we grow this up, we will decide when to make the
product generally available.
So our aspiration is to make it available soon, but we want
to create a great experience for all our customers.
And we think that the best experience will be with these
large-scale workloads right now.
AUDIENCE: But is it probable that by the next Google I/O
you'll be there?
CRAIG MCLUCKIE: I'm not going to make any statements about
our future, so I'm sorry.
[LAUGHTER]
AUDIENCE: Good try.
Can you touch on how you would upload a lot of data, what
tools are available, if it's GSUtil based or FTP?
Or if you have a lot of data that you want to move up to
disk, how that whole process works?
CRAIG MCLUCKIE: Yeah.
So as you've heard, all of our services interact on the
Google global network.
So we have pretty high-capacity
pipes to Google Storage.
And from in the virtual machine, you're actually able
to get OAuth credentials seamlessly.
Our version of GSUtil, for instance, seamlessly will just
reach into our metadata server, grab some OAuth
credentials, and then enable you to grab content from
Google storage very quickly.
So it's a very easy way to deal with this.
It also means you don't have to deal with pushing keys into
virtual machine images and manage them.
So that we think is a great experience.
And then again, it's on the Google network.
We have some pretty interesting network
technology.
AUDIENCE: Right.
But getting data before it's on the Google network, so--
CRAIG MCLUCKIE: Yeah, so if you can get to the Google
network, awesome things happen.
AUDIENCE: So for many high-performance computing
jobs that utilize a heterogeneous compute
environment that includes GPUs as well as a traditional CPUs,
does Google have any plans for a
heterogeneous compute product?
CRAIG MCLUCKIE: I mean, obviously, there's a lot of
workloads that benefit from GPU acceleration.
We don't yet have an offering.
And I can't really talk about our GPU futures at this point.
But we do recognize the value proposition of that.
AUDIENCE: Great.
Thanks.
CRAIG MCLUCKIE: Thank you.
AUDIENCE: I was going to ask a GPU question as well.
CRAIG MCLUCKIE: OK, so pick another one.
AUDIENCE: How many--
what's the maximum number of cores in the largest instance?
CRAIG MCLUCKIE: The maximum number of cores in the largest
instance is eight.
And those are pretty beefy cores.
So you actually get quite a lot of compute power for that.
We'll look at vertical scalability and providing
bigger options in time.
But we figure that's a really good place to start.
AUDIENCE: Sometimes with continuous delivery, we need
to spin up rapidly a large number of servers all at once.
Do you guys have any kind of limits or
delays in large requests?
CRAIG MCLUCKIE: What do you believe is large?
[LAUGHTER]
AUDIENCE: Um, say, 50, 100.
CRAIG MCLUCKIE: Yeah, no, that's not--
I mean, that is a lot of servers, but it's not that
many servers.
AUDIENCE: What is your limits around some kind of delays?
CRAIG MCLUCKIE: Well, we tend do think in the tens of
thousands at this point.
MALE SPEAKER: The API itself is currently limited to about
20 QPS in our initial offering.
JOE BEDA: Per project.
MALE SPEAKER: Per project.
Not total.
JOE BEDA: Not total.
[LAUGHTER]
MALE SPEAKER: No.
And then we'll be spinning that up
as the service matures.
We're also going to look at improving the APIs to do more
operations in batches.
So you could make a single request for N
instances or N disks.
JOE BEDA: So it's good, and it'll get better.
AUDIENCE: So for application service, we may have the
internal load balancer.
Do you provide that?
So different--
CRAIG MCLUCKIE: So for applications, you're looking
at a sort of elastic load-balancing capability?
AUDIENCE: Yeah, load balancing for internal so internal one
cluster can talk to another cluster but is on the internal
access mode.
JOE BEDA: That's not something that we're offering right out
of the gate.
But it's definitely an architectural pattern that
we've seen.
And it mirrors a lot of what we do internally at Google
also where you'll have different tiers of your
application using load-balancing technologies.
And we're going to keep looking to see how we can
apply Google's experience, not just at the hardware level,
but also at the software and distributed system levels to
start solving problems like that.
AUDIENCE: OK, thanks.
JOE BEDA: Thank you.
AUDIENCE: You guys have mentioned a
lot about your network.
What kind of between-instance latency do you have compared
to, say, EC2?
CRAIG MCLUCKIE: So it's a great question.
We are very proud of our network capabilities, and it
will continue to get even more awesome with time.
We don't like to publish
comparative performance numbers.
We invite you to apply for the program, try it out.
We think you'll like it.
We think you'll like it particularly when you try to
build large clusters and you want to see very strong,
consistent cross-sectional bandwidth
across large clusters.
AUDIENCE: Will the App Engine team ever be developing on top
of the Compute Engine?
So provisioning for App Engine using the Compute Engine.
CRAIG MCLUCKIE: So the question is App Engine, will
they be building on top of Compute Engine?
So at this time, the two technologies work well
together, and we've ensured a good degree of integration.
But they are discrete products.
I mean, Google Compute Engine is pure infrastructure, and it
benefits tremendously from having that
Platform-as-a-Service component.
Over time, I expect you'll see the lines become more blurred.
And we're working on creating a much more natural fusion of
the technologies.
For now, the best applications are, for instance, using App
Engine as a way to build and scale and manage large-scale
applications and serve as a front end for those
large-scale Compute clusters, and then using Compute Engine
to open up App Engine and run proprietary code, for
instance, like a transcoding application, et cetera.
And over time, I think you'll find them--
well, we're working on making them much more aligned.
AUDIENCE: Thank you.
AUDIENCE: IPv6.
What are your plans for offering it?
JOE BEDA: We are going to do IPv6.
We had to prioritize what's important to people now, right
now, as we worked to get the product ready for Google I/O.
But Google is a very big proponent of IPv6.
And I can guarantee you that there are many people inside
of Google who would love to see us have IPv6 yesterday.
So yeah, it's a priority for us.
CRAIG MCLUCKIE: Absolutely.
Right.
Any other questions?
Wonderful.
Well, thank you so much for your time.
[APPLAUSE]
CRAIG MCLUCKIE: And do visit us in the later sessions.
Joe's session at 5:15.