Tip:
Highlight text to annotate it
X
JEROMY CARRIERE: Welcome.
Before I get started, I want to kind of do a quick question.
Developers?
OK.
Sort of management types?
Which I can say without any slight,
since I am such a thing.
Oh, good.
OK.
Anybody I didn't get with those two categories
that's willing to admit it?
No.
OK.
Fair enough.
So, as Tom said, I'm Jeromy Carriere.
I'm an engineering director here at Google in New York.
I'm responsible for all of Google's monitoring, logging,
alerting, sort of operational systems,
both facing Google's internal users
and facing our cloud customers.
So if you've seen Google Cloud Monitoring,
that's my team's project.
I was told to offer a little bit of human interest,
so my human interest is my daughter's joining us today.
She brought me to a One Direction concert last night,
so this is payback.
Yeah.
She's going to have to stay all day.
So, does anybody-- I was going through these slides.
I'm thinking, like, does anybody really
need to be sold on why cloud is a thing?
Does anybody need to be sold on why cloud is a thing?
Does anybody not believe it?
OK.
Good.
Anyway.
I'm forced to go through the slides.
CEOs.
We've talked to lots of CEOs.
They see technology change as driving their business.
That's likely been truth for all time.
But right now, the key thing that's
driving businesses globally is the cloud.
Again, I probably don't need to sell you too much.
There are a few key elements to that, things that are really
pushing cloud as a thing today.
Mobile, obviously, is a huge one, right?
People are using their own devices in the workplace.
Common pattern, 53%, I think, is the number
of people bringing their own mobile devices
into the workplace.
Any time.
We need to be able to access data wherever you are,
whenever.
Any time of day.
From anywhere in the world.
We need people.
We need collaboration.
That's a huge driver today.
The key thing that I like to focus on is this idea of speed.
Right?
So I'm sort of subscribed to the philosophy
that velocity is paramount.
Everything else you can fix if you can deliver quickly.
So spending time, as most enterprises
do-- 67% of engineering effort goes
into maintaining existing systems-- that
just saps velocity.
So how do we fix that?
What is it that we can do about it?
And that's where we come-- ah, sorry.
One more motivational slide.
Mobile.
People don't really even appreciate
that cloud services are part of their daily lives.
Whether it's iCloud or pick your favorite start up.
It's likely running on our cloud or on AWS.
Not a surprise.
So it's in everybody's lives all the time.
A vast majority of the workload that's
running on Google's Cloud Platform today
is actually being driven by mobile applications.
IT trends are the sort of fundamental building blocks
that are pushing this momentum today.
So the first is affordable capacity.
Today-- this is a stat I didn't even actually know
until I read these slides-- $600 can buy
enough storage for the world's music.
That's pretty neat.
$500-$600.
On demand.
You can get compute capacity when you need it,
with basically no ramp, right?
There's no time.
You don't have to wait to get capacity.
As a startup, as any enterprise, you
can get going with zero capital expenditure.
You can rent your way to massive scale.
And then, finally, global networks
in their current state-- Google having one of the best--
is we can deliver information, anywhere in the world,
to any device, extremely quickly.
So these are the trends that are pushing all this together.
In case that you're not sold, I found this also
very compelling.
In 1957, the average age of a company joining the S&P 500
was 75 years.
In 2013, it was 10 years.
So if you don't believe that the enterprise is changing,
I think this should be compelling.
Google.
Everyone knows Google's mission statement.
We're forced to memorize it.
Tattoo it.
Organize the world's information and make it universally
accessible and useful.
Of course, this started with Search.
Google Search was Sergey and Larry
in their, whatever, dorm at Stanford.
That was where Google began.
And that required a lot of infrastructure.
Over the years since Google began,
there has been an enormous amount of infrastructure built,
rebuilt, rebuilt again, globally to deliver on that mission
statement.
That means that we are running some of the largest distributed
systems in the world, with extremely
stringent requirements in terms of latency,
in terms of reliability.
Doing this required solving a lot of problems, right?
How do you actually not just store multiple copies
of the web, which is one thing-- you can put it on a hard drive
and stick in the closet, right?--
but making it accessible globally with low latency.
Making it queryable in rich ways.
Applying the Knowledge Graph to overlay
the raw index of the web.
And then expanding that, of course,
into the rest of our product portfolio.
That's required us to solve some really interesting problems.
And just to look at one-- networking.
So Google has a fantastic global network
in terms of reliability, in terms of latency,
in terms of raw throughput, spanning the globe.
To get there, Google actually had
to reinvent the way telecom was done.
For the most part, we actually have our own dedicated fiber
spanning the globe.
That's kind of one aspect of the sort of physical plant
side of things.
But there's a lot of software here, right?
Google engineers are exceptionally smart.
Some examples are shown here.
And these are the ones that we've
talked about most publicly.
And you see, as you go to the right,
you're moving toward more public offerings like Compute Engine.
But back in 2002, the problem was, how do you
store multiple copies of the web and make them accessible?
So the Google File System.
Then, how do you actually process that much data?
That's MapReduce.
And then how do you actually make
it queryable in an online fashion?
Building a storage system is one thing,
but then actually making it queryable for online access
is another thing.
That's Bigtable.
But how do you make it expressive?
How do you actually allow users that
want to ask questions of that data--
how do we make them productive?
Well, that's Dremel, or otherwise known externally
as BigQuery.
And then another interesting point
here is that Colossus is the replacement for Google File
System.
So even in this one slide, there are already
cycles of wax and wane of a given technology.
So as we learned about GFS, it taught us
what we needed to do in the next generation.
That's Colossus.
And then you see Spanner, which is
a strongly consistent global database-- which sort of flies
in the face of much of the accumulated wisdom of software
engineering these days, which counsels
us to relax consistency constraints
and favor availability over consistency.
And that's the Bigtable model.
But as we observed how Googlers build applications,
we discovered that actually finding a way
to build a global, consistent database
makes many problems easier to solve.
And that's embodied in Spanner.
And then, finally, taking everything
we've learned about building global compute infrastructures,
global physical compute clusters,
has led us to Compute Engine, Google Compute Engine.
So that's all well and good, right?
That's fun for us.
But what does it mean to you?
Well, the good news is that the Google Cloud Platform,
the compute engine I mentioned in the previous slide,
is one example.
It's actually built on the same infrastructure
that powers Google itself.
So underneath, as I said, BigQuery is Dremel.
Underneath another-- just quick example-- underneath Google
Cloud Storage is a product that we use internally
called Blobstore.
And I can sort of repeat this pattern over and over again.
So the portfolio.
The products that we offer as part of the Google Cloud
Platform.
Loosely, in three large buckets.
Compute, that's hosting applications.
App Engine and Compute Engine, which
we'll get into in a second.
And you'll hear more, of course, throughout the rest
of the afternoon on these things in depth.
The storage products.
Cloud Storage, Cloud SQL, Cloud Datastore.
So Datastore, with an affinity to App Engine.
Cloud SQL, hosted SQL offering.
Cloud Storage, the Blobstore I mentioned a second ago.
And then high-level application services.
Cloud Endpoints.
This is the facility we have for allowing you
developers to build APIs hosted by App Engine.
And BigQuery, again, is our SQL-like large data query
capability.
Sort of a columnar-inspired query system.
So that's the quickest possible pass through the platform.
Just to reflect on how Google Cloud Platform is evolving.
So I've only been at Google 18 months.
So 18 months ago, coming into Google, Cloud, in all honesty,
seemed like a bit of a novelty to me.
It didn't seem like it was really core to our business.
That, I can swear, has changed.
We have as an organization, as an engineering organization,
as a business, have applied an enormous amount of energy
to the cloud platform.
And you can see that reflected in just this small slice
of the products that we've delivered just
in the last year.
Since last August.
Everything from Encryption at Rest,
which seems like kind of a table-stakes thing,
all the way through stuff that we were talking about.
Tom mentioned before Containers, Google Cloud Monitoring, which
I'll talk about, I think, on the next slide.
And all through this, there's been a consistent theme
of reducing prices to better reflect the costs that we
actually incur at Google, which I'll get into in a second.
So developers are moving to Cloud,
because it's always going to be lower cost.
We can always, as a company like Google,
run it more inexpensively than you, an enterprise, can.
And that sounds like it's easy to say.
But it's actually true.
The economics are fundamentally different for an organization
like Google.
And I'll come back to that in a second.
Then there's a question of flexibility and adaptability.
We are striving to build a cloud platform that
eliminates lock-in.
Now, there's only so far you can go, right?
If you believe me, and I said everything
you build a Google platform is 100% portable,
you'd be right in questioning that.
But we are striving to not just adhere to standards,
but drive standards to make that true.
And then finally, we want to allow you, the developer,
to focus on customers.
Energy you spend constructing, managing, monitoring,
and maintaining infrastructure is
time you're not spending building value
to your business.
Something that your customers are going to love.
And we want to obviate that.
Cloud economics.
So-- this is the point I made a second ago-- even if you were
sort of thinking cloud-wise, and you're
building a private cloud, unless you had an extremely diverse
workload, if you're a very large enterprise,
you can imagine how you could drive that curve down
towards the public cloud curve.
But we, as a cloud provider, have the opportunity
to optimize in terms of economies of scale.
We have the opportunity to hire in a way that is really just
not possible unless you're running a cloud platform.
And we have an opportunity to multiplex workloads.
And this is a really interesting and subtle and rich
topic of study.
But we have an opportunity to multiplex workloads
across this global infrastructure in a way that
lets us pass on cost savings to our customers.
As an example, there are a few of many potential computing
patterns, right?
There's the on-off sort of batch every night.
Once a week, run this job.
You have the pattern of growth, right?
Which is of course what we all want in our businesses.
Up and to the right.
That means you need more capacity.
And then there's a third workload type--
this idea of burst.
And every workload, generally, that's continuously on
is going to have some pattern, right?
Absent any other capability, you have to provision for the peak.
Or, if you're optimistic, you provision
for some multiple of the peak, right,
to avoid problems when you have an unexpected spike in demand.
So all of these things, all of these computing patterns,
if you have to manage them yourselves,
you have to take on either dramatic overprovisioning,
or provisioning for a certain amount of growth,
or overprovisioning for these batch periodic workloads.
With Google Cloud Platform, you can
pass on all of that complexity to Google.
You no longer need to plan for every cycle of capacity
you need to run your business.
I'm not saying you don't have to do some work,
but we are striving to take away as much of that work
as possible.
Concretely, since 2006, we've seen 6% to 8% price decreases
in public cloud offerings annually.
But we don't think that's fair, because the underlying costs
that actually drive the business have been dropping at 20%
to 30% .
So that remaining red area there is just
profit for the cloud provider.
Our intent is to actually drive that red line
much closer to the blue line, and pass on those savings
to our customers.
And you had seen that recently as we announced new pricing.
And one concrete example of that is our switch
in Compute Engine pricing for On Demand.
Historically, you had to choose.
You had to choose between On Demand and reserved compute
instances.
And generally, you do that, if you're
running a complex enterprise, you hire a capacity engineer.
You have to hire a person-- or more than one person--
to actually plan for that capacity and optimize.
How much do I run on demand, which is more expensive,
versus how much do I run reserved,
which is less expensive, but I'm committed
for some period of time.
So what we've done with our new On Demand pricing
is as its usage increases towards 100%, prices decrease.
So without you having to do any specific engineering,
you just buy On Demand.
Right?
You buy On Demand instances, and if you use them
like reserved instances, they become
priced like reserve instances.
And that is just sort of one data
point to hopefully give you some feeling for how Google
is thinking about pricing for the cloud platform.
But cloud is still too hard.
We want to price it so you love it.
We want to make it easy to get at.
But actually building applications for the cloud
is still particularly difficult in many ways.
Specifically developers need to make trade-offs to get around
the way cloud platforms in general--
not just ours-- are deficient in a variety of ways.
And here are three kind of classical trade-offs
that we see today.
A trade-off between time to market or scalability.
Do I build my application, as a startup, as a new project,
on day one to scale, when I don't
know if it's going to scale?
Or do I build it so I get to market most quickly?
And then deal with, ugh, maybe it's not going to scale.
I'll do it later.
And often you have to step in one of those two buckets--
either build it quick or build it so it's going to scale.
Second example is flexibility or automatic management.
So this, back to something Tom said before, we believe
that the line between infrastructure as a service
and platform as a service-- infrastructure as a service
being maximum flexibility, platform as a service
being minimal management-- we believe
that that line is blurring.
And Julia is going to talk about that in a little while.
And then, finally, we also have to often choose
between big data-- store it efficiently,
make it cost-effective to ingest and process--
or make it easy to access in an ad hoc, real-time fashion.
Another trade-off that's commonly made.
But we think in most of these cases,
we can change that "or" to an "and."
We can obviate the trade-off in many cases.
So first one.
Time to market versus scale.
How do you smash those two things together?
Well, we want to make it possible to use
the tools that you as developers know and love.
Make deployments fast, reliable.
And then make it easy to fix problems in production.
And one big component of that is Google Cloud Monitoring.
This is my team's project, so I'm just
going to leave this slide up here for a while.
Let you bask in its glow of those nice graphs.
So the idea here is to have a single pane of glass
for monitoring all of your workload
on the Google Cloud Platform.
Give you rich dashboards, alerting,
provide custom metrics, give you an instant management facility,
make it possible to quickly find and solve problems
in production systems.
Whether they're running on App Engine or Compute Engine,
you have Cloud SQL databases.
You've got stuff deployed globally across our network.
Stack Driver is the company that we acquired back
in May that is the foundation of this offering.
Another super-compelling piece of this
is recognizing that if you're making that trade-off, right?
Either I have super-flexible or I have completely managed.
One of the common complaints there
is that if I go to the super-managed end
of the spectrum, I lose visibility.
I lose the ability to debug my application, right?
We've all typed gdb binary core file.
Like, go in there, find out, like, stack trace.
See which pointer id reference is null.
You lose that, in most cases, in a cloud environment.
You're running on App Engine.
You're running on managed VMs, which
I'll talk about in a second.
You've got your processes running wherever.
You don't have any idea where it is.
You can't log into the box and attach to the running process
and inspect the stack.
Cloud Debugger offers the ability
to do exactly that in a globally deployed application.
So you fire up the Cloud Debugger.
If we have your source, you set a breakpoint.
And we catch when the breakpoint is
hit by the process running, wherever
it happens to be on our platform.
You see it.
You can inspect the stack.
Incredibly, incredibly compelling capability.
And this was also announced at I/O.
This is not generally available yet.
And then another, final example is Cloud Trace.
So we've all wanted look at traces.
A hierarchical decomposition of a request.
So we have an offering-- again, not yet generally available,
but will be soon-- called Cloud Trace,
that lets you inspect for a given App Engine request.
Inspect its complete trace of calls into Datastore,
calls into memcache to look for latency,
to look for deviant behavior in your application.
And compare from release to release.
Had release four, release five.
Release five seems to be relatively less performant
than release four.
Where is the delta?
This tool will tell you.
OK.
The next trade-off that we want to make go away--
that, again, Julia will dig into--
is the trade-off between flexibility and management.
Making it easy to manage your application, right?
The App Engine kind of platform as a service model.
Versus run whatever you want, however you want,
but you lose the management facility.
Again, we want to make that go away
and blend together these two models, the Turnkey App Engine
platform and the fully flexible Compute Engine platform.
This is kind of the conceptual view
of how this might be accomplished.
At the bottom, you've got Compute Engine.
We manage your infrastructure.
Then you get the idea of replica pools, which
is a new facility in our Compute Engine offering, where we take
care of multiple instances sort of stamped out
of the same virtual machine image.
And you can tell us to scale it.
We'll make sure that there are n healthy at any given
point in time, by doing health checks.
That's one additional level of management.
Then up another level is managed VMs,
where you give us your App Engine app.
We run it in a virtual machine.
You don't have to log into that virtual machine.
You don't have to install software on it.
We take care of that for you.
But it's got your App Engine app deployed in it.
And then at the top of the stack,
we have so-called managed runtimes.
This is the classic App Engine model.
So in each of these layers, you have
to have-- maybe not less code.
To me that's not quite the right representation--
but you have to do less management at each level.
One more plug for my team's project.
We will provide logging and monitoring across the spectrum.
So, regardless of whether you're running a raw VM
or running on top of App Engine, or anywhere in between,
we will give you visibility into your complete application
portfolio.
So managed VMs, which I mentioned before,
so this is the idea that we can offer you
a virtual machine that is managed like an App Engine app.
Containers.
This is one of the newest things that we've
been doing in Google Cloud Platform,
is offering facilities to make it
easy to run containerized applications on the Cloud
platform, by offering special VM images that
are tuned for running containers.
And, at the richest end of the spectrum,
offering this new system called Kubernetes,
which is a sort of miniaturized cluster management system that
reflects on the lessons that Google has learned
over many years of running a large-scale cluster,
global managed cluster system.
So this lets us offer better predictability.
Lets us do things with resource allocation
that increase efficiency, give you visibility
into how your resources are being used.
So this I think is extremely exciting,
an extremely exciting offering from GCP.
And, perhaps more importantly-- and you
might have seen this in the press-- we're actually
working with a consortium of tech companies-- including
Microsoft, somewhat surprisingly--
to make this a portable offering.
So that it operates the same, regardless
of which cloud platform you're running on.
So you can take your containerized Kubernetes
application and run it in any cloud with minimal pain.
Networking.
I think I went through this a little bit already.
Our networks are great.
I won't dwell on it too much.
But I think this is generally self-evident in the data.
Finally, big data.
So, again, trade-off between big and fast, big and ad hoc
real time.
There's a variety of reasons why big data is hard, right?
On one side of the coin, you need specialized expertise.
You need complex distributed infrastructure.
It takes a lot of energy to build an efficient big data
system.
And then it's expensive.
The people that you have to hire that have specialized expertise
are expensive.
Storage is expensive.
Compute is expensive.
So what we've decided to do is, to the best of our ability,
make this easy and affordable by bringing
to market a number of tools, again based on the things
that we've done internally at Google.
Based on our expertise with MapReduce,
with large-scale SQL-like systems like Spanner.
Data Flow, which is an evolution of an internal system that
has largely supplanted a raw MapReduce
for our internal application workloads.
And, at the same time, make it easy to run open-source systems
like Hadoop on Google Cloud Platform.
So we're trying to put together a portfolio of building blocks
to make big data accessible to our customers.
Last point on this is we're also working
on a fusion of streaming computation--
that's the thing I mentioned before Data Flow-- batch,
typical kind of MapReduce-like applications,
and graph analysis, graph databases.
Putting those together into a portfolio that, again, gives
you the building blocks that you need
to build the kinds of applications
your business requires.
So, in summary, cloud's real.
I didn't have to sell that one too hard, which is good.
The idea of the Google Cloud Platform
is harnessing the technology that we at Google
have used to build Google.
And, finally, all of these software and infrastructure
innovations are really coming of age.
And the Google Cloud Platform has
as its mission to bring those to our customers.
And to put our money where our mouth is,
we have a so-called starter pack credit available now
with this promotion code applying at the URL there.
And get $500 to begin working with the Cloud platform out
of the gate without any upfront expense.
Thank you.