Tip:
Highlight text to annotate it
X
[music] Hello, network students. Welcome to week two of quiz sections.
So today, we have with us, Will Scott. Will is a graduate student at the
University of Washington. And he is a former employee at Google.
>> And he's an expert on all things to do with Google.
So today we are going to talk about some tweaks and some optimizations that Google
does, to the regular network protocol stat.
And Will is going to tell us about that. So Will, why don't we start with why are
these tweaks required? Tcp is a 20 year old protocol, so isn't
TCPIP a solved problem? >> Yeah, so, let me, prefix My answer by
saying that I didn't or can googe's TCP stack but rather worked on some of their
actual products. But and when you think about TCP, TCP is a
very broad protocol that is responsible for most of the traffic that goes across
the internet but traffic bid is going on inside of a data center its going to look
very different from the traffic that goes between your home computer and that data
center. Right the, the computers are much closer
together. In that packets are going to get to there
destination much faster like a matter of mili-seconds or one mili-second something
like that. And also there's a lot of bandwidth
between those, right? Data centers are going to have a giga-bit.
Connection between any two computers it's going to be fast.
That's like a given and so one thing you're going to see in TCP is we've got
the slow start mechanism which sort of probes the connection, sees how much
bandwidth is available. And that's something you need on the
internet because you don't know how much bandwidth you've got.
But in a the same, you've got Much more of that guarantee of, of what you're
connection looks like and so you can, you can set your parameters so you largely
skip that. You start with a large window to send for
most of your connections. They're going to send before you need to
wait for any of your acknowledgement. >> Come back to you and so that lets you
be more efficient and that happened because you have a better sense of what
traffic you can be sending. >> So can you tell us about any cast,what
is any cast, how does it work and how does it help us.
>> Sure. So any cast is a way where you can have
multiple hosts that all respond to the same IP address.
So if you got a piece of content, or a service like for instance google but you
can think of any piece of content. And you want hosts around the world to be
able to get to that content quickly. One of the problems that we've had with
the model that we understand so far, is that all the hosts are going to have, all
of these computers around the world, are going to have to go to one place to get
that content, right. The IP address identifies one machine and
that's sort of sub optimal in that you don't really want the computers in Europe
to have to go to the US to get their content if you've got data centers in
Europe. And so there's a trick that gets played
and it's called anycast. And what anycast allows computers and
services to do is that they can advertise the same IP address from multiple places.
So the bg key particle, the border gateway particle, which I believe has been
discussed in your lectures. Talks about how an AS an autonomous system
is going to advertise a block of ip addresses that it owns.
And so Google then for multiple places where it appears where it with other
autonomous systems it can advertise That it owns and has a route to this same block
of IP addresses. And so, when your host goes to you know to
send packets to that IP Address it will follow the path to the closest Google data
center and so it will end up, end up following that shorter data path.
And any test that's used. In lot of places it gets used by the root
DNS servers and many sort of replicated widely available services its a common
trick that gets played to allow for this locality and global internet services.
>> Thank you, Will. So before, before we talk about[INAUDIBLE]
sensitive DNS and what Google does with it.
Can you give us words of a very brief[INAUDIBLE] about what DNS is?
Because we haven't Studied the application lariat.
>> Sure. So DNS stands for the domain name system.
And it's the internet system that lets you resolve a domain name to a[INAUDIBLE].
So the google.com, or the washington.edu, or the coursera.org name that we type into
our browser and get used in many applications.
When you think about the internet packet and the, the IP header that you've
learned, there's no place for that to fit, right?
The, the destination is one of these IP addresses like 128.208.4.252, or something
like that. And so the DNS system is a set of servers
they form server tree and, and the thought is, its a way to, to resolve one of those
names like Washington and entire you back to an IP address that you can wrap to.
>> So what does Google do differently from the DNS you just described.
>> Okay, so one thing that you notice with, with Google if you if you look up
their record is that they tend to set the time to live field very short like
something like five minutes or even one minute.
And what that, what that does is it says, it tells our intermediate DNS severs that
they shouldn't cache the result very long, that googles IP addresses are very
volitile, that they'll change quickly. Another thing that you'll notice is that
when you ask for google you'll get something like 10 IP addresses, 10 IP
addresses back not just one, which gives your computer some additional freedom in
load balancing. But I think the, think the caching is one
of the interesting points dns ends up caching a lot.
There's a lot of sort of inter-mediation between.
Google is a server and you know, your ISP your local network into cash.
And, so, Google doesn't, isn't able to direct say you should visit this IP
address when you want to visit us. And, that, that ends up making their,
their traffic management issue tricky because there's these intermediate systems
that, they can't control in between the client somehow.
>> Another thing that you see from them to try and prevent this is something, the use
of SYN names alias domain names. And so, what they'll do sometimes is that
a, a domain that tends to have logged in users will actually be an alias for some
longer more personal domain. Adn what does is it means that your client
will sort of save that. And, and in the future it will look up
that specfic domain. And that let's the DNS server route you to
the datacenter that has your data. So that, they don't have to, to move it
around to a different datacenter if your, your traffic ends up there.
>> So it seems like the gist of these optimizations that we saw was that in
today's internet the content is more important then finding the content is more
important than finding the right host. So, if supposing, a hypothetical scenario.
If the internet were to be redesigned and built up, from the bottom up today.
What do you think would be different? >> Well, that's a good question.
And a lot of thought went in to the internet and we've come up with something
that's really cool I mean I think, I think one thing that we've noticed right is that
the internet was purposefully designed with all of the functionality pushed out
to the edges and that's given us a set of properties.
We have really good evolvability. We can evolve protocols without having to
change the intermeditate you know the core routers.
And we also get good properties of reliability and availability but there's
other things that end up not working as well in that model.
Things like quality of service. So that's when, if you want to, to watch a
video and if you want to sort of pre setup that channel.
And say, I really want to make sure that the video keeps coming at the same rate
the whole time and guarantee that. Or mobility, if you want to be able to
move your laptop to different points in the network and have the connection not
lose. The Internet isn't set up to handle that
very well right now. So I guess to, to answer your question I
think were at a point in the internet that now we have a better since of what, what
it's doing now and are happy with that. And so that means that maybe we can start
pushing some of the functionality back into the middle.
I think we're seeing that already with middle boxes, we've got this
proliferation, proliferation of smart machines in the middle of the Internet.
I think in the ground up design you probably see this as well.
Also because we've got a lot of big company players and that definitely makes
their life easier if they have more intelligence there.
>> Thanks a lot. And, thank you for listening to this
section. And good-bye.
>> Thanks.