Tip:
Highlight text to annotate it
X
Hello and welcome to Unit 6 of our course,
where we're going to cover the use of address spaces
in IPOP to support multiple virtual networks
sharing the same peer-to-peer overlay.
[pause]
So the goal of this use of address space is in
the virtual network to support this idea...
this feature where you can have multiple
virtual networks sharing the same peer-to-peer overlay.
So this example illustrates a scenario where we have
three users with three different virtual networks,
VN1, 2, and 3. And they are sharing the same
underlying P2P overlay. So the machines in blue
belonging to virtual network 3, for example,
can communicate within their virtual networks.
They can have IP addresses that are... isolated from...
the addresses that are allocated to other virtual networks,
and are able to communicate without worrying about
collisions with the namespace or the address space
of the other virtual networks.
So... how is this supported in the context of IPOP?
[pause]
The idea is we introduce this abstraction of a namespace,
which is a user-provided string that uniquely identifies
a virtual network that is multiplexing a P2P overlay.
And then every IPOP node is bound to a namespace.
Now one thing that's important to keep in mind
is that this namespace is not seen by applications.
The applications only see the IP address
that's bound to a virtual network interface.
They are part of the configuration of an IPOP router
when it is started up. And it's provided by
the user who creates this virtual network.
So then the key primitive that's used to help
in the mapping of addresses to...
peer-to-peer identifiers is a lookup
of the IPOPid, which is, as you recall,
the unique identifier of every node in the P2P network.
So to perform this lookup we need two parameters.
One is the namespace and the other is
the IP address within the virtual network.
For example, if we look at... a scenario where
virtual network 3, VN3, has two nodes...
with addresses that have been allocated...
IP addresses 10.10.1.5 and 10.10.1.6.
So when 10.10.1.5 needs to communicate
with 10.10.1.6 in VN3, the first thing that it does
is a lookup in the first row here of the slide.
And this lookup should yield the value n3,
which is the IPOP identifier bound to the machine
with that IP address on that namespace.
And conversely if 10.10.1.6 wants to communicate
with 1.5, it looks up a... an address space on
the virtual network's namespace and the IP address.
Now because we used different unique
virtual network namespaces, and it's possible
for different virtual networks to have nodes
which had the same virtual IP address.
So in this example, virtual network 2 has a node
with also the address 10.10.1.6, but that is
mapped to a different IPOP identifier, n5.
And that's because it belongs to a different namespace.
Now to accomplish a lookup, what you basically
need is a table. And a hash table would be
one approach of implementing this lookup.
But we don't want to have a centralized server
that's responsible for performing these lookups.
So the goal is to perform the lookup in a decentralized way.
And to do this, we leverage the fact that
we can use a distributed hash table...
which is a data structure that's... supported by
a structured peer-to-peer system like IPOP.
So in a... distributed hash table works similarly
to a regular hash table, but its entries are
distributed across peers in the network.
And to look up this table we're going to use
a combination of the namespace and the
virtual IP address. And the value that we store
in this table is the 160-bit IPOP identifier.
[pause]
So in a distributed hash table, the key...
that is used to look up and store data
is hashed to a value that's... fits within
the peer-to-peer identifier address space.
Because of IPOP, it's 160 bits, and we use
the [unknown] hash function to compute
a hash of the key that's being inserted.
And that information is stored in the nodes that are
left and right to this identifier, to this hashed value.
And to support additional redundancy, we can...
append unique values to the key and hash the key
multiple times. Let's say if you hash four times,
you'd have a total of eight replicas of a DHT entry,
because you store in two nodes around
the value of each hashed entry, and if you have
'k' times a hash, and you have k times 2,
or four times two entries in the DHT.
But the basic primitives of distributed hash table,
or the DHT, are very simple. It's a PUT of a key/value pair
which translates into sending a message to a node,
which is the hash of the key. And again,
potentially appending values and recomputing
the hash multiple times for more redundancy.
And the value is, in this particular case, our IPOPid.
And to get... or to read... look up a value from the DHT,
is also a simple operation. You provide the key
and a return will be the value associated with that key.
Or nothing, if there's no value associated with the key.
[pause]
So in general, in the DHT, you would have,
for example, node n1 in this case holding
a key-value pair of "foo" as a key and "bar" as the value.
So the first thing we could do is append
a unique value to these hash... to these key,
and computing a hash of that key.
Now in this example, let's say that a hash
of this value's 107. Now there's no node in this network
that has exactly the value 107. So storing that message...
storing that value bar associated with the key "foo"...
ends up being a message that is delivered to
both neighbors of this value, 107, which in this example
would be n6, which has id 101, and n7 has id 205.
And again, recall that these identifiers are ordered...
in increasing ascending order. So once you look up...
the value that's immediately smaller than 107
and immediately larger, you know that you can
stop routing this message and you can
store the information on these three nodes.
And again, to look up a value for this "foo" key,
all we need to do is, again, compute the hash
and send a message for these... identify and
obtain the result. In this case it could come from
either n6 or n7, they're both storing this information.
[pause]
So that's a general behavior of a DHT.
In the case of IPOP, the specific key
and values that are used are as follows:
so the key is a concatenation of a keyword, 'dhcp'...
with a namespace in the IP address separated by colons.
And the value is a string that has, again, a keyword,
'brunet:node', followed by... a 160-bit IPOPid
encoded in ASCII format. So that's the long string
that you see in green in this example.
[pause]
So having this basic capability of looking up
addresses also allows us to implement...
[pause]
flexible ways of managing the allocation
of the IP addresses. And IPOP supports
two different approaches. One is dynamic assignment,
where we have a DHCP proxy that understands
DHCP messages that come... a DHCP request
that comes in the IPOP tap device, and...
[pause]
provides the functionality of dynamic address
assignment without having a single centralized server.
So how this is done is the node itself generates
an IP address at random within the range
of the DHCP configuration. And then it attempts
to store in the distributed hash table.
Now it ensures that a majority plus one of
replicas in the DHT... let's say if you have
eight replicas total, you expect at least five... replies
to acknowledge that the value has been inserted
before binding the address to the interface.
If it's not possible to bind... to insert this map
in the DHT, the DHCP proxy will regenerate
another address and retry, and continue this process
until it finds a value that has not been
allocated to any other node.
It's also possible to support static addresses
by inserting them into the DHT mapping.
And in both cases it's important to keep in mind
that the DHT stores a value only for a certain
amount of time, a 'time to live', or TTL.
So these values have to be refreshed in the DHT
or they... these mappings expire and the node...
will not be addressable anymore.
Nodes in the IPOP network can be moved across
physical links and maintain their IPOP identifier.
For example, this is useful if you're migrating
virtual machines from one data center to another.
They can have a different physical address
on the physical network, but IPOP will maintain
the virtual IP address at the destination.
The key idea is that the IPOP identifier
remains the same, and also the mapping
between the virtual IP address and the IPOPid
can remain the same in a distributed hash table.
What changes is that at the destination,
the node will re-initiate the process of
creating edges with its neighbors, if edges
have been... torn down during the migration.
So the node will go through the process we saw
in the previous unit, learn the URIs for itself,
the endpoints that it has in the new network,
begin creating edges with its left and right neighbor
and the far edges, and eventually reconnect in...
becoming again routable on the virtual network.
All of this, again, without losing connectivity,
without losing the IP address allocation in the process.
[pause]
So that concludes this unit,
and in the next unit we're going to look at
some of the performance optimizations in the system.