Futuregrid Mooc - Ipop unit 5 - Ipop architecture - Topology and routing 1

Hello and welcome to Unit 5 of our course, and in this unit we're going to cover the IPOP baseline architecture. So the overlay, the peer- to-peer overlay that's used by IPOP to implement the virtual network, including topology and routing. [pause] So in the last unit we looked at some examples of inter-cloud virtual networks. And IPOP is one example of such a virtual network. And the main goal is... driving design decisions in IPOP have been isolation, providing an address space that is decoupled from the Internet, the same way that VIOLIN does... isolation, for example. What differentiates IPOP from other systems is self-organization and the peer-to-peer architecture. Self-organization in IPOP includes the routing tables in the overlay, the connections of the overlay. As nodes join and leave the network, there's no need for configuration. The system configures itself through mechanisms that we're going to discuss later. And it's a decentralized messaging architecture with no global stage, no single point of failure, and with scalable routing techniques. This is also used in the process of traversing NATS, as we'll describe later. [pause] So... in essence, IPOP uses the tunneling mechanism that we saw earlier as a primitive for implementing a virtual network. And it uses a virtual network device at the endpoints to capture and inject IP packets. This is a device that you would find in typical operating system as a TAP device or a TUN device. And then IPOP tunnels these captured IP packets over multiple... potentially multiple connections, and each one of these could be over a different protocols. Typically UDP, but could also be TCP. This is similar to other VPNs. What is different in IPOP is that these tunnels are not centrally managed and configured. So they are self-organized, they are self... discovered, established and maintained by the overlay itself. And there's no distinction between a client and a server, each IPOP here is capable of picking packets, injecting packets and also routing packets over the network. So the architecture looks like this. You have, in this example, two endpoints running the IPOP virtual router, and the virtual network interface that it exposes for connectivity within the virtual network. So this virtual network interface is configured with an isolated private address space that the user or the owner of this virtual network can set up. Applications can run un-modified. For example, web server on the left connecting to a web... sorry, web client on the top left connecting to a web server on the bottom right will use the same code and the same protocols that they would normally use on the public Internet. And all the intelligence needed by IPOP to capture, tunnel, and... route tunneled packets over the overlay are encapsulated in this vitrual router, which is the main software that runs the IPOP... infrastructure. So again, an application sends a message, that message is captured by the virtual network interface, goes through a router, it's encapsulated, perhaps encrypted, and sent over perhaps multiple hops, but eventually getting to a destination at another IPOP virtual router where the message is decapsulated and delivered through the virtual network interface to the application. So the question is what happens between these steps when a message goes from a virtual router to another virtual router. That's what differentiates IPOP from most other VPNs. [pause] So what happens is... what you have now is an overlay that's formed as a peer-to-peer... structured peer-to-peer system. So the nodes... the virtual routers in the previous slide would be the nodes in 1 and 2, etc, in this example. And IPOP follows an approach that other structured P2P systems have followed, which is to order them as a bi-directional ring, where each node has a unique identifier with a large address space. In our case, 160 bits. And these identifiers are ordered in... increasing order of these identifiers. And that's important for routing, as we'll see. So there are two types... two major types of edges, and the edges here are the connections between these nodes that allow messages to be transmitted from, let's say, node4 to node5. The two kinds of edges... important kind of edges are the near edge, which connect nodes with their immediate neighbors in the IPOP identifier address space. And... in this example, n1... n11 to n12 are connected by a near edge, and the far edge is connecting nodes across the ring. In this example of our edge connecting node12 to node4. And the reason for this, for our edges, is to... support scalable routing. If you only have near connections, you would have routing which would be order of 'n' nodes in the network... would be n hops. With these... far edges, it's possible to reduce the complexity to log square n, or log n. So when a node... in this case, in this example n1 wants to send a message to n7, it uses multiple hops, but it can make... packets can make local decisions based on the local routing table to forward progress towards the destination node. And that's a process called creative routing. So in this case, n1, for example, looks at its three edges. He has an edge with n2, n12 and n5. He chooses the one that brings it closer to the destination, which is n7, and forwards the message to n5. And this process goes on recursively at each node that is visited in the process. [pause] So the overlay edges can be... [pause] tunneled over multiple transports, UDP, TCP, and we'll see later an example of a tunnel edge. UDP is put forward because it supports decentralized NAT traversal using hole punching. Greedy routing is used, like I described in the previous slide. You route... each node forwards the message to the... through the edge that brings the message closer to the destination in the identifier address space. With a constant number of edges and following a small-world network, the complexity of routing is((1/k) log square (n)), where 'k' is the number of edges per node and 'n' is the number of nodes in a network. If you make k, instead of a constant, log(n), then the complexity of routing is O log (n). And there's adaptive shortcuts and tunnel edges that are headed, in addition, to the near and far edges, and those will be discussed later.