1 1 1a Why Social Network Analysis 1354

Welcome to Social Network Analysis. My name is Lada Adamic. I'm an associate professor at the University of Michigan. I'm affiliated with the School of Information, the Center for the Study of Complex Systems, and the Computer Science department. What I like to show you, in this course, is what we can get out of modelling the world around as networks. Now the world is very complex and what you represented as a network, it may not really look any less complex. But indeed, we can gain very useful insights. We can start to understand how information diffuses in social networks. We can also understand how resilient different. Infrastructure networks such as roads or the electrical power grid are two random or intentional failures. Here's one example I'd like to start with. These are hand drawn networks made by the artist Mark Lombardi. He constructed them by pouring over news articles in the 1980s and 1990s, making connections between political entities and different financial institutions and corporations. When you laid them out, you could see connections that might not otherwise be obvious, just by reading the news articles one by one. Here is Michael Kimmelman, a columnist for the New York Times, commenting on having encountered a few folks from the Department of Homeland Security at an exhibit of Mark Lombardi's art. They found the work revelatory, not because the financial and political connections he mapped were new to them, but because Lombardi showed them an elegant way to array disparate information and make sense of things, which they thought might be useful to their security efforts. Now in this class I'm not going to make you do hand drawn network layouts, even though they're really hard to beat. But Mark Lombardi would spend days drawing these over and over until they were perfect. Instead, what we're going to be doing is using automated layout algorithms in software such as graphing. Now here is a nice example of how automated layout algorithms make things very apparent. All they're doing is placing nodes that are connected through edges close together and other nodes are repelled. Actually, all nodes experience a repulsion force so that they're not all clumped together unless the ties bring them together. This is a data set of political blogs prior to the 2004 Presidential Election. This is who follows who, who, who has whom on their blog rolls. The liberals are colored in blue, the liberal blocks. The conservatives are colored red. Liberal to liberal ties are blue, conservatives to conservatives are red. Liberals to conservative are purple. Conservative to liberals, orangish-yellow. And what is apparent right away is that to some extent, there's an echo chamber effect, where liberals are primary talking to liberals and conservatives are primarily talking to conservatives. And all I really had to do was run one layout algorithm. I did not have, in this case, although we did do this in our study to even do any calculations for this pattern to be to be apparent. So here is another example of a data set I have gathered. This is an organization of Hewlett-Packard Labs, so a bunch of researchers. And what we looked at were, was their e-mail communications. If two people had exchanged at least a couple of emails back and forth over the period of a few months they get a grey edge. At overlaid here black edges, which represent the formal organization who reports to whom? Now, what is immediately apparent from this visualization and what we confirmed in the study is that the e-mail communication is more likely to occur between individuals who are closer together in the organizational hierarchy, but there are enough shortcuts across the organization that any two individuals are connected through a short number of hops but the fact that those hops roughly follow the organizational hierarchy makes the e-mail network navigable. So informal collaboration of getting the job done is reflected in this network. This network is my Facebook network and what I've done here is I've used an automated community detection algorithm in addition to non- automated layout algorithm [laugh] to layout my Facebook friends. And what the automated community detection algorithm did was it said, oh. There seems to be some people in your network that are tied together in, more or so they're connected more than they are to the rest of your network. And indeed, once you're working with this data with an automized version of this data you'll see that the different groups roughly corresponds to different contexts in which I have met people from school to work to outside of school and work activities. And this, you can just tell without, you know without really knowing anything about my life, you can look at my network and, and understand quite a bit of it. The final network is one of ingredients, recipe ingredients. We analyzed tens of thousands of recipes to figure out which ingredients go well together, and then we made a network. In fact, we made several networks and in this one, you can see that there are two main communities. One of savory ingredients, and one of sweet ingredients. And actually at the very top, there's a smaller community, and you'll, you'll play with this data in time. That is the mix the drink community, where you have ingredients such as *** and, and lime juice. So, what I've shown you so far is that, you know even just visualization can buy you a lot in understanding what the myriad of connections that we know are there can, can represent but that might be rather invisible to us until we represent them as this network where they are all connected together. And this is where we're going to get the nice insight. I am going to be using maybe somewhat inconsistent terminology. So I may alternate between the, the words network and graph. Graph is the terminology where, you know it all started in the field of mathematics. But I'm more likely to use the word network. For example the new emerging field now is Network Science and it doesn't make a network any less of a graph, it's just that you can use both the terms. Similarly, I'm going to primarily use the words, nodes and edges. However, nodes can also be referred to as vertices if you're talking about sociological phenomenon or you're talking to a sociologist, they might use the word actor. Similarly for ties, a sociologist might say ties [laugh] or relations. A physicist might talk about sites and bonds, although physicists who work on networks to say nodes and edges. And finally, in computer science, you might be talking about links, especially if you're talking about networks such as the World Wide Web. So we have a variety of terminology. It's, it's very easy to, get used to it and all we're talking about in the end is that you have different entities. And the connections between those sentences and that is what we're going to analyze. Let me get to the goals in this course. In addition to picking pretty pictures we need to really understand what the structure of a network is. So we're going to do some measurements. And in this measurement we're going to look at whether nodes are connected through the network. We're going to look at how far apart they are in the network. How many hops does it take following these different connections? We're going to look at whether some nodes are more important than others due to their position in the network. And we're going to look at whether there are these communities in the network that is sets of nodes that are especially densely connected. We are not going to be satisfied just knowing that there's this structure, we want to know where does this structure come from. What kinds of processes shape a network? So we're going to start with randomly generated networks where you're just throwing edges at random and connecting different limits. Then we're going to look at preferential attachment, where it's a phenomenon of rich get richer. As new edges are added they're more likely to be added to the nodes that are already popular in a sense. They already have many other edges. We're going to be looking at small world networks as well, so you might have processes such as, you know a friend of a friend is likely to be a friend because friends tend to introduce their friends to each other. And yet, any two people in the world are connected through a short number of hops. So recently a Facebook study showed that any two people in the Facebook graph are connected with an average of 4.7 hops. We're going to see how certain processes might shape such small world structure. For example, how do small worlds arise out of optimization. For example, airline networks might be optimized to ferry passengers back and forth in a way that is efficient and doesn't cost the airlines much money. You might also have strategic network formation at the level of the individual, so the individual is getting something out of participating in the network, and so they may choose to connect to some nodes and not others. Okay, so we're going, we've described the network structure, we've figured out where that network structure comes from. And the final goal is to understand how that network structure now influences different processes occurring on the network in turn. So, for example, we're going to learn how information diffusion is affected by the network structure. If any two people are, are connected through a short number of hops, does this mean that information will readily diffuse? Sometimes, it's not information that's diffusing but something that we don't want to diffuse such as a virus. So how does the social network actually influence how quickly a virus is going to spread and what immunization strategies can you use once you know what the structure of the network is? We may study the process, or we are going to study processes such as opinion formation. This can be kind of consensus that can be reached across the network as individuals continuously update their beliefs or it may be just a, a single shot. You, you, you form your opinion only once but it is influenced by what your friends think. We're also going to be looking at coordination and cooperation. If you have a certain task that you'd like to do, but it, but it depends on inputs from the nodes that you're tied to, how quickly can you accomplish the task. And finally we're going to look at resilience to attack so if for some reason, a certain subset of the nodes are removed from the network, can the network still function? Now, I filled in the first six weeks, so you might be wondering what we're be going to do in week seven and eight. In week seven, we are going to look at cool and unusual applications of social network analysis. We are going to be looking at things such as recipe ingredient networks for predicting recipe ratings. We're going to be looking at the social networks of dolphins and we're also going to look at economic development. So if you have the network of countries and the products that they produce, can you actually make predictions about which products the countries are going to produce in the future and how rapidly those countries are going to develop economically? In the final week, we're going to look at how social networks analysis is used by companies such as Google, Facebook, LinkedIn, Twitter, Couch Surfing to enhance their product offerings. So what kind of social network analysis do they do? What kinds of research have they done? And how has this impacted and benefited the, the features of the social networks that they enable? So those are, that's the outline of the course. In the next video I'm going to dive right into it, and we're going to visualize some, some networks and see what it's all about.