Tip:
Highlight text to annotate it
X
Welcome to Social Network Analysis. My name is Lada Adamic.
I'm an associate professor at the University of Michigan.
I'm affiliated with the School of Information, the Center for the Study of
Complex Systems, and the Computer Science department.
What I like to show you, in this course, is what we can get out of modelling the
world around as networks. Now the world is very complex and what you
represented as a network, it may not really look any less complex.
But indeed, we can gain very useful insights.
We can start to understand how information diffuses in social networks.
We can also understand how resilient different.
Infrastructure networks such as roads or the electrical power grid are two random
or intentional failures. Here's one example I'd like to start with.
These are hand drawn networks made by the artist Mark Lombardi.
He constructed them by pouring over news articles in the 1980s and 1990s, making
connections between political entities and different financial institutions and
corporations. When you laid them out, you could see
connections that might not otherwise be obvious, just by reading the news articles
one by one. Here is Michael Kimmelman, a columnist for
the New York Times, commenting on having encountered a few folks from the
Department of Homeland Security at an exhibit of Mark Lombardi's art.
They found the work revelatory, not because the financial and political
connections he mapped were new to them, but because Lombardi showed them an
elegant way to array disparate information and make sense of things, which they
thought might be useful to their security efforts.
Now in this class I'm not going to make you do hand drawn network layouts, even
though they're really hard to beat. But Mark Lombardi would spend days drawing
these over and over until they were perfect.
Instead, what we're going to be doing is using automated layout algorithms in
software such as graphing. Now here is a nice example of how
automated layout algorithms make things very apparent.
All they're doing is placing nodes that are connected through edges close together
and other nodes are repelled. Actually, all nodes experience a repulsion
force so that they're not all clumped together unless the ties bring them
together. This is a data set of political blogs
prior to the 2004 Presidential Election. This is who follows who, who, who has whom
on their blog rolls. The liberals are colored in blue, the
liberal blocks. The conservatives are colored red.
Liberal to liberal ties are blue, conservatives to conservatives are red.
Liberals to conservative are purple. Conservative to liberals, orangish-yellow.
And what is apparent right away is that to some extent, there's an echo chamber
effect, where liberals are primary talking to liberals and conservatives are
primarily talking to conservatives. And all I really had to do was run one
layout algorithm. I did not have, in this case, although we
did do this in our study to even do any calculations for this pattern to be to be
apparent. So here is another example of a data set I
have gathered. This is an organization of Hewlett-Packard
Labs, so a bunch of researchers. And what we looked at were, was their
e-mail communications. If two people had exchanged at least a
couple of emails back and forth over the period of a few months they get a grey
edge. At overlaid here black edges, which
represent the formal organization who reports to whom?
Now, what is immediately apparent from this visualization and what we confirmed
in the study is that the e-mail communication is more likely to occur
between individuals who are closer together in the organizational hierarchy,
but there are enough shortcuts across the organization that any two individuals are
connected through a short number of hops but the fact that those hops roughly
follow the organizational hierarchy makes the e-mail network navigable.
So informal collaboration of getting the job done is reflected in this network.
This network is my Facebook network and what I've done here is I've used an
automated community detection algorithm in addition to non- automated layout
algorithm [laugh] to layout my Facebook friends.
And what the automated community detection algorithm did was it said, oh.
There seems to be some people in your network that are tied together in, more or
so they're connected more than they are to the rest of your network.
And indeed, once you're working with this data with an automized version of this
data you'll see that the different groups roughly corresponds to different contexts
in which I have met people from school to work to outside of school and work
activities. And this, you can just tell without, you
know without really knowing anything about my life, you can look at my network and,
and understand quite a bit of it. The final network is one of ingredients,
recipe ingredients. We analyzed tens of thousands of recipes
to figure out which ingredients go well together, and then we made a network.
In fact, we made several networks and in this one, you can see that there are two
main communities. One of savory ingredients, and one of
sweet ingredients. And actually at the very top, there's a
smaller community, and you'll, you'll play with this data in time.
That is the mix the drink community, where you have ingredients such as *** and,
and lime juice. So, what I've shown you so far is that,
you know even just visualization can buy you a lot in understanding what the myriad
of connections that we know are there can, can represent but that might be rather
invisible to us until we represent them as this network where they are all connected
together. And this is where we're going to get the
nice insight. I am going to be using maybe somewhat
inconsistent terminology. So I may alternate between the, the words
network and graph. Graph is the terminology where, you know
it all started in the field of mathematics.
But I'm more likely to use the word network.
For example the new emerging field now is Network Science and it doesn't make a
network any less of a graph, it's just that you can use both the terms.
Similarly, I'm going to primarily use the words, nodes and edges.
However, nodes can also be referred to as vertices if you're talking about
sociological phenomenon or you're talking to a sociologist, they might use the word
actor. Similarly for ties, a sociologist might
say ties [laugh] or relations. A physicist might talk about sites and
bonds, although physicists who work on networks to say nodes and edges.
And finally, in computer science, you might be talking about links, especially
if you're talking about networks such as the World Wide Web.
So we have a variety of terminology. It's, it's very easy to, get used to it
and all we're talking about in the end is that you have different entities.
And the connections between those sentences and that is what we're going to
analyze. Let me get to the goals in this course.
In addition to picking pretty pictures we need to really understand what the
structure of a network is. So we're going to do some measurements.
And in this measurement we're going to look at whether nodes are connected
through the network. We're going to look at how far apart they
are in the network. How many hops does it take following these
different connections? We're going to look at whether some nodes
are more important than others due to their position in the network.
And we're going to look at whether there are these communities in the network that
is sets of nodes that are especially densely connected.
We are not going to be satisfied just knowing that there's this structure, we
want to know where does this structure come from.
What kinds of processes shape a network? So we're going to start with randomly
generated networks where you're just throwing edges at random and connecting
different limits. Then we're going to look at preferential
attachment, where it's a phenomenon of rich get richer.
As new edges are added they're more likely to be added to the nodes that are already
popular in a sense. They already have many other edges.
We're going to be looking at small world networks as well, so you might have
processes such as, you know a friend of a friend is likely to be a friend because
friends tend to introduce their friends to each other.
And yet, any two people in the world are connected through a short number of hops.
So recently a Facebook study showed that any two people in the Facebook graph are
connected with an average of 4.7 hops. We're going to see how certain processes
might shape such small world structure. For example, how do small worlds arise out
of optimization. For example, airline networks might be
optimized to ferry passengers back and forth in a way that is efficient and
doesn't cost the airlines much money. You might also have strategic network
formation at the level of the individual, so the individual is getting something out
of participating in the network, and so they may choose to connect to some nodes
and not others. Okay, so we're going, we've described the
network structure, we've figured out where that network structure comes from.
And the final goal is to understand how that network structure now influences
different processes occurring on the network in turn.
So, for example, we're going to learn how information diffusion is affected by the
network structure. If any two people are, are connected
through a short number of hops, does this mean that information will readily
diffuse? Sometimes, it's not information that's
diffusing but something that we don't want to diffuse such as a virus.
So how does the social network actually influence how quickly a virus is going to
spread and what immunization strategies can you use once you know what the
structure of the network is? We may study the process, or we are going
to study processes such as opinion formation.
This can be kind of consensus that can be reached across the network as individuals
continuously update their beliefs or it may be just a, a single shot.
You, you, you form your opinion only once but it is influenced by what your friends
think. We're also going to be looking at
coordination and cooperation. If you have a certain task that you'd like
to do, but it, but it depends on inputs from the nodes that you're tied to, how
quickly can you accomplish the task. And finally we're going to look at
resilience to attack so if for some reason, a certain subset of the nodes are
removed from the network, can the network still function?
Now, I filled in the first six weeks, so you might be wondering what we're be going
to do in week seven and eight. In week seven, we are going to look at
cool and unusual applications of social network analysis.
We are going to be looking at things such as recipe ingredient networks for
predicting recipe ratings. We're going to be looking at the social
networks of dolphins and we're also going to look at economic development.
So if you have the network of countries and the products that they produce, can
you actually make predictions about which products the countries are going to
produce in the future and how rapidly those countries are going to develop
economically? In the final week, we're going to look at
how social networks analysis is used by companies such as Google, Facebook,
LinkedIn, Twitter, Couch Surfing to enhance their product offerings.
So what kind of social network analysis do they do?
What kinds of research have they done? And how has this impacted and benefited
the, the features of the social networks that they enable?
So those are, that's the outline of the course.
In the next video I'm going to dive right into it, and we're going to visualize
some, some networks and see what it's all about.