Tip:
Highlight text to annotate it
X
PHILIP: Thank you everyone for coming.
My name is Philip.
I'm a software engineer here, and I am very happy that Leo
is coming to speak.
So Leo and I were in grad school around the same time.
And we were actually officemates one summer at a
another unknown company.
And we just had a really good time that summer.
And Leo has really impressed me by both the breadth of
stuff that he's really interested in, in terms of
research engineering, and also the depth in which he goes
into stuff.
So he's one of these rare individuals that has a lot of
diversity and breadth, and also [INAUDIBLE].
This is one of his several projects that he goes in with
a good amount of depth.
So I'm really excited for this type.
I hope everyone else is as well.
So go for it.
LEO A. MEYEROVICH: Philip didn't say
he's also like that.
Hi, I'm Leo.
I'm from over at East Bay UC Berkeley.
And this is work I've been doing with Ari Rabkin who was
at Berkeley and now is at Princeton.
And we've been looking at programming language adoption.
I'm going to be talking about two different ways we've been
looking at it.
One is about quantitative analysis, and the other was
looking at what the sociologists might say to us,
and seeing if we can cherry pick theories that apply to
our domain.
So before I get into what that really means, I want to ask
why do we care?
How is this interesting to us?
And so for that, there's a really cool paper by Eric
Meyer, where he found this principal
called the change function.
This is by Pip Coburn, and the sociologists--
they actually call it something called a switching
cost, but neither of them appear to have
known that at the time.
But basically, what it says is if you're looking at some sort
of new technology, it's going to have some benefit.
And when that benefit is greater than the cost and that
pain of going through that adoption process, then it's a
rational choice to go forward.
And you might not have known the guy I was talking about--
Eric Meyer--
but you probably know the language that he was one of
the key designers of, which is Haskell.
And when he looked at this change function here, he
realized Haskell has all this functional programming
goodness, but you also have to pick this new language.
And there's a lot of pain involved there.
And what he decided after that is that from now on, my goal
in life would be to also drive the denominator down to zero.
And what he meant here is that instead of doing all this cool
functional programming stuff and designing these new
features for Haskell, he'd actually join the Visual Basic
team, do all this cool new functional programming
research there, but do it in that same language that is
easy for developers to already use and lower
that cost of adoption.
So what I want to talk about today is two different ways
I've been looking at how adoption goes and how we
should think about adoption.
And by we, I mean language designers and language
researchers.
So we've been looking at two ways.
One is doing large scale data analysis.
We've been going out into the wild and seeing how the stuff
actually happens.
And instead of just doing random safari, just looking at
random facts, we've been trying to keep all of our work
informed by how actual sociologists look at adoption
in general and in particular fields that are somewhat
related to languages.
So let's first talk about the fun numbers, and then we'll
get to the models behind it in a bit.
So when I'm talking about adoption, I'm interested in
two things.
One is a language like Haskell.
But also another thing is a feature like functional
programming.
And throughout this talk, I'm going to switch in and out of
those two different types of adoptions.
And we've done a lot of cool quantitative analysis.
Here, I just want to talk about three particular types.
One was we want to look at how people pick domain specific
languages over general purpose languages, which matters when
we're trying to pick what type of language to design.
Another thing is in the small-- when a programmer
makes one small decision, how did they actually make it?
And then finally, as an educator or somebody who wants
people to know languages, we're interested in what
actually influences people's ability to do so and what's
important there.
So let's talk about
demand-specific languages first.
So when I say demand-specific language, I mean something
like Excel, which is good for, maybe,
doing accounting formulas.
And then, there's this spectrum-- that you can make
it more and more general, which is something called
general purpose languages.
And oftentimes, we might say this is something like a
Turing complete language, where you can compute
anything you want.
So maybe there's like a library for
everything in Perl.
You just have to look for [INAUDIBLE]
and find it.
And then there are--
and this is a spectrum.
Maybe there's something between like map reduce here
at Google, where you can plug in whatever function you want
into your map and reduce.
And it will just run it on the cluster.
But this might not be a good way to think about this stuff.
So what you're seeing here is one of those experiences that
scarred me as a youth.
I went to a soap factory to see how they make soap.
And apparently, they run the soap machines by Excel macros.
It's this very domain specific thing, right?
So maybe we don't actually even know what we mean when we
say domain specific.
And so we took a look at about 200,000 projects in
SourceForge and tried to understand what does it mean
to be domain specific.
And for each project, we got out a few pieces of
information.
So here, we're looking at the Squirrel SQL client.
On the bottom left, we're seeing that it's a client.
It's a front end.
So it's the category of front ends.
And then on the bottom right, we're seeing the programming
language is Java.
So you're going to write this type of front end in Java.
And let's see.
What else do people use Java for in SourceForge?
And so the chart you're seeing here is-- on the x-axis is
different categories.
So one of those dots means blogging.
Another one of those dots means you're writing some sort
of search program in Java.
And what the y-axis means is that [? wire ?] means it's
more popular for that particular category.
So if somebody's going to write a search client-- it's
actually about 40% chance it'll be in Java.
But if they're doing blogging, it'll only be a 10% chance
that it's in Java.
And so this is in just the SourceForge [INAUDIBLE].
If you're one of those 200,000 programmers, this
is specific to that.
What's cool is I actually started life as a schemer, and
it's fun to look at the adoption--
how niches work there.
And there we see, OK, apparently build tools is
something people--
if you're going to use scheme, it's going to
be for a build tool.
And if you notice, though, the y-axis is a little different.
It's smaller.
And what the really interesting thing here is that
you don't have this nice spread of scheme across
different languages, right?
A few things pop out.
And so what I did next is--
or Ari and I--
we ordered the languages by popularity.
So Java is there on your left.
And note, that's the popular one.
And then Scheme is on the right.
And maybe it's not unpopular, but under appreciated.
[LAUGHTER]
LEO A. MEYEROVICH: And so here we are seeing just general
popularity.
And then what we added in was the standard deviation across
different categories.
And notice that the y-axis here is on a log plot.
And so what that means is even though they all kind of look
the same size, because the standard deviation is kind of
at a lower point on the bottom right, that means it's much
smaller than the ones on the other side.
And so, for example, if I divide standard deviation by
the average, you actually see--
or maybe another way of looking at it, if you look at
those two slopes, they're changing at a different rate.
And so what that means is standard deviation--
so the interpretation here is basically as you get more
unpopular, you're only going to be showing up in certain
niches is what we saw for Scheme versus in Java.
And so that kind of leads to a certain type of thinking.
For example, when you're going to talk about language
adoption, you're not going to say that it's getting
generally more popular.
It's getting more popular niche by niche.
You're going to see more of those popping up.
All of them are going to go up, but also the cool
phenomenon there is you start seeing this pop.
And I'll get it back into that later.
But now, this already leads into a whole new line of
reasoning about how languages work.
So this is all very high level.
So let's actually zoom in a bit.
So we can ask, well, how do programmers
actually pick languages?
And here we see a picture of a bunch of dogs where they say,
no matter what it is, we want it.
I think better of programmers, and so I was curious what they
actually do.
So here again, we're looking at SourceForge, the same
200,000 projects.
And what you're seeing on the right is just a project--
the second most recent project they wrote.
And then what you're seeing on the bottom axis, given that
project, what's likely the likelihood of them picking a
language, some next language?
So given that you use one language for a project, what's
the likelihood that you're going to use some other ones?
So as an example of what's going on here, no matter what
language you picked on that right axis--
on the y-axis--
you see these vertical strips that good chances are, you're
going to pick any one of those six languages.
So the probability is independent of what
language you use.
You're going to probably use one of those six bars.
And so that means programmers are creatures of habit.
You're going to use a popular language.
What's also cool, if you notice, is there's another
phenomenon going on here, which is we have this very
strong diagonal in this matrix.
And what that means is you're going to
use that same language.
You use one language, there's a good chance the next
language you use is going to be the same exact language.
So now, it's more strongly instilling that programmers
are creatures of habit in two different ways.
And this characterizes most of the projects in SourceForge.
Programmers use the same language, either that's a
popular one or that they used before.
So this led us to a question of why.
I'll explain this in a second.
But why are they a creature of habit?
What actually led to those decisions?
Is it just they like popular things, or is there something
else going on?
And so we launched a bunch of visualizations of this stuff.
And we realized that, oh, this might be a nice opportunity to
get our work in the eyes of lots of other people.
In this case, you see we got Slashdotted.
And as soon as we put up those visualizations, we were
talking to the press and it was all really fun.
But what was really going on here was something different.
It was much more insidious.
We wanted to run a survey.
And so the reason we did this viral campaign is we wanted to
see if we can ask programmers about how they actually pick
projects for their most recent project.
And so in about a period of two days after this campaign,
we had about 1,600 responses from people on websites like
Slashdot and Wired.
And so what we're seeing on this graph is what they said.
So this is a little noisy graph.
But basically, what we're seeing here on the bottom bar
is different types of reasons people pick their language for
the last problem.
The very strong influences are those bars
under the green arrows.
So for example, on the leftmost--
actually, open source libraries were from a strong
to medium influence--
so everything above that horizontal black bar.
And then also something like group legacy which is sort of
what we're seeing in the SourceForge case.
The group was already writing code in this language, so the
next project will use that same language.
And what was really cool was that as we go through all
these green bars-- for example, cellphone
familiarity, team familiarity, open source libraries-- these
are all about social properties of language.
This is how other people use the language.
It's not just about how fast the language is intrinsically.
And as a languages designer, then I started asking, OK,
well, what about the others?
What had slight influence on what languages were picked?
So on the leftmost, you see something like correctness.
That would be like type safety or something-- one of those
very common properties.
And more dear to my heart is something like developer speed
or productivity or inherent productivity to language.
That, actually, was not a strong influence for picking a
language, when you actually get down to
the concrete decision.
So the programmer is an interesting social beast.
So this is just me asking all the
programmers and showing that.
So the question is what happens when we start picking
what programmer we look at.
And so if social properties are very important, we can
start asking what happens when we look at different types of
programmers and from different types of social organizations.
And so what this chart is showing is those same axes on
the bottom, those same properties.
But now, I'm slicing the programmer data based on what
size company does that programmer work at.
And so the leftmost--
that dark blue-- would be you're working for yourself.
It's a one man team.
And the rightmost will be you're working for a 500
organization with 500 programmers or more.
And just as a quick surface representation, I asked what's
the slope of the curve across these.
So for example, if we look at correctness on the left, I
wrote a green plus.
And that means that the bigger the organization--
except for one of those bars--
basically means that the bigger the organization, the
more that correctness is a concern.
Well, if we look at something like open source libraries--
the first red minus--
that's a negative slope.
And what you see is, the bigger the organization, the
less that open source matters.
And presumably, they're building their own software.
And these are actually significant changes in
adoption habits, because this is essentially going from
medium influence to a slight influence.
This is a full one point drop.
So in summary, larger companies care about more
social properties about their language or how
the language is used.
So the size of the company is just one way of bucketing
programmers.
And so there are lots of different types of programmers
out there and lots of different ways of looking.
And so one thing they we've been looking at, actually, was
partially inspired by Philip.
It was actually education.
And so we were wondering, how does education and age and
things like that play into the languages you know?
So as one example, we did another survey, this time on
something called a MOOC, a massive open online course.
And essentially, you can think of as a programmer who's in
the job force but wants to take an online course.
So this is an educated programmer-- somebody
interested in languages.
And we were able to get about one 1,000 or 2,000 programmers
this way and ask them about how they learned languages.
And one interesting phenomenon found out is that the number
of languages a programmer knows--
once you're 20 or older, it stagnates.
You'll say that, oh, I know--
those red bars we're seeing, the x-axis is age.
So as we go from left to right, the red bar is kind of
constant, right?
We have this nice line in the middle.
The number of languages you've used actually stays kind of
dormant independent of your age.
And the number of languages you know well
is the green bars.
And that also stays fairly invariant.
And again, these are educated people who are working
programmers.
So if we are going to do something about language
education, it sounds like what's going on after you're
in the workforce today is rather stagnant.
So we took a look at what happens in school, like what
we do there matters at all.
And so what we asked is if you look at the labels on the left
column, there we asked them for different categories of
programming, like functional programming like Lisp and
Scheme, or dynamic programming Perl and Python, or maybe
specialized systems like Assembly or Matlab and
Mathematica.
For each one of those categories, we wanted to see
how you acted in school influenced what languages you
know today.
And there, the first thing we looked at is whether you're a
CS major or not.
Is CS education actually doing anything at large?
And there what we found is that when you ask CS majors
versus non-CS majors about the languages they know in these
different categories, that actually
doesn't really matter.
For example, non-majors, 19% of the time will know about
functional programming, and CS managers will
know 24% of the time.
So there's a small 5% jump.
So as somebody interested in education,
this is a small jump.
However, if we look at whether a particular language was
taught in an individual course, like in one of these
families, now we see significant changes in the
statistics.
For example, for functional programming-- that top row--
if you were taught one of these languages, you'll say
that you know it 40% of the time.
But if you weren't taught it, you'll only have picked it up
after school about 15% of the time.
So the takeaway here is it actually doesn't matter to a
large extent, if you don't know anything else about a
person, whether there they were a CS major or not.
But if you ask what was taught in those particular courses,
now that becomes significant.
And when you define a CS curriculum, the languages we
teach actually matters, because otherwise, apparently,
people won't learn them.
So I actually have lots and lots of statistics.
If you go to my website--
the URL is here--
we actually have a few interactive visualizations,
and this is the thing that launched our viral campaign.
Also, our raw data is up.
If anybody know statistics much better than me, I would
be curious to see what you have to say.
And with that, I wanted to move on to the social
principles, the more theoretical stuff.
But before then, I think this might be a good stopping point
if anybody wants to ask questions about this more
quantitative analysis.
AUDIENCE: The previous slide here, [INAUDIBLE]
LEO A. MEYEROVICH: Yes.
AUDIENCE: [INAUDIBLE]?
LEO A. MEYEROVICH: So we have that data, but I suggest you
come up to me after and I can pull it up.
This is just a snippet.
Also, I think for those, people use those in practice,
so I don't think it's as surprising.
Mark?
AUDIENCE: Well, the number of languages versus age--
there was never an upward slope, even at the beginning
of the graph.
So that doesn't seem to support the hypothesis that
people tend to learn languages before they [INAUDIBLE].
LEO A. MEYEROVICH: Yes.
AUDIENCE: It seems like as soon as they're qualified to
be on the graph at all that they're already [INAUDIBLE].
LEO A. MEYEROVICH: Yeah.
So our first guess here was that we had some
sample bias going on.
And actually, the first time we did it was actually for the
Slashdot survey.
So we figured we were just asking a bunch of nerds about
nerdy things.
And that's actually why we went back to the online
course, because there, we had much wider demographics.
And so here, I think this really is what's going on.
But maybe what's going on is somehow--
for example, how we asked the question, that maybe somebody
isn't remembering languages.
But one of the ones we asked was we actually gave reminders
of different types of languages.
And then we tried different ways of asking the question,
like can you enumerate your answers and can you actually
just give a number?
In both cases, we kind of saw the same stagnant--
so this seems close.
AUDIENCE: Could this be because companies pay
employees to [INAUDIBLE]?
LEO A. MEYEROVICH: Yeah.
So maybe a good thing to do would be to do some more of
the demographic slicing.
For example, we had a lot of international students in this
particular survey.
So that would be interesting to control for.
AUDIENCE: Also, you're assuming that a snapshot of
ages at a given point in time is a good indicator of what
would happen if you followed the programmer over the ages
of the same programmer.
But it might be that the conditions for being 20 now
are different than the conditions for
being 20 20 years ago.
LEO A. MEYEROVICH: Yeah.
So the computing industry could change, and that will--
so maybe these are statistics of the day.
These are cross sectional, not longitudinal.
AUDIENCE: Yeah.
LEO A. MEYEROVICH: I've actually found indicators that
that's not the case.
But I think we should talk offline about that.
But basically, essentially, as soon as personal computing
happened, then ages became invariant.
But we can talk about that offline.
I think in the back.
AUDIENCE: Are there any [? indications ?] that the set
of languages that a 20-year-old programmer knows
well [INAUDIBLE]
know well [INAUDIBLE]?
LEO A. MEYEROVICH: So this is cross sectional, meaning we
took it at a snapshot in time.
So I really don't know.
But what I will say is that we actually looked at different
age groups to see what languages they know.
Then for example, if you're in college and you haven't left
yet, this is the time to learn Ruby.
You're a Ruby programmer.
But if you've already left, you've missed the curve, and
you probably won't know Ruby, even if you're
just two years older.
So there definitely are cool age-specific phenomenon.
John?
AUDIENCE: [INAUDIBLE].
So you would predict that the age or [INAUDIBLE]
programmers [INAUDIBLE]?
LEO A. MEYEROVICH: Yeah.
So the question is are languages somehow
generational?
When Java came out, you have the Java generation
programmers.
That I don't know.
That's kind of what the Ruby comment is, that there are
these generational blips, especially for the less
popular languages.
Something like Java's a little tricky, where it's an old
language, relatively speaking, yet it's still the number one
language for a lot of statistics.
That's a good question.
Unless there's like another really burning question, I do
want to move on to the really, really crazy stuff.
OK.
So now we have a bit of an idea of the numbers.
But we weren't actually doing a random number hunt.
A lot of this was informed by ways that we saw that
sociologists might think about these things.
And it was opportunistic, but there was some picking here.
And so the other side of our research-- we've started
thinking about adoption in a little bit of a different way,
where we realized that it's not just about getting our
language out there.
We have these longstanding challenges and questions in
programming language research, and so the question is could
we start, by looking at them differently in terms of social
theories, we have alternate explanations for what's going
on, or in some cases, explanations
for the first time?
And I want to talk about three particular cases.
The first one is I think what a lot of people would expect
me to talk about.
And it's very practical, which is, if you are building a
language or any tool, how do we market it so
people will adopt it?
And so there's something called diffusion of
innovation.
That gives us a nice recipe.
The next thing--
that was a bit of a narrow minded view of
how adoption works.
For example, one thing I've come to appreciate is through
theories like reinvention, what it basically means is the
more people use your system, the
better your system becomes.
And I'll talk about one particular case of how we
might try to harness that.
And then finally, I'm going to argue that--
pulling on, again, arguments from sociology--
that we shouldn't just be looking at the technical side
of what's inside your language, that basically, a
lot of the knowledge about what it means to be a language
is actually coming from how people use it.
And when I say people, I mean groups of people.
So let's kick this off.
So I'm going to say something really bizarre here.
So to a sociologist, if I told them, yeah, I think safe sex
is like this thing called the type system, I don't think
they'd laugh me out.
And before we get to safe sex let's talk about corn.
That's a little easier.
And so basically, what happened in 1943 was near the
beginning of this revolution in sociology.
And what happens in one particular case study was this
guy named Brian went out from farm to farm.
And he looked at how people were adopting corn.
And the reason he was doing this is--
the corn you're seeing here is genetically modified, and more
in the tame 1940's sense, not the way we're doing it today.
And the reason they were doing this is because they were
concerned about things like, say, world hunger.
And what Ryan found was that it took about 12 years from a
farmer to hear about corn that will increase his yield to the
farmer actually using this on the farm.
And all the farmers eventually did this.
So the question is why did it take 12 years?
So over the next about 20 years, there are a lot of
other studies, some about corn, other things about how
children buy toys and how companies buy microscopes.
About 500 quantitative studies later, somebody named Everett
Roger came into the picture and said, wait, there are lots
of patterns going on.
I could build a model of how adoption works.
And then for the next about 50 years, this became a really
big wide study field and cited more than any computer science
paper I know of.
So I was impressed.
So this is a very general thing, so I think you guys
will enjoy hearing about this.
So how adoption works is, Everett found, was that it
goes through a pipeline.
The first step-- somebody has to hear about your innovation.
In the case of corn, that's very easy.
A door to door salesman went around trying to sell corn.
But that doesn't really work.
Salesmen aren't very compelling.
And so what happened is it progressed to the next step,
which is something called persuasion.
And this is more of an information seeking thing.
You can imagine going through your social network, talk to
your friends about it, go onto the internet, and
try to learn more.
And this is more of an active process.
And eventually, you're going to go through this coin flip.
You're going to have to make a decision.
And this is a very short process, because eventually
you're like yes, I'm going to do this.
However, you're not just going to yes, I'm going to do this
and replant your entire farm.
You're actually going to go through a trial period.
So here, I'm showing that you'll maybe plant one piece
of corn, see how that works.
If it works for you, maybe you'll try to deploy it on
your entire farm.
If you're in Alaska, obviously, this won't work.
So it has to be compatible with how things worked.
And so this is basically what Roger found for all these
innovations.
He was actually very easily able to characterize the
processes they went through.
Now, at the same time that a person is going through this
adoption process, there are catalysts that
influence how it goes.
So for example, is the innovation simple?
Oh, this is magic corn that's resistant to bugs.
It's bred to be resistant from bugs.
I understand what that does, and well, wait, that there's a
relative advantage to this.
So because it's resistant to the bugs, no bugs will eat my
corn and I can sell more corn.
And at the same time, you might not believe that this is
actually the case, like this is a salesman trying to sell
you something.
So you actually want to try it out and see that, maybe, for
example, does my soil type work with this type of corn,
and see if it really works.
And then for the final step--
I just described the final step of compatibility.
Does this innovation actually work for
your particular domain.
Now the reason I wanted to walk through this, first of
all, I think this is one of the most important
things in this talk.
So I didn't even come up with it.
But it's still really cool.
But the thing is now, we can start analyzing innovations
and technologies using those processes and catalysts.
So for example, let's look at safe sex and how that relates
to type systems.
So if we look at the process--
actually, let me back pedal one bit.
So the reason I'm talking about safe sex--
I actually mean a very particular time period, about
early '90s safe sex, which is essentially when you had the
whole *** epidemic was really concerning.
And so if we look at, in terms of the process of why safe sex
was becoming an issue of the time, most people knew that
safe sex would prevent problems.
And they might actually talk to their friends to say yes,
do you really believe this is going on?
But somehow somewhere between decision or trying to see if
it works for them, going forward, it fell down.
And the interesting thing is we look at type systems--
most people know about type systems.
A lot of people might have even tried to read up a blog
entry or something about it.
But somehow, it falls down.
And so if we take a look at the catalyst, it actually
seems very similar as well of what the challenges are.
So the relative of advantage of both safe sex and type
systems-- oh, we prevent bad things.
You will not die.
Your space shuttle might not explode.
That sounds really good.
And it's pretty simple how they both work.
I don't think I need to belabor the point at somewhere
like Google.
I'm not even talking about crazy type systems, just
simple things.
However, they also fell down with something like
observability.
Did it actually prevent the problem it was
advertised to prevent?
I don't know.
If something didn't happen, why didn't it happen?
It's hard to see causality, right?
And at the same time, through trialability--
you can't just use a type system in isolation.
You have to work with your co-worker.
Same thing for safe sex.
It takes two to tango.
And so as you go through these things, you see that this
actually is the same type of technology
that is at its essence.
And so the question is, can we learn from the kind of studies
done in other domains for similar things?
So apparently, in two weekends, you can
save a lot of lives.
So I'm going to teach you how to do that.
So some sociologists took a look at how could
we spread safe sex?
How do we get this innovation into people's
hands and make it active?
And so weekend number one, they hung out at a bunch of
gay bars in three different cities.
And they isolated local opinion
leaders in the community.
They said, here are the people that people at the bars
apparently listen to.
What that means, I don't know, but apparently--
why that is--
these are the people of interest.
And once they found those opinion leaders, they came
back the next weekend and said, come to our workshop.
Let us teach you about what this thing is and let us teach
you how to explain it to other people.
And then finally, a cute thing they did, also, was they gave
them a little token.
So they actually them a badge with a traffic light.
So basically what happened is somebody would say, oh, why do
you have a traffic on your head or on your lapel?
And he says, well, let me tell you about safe sex.
And that was actually very simple, but it got the
conversation rolling.
I could claim that it worked well, but I could also just
show you that it worked well.
And I don't want to get into the methodology of how they
measured this.
But it's actually much more rigorous than what we normally
see in computer science, [? as its ?]
[? weighed. ?]
But what is essentially what comes out is when they came
back three months later and also three years later, they
asked, all right, if you had sex maybe several times or
whatever, did you have safe sex?
And more people would say, yes, I did after the
interventions.
And likewise, if they asked, did you have unsafe sex, which
is the thing that we're really interested in.
And that's where we see it went down.
So we have this nice gap.
So this was, I think, a very successful case study.
So now the question is, can we go back to languages and see
does the diffusion process mesh with
how things work here?
So we actually have some cool success stories in the
language community.
And I think I could describe them in terms of this
diffusion process.
So that's two very simple examples.
One is observability.
Do you see a benefit of this technology?
So type systems are supposed to be a program analysis that
finds bugs.
We can't even get people to do this for free, right?
But that's the [? high school ?] world.
Now, there's a Stanford start up called Coverity that runs
program analysis.
Again, it's sort of in a sense similar technology, but in
this case, not only do they get people to use it, they get
people to pay them to use it.
And basically what's going on, the analysis runs and then you
see that this long standing bug or this really scary
looking bug actually gets characterized.
So that's an observable result of your tool.
That's something you want.
So the question is, can we do this for type systems?
How would you do that?
Can you give some accountability?
Another case here-- and actually, probably really big
at Google--
is something like relative advantage.
If I use this tool, what do I expect to see in terms of a
substantial change to myself or to my organization?
So something like Hadoop or EC2, all of a sudden, oh, I'm
going to scale out and be able to handle more
users, things like that.
And so this targets a particular need.
And its a very quantifiable need.
And so relative advantage is something you should be
thinking about when you're trying to design your
technology.
You want people to actually use it.
So I can go through all the other catalysts and all the
other processes and do the same thing for other popular
technologies of the day.
I suggest you think about it.
And then if you really are stumped, then maybe look at
the slides, but I think you should just think about it.
I just described very technical solutions.
What's really cool to me about the safe sex advocacy was that
it was very non-technical.
They just want to a bar for a weekend.
We actually have that in the computer world.
For example--
this, luckily, is no longer true.
We dropped the URLs now.
But you actually could look at a website and see what
technology they use, and this is a ringing endorsement for
your technology.
People know about it.
They get persuaded.
So I think a lot of simple solutions will work.
You have to think about them and know how
to think about them.
This was about how to get people to use your
technologies and ways of thinking about it.
As a language designer or as an academic, that sounds fun,
but that's not the only thing I'm interested in.
I just want to make better and cooler things.
And so I'm going to argue that adoption lets you make better
and cooler things.
So I'm going to look at one particular case.
It's something called reinvention.
And I'm going to look at something in the context of
something called the living will law.
This is a very politically charged topic, so instead of
getting into that, what I do want to say is that both
Republicans and Democrats in the US think this is an
important thing.
And so both do lots of legislation on this.
And there are a lot of points where they agree that,
essentially, you need to have laws about what happens when
somebody's in a coma.
What are their legal rights?
So as a Californian, I was very excited.
California wrote one of the first living will laws.
That's good.
And then very soon after, Nevada wrote
another living will law.
And they expressly said, the goal of this legislation is to
become in accordance with recent California legislation,
a living will law.
So they had no innovation in this law.
It was the same thing.
It was just copy and paste.
But the really surprising thing is about 10 years later,
Arkansas, which is not liberal hippy California, had made its
own living will law and legislation.
And what's more is it was actually better than that
legislation that California had in 1976.
So the question is, what happens?
And what's cool is this isn't a singular event.
If you look at something like school policy--
or maybe if you don't like welfare, but you want people
to go off welfare and get jobs, that's something called
welfare reform.
That is actually the same curve that happened for how,
as policy spread throughout the country, it got better.
AUDIENCE: What is your vertical axis?
How are you measuring [INAUDIBLE]?
LEO A. MEYEROVICH: It's very law-specific.
The case of living will law, the question is does it handle
more scenarios that come up, and independent of whether you
can or cannot do something, how easy is it to exercise?
So think of this like a flexible type system.
Maybe it does what you want or not, but it's flexible.
It gets you there really quickly.
I'm going to stop making fun of type systems.
I've written a paper about them, so don't think I'm
against them.
So what's really going on here is two very cool phenomena
that I think we should be looking at.
One is something called social learning, where basically this
is Arkansas looking at somebody who did a full
deployment of the idea before.
And you can see how that worked for them.
Can we copy the good things, change and fix the bad?
And related to that is something called adaptation,
where you're going to be in a different context than the
person using the innovation before.
So in this case, for example, if the legislation was made
for an urban area, because you're in a rural area, you're
going to need different legislation.
And so then you're going to have to adapt it
based on your context.
So both of these phenomena--
you can learn from them and see how people
work in those scenarios.
This is very hard to do in a lab room environment.
So if you're designing a language, I think it's a very
good question--
how can we harness this reinvention of the community
to be part of the language of design process
or the feature process?
And more in general, if you're making technology, you could
pretend you can invent it all, but I'm going to say history
is against you.
So to be very concrete how the shows up-- as me, Leo the
language designer, I'll come up with an idea.
I'll maybe prototype something three to nine months.
Maybe I'll send it for three to six months to people so
[? for ?] feedback, see if they like
this language feature.
Then maybe I'll have to iterate again.
And so now we're entering this year long period.
And then heaven forbid, I decide to publish a paper
about this thing.
That's an extra year long lead I'm working on this feature.
This is where we are in the language
design community today.
Even an industrial setting--
it's a little faster, but it's not significantly faster.
And so the question is how do we streamline language
evolution so you can involve the community to actually
improve your technology?
So this is a thought experiment.
I want to be clear that I'm not doing this, but this pulls
onto a lot of ideas that people are doing.
So again, let's say we start at the top with an idea.
Now, then you're going to want to design your language
feature, but you're not really interested all the heavy,
gunky, low level details.
So notions like language as a library, where you can do an
interpreter level or very high level implementation of
feature, that actually streamlines the process.
And then from there, you can go and get it out into the
community right away.
However, today you can put it on a blog, but who's
going to use it?
So the question is if you're working with a language
community, how do you actually engage with
the language community?
And so for example, we have Mechanical Turk, which will
give you people who work at a call center.
That's not very interesting.
But you can ask is there a Mechanical Turk equivalent for
trying out language features?
And as far as I can tell, there is not.
So if you want to build one, please let me know about it.
But that's not enough.
Then you have to get the data back from these experiments if
you want to learn how the community works.
And so there, you need to get analytics from your compiler.
Maybe, as you saw in the beginning of the talk, you
have to survey people to see what's going on.
Not everything is just in the numbers.
There's explanations for the numbers.
And then finally, you realize that we're iterating.
And so you want to put this all into a central place, save
that data for somebody else, fork, and move
on to the next pipeline.
And so I'm not saying this is necessarily the way to do it,
but hopefully, by principle of social learning, you
appreciate, oh, there is this untapped resource that we
don't know how to take advantage of today, and for a
lot of other people, works for us.
Mark is making eyes at me.
AUDIENCE: Yeah.
Back up just a little.
So many languages have active communities that seem to me to
already be providing your expert social [? files ?]
behind the function.
[INAUDIBLE] the Python community
has the OLPC process.
When I was [INAUDIBLE], there was this very active
[INAUDIBLE] list, and we were constantly discussing language
features today.
Equiscript has a very active community.
How do these communities of discussing and proposing
prototype of the futures and providing feedback, iterating
fairly quickly, differ from those [INAUDIBLE]?
LEO A. MEYEROVICH: That's a very excellent question, which
is basically we already have involved communities.
Do we want to involve them in a different way?
Is this the type of the involvement I'm talking about?
And so for there, I actually want to draw on an example.
I was talking to one of the Scala developers, and in
particular, a concern that they've been having, which is
Scala is this language that's going under rapid
evolution as we speak.
What basically happens is somebody like Martin Odersky
will have an idea, and they'll put out a patch or talk about
on a mailing list.
But what they realized pretty quickly on was that
essentially, it's an echo chamber--
that you're talking to this very special demographic, and
that this does not necessarily relate-- it's unclear how this
demographic relates to the demographic you care about.
And so when I'm talking about social learning, I'm really
saying get this out to people, the community or a
representative of the community who really are
working there.
And so I think Equiscript are doing a really good job of,
for example, getting high level developers at Google and
Microsoft to say what they want in the language.
But unfortunately, those aren't the only people using
the language.
And I say those are the minority.
And so I don't actually have a solution for you.
But I do have the problem for you.
Or I have a proposal for a solution.
This is totally untested.
So yeah, that's a great question.
OK.
So now, I want to go to a super high level.
I talked about different ways of using adoption for some
particular tasks.
Now, I'm going to argue that it shouldn't just be Leo
coming here and talking to you about it.
I think other people should be looking at it.
And I'm going to give two examples of why I found this
interesting.
So here, we have something called an ecological theory.
And somebody named Mark, about, I think 15 years ago,
made an interesting observation that discussing
music with your friends is fun.
And so what led from that is well, does this somehow drive
how we pick music and how music genres emerge?
And what he observed is that individuals have time
constraints.
You can't just listen to all the music or talk to everybody
about your particular music and seek them out.
And what he realized is that somehow, music moves along
demographic lines in a pretty interesting way.
And so what you're seeing on that chart on the left--
on the x-axis is people's age, and on the y-axis, people's
education level.
For example, if you look at person A on the top, they're
somewhere in that middle age group, but
they're very, very educated.
And what's cool is you can actually start drawing niches
around which people listen to what music.
So even if you don't know anything about person A,
there's a good chance, according to this graph, they
like new age music.
If you look at somebody older, person B on the right, they're
part of a much bigger demographic, a big age
demographic.
And country music--
I was surprised by this-- is actually very popular in the
US, even before the recent bluegrass stuff.
And part of the appeal is, according to this reasoning,
is that a lot of people could talk about each other to
country music and know what they're talking about.
And so what the realization behind this chart I'm showing
you here is that when some sort of innovation, whether
music, technology, whatever, is competing out there in the
market, it's actually not competing for individuals, but
competing for social networks.
So the case of country music has won the demographic of
people who are older, while heavy metal has this nice
niche of younger people.
And so the interesting thing is if I go back to this early
graphic I showed in the beginning of how language is
spread throughout SourceForge, I said that DSLs are going
niche by niche.
I'd make a much stronger claim-- that according, at
least, to the ecological theory, if you're a language
designer, you're not targeting the technology constraints of
the niche, you're more generally targeting--
you're a community builder.
You might not necessarily have even fixed any technical
problems in that niche.
According to ecological theory, you have just somehow
spread into that particular social network of the domain.
So when I say domain specific language, I mean community
specific building or something like that.
So this changes exactly how we evaluate
or understand languages.
And so now, I want to finish on one last example.
I grew up in New England, and we liked to
make snowmen there.
And I don't you know if you ever made one, but
you roll the snow.
Every time you roll the ball, it gets bigger and bigger.
And technology is like that.
In particular, let's say you make one roll and you add in
some technology on top of the existing technology.
What this is going to do is enable new types of social
interactions.
For example, we added a Twitter wall or something or a
Facebook wall.
Now, all of a sudden, people standing in line at the market
can tweet on it.
And that means they have new social interactions driven by
the technology.
But at the same time, if we turn the snowball again, now
we're going to have, based on those social interactions, new
types of technology emerging.
And for there, I might say, for example, before we had
Twitter, we had Facebook.
And because people were using Facebook, then we were able to
advance to Twitter.
And the relevance here is that if you want to talk about
designing a language, building some new technology, that's
only half of ball roll.
The other half of the ball roll is understanding how
people are using it.
And that tells you how the next iteration of the
technology works.
And this is good news, bad news.
Good news is that we have this understanding where I claim
there's this relationship.
The bad news is twofold.
One is this relationship is a moving target, because they're
co-dependent.
It's human specific.
It's society specific.
And the really bad news here is I don't think sociologists
are going to do the work of understanding the other side
of the ball roll for us.
So if you really want to understand this, I think, I
have to keep doing this.
Other people have to keep doing this.
And we have to be a little more, I would claim,
scientific about it-- but at least just looking at somehow.
So in conclusion, I showed you two things.
I don't think this is a conclusion.
I think is a start of a lot of cool stuff.
The first thing is I think we are in a data
era of language research.
It's not just software engineering or analytics.
I think it's how we get data to understand
how languages work.
It'll hopefully be not as much of an art going forward.
And the second thing is, I think when I talk about
principles of programming languages or language
foundations, my argument here is that social theories are
one of the big foundations that, for a lot of things we
care about in languages, the social theories actually
inform whether they work or not and how they should work
and they give explanatory explanations.
So with that, I'm going to say, if you thought this was
cool, go to my website.
I have data, papers, and an email link probably somewhere
there if you want to talk more.
John?
AUDIENCE: [INAUDIBLE].
LEO A. MEYEROVICH: OK.
Then let's hearken back to the safe sex example.
Apparently, people on their own aren't going to get pushed
to do safe sex.
But somehow, if you hijack the social process and interject
on it, you can change it.
So there's really bad news here, which is one of the
earlier results from the modern school of sociology was
that a lot of these social processes are not
automatically self-sustaining, that even if you do an
intervention--
to design an intervention that keeps working is very hard.
So good news, bad news.
Good news is interventions work.
The bad news is it's hard to do them.
One last thing here.
I was very impressed by the age in variance results,
because what it told me is that older programmers have a
very long shelf life.
A 60-year-old knew all the popular languages.
Statistically, it's fine.
Maybe there's some sample bias here, but I thought that to be
a very promising thing that they were there on it.
Hi.
AUDIENCE: On that age thing, how much have you considered
the fact that people shift the set of
languages that they know?
When I was 15, I knew Basic very well.
But if someone asked me today, I probably wouldn't
[INAUDIBLE]
Basic at all.
LEO A. MEYEROVICH: Yes.
So the question is how do we discern languages before
versus languages that we are using actively today?
There, we had two different questions about that.
And we put them in a particular order to help.
The first question was what languages have you ever used?
And even if before-- and we're careful with the phrasing to
get at that.
And then the second question is what languages do
you know well now.
So I'm totally with you on that.
I think there's still problems with the phrasing.
Mark and John--
AUDIENCE: [INAUDIBLE]
LEO A. MEYEROVICH: Yeah.
So I think that's a very good observation, which is that
basically, languages may not necessarily be an entirely
technical innovation, that there's still some perceptions
and beliefs involved.
And I found two very compelling areas of research
that helped align my thinking there.
The first one was historical linguistics, which is asking
questions like, well, why was Italian so
popular after the Romans?
And it's not because Italian is a better language, but it's
because if you work in the Roman army and learn whatever
that you'll become a citizen.
So that was an issue of prestige and other things.
The other community which is a smaller body of research, but
also very interesting, is something called the economics
of religion, where you get a statistician to ask how a
religion works.
And there, you get funny results like, for example, if
you don't care if your language is adopted in wide
scale, but you are interested in if people keep using it--
for example, you could do horrible experiments on them,
which actually, I think, is a great model for academia--
you actually could be a very strict or very polarizing
religion or language.
And if you make it hard for people to get in, but once
they're in, it's hard for them to leave because all your
libraries don't inter-operate with anything
else, people will stay.
And then you can try things on them.
So I totally agree with you that there's a lot
non-technical going on, or non-utilitarian In the middle?
AUDIENCE: It feels like the greatest area of language
exposure was back to the '60s, perhaps through the '70s.
And everything we're seeing now is, oh, well, let's take
something from the '60s and repeat [INAUDIBLE] syntax or
dumb it down so that people [INAUDIBLE].
Is that a reasonable observation?
AUDIENCE: [INAUDIBLE].
[LAUGHTER]
AUDIENCE: [INAUDIBLE]
But seriously, it's like you're arguing over the
details in a sense.
It's like the difference between sect A of religion and
sect B of religion, but way over here is something that's
a totally different view of the world and you're totally
ignoring it.
LEO A. MEYEROVICH: Right.
I think that's what motivated a lot of this work.
I feel like it's not quite the same problem that physics has
today, but we're getting there-- where our ideas are
way past where people can do or what people will use.
But on the other hand, when I look at numbers, the
programmers today use many more languages
than they used before.
Whether those languages are actually very different from
the ones they used before, that's a good question.
For example, the research I do for the language features I
do, those aren't going to show up in your language
in a very long time.
That's also why I think we should understand this.
I don't want that to keep happening.
Mark?
AUDIENCE: So on your diffusion of innovation numeration and
the characteristics of [INAUDIBLE], there was all
this stuff about compatibility and [INAUDIBLE].
And then, you were focusing on type systems, [INAUDIBLE] seem
or what have expected that the focus would have [INAUDIBLE]
that you're imagining one is trying to advance is to
promote type systems to programmers that currently are
in [? dynamic ?] type languages--
that gradual typing approaches and optional typing
approaches--
things that allow the types to be adopted incrementally and
allow them to be used in the context of not have to
completely change what these programmers [INAUDIBLE] use.
I would have expected that to have all followed from your
basic principles.
LEO A. MEYEROVICH: Yes.
So the question is something like gradual typing, which
lets you mix in static types into your dynamic language--
let's make this the last question.
So the question is how does gradual types, which is
supposed to be an adoption-oriented approach to
static types, mixing those into popular dynamic
languages--
does that work or not?
I think, in many cases, it does address a lot of issues.
But for example, can you imagine working with three
people-- one of them doesn't know anything about static
types and the other two do use the gradual types.
Could the person who doesn't understand static types very
well write programs that inter-op with them?
My claim is with modern gradual type systems, the
answer is no.
That's my experience with something similar at Adobe.
AUDIENCE: I think the reason the answer to that is yes
because of the very common observation that main purpose
that declared types is documentary.
LEO A. MEYEROVICH: So that's like a
slightly different issue.
And that is what's the purpose of these types?
And there, I agree with you.
AUDIENCE: I think that's the bridge between the person who
doesn't think in terms of type checking.
That bridge will enable him to work well and will enable the
incremental learning without having to explain concepts.
LEO A. MEYEROVICH: Yeah, and actually, our
statistics agree with that.
When we asked people why they thought static types are good,
they didn't think static types were good for bug finding.
They thought unit tests, generally, were better.
But they did think static types were good
for explaining things.
So maybe that's what the static type community should
look at a bit more strongly.
PHILIP: Well, thank you so much, Leo.
[APPLAUSE]