Tip:
Highlight text to annotate it
X
MIKE WINTON: Hi, I'm Mike Winton.
And for those of you who don't know me, I lead Developer
Relations here at Google.
And today I would like to introduce our special guest,
Joel Spolsky.
Joel's an expert on software development
dating back many years.
Until long ago, he worked at Microsoft and designed VBA as
a member of the Excel team.
He's also worked at Juno Online Services on their
internet client.
And a little bit more recently, he co-founded Fog
Creek Software and is co-creator of
stackoverflow.com.
He's also got a popular website that's been translated
into over 30 languages.
He's written four books about software development.
And he's here to talk to us today about something else
he's an expert at, which is communities and the dynamics
and the anthropology of the communities.
And I would say we are so thrilled here at Google about
the existence of Stack Overflow.
And that's something that's really one of the fundamentals
about how we do developer support and how we do
developer relations, is going out and participating in the
conversations and in the community that developers are
having there at stackoverflow.com.
So with that, I'll hand you over to Joel.
Thank you, Joel.
JOEL SPOLSKY: Thanks.
[APPLAUSE]
JOEL SPOLSKY: So this talk is called--
Can I walk around?
Yes.
I'm not sure what I'm attached to.
This talk isn't called The Cultural
Anthropology of Stack Exchange.
Did anybody take anthropology in college?
I had.
It was the worst class I ever took.
It was a class called Cultural Anthropology because I found
it to be unbelievably boring.
And there was all this stuff about Trobriand Islanders and
there was a tribe somewhere that exchanged blankets called
Potlach blankets.
This is a Canadian thing.
And there was kind of a lot of stuff that I found
unbelievably irrelevant and boring.
And then I discovered that this turned out to be my day
job, actually studying anthropology, especially as it
applies online.
The stack exchange network is 266,000,000 page views every
month, 25 million global monthly uniques.
That's one way we have of counting it.
We have another way of counting it, which is 50
million uniques.
I like that way better but let's go with 25 million
different people visit every single month.
If we were a country, we would be up there with some of the
most reputable countries.
with--
[LAUGHTER]
--their governance.
Not really a good--
Not really an impressive list.
If we were a state, we would probably be the third largest
state and we're about to break--
about to pass Texas.
So that's why the only way you can study this large group of
people is kind of the tools of anthropology, the tools of
cultural anthropology, studying people and what they
try to make, and how they create from the smallest
group, which is just a group of people to come together to
do something temporarily and then disband to larger
[INAUDIBLE]
like groups.
Now when I started programming a long time ago-- that's me.
No, I'm just kidding.
We had these tape drives that--
No.
We thought that computing was about computing, like getting
a number, getting a result, solving a
problem using a computer.
And you felt like you were lucky if you could somehow
manage to create computer software that, at the end of
the day, ran the payroll at least once every two weeks
when the payroll was due.
And if you actually got payroll numbers, or tax
reports, or whatever, then you felt like you had
accomplished something.
But then people invented time-sharing in the '60s, and
I actually got my start sort of in this generation.
And so this is a little printing terminal called a
DECwriter and they were very popular.
And it would hook up with--
It had to be a 300 baud modem, because any faster than that
and the print head couldn't keep up, even though there
were faster modems probably already being invented.
They came up with a later version of the DECwriter that
could do 1,200 baud.
And multiple people were online at once, for the first
time, instead of doing the batch jobs.
And that meant that all of a sudden you had the possibility
of communication, but 300 baud was still too slow to put
large amounts of text.
So you didn't really have conversations.
Email hadn't yet been invented.
Until the first terminals came out, the first what they
called-- this was called a smart terminal, because it had
the ability to move the cursor.
The VT-100 was sort of a classic of this era.
And they could go up to 1,200 baud, or
even 9,600 baud later.
And so, actually, that starts to be about as fast as your
eyes can read.
I would say most people read at about 2,400 baud.
And it starts to become reasonable to imagine doing
things like sending emails, so email was invented.
And the first versions of online support groups, and the
first ways that people communicated in groups online,
were just extensions of email.
Google Groups, to this day, you could trace it all the way
back to Usenet.
And Usenet you can trace all the way back to this idea of
just sending around mailing lists, where there would be
some stupid daemon that would just receive email at one
address, and send it out to people
according to a list, right?
A mailer daemon.
So Usenet was really interesting, because I used
Usenet a lot in college, and now it's almost completely
gone, except for distributing illegal things.
But it was this is a fairly late version of the Usenet
software TRN, the threaded RN.
RN stands for "read news." And up in the right-hand corner,
you can actually see the thread, and the idea of the
thread being that a conversation can have a parent
and multiple conversations can have multiple parents.
It's all wonderful for computer scientists, not for
regular people.
But one of the things that I noticed about Usenet was a
very interesting phenomenon--
I know it's very hard to see this in the back-- but you
would get these conversations that showed artifacts of the
software, OK?
So this is kind of interesting.
This is a conversation on talk.politics.mideast, which I
participated in for about a month, until
it went into a loop.
And then it never emerged, it never got out of.
But an interesting thing to notice is when you hit R to
reply to a message in the RN program, the first thing you
have to know about Usenet is it's not client-server.
It was a totally distributed system, meaning everybody just
connected to each other and sent each other any messages
they hadn't already seen, and every node
didn't have enough people.
Hard drives were expensive, and so the average Usenet node
would have maybe three days worth of archives and throw
away everything older than three or four days.
They would keep enough so that you cold away for the weekend,
come back on Monday, and not have missed anything.
But very rarely did anybody have a big enough hard drive
to keep a whole week worth of Usenet, or two weeks.
So--
which is one reason why it was so hard to assemble the Usenet
history, because it was never in one place.
Because you--
when you're applying to a message, if there is a
probability that the person who's seeing your reply does
not have access any more to the thing that you were
replying to, that makes it hard to have a conversation.
And so people started quoting the message they
were replying to.
And then the software said, hey, let's make that a feature
of the software to make that easy.
So when you hit R, which meant reply, you've got this thing--
it still happens in some email clients--
where, the message you are responding to, every single
line got a little greater than in front of it, saying this is
what I'm replying to.
And what people did when confronted with the software
is interesting.
They would then go intersperse their own comments in between
the other people's comments.
And it was very easy to follow what was going on because the
greater thans were the original message.
And then you could have your nitpicky little answers
interspersed on every single sentence.
And so here you have-- clearly this is nonsense, this is
clearly nonsense, this is nonsense, this is
nonsense, et cetera.
You, what you would do is you would pick apart someone's
argument by picking apart every sentence of their
argument, which is a certain type of culture.
It was very popular on Usenet, the nitpicky culture.
Still exists on the internet in certain
places today, I believe.
And the bloggers later reinvented this, and they
called it fisking, if you're a blogger.
But this particular style is an artifact.
Its an accidental thing that happened because the software
happened to behave in a particular way.
And what's interesting is, until you start noticing that
the culture is acting according to the software,
then you may forget to design the software in a way that
makes the culture work.
So this is at Moscone Center actually.
And it's just and architectural image that shows
the concept of if you build something in a certain shape,
the people will go into that shape.
And that happens in architecture all the time.
Sometimes it's accidental.
You happen to have a nice curve that's awesome for
skateboarding, and you'll get kids skateboarding.
Well, this is a little bit more intentional, where if you
build a table that looks like it might be the right size for
chessboard, and it's got an 8 x 8 grid on it.
And you put two things that are the shape of humanoid
chairs on either side, then old men will come and sit down
and play chess.
Whatever you build, somebody will find a way to come use it
in the way that you built it, sometimes completely
non-intentionally.
This is the Spanish Steps in Rome, where you'll see people
come in and just sort of hangout, sitting on the steps.
It was built because there are two roads at different heights
and they can't be connected with the road.
It would be too steep.
So they built steps.
But actually, that turns out to be a really good place to
sit down if you're a teenager, or you're a backpacker, and
get your hair braided by gypsies, because it's just
sort of the perfect physical environment for that.
And then you can copy it.
So, in Times Square, they built this staircase that
doesn't go anywhere in hopes that backpackers and gypsies
will come sit down there.
If you don't know, if you're not paying attention, here's
what happened on software.
We had Usenet and then everything moved to the web
and they said well let's not use NNTP as
our protocol anymore.
And they started building these web versions of Usenet.
And the web versions of Usenet, which are still out
there all over the place today.
There's software called [INAUDIBLE] and phpBB There's
a whole bunch of these packages.
They're actually just copying Usenet.
They work in the same way, except that they have the
Smiley face that rolls to the left and the right when you'
enter a rolling smiley face command.
But other than that, they're functionally equivalent to
Usenet, which is an accidental design that actually,
literally came from email.
So, look, nothing has been sort of innovated here.
So if you want to be Utopian--
I don't know why I have this picture here.
I guess this is a Utopian picture.
You have to actually design stuff a little bit
intentionally.
That's what we started to do with Stack Exchange.
I feel like we're about 10% of the way there.
And I just want to show you some examples of how we've
done this at Stack Overflow and Stack Exchange over the
four years that we've been out and in business.
All right, so one thing which we focused very much on is
first impressions.
Does anybody know what this is a picture of, anybody?
AUDIENCE: Occupy Wall Street.
JOEL SPOLSKY: Occupy Wall Street.
It's already starting to be historical, but
they're still there.
And there's a lot of clues, obviously, besides the, well,
it's on Wall Street, so that helps, if you saw that little
thingy there.
Like, this fellow here, who one of these days is going to
accost me after one of these events because I
don't know who it is.
But he's got a Zapata t-shirt, which is awesome.
I don't know if he knows who that is, he probably knows.
He's got kind of like a kaffiyeh, but it's like
interesting colors, bright colors.
Here we have a person who loves the 99% with really,
really expensive headphones.
[LAUGHTER]
That's OK.
There's just all kinds--
everybody is trying to sort of put out their signals as to
who they are, what they believe in, because they want
to attract more people like them to their cause, right?
So what's the first impression that we got?
I sent Jeff Atwood out, and I said, when we started this
project four years ago, more than four years ago.
I said, go look at some Q&A sites that are out there.
Because here's my idea, it's different.
And what he found was Yahoo!
Answers is the big one.
Answers.com is also pretty big.
This is Yahoo!
Answers.
What's your first impression of Yahoo!
Answers?
This is a screen shot of the home page I
took at some point.
What are you--
I use to clean out my coffee maker.
What is your favorite plant of all time?
Anyone up for a food/drink, true/false survey?
What are you listening to?
What kind of questions are these?
I mean, they're questions, right?
They have question marks--
sometimes.
What are you listening to?
What is the last thing you ate?
Can I die from carbon monoxide?
No, no, it's OK.
No.
This is the clue here.
It's all the way at the bottom.
I keep forgetting to do my homework.
That's the clue, the secret clue.
Does anybody know the secret?
Yahoo!
Answers?
AUDIENCE: Six-year olds.
JOEL SPOLSKY: Well, it's slightly--
They're 12.
They're latchkey kids.
It's really, really active in the afternoon when kids get
home from school, especially girls, because they have no
permanent identity here.
It's not like Facebook, where you set up an identity and
then people can be creepy.
You can be totally anonymous, and then you could just be
another person tomorrow when you ask a different question.
So Yahoo!
Answers became a chat room for teenagers.
And when you look at the website, you would be deterred
from using Yahoo!
Answers for any purpose other than talking about how to, you
know, what you're listening to.
Here's Answers.com again.
What kind of attorney is needed for advice on getting
someone you know committed to a mental institution?
I like that question.
That's a good question.
So, never mind.
What's supplied does the U.S. get from Ghana?
I don't know.
Who cares.
Another clue.
What are some examples of a welcome address for JS prom?
JS prom, anybody?
It's not JavaScript, that's what I thought.
There is no JavaScript prom.
This is obviously a kid that is so young that they don't
even get that this prom that they have at their school,
called the JS prom, is not, like, a
universal internet thing.
It's just specific to their school.
Or they don't get that the internet is a universal thing.
Either way, once again, what's going on here
is this is for kids.
What is the rights of McDonald's?
That's a good one.
Askville.
Amazon bought this little site called Askville and then
proceeded to ignore it.
How can I start making the right choices in my life?
What is the 21 largest states?
What is the interval, some sort of math--
This is actually a question, at least,
that makes some sense.
And the answer is, like, this is homework.
We don't do homework questions here.
So again, what's the first impression on Stack Overflow?
The first impression on Stack Overflow is, if you're a
programmer, you get that these are programmer questions.
You're like, oh look, it's full of programmer questions.
If you're not a programmer, you don't understand a single
thing and you leave.
Which is a good thing, this is what we want.
Here's one of our sites, a network of 90 stack exchange
sites, for Jewish life and learning.
Again, unless you actually went to a yeshiva and studied
sort of extensively into the details of orthodox Jewish
law, you probably don't even understand the questions
because they're written in a special language called
Yeshiva English, which has pretty much replaced Yiddish.
It's English but, I mean, all those words are Hebrew.
Take this one.
Is it possible to have hametz on the Shabbat
directly after Pesach?
In English, that would be is it possible to--
Are you permitted to eat leavened bread on the Sabbath
which immediately follows Passover?
So I can translate that to English,
but it's not in English.
Why isn't it in English?
Because you're trying to push away the people that don't
speak Yeshiva English because, this is--
Not only is this a site for Jewish life and learning as
advertised, it's actually kind of orthodox, right?
And we actually kind of don't really want conservative,
uneducated Jews hanging out here.
That's not my decision.
I mean, I founded a conservative Kibbutz.
I'm a strong--
but I also went to Yeshiva and I get what's going on here.
Cross Validated is a site about statistics.
Comparing two methods of sampling from bivariate--
Again, I don't understand it, but if you're into statistics,
you immediately recognize it.
You Immediately say, OK, this is a real statistics site.
This is for people that actually do statistics, and
understand it.
And know it.
Now Askville does have sections.
So to be fair, I tried to go into the Askville math section
and see what was going on, what they had going on in
their math section.
Write an algorithm to find the number of between 7 and 100
which is exactly divisible by--
OK.
Apply for apartments online.
Asked 13--
What does that tell you, asked 13 hours ago?
AUDIENCE: It's spam.
JOEL SPOLSKY: It's spam.
What else?
What does it tell you, though?
AUDIENCE: [INAUDIBLE]
JOEL SPOLSKY: Nobody cares.
Like, nobody's cleaning up the spam here, so nobody goes
here, right?
Because there's still spam and nobody's cleaning it up.
What is the size of a plot in the Caribbean?
I like that question.
These are the full text, I think.
40, maybe 50.
OK.
So here's my problem.
I'm 20 years old.
Didn't really attend high school and know only super
basic math, meaning plus.
All right.
So if you're a Fields medallist or a math professor
at Berkeley, you don't go to that site.
You go to this one.
This is our math site.
We've got another.
We have two math sites, because the mathematicians
have bifurcated.
This is the PhD-level mathematics.
Again, I can't understand a single thing there.
I don't even know what they're talking about.
I barely understand the tags in terms of knowing about
areas of mathematics.
But we have two sites.
Math overflow has a rule that if a math professor is likely
to know the answer to your question, don't
ask it on math overflow.
Because math overflow is for research-level questions, the
kind of things that most people probably don't
know the answer to.
Otherwise, you have to go to math stack exchange, which as
you can see, is liberal and just allows anybody to ask any
kind of crappy math question that they want on there.
So once again, it's all about what's that first impression
that you give.
This just a random picture that I took.
You could--
There's a few things going on here that you
can immediately tell.
There's some kids are playing Ultimate Frisbee.
This is Gail.
Because it looks like Gail.
There's no leaves on the trees,
which means it's winter.
And yet they're wearing shorts.
So it's either the first day of spring or, more likely,
their Californians because they're
playing Ultimate Frisbee.
And they also kind of look like jocks to me, actually,
based on the sunglasses, and the backwards baseball cap,
and all that kind of stuff.
When you see this scene, you're walking down campus and
you see these kids, you immediately say, oh my God, I
want to join that because I am a Californian at Yale who
plays Ultimate Frisbee and I'm kind of a jock.
Or there's a million things there to turn you off.
It doesn't matter.
But everything about a community either draws people
into the community or pushes them away.
And so many people in web design have been trying to
figure out how do we make a web page that sucks every
single person in the known universe in.
And when you're trying to get expert answers to difficult
questions, that is the opposite problem.
You actually want to drive away as many morons as you
possibly can, hopefully as quickly as possible.
All right.
So that was one big area, that's first impressions.
And that's a really important thing to us
in the Stack Exchange.
Number two, voting.
Obviously, is an important part of stack exchange.
I'll breeze through this because you're all
familiar with this.
Questions are voted up.
Answers are even voted up.
Here are some hot questions from the previous week.
People can vote up the questions they like.
They vote down the questions they don't like.
They vote up the answers that they like.
That's really cool because it sorts the answers in the order
of how good they are.
But what it also means is that you can't have random
conversations because the quotes keep getting quoted out
of context.
So you can't get into a back and forth
argument on Stack Exchange.
You can in the comments, but we'll delete that.
But you can't get into a back and forth argument because
those back and forth arguments are useless.
You're not creating a useful artifact for the internet.
Voting is mostly important because it leads to
reputation.
Reputation is this thing that tells you do I trust people?
This is Colin Powell.
This is what the US Army calls "fruit salad" that he's
wearing, which tells you all kinds of useful stuff about
who he's been and campaigns he's been on.
And his reputation is he's got four little silver stars on
his epaulette.
And we've got the same thing.
Here are some top users from stack overflow by reputation.
Jon Skeet, who you may know, works for you guys in London.
JON SKEET: How you doing?
JOEL SPOLSKY: Oh!
Oh my god, he's here.
[LAUGHTER]
One thing that's interesting to notice here--
I'd never noticed this until we start putting
locations on the site.
Marc Gravell, who works for us now, Forest of Dean.
I don't know where that is.
It's in Never Never Land, somewhere, it's
Middle Earth, UK.
we don't care.
He's just a brain in a box.
He types code for us.
It's awesome.
Rouen, France.
Curacao.
Madison, Wisconsin.
New Jersey.
France.
Alex Martelli.
Is Alex here?
He actually works here at this office.
He's probably heard this speech three times.
But other than Alex, I don't think there's anyone here in
California.
Very, very little participation from the Silicon
Valley, guys.
I don't know why.
Those little badges that you were seeing.
When you start out on the site, you start with a little
teeny tiny badge.
And this is somebody who is-- they've made
a name, Geek Matter.
They haven't customized their avatar.
So we gave them something based on their IP address.
We gave them some triangles.
And you start out with one reputation which you get for
successfully typing in your name.
But you start to earn more and more reputation as
people vote you up.
So Favolas, who's there, has to 208 points and also some
little badges.
There's a little silver badge and some bronze badges there.
And we'll also let you customize your avatar, and the
accept rate.
At one point, we were trying to encourage people to accept
answers that were good and so we're displaying also what
percentage of answers that you accept.
And you could earn sort of more and more points.
Daniel Hilgarth here--
It's hard to see, but his avatar is displayed with a
drop shadow.
And the drop shadow is the subtle hint that, if you move
your mouse over that, you get like a little customized
profile that shows up.
So when you earn, I believe 10,000 reps, you start to get
the right to customize the profile that shows up when
somebody mouses over you.
And there's John, and you can tell you what date this from
based on when I took the screen shot, based on
when you have 400--
only 403,000 rep.
JON SKEET: March?
JOEL SPOLSKY: March.
And you were just about to hit 3000 badges over there.
As a good accept rate, as well.
But you can actually go higher than Jon Skeet, which is you
could become a moderator.
That's an elected position and mostly a burden that involves
having to spend a lot of time on the site deleting bad
behavior, instead of answering questions.
And then you get this extra little diamond that shows up
next your name.
So that's all the flair.
People wear flair for all kinds of reasons.
It's an important part, flair that you wear in real life.
This fellow, I think, has got five things going on here,
which you may need to notice.
He's got the Confederate flag on his cap, which I have to
explain, when I'm outside the United
States, what that means.
He's got a lot of tattoos.
And I won't even try to explain what
each of these means.
I'm sure they mean something.
He's got big muscles.
He's wearing a tank top, or a wife ***, as we call it.
He's got a spare tank top, in case the main, primary, tank
top fails for some reason.
And a Harley Davidson bike logo.
So just think about all the things that he had to think of
in the morning, when he was like, I'm going to go outside
and I'm going to project this image of myself.
And he the sort of decorated himself Shah-of-Iran style, in
a way, just to sort of kind of festoon himself with all those
little graphics.
So that's an important thing that happens in real life.
And we do it in Stack Exchange, too.
We've got these badges.
The badges are interesting, because the first thing people
say to me about badges is, like, I don't really care
about badges.
Who cares about badges.
They're stupid.
How do you get people to care about badges?
They're not worth anything.
The badges probably motivate maybe 1% of the participants
in our site.
Very, very small number of people that actually say I
want to learn this badge.
But everybody on the site knows about them.
And all you have to do a sort of imagine that one person has
seen your flair to actually care about it.
And the neat thing about the badge is that they tell
everybody the behavior that we want to incentivize, even if
they're not directly incentivizing it.
So there's all kinds of things that are norms of Stack
Exchange, and we have communicated to the world,
hey, these are norms, because we give you badges for them.
And so if you were ever wondering whether you're
allowed to ask your own question, then just go look.
You'll see that there's a badge that you can earn for
asking your own question.
If you've ever wondered if we think it's a good idea to ask
a question was asked two years ago, well, you
earn a badge for that.
And so it's a sort of way of saying, hey, all of these are
behaviors that we want to see on the site.
Reputation translates into jobs, actually, and so the
monetization scheme, if you could call it that, of Stack
Overflow, is to show job list things to people.
And because we have this large audience of people that use
Stack Overflow, and we know what they we know and what
they're good at, we can fill positions very well.
All right, government.
Every culture of three or more people has some kind of
government.
Two or more people?
One person and a dog as a government?
There's always all kinds of rules around there.
We try to push a lot of the simple governance, the
policing, on to the population.
And so, as you earn reputation on Stack Overflow, we decide
basically, all right.
You know how the system works, and so we can let you do stuff
to self-moderate, so that the community can sort of
self-moderate itself in many ways.
So for example, if you have 1,500 reputation, you can
create a tag.
If you have 10,000 reputation, you can vote to delete a
question which has already been closed,
et cetera, et cetera.
So as you earn more and more reputation, you get rights to
do stuff on the site.
And that sort of the mass terrorism of the population
terrorizing itself.
I'm trying to think what Kafka story that corresponds to.
But government also--
Another interesting thing is--
Well, let me do a poll in this room.
How many people in this room use stack overflow?
In any way, shape, or form?
Everybody.
How many people have been on Meta Stack Overflow?
If you look around, that's about 10%.
It's much smaller.
Meta Stack Overflow is the place where the actual
policies of the site--
it's the site about Stack Overflow--
are discussed.
And it's sort of the back room, if you're really deeply
involved, deeply interested.
You may actually also be interested in going on there
to see kind of how the things govern themselves.
And this is sort of the equivalent of the civic
society kind of club, or whatever, where people talk
behind the scenes.
And then there's an even deeper level, which is--
How many people have been in our chat system?
That is three.
Four.
So this is the teacher's lounge.
This is where the 275 moderators on the
network hang out.
And there is a chat system there.
Anybody can go in there, and make rooms, and have
conversations out there.
But again, it's sort of the highest level of deep, dark
engagement in Stack Overflow.
As you become bored with the boring little
question-and-answer game that we gave you, this is the place
where you can actually sort of talk about things.
And that's actually where the live government has.
And this is actually where you'll see live decisions
being taken and discussed by moderators about how to
moderate certain things.
We've also got a blog.
And the blog is all about sort of promulgating the decrees
from Stack Overflow, with Stack Exchange central.
And we have laws, so let me talk about law, because every
society has its rules of the government.
And government essentially comes up with the laws.
We only have one important rule, which is we hate fun.
This is the logo of Stack Overflow hates fun.
And we hate fun is all about how there are millions of
things you're not allowed to do on Stack Overflow,
especially anything that you might enjoy, or that might be
popular, or that might get on Hacker News, or Reddit.
And that comes from sort of an observation that there's a lot
of things you could do in discussion groups online that
don't really leave a useful artifact
behind on the internet.
So a very, very important observation, which I have to
keep repeating again and again and again,
because nobody gets it.
But this is the most important thing.
If somebody asks you about the design principles of Stack
Overflow, of Stack Exchange, the most important thing you
have to remember is that the question is
asked by one person.
It's answered by, let's say, one to
four, five people usually.
But it's viewed by hundreds of people.
And hundreds of people will get benefit from that
question, out of just that one person who asked for it.
So if you ask us who we're optimizing for, it's not the
person asking the question.
It's not the people answering the question, although we want
them both to be somewhat happy.
We're doing this all for the hundreds of people.
A very fundamental part of the initial design direction of
stack overflow is that Google is the user
interface to Stack Overflow.
You're on our site because you typed a question on Google.
I used to say search engines.
[LAUGHTER]
Google is our user interface.
You typed a question on Google, and you found a page.
And if we have inventory there, it has
to be really good.
And that means everything is optimized for creating this
great artifact, this historical record.
And the biggest problem with Usenet and the old phpBB
sites, and Experts, whatever it was called.
Can't remember.
And all kinds of other sites, that were out there, the
biggest problem is that the inventory that they had to
give Google was just these crystallized, captured, old
conversations that took place a long time ago.
And there were all kinds of things that
were wrong about it.
First of all, it's a conversation.
In a conversation, you see this back and forth and you
see this ridiculous, hey, did anybody--
Does anybody know how to solve this problem?
I don't know, but did you try X?
Yeah, that didn't work.
Well, did you try this other thing?
Look, I already asked you if you said da da da, and I said,
in my thing, da da da.
OK.
Hey, does anybody know the answer to da da da?
This is now answer number 17.
Hey, does anybody know the answer, because I'm having the
same problem-- one year later.
And then, if you do get an answer, it's on page seven,
and it's wrong, and it involves some kind of a
cross-site scripting vulnerability when
you copied the code.
So if you're not thinking about
creating a useful artifact--
And then Google also has this incidental problem, which is
that the older a page is, the more respectable it looks.
It's a page rank.
And so, a lot of times you would search for things, and
they would get you answers that were just really old.
And that were crystallized and that were not even close to
being the actual solution to a problem.
So we started optimizing around that.
We said, you've got to vote.
You've got to stop having conversations.
You've got to have everything be editable.
Because, for us, it's the artifact that we're creating,
a lot like Wikipedia.
There are people asking questions, and there are
people answering them, but the goal is for the 300 people
that find that question later, to be able
to get a good answer.
So we have these things called close reasons.
And people hate us because we close questions.
And they're like, this is the end of Stack Overflow.
It's going to all collapse because you moderators are
heavy handed.
And you're closing all kinds of useful stuff that would
have been fun, or entertaining, or had thousands
of page views, or went all the way to number
one in Hacker News.
And we have five reasons for closing a question,
specifically.
And these are the five.
And I'm going to talk about them.
They're mostly non-obvious, right?
Exact duplicate.
Pretty obvious.
But remember, the reason we close duplicates is because,
as long as we're trying to create a record for how to
solve a particular problem, we want to create that record in
one single place.
Because then you have everybody going to that page
and it can be more canonical.
It cam be more useful, more helpful, if you have 100 eyes
on that question, instead of 50 eyes on this one and 50
eyes on that one.
The question is just going to get better
and better over time.
So here's an example of an exact duplicate question.
Can anyone explain Monads?
And then that goes off to another page.
It goes into great detail on that.
We don't get a lot of things closed as exact duplicates,
because they really have to be exact.
If we have a slightly different question, we do want
to answer it.
OK, this is what we call off-topic.
Toilet issue in my company.
Not strictly a programming problem, but any help will be
appreciated.
I have a lady employee who is joining the company tomorrow,
and I want to convey her a message that the bathroom
facilities in the office are out of order.
How do I tell her to relieve herself before
arriving in the morning?
It is technically not a programming question.
And while it may be important to you, and this is a
respected member of our site, we're not going to answer it.
Sorry.
It just doesn't belong on the site.
Off-topic.
People generally understand what that means, to be off
topic, although we have a rather narrow
definition of on-topic.
OK.
Not constructive.
Again, the word not constructive is not clear what
that means to people.
Question is not a good fit to our format.
We expect answers to involve facts,
references, specific expertise.
This question will likely solicit opinion, debate,
arguments, polling, or extended discussion.
Those are all the things we hate.
Once again, opinion, debate, argument, polling, extended
discussion.
They're great things.
But they do not create an artifact.
They do not create a useful resource that anybody can
learn from.
If you've ever seen one of these Emacs versus Vim, or Mac
versus Windows.
This one is Web forms versus MVC developer.
If you've ever seen one of these debates on the internet,
they always look so fun.
But they're not.
And they're not interesting.
And what's interesting about these
debates, is they get heated.
And then it draws a crowd, because everybody
wants to see the fight.
Because you've evolved.
If there's something dangerous going on, a tiger is eating a
person, you've evolved to pay close attention to that tiger
eating the person.
And that is way more important than the people being friendly
over there.
You pay attention to the tiger.
And so the stuff that draws in the page
views on the internet?
It's conflict.
And it's useless as an artifact.
It's fun to participate in, so go to Hacker News or wherever.
Go elsewhere.
Buy a subscription on experts-exchange
and have your debates.
But on our site, that's not what we're trying to do.
And so we want things that can be answered, that are useful,
that have answers.
We don't want "which is better, x or y?" We don't want
shopping questions. "Which video monitor should I buy?"
That stuff, believe it or not, I know you need to know which
video monitor you should buy.
And I know that the-- or display adapter-- and I know
that the programmers on Stack Overflow all have amazing
opinions about this and they're great people to ask,
but you can't do it on our site because you're going to
make a useless piece of crap that's going to come up in
someone's Google results.
And they're going to be pissed off because they will have
wasted time.
So it we close them.
It's not constructive.
Now, not a real question is another rule.
Need ideas about mobile apps, Android iPhone, which has
never been created before.
Thanks for the help, please.
I mean this is--
You can see how it's a question.
It's got a question mark and stuff.
And he's going to get a lot answer's, probably, I think.
But this is what we call the question is ambiguous, vague,
incomplete, overly broad, rhetorical.
You know, one thing we have is we basically say, if your
question is one sentence, and the correct answer would be a
book, just don't ask it.
Because we don't want to see those questions on our site.
OK.
I'm missing a bracket somewhere.
[LAUGHTER]
I think you recognize this question.
[LAUGHTER]
This is-- we have a weird terminology for that.
We call this "too localized." That's because Jeff Atwood
doesn't know anything about localization or
internationalization, so he used that word for another
purpose, which is a question that really only applies to
one person in one circumstance.
It's never going to apply again.
Once again, to the global internet audience, it's just
not to be useful to anybody ever.
And so, sorry, don't ask it.
We're not your debugger.
A great question would be "I'm using such-and-such a
programming language, with such-and-such debugger.
What's the easiest way to find missing parentheses." That's a
great question.
Or "how would you approach finding missing parentheses?"
That's a good question, and I'm sure it's in there.
And I'm sure it got a good answer.
But find it for me, it's like no.
Go away.
Go away.
This is--
the other example that I give people that are not
programmers of a too-localized question, "Why is there a
green Honda Civic parked on my street?" I don't know.
Is it still there?
Go check.
Where do you live?
So this is Seoul Korea.
Stack Overflow.
Stack Exchange.
25 million people.
Seoul is about 20 million people, the
population of Seoul.
And so the number of things that are going on in, like,
one of the world's largest cities, that's essentially
what happens on Stack Exchange.
And as you could tell, there are certain things that are
common denominators.
I mean, really, everybody in Seoul speaks Korean, except
me, when I was there.
But everyone else does.
And yet, there's lots of high rises, and
lots of little buildings.
There's all kinds of different cultures.
There's all kinds of subcultures.
There's also some things that are kind of constant.
When you think about the millions of stories that are
going on in a big city, that's what's really happening in
Stack Exchange.
I mean, we started out building software.
When I started building software, we were lucky just
to get computations.
We ended up building software for hundreds, or thousands, or
millions, or millions upon millions of people, where the
actual interaction between the people is what we're trying to
create and what we're trying to make happen.
And that's where you need anthropology.
So thank you very much.
That's my prepared talk.
I will take questions.
[APPLAUSE]
I will take applause.
AUDIENCE: Is it really true that participation in Silicon
Valley is low?
JOEL SPOLSKY: Participation in Silicon Valley--
No, I don't think it's true that participation--
AUDIENCE: Can you repeat question?
JOEL SPOLSKY: Yeah, the question was is it really true
that participation in Silicon Valley is low.
I've never actually compared it to the actual population,
but Silicon Valley does not have a whole lot of
programmers, as a percentage of the
programmers in the world.
We have our own little theories, like most
programmers in Silicon Valley have other sources.
If you work at Google, actually, there's 800 people
you could ask questions if you need help programming.
Whereas, if you work at the Department of Forestry in
Nebraska, there may be nobody that you can ask those
questions to.
You're also, if you work at the Department of Forestry in
Nebraska, probably under-challenged at work.
And so finding a place where you help other people by
answering questions may help challenge you
a little bit more.
Please don't challenge Jon Skeet any further.
Yes.
AUDIENCE: I also find that Stack Overflow is more useful
to me than a book.
[INAUDIBLE]
small, whereas [INAUDIBLE].
JOEL SPOLSKY: Stack Overflow is more useful than a book,
was the comment.
But the granularity is really tiny, and so it doesn't really
quite scale in the same way.
One of the things that I said, when I was starting this, I
had written some books and so I was friends with the people
at Apress, who published programmer books.
And I actually said to the founder of Apress, and then
the founder of O'Reilly, I said, you know, what we
believe is that people are going to stop learning how to
program from books.
Or that they already have.
And that the way to learn programming is usually going
to be to find a tutorial, or some sample code, or to take
over someone else's code.
And just to start typing and, essentially,
page fault in knowledge.
Every time you get stuck, type a question in Google.
Try to learn that one thing.
And then move on and keep doing that.
It's not a very thorough way to learn programming, but that
seems to be the way most of them are learning it.
And the founder of Apress wanted to invest, and the
O'Reilly people were very insulted that I should say
that books are going away.
And they made a competitive site to Stack Overflow called
O'Reilly Answers, which is still on the web,
believe it or not.
And has at least five pages.
Yes.
AUDIENCE: You mentioned, actually, that there is a
badge for [INAUDIBLE] asking a really good question or
answering your own questions.
JOEL SPOLSKY: There is a badge for
answering your own question.
Yeah.
AUDIENCE: And I was kind of wondering, is there interest
in avoiding dupes?
And also, there are certain question that people would ask
that you could probably deliver an answer to based on
analytics of the existing data.
JOEL SPOLSKY: Right.
So the first question is are we really deliberately trying
to avoid dupes?
And secondly, a lot of times we can answer questions--
We already have the answer to the question that they're
actually typing.
So we have a feature right now that, when you type the title
of a question, we'll actually do some keyword searching.
And it's pretty naive and show you some other questions which
might answer your question.
That's done in a relatively naive way, without very much
machine learning behind it.
And nevertheless, it does successfully intercept an
awful lot of questions being asked another time.
We do very much want to get rid of dupes.
However, there is sort of a syndrome on Stack Overflow we
haven't been able to cure completely, where people will
answer a question that has been asked 1,000 times before,
either just to earn reputation quickly, or because it's more
fun and easier to answer a question the 37th time.
They're just like, hey, I can answer this, da da da, rather
than actually try to search and see if that thing's been
asked before.
So we do get an awful lot of dupes.
We're going to start trying to attack that with the machine
learning, and better machine learning algorithms, over the
next year or two.
But that is-- the dupes are actually still
kind of bit of a problem.
We're also starting to see one of the problems that happens
with dupes is that the answer has changed over time, right?
Like the Android has changed.
Something you couldn't do, you now can do.
So there's 37 things saying you can't do it, all which are
highly ranked in Google, and then there's one thing say,
no, you can do it now, or it has changed, or whatever.
And those are too new, actually, to rank.
And so, one of the problems with the lack of de-duping
that we're doing is that we do have stale answers that just
don't get enough eyeballs to fix them, to edit them.
So that's something we want to work on, and we're going to
rely on machine learning for that, because humans obviously
don't want to do for us.
Yes.
AUDIENCE: So how concept [INAUDIBLE] work outside of
programming.
So I'm envisioning a [INAUDIBLE]
used and successful.
So how successful are we [INAUDIBLE]
can you actually discuss the Middle East, something--
JOEL SPOLSKY: Well, the Middle East, there are no answers.
So you can't discuss the Middle East on Stack Exchange.
So the question was how do you discuss--
how well does this scale outside of
the programming community?
Does it even work outside the programming community?
And how well are other communities going to work?
And could you ever discuss the Middle East on our network?
There are now 90 Stack Exchange sites.
I think 35 of them have graduated from beta.
We have a beta process that's pretty rigorous.
We don't let them graduate until we think they're going
to be around for the long haul.
If you look at the site's statistics, a lot of the
growth is coming from the Stack Exchange-- what we'll
call the Stack Exchange network, which is already
probably about a third of our traffic, and probably gets as
many unique page views as Stack Overflow itself, the
other sites.
However, a lot of them are kind of semi-geeky.
Like, they're not programmers, but they're server fall for
system administrators, superuser for PCs, WordPress,
Drupal, database administrators, TeX, the math
typesetting language.
So there's a lot of these sites that are--
like a lot of our traffic comes from
fairly geeky domains.
And it's not clear whether that's a historical accident,
like, we started with an audience of programmers, and
then we say what else you want to talk about?
And they said Drupal.
Or if that's because there's actually something about this
mechanism that appeals to programmers and it works
better for programmers.
However, we are getting pretty far afield, and we have some
sites that are pretty successful that are
not geeky at all.
I mean, there's a parenting site.
There's a home improvement site.
Photography, which is geeky in its own way.
But the home improvement site is kind of interesting,
because it's a bunch of contractors talking about
drywall application techniques.
So we are starting to move beyond that.
And all the growth is happening on the Stack
Exchange side.
Not all of it.
Stack Overflow is growing now at about 56% a year, because
we already have-- just in terms of reach, like number of
unique visitors.
Stack Overflow has only grown 56% last year because it's
hard to find more programmers, at this point, that
are not using it.
But the Stack Exchange network itself is growing at 350% a
year, so much, much faster.
And it's already, I think, a third to half of our traffic.
So it's getting bigger faster.
We think that it's sort of like a system of concentric
circles, right?
We've got the programmers.
And now, you ask the programmers what they want to
talk about, and they might say other programmer-y things.
And some of them are going to be photographers, but there
are no programmers who are also lawyers, very few.
Because those are both sort of all encompassing things.
And so we haven't gotten to law, although we have an
announcement coming next week.
So pay attention next week, when we announce something.
Yes.
AUDIENCE: Do you think Quora is a different kind of market?
Or--
JOEL SPOLSKY: The question is would I say Quora is a
different kind of market.
Yeah.
I mean, I don't want to speak for the Quora guys.
To me, it looks like provoked blogging, meaning it's sort of
like a logging platform, except that there's kind of
all kinds of provocation on there to write an awesome blog
post about your opinion, about a particular thing, that
happens to address a particular question.
So I feel like they don't--
because Quora is wider, and anything is allowed on topic,
it hasn't attracted experts in really anything other than
Silicon Valley start-ups.
And there's no particular field of expertise where
you'll find those experts on Quora yet.
And I don't know if there can be, because
when I look at Yahoo!
Answers, if I were a Fields medallist mathematician, I
would never think to go on Yahoo!
Answers and ask a hard math question.
Just like they won't go to Quora and ask it.
But when they see Math Overflow, they will, because
they see a whole bunch of other really, really hard math
questions around, and so that make sense.
So I think that, if you start with something that's broad
and horizontal in general, you're never going to attract
the hard-core obsessive experts, or the people who
this is their job, it's their profession.
You can't get vertical from horizontal.
On the other hand, if we can build a lot of verticals, we
can, at some point, be kind of indistinguishable from a
horizontal because we've got all the verticals.
Yes.
AUDIENCE: I seem to remember that Stack Exchange does not
have an open source--
or it's not open source.
So if I'm a company and I want to use something internally or
I want to use it for my [INAUDIBLE] or
something like that, how?
JOEL SPOLSKY: Yeah.
Stack Exchange is not actually open source itself.
So we actually think that the valuable data is the text that
people have typed.
And that is open source in the sense that's it's
all Creative Commons.
So we have a public open API, where you can
access any of our data.
We have database dumps where we would take an entire SQL
database and make it available on a monthly basis for anybody
that wants download it and do anything that
they want with it.
Remix, reuse, whatever.
Just don't--
If you put ads on it, and then remove all the links to stack
overflow, then we'll probably be pissed off, but we may not
be able to do anything about it.
But the actual software itself is sort of a little bit
incidental to our system.
So it's not open source.
There are multiple open source clones of stack overflow that
you can get.
OS QA is probably the biggest one, which is an open source
project to kind of clones the way it works.
We think, again, the value is in the community and the
questions that they ask and answer.
John Skeet.
JON SKEET: You talked about trying to put off people who
aren't natural members of the community.
It seems that every day on Meta, there is someone who has
got upset with being down-voted and say, why are
you so harsh when, OK, I clicked saying I understand
all this and then I posted a rubbish question.
Do you think that you will--
Assuming that has to change in some way, is it going to
change by better education, so they really don't ask a
rubbish question?
Or are we going to put them off so they don't even ask
[INAUDIBLE]?
JOEL SPOLSKY: That's a really good question.
So the question is sort of every day, you have large
numbers of people showing up on Meta being angry that their
question has been closed, because it's idiotic, and it
doesn't follow the rules, or it's badly formed, or
whatever it may be.
And they are very, very adamant about
their rights to--
There's a sense of entitlement that people believe, well,
you've given me an edit box on the internet.
I have a right type words in that edit box.
And then when it gets down-voted or deleted, they
sort of say, wait, what is going on here?
How do I not have the right for my words to appear on the
internet for everybody to see?
And this is a bit of an ongoing problem.
So we beat those people down and they get upset and
complain and they say that Stack Overflow is getting--
that the moderators are full of themselves and whatever.
I'm being censored, et cetera.
And then they-- and then, when they say they're getting
censored, of course anybody who's listening from far away
is like, you shouldn't be censoring
people on Stack Overflow.
That's horrible.
With a site as large as Stack Overflow, this
is a growing problem.
We have 7,000 questions a day.
They're not all gems.
There is a large category of people who
cannot do their jobs.
They have been hired as programmers, and they're
incapable of being programmers.
And the only hope that they have, somebody has told them,
well, just type whatever your boss told you to do into Stack
Overflow, and somebody will help you.
And the bad thing is that, sort of like if you've ever
done dog training with the intermittent reinforcement,
sometimes they get answers, so they keep trying.
All of you would really get swapped down badly enough.
I don't know what the natural end state is going to be.
Education, as you mentioned, would be helpful.
We've tried the thing, which I'm not a big believer in,
where somebody new shows up at the site.
They ask a question.
You say wait stop.
Read all this text.
Make sure you are doing all these things.
It's the Eric Raymond "How to Ask Questions on the
Internet." It's a book this long.
And when you're done, you probably won't
have a question anymore.
You're not one of those people that has to ask questions.
Because the exhaustive nature of what Eric Raymond would
have you do before you can ask your question.
And the truth is sometimes you put the best questions on
Stack Overflow just by not looking something up.
Just by saying, you know what, I'm reading this
documentation.
And it makes me wonder da da da da da.
And maybe I could get the answer somewhere else already,
but I'm just going to ask on Stack Overflow because
somebody's going to answer it.
And that's going to help 100 people who are going to have
that same exact question when they read the same
documentation that I just read.
So I don't know if there's a great answer to this.
There's stuff that we're definitely
doing to work on it.
The number one thing that we're trying to do--
I would say there's sort of two things we're trying to do
to work on it.
One is, in the very short term, we have a contest up on
Kaggle which, if you're into machine learning, go answer
that Kaggle contest, you can win valuable prizes.
And it's a contest to identify questions that would likely be
closed, just based on their text.
So what we're going to try to do is develop some kind of
machine learning algorithm that can look at a question
and predict whether or not it's later going to be closed.
We actually discovered that one of the strongest signals,
before we did this Kaggle contest,
the strongest signal--
Does anyone want to guess what the strongest signal is that
question is going to be closed?
AUDIENCE: Spelling errors?
JOEL SPOLSKY: Spelling errors, nope.
AUDIENCE: Length.
JOEL SPOLSKY: Length?
AUDIENCE: Length.
JOEL SPOLSKY: Length, no.
AUDIENCE: [INAUDIBLE].
JOEL SPOLSKY: Sorry?
Books?
No.
AUDIENCE: [INAUDIBLE]
poster.
JOEL SPOLSKY: All right, I'm just going to tell you.
Sentences that start with a lowercase letters is one of
the strongest features, essentially, if you try to do
the machine learning.
So we're going to try to improve that so we can
actually try to block some questions earlier.
But all that really does is make people
capitalizes their sentences.
They really want to ask their stupid question and then it
needs to be--
We're going to keep working on that.
In the long run, one thing which worries me about
Wikipedia--
and that allows me to launch into a little speech about
what Wikipedia--
Wikipedia also has these non-intuitive rules.
We have this non-intuitive rule.
Don't ask shopping questions, for example.
This is non-intuitive because you might think oh, this is a
great place to ask which 30-inch monitor should I buy?
But it's not.
And don't ask subjective questions.
This is a really, really important rule for us.
And I have tried to explain--
I have now explained to you why we have this rule.
And many of you can go back and say, well, here's why
Stack Overflow closes questions that they think are
too localized.
You may not agree, but at least why we're doing that.
But Wikipedia has similar rules that nobody gets.
And they're always confused by these rules.
So for example, there's a rule that says we need tertiary
sources, not primary sources, for Wikipedia.
Wikipedia is making an encyclopedia and we don't
allow say, Joel Spolsky to go into the article about Joel
Spolsky and correct things that I know to be false
because I am not trustworthy.
Who is trustworthy?
The Village Voice.
The New York Times, people that know absolutely nothing
about me and actually, the information that they have,
they got for me, when they interviewed me that time.
So it seems like a funny rule on Wikipedia.
And this rule--
There's sort of another rule about notoriety, which is you
cannot have an article on Wikipedia about something
which is not notorious enough.
And then people start say well, the internet is not
running out of pages.
Why do you have to shut down this awesome article I wrote
about the five tires that are stacked up in the garage at
the new Google hacker space, or whatever this is.
Why can't there be an article about this on Wikipedia?
I'll make it, where we're all here.
We all see that we can write down the model numbers and the
paint color, and stuff like that.
But there's no notoriety there.
That's not notorious.
So who cares.
Well, the reason we care is because there's not anything
published that we can go back in a book, in a library, to
check if those facts on Wikipedia are correct.
So you have to have this combination of two rules, that
the thing be notorious and that all
the facts have citations.
If you don't have those two rules, stuff gets in
Wikipedia, which is not verifiable.
We don't care if it's right or wrong.
It just has to be verifiable.
There just has to be a way to check.
Because if there's no way to check, it could
never possibly be right.
And when you think about this logically,
you say, aha, I see.
I now understand why Wikipedia has to have those two rules,
which sound really nasty and anti-democratic.
Some famous writer, I'm trying to remember who
it was, Philip Roth?
Yeah, just went on Wikipedia to try to correct something
about one of his own books.
And the editor told him, look I know you're Philip Roth.
I get it.
We don't accept you as the source unless it's been
published somewhere.
So I think what he did is he did an interview somewhere,
and then cited the interview.
And they were like OK.
And that's usually the way this is resolved.
But that rule sounds so ridiculous.
And people just get angry with Wikipedia and they actually
drop out of Wikipedia.
They say, I am sick and tired of trying to correct Wikipedia
entries that are just wrong because I
can't do this anymore.
And we have the same fear over Stack Overflow.
We keep closing localized questions and people drop out
of the community because they feel like we're too strict, or
we're nasty, or our moderators are cruel.
Then we're going to lose them.
But we have to have those rules.
That's why we have a site that's awesome because we have
sort of strict rules.
So, education.
I'm all out of time.
Thanks very much for coming to hear me.
I really appreciate it all.
You've been great.
Keep answering questions and asking
questions on Stack Overflow.
I'll be hanging out here for at least another
15 minutes or so.
If you have any more questions you want to ask,
come up and ask me.
Thank you.
[APPLAUSE]