Tip:
Highlight text to annotate it
X
[MUSIC PLAYING]
DAVID J. MALAN: This is CS50.
So I was where you are now some years ago.
And when I was a freshman at the time, I wasn't really on
this path of doing computer science, of doing engineering.
Indeed, I came into this place as a government concentrator.
I'd been, in high school, a kid who
liked things like history.
I liked constitutional law, kind of English and math.
It's like kind of well-rounded, but didn't
necessarily know things that I hadn't been
taught in high school.
And so freshman year, I had this trepidation whereby even
though I liked computers, played computer games and the
like, I certainly never thought of myself as a
computer person, a computer scientist.
And frankly, I thought my friends in high school who
were taking computer science were bit of a geeks.
And yet, when I got here on campus, there was
this course, CS50.
And at the time, it had this reputation of really being
something to beware.
It was a good course.
It was a fun course.
But you had to actually get that foot in the door.
And even I did not cross that threshold freshman year.
And I went on my way being a government concentrator, going
through as many of the prerequisites as I could,
cross-counting things for gen ed or core and the like.
And then sophomore year, for some reason, I got up the
nerve to step foot in Science Center B, where CS50 was.
A very famous computer scientist by the name of Brian
Kernighan was teaching here that year.
And even then, I was only willing to actually fill out
my study card by taking this class pass/fail.
I looked around me and I figured everyone in this one's
gotta know way more about computer science, about
programming, about computers.
Everyone one must be programming, in this room,
since they were 12.
But, indeed, that wasn't the case.
And so the very last day, the fifth Monday of the semester,
did I take this leap and change from pass/fail to a
letter grade and ended up changing my concentration that
same day to computer science.
Now, that's not our objective in this class, to turn you all
into computer science concentrators, but really to
propose that there's an opportunity in this field and
in other fields with which you might be quite unfamiliar
given that high schools typically follow a fairly
standard curricular path, but to venture in, in CS50, into
new waters.
And if you are sitting here today thinking you don't
actually belong, so do most of the people to your right and
to your left.
Indeed, last year, 76% of the students in this class had no
prior experience.
So contrary to what you might think, most of the people
sitting around here today do not, in fact, have any prior
experience.
18% have taken one CS class, and 6% have taken two or more.
Meanwhile, we ask our students every year to describe
themselves in terms of comfort level.
And there's no one definition of this.
You just kind of know it if you're not very
comfortable in CS50.
And last year, we had 55% in this green pie slice here
self-describing as less comfortable, students who
frankly had no idea why they'd even shopped the course on
that first day.
But the same 55% remained with us until term's end, as did
35% who were somewhere in between those more comfortable
and those less comfortable.
So, what is computer science?
Well in high school, and really more generally out
there these days, there's this perception or this
misconception that computer science is programming.
And that's absolutely one aspect of computer science.
But programming, whatever the language is, is really just a
tool that computer scientists use to solve problems, either
in the domain of computer science or increasingly these
days in the physical sciences, the natural sciences, in
medicine, in humanities, to analyze large sets of data.
Anywhere now there is computer and data, there's an
opportunity to apply lessons learned in a class like CS50.
So let's solve a problem that a computer scientist might go
about solving and try to put some jargon, put some
conceptual framework, around what might otherwise be some
fairly abstract idea.
So this is a telephone.
You don't see these things too often, though the college
still seems to have these in the houses and dormitories.
But back in the day when you wanted to use a phone like
this, there was no electronic address book
in your cell phone.
Rather, you pulled up something
known as a phone book.
And these phone books had about 1,000 pages, typically.
They were sorted from A to Z. And you simply had to find the
right page to find the person you're looking for in order to
find their name and their telephone number.
Now how do you go about looking up
someone in this book?
Suppose my goal is to give my friend, Mike Smith, a call.
Well, how do I go about finding Mike Smith?
Well, a very reasonable approach, if naive and
inefficient, would be start here and start flipping to
page 4 to page 5 to page 6, and to sort of linearly, along
a straight line, go through this phone book.
And even though it's gonna be incredibly tedious, if Mike
Smith is in this book, I'm eventually gonna reach him
when I finally flip to the S section of this book.
Now of course, you don't need to be a computer scientist to
know that this is a stupid way of solving this problem.
What would a typical human being do?
Well done.
So you would flip to the middle, right.
So you'd flip roughly to the middle, look here, and I seem
to find myself in the M section.
OK, so M is clearly not what I'm looking for.
And Mike's to the right, so to speak, of this section.
And as some of you have seen before, we can literally now
proceed to tear this problem in half.
[APPLAUSE]
You really shouldn't be that impressed.
Tearing it down the seam is actually not that hard.
The real people do it this way.
But, down the seam, we now have two problems, each of
which is half as big.
And we can literally throw that half of the problem away.
Now we're left not with 1,000 pages but, say, 500.
So now what do I do?
Well, a typical human will go roughly in the middle again.
And I find myself an the R section.
So not quite there.
So again, I can tear this problem in half.
[APPLAUSE]
Thank you.
So now I only have some 250 pages.
And I can do this again and again and again and go from
125 down to roughly 60 to 30 to 15 and so forth.
And finally, I'll get whittled down to one of the S pages on
which, if he's in the phone book, Mike Smith should be.
Now, that's an obviously fairly reasonable algorithm,
and it's a one-time-use algorithm in this case.
But what can we sort of take away from that?
Well, the first approach, correct if naive though it
was, can be described by this straight line.
So if on the x-axis here we say this is the size of the
problem, so as the x-axis goes to the right, the
problem gets bigger.
What does it mean to be bigger in the
context of this problem?
More pages in the phone book.
There's more something we can quantify.
On the y-axis, time to solve.
So as the axis goes up, it presumably takes more time.
So that first approach of linearly searching from page 1
to dot dot dot page 1,000 is a linear procedure, a linear
algorithm or process.
And we can describe it by this straight line.
If I add one more page to the phone book, it's going to, in
the worst case, take me one more page flip
to find Mike Smith.
If I add 100 pages, 100 more flips or units of time.
Now, I can be a little clever with this.
I don't need to really turn it one page at a time.
I can do things like 2 at a time or 4 at a time.
But even that's not all that fundamentally better.
Even if it's 2 at a time, yeah, that kind of moves this
line down a bit, and it means that it takes less time given
the same number of pages.
But it's not fundamentally better.
But what did we just do, and what did all of you do
instinctively?
You actually achieved a little something like this,
logarithmic time, whereby the problem can grow and grow and
grow but the cost of solving that problem, the time
required to solve that problem, does not
grow nearly as fast.
This would be a logarithmic curve, log of n, where n is
just the size of the problem, the number of pages in this
phone book.
And what does this mean in real terms?
Well, if we have like 500 people in this room right now,
or rather, if we have--
mixing metaphor, didn't do that example yet this year--
so if we have 500 pages in the phone book and we double it to
1,000, in this more intelligent model of flipping
to the middle, how many more page tears does it take to go
from 500 pages to 1,000?
Well, just one additional page tear.
If you handed me a 2,000 page phone book, no big deal.
I just tear it one additional time.
So in short, the size of the problem can grow much faster
than the cost of actually solving it.
Now this is just one such algorithm.
There are others we can solve in the same way.
And so why don't we do this?
If you would humor me, albeit awkwardly here in Sanders, go
ahead, everyone, if you could and stand up in place.
As you see on the screen here, this is an algorithm, a
process, a computer program if you will, to be executed by
humans that has just 3 steps.
We're already on step 1.
You've stood up.
And now think to yourself the number 1.
That is your current number.
Everyone here is number 1.
Step 2, pair off with someone standing, add your numbers
together, and then adopt the sum as your new number.
One of you should sit down, then repeat.
SPEAKER 1: 205.
DAVID J. MALAN: What's that?
SPEAKER 1: 205.
DAVID J. MALAN: OK.
SPEAKER 2: He has the other ones.
DAVID J. MALAN: 205?
SPEAKER 3: Yeah.
DAVID J. MALAN: OK.
3.
SPEAKER 4: 400.
SPEAKER 5: 5.
700.
DAVID J. MALAN: All right.
At this point, fewer and fewer people should be standing.
This is where it gets more awkward.
Someone here.
Here.
The worst part is you also have to very verbally do
arithmetic in front of hundreds of Harvard
undergrads.
OK.
Bit of a bug here.
Okay.
What's your number?
SPEAKER 6: Nine.
DAVID J. MALAN: What's that?
SPEAKER 6: Nine.
DAVID J. MALAN: Nine.
Okay.
What's your number?
SPEAKER 7: 179.
DAVID J. MALAN: 179?
Okay.
Good.
So 188.
So you guys can sit down.
What's your number?
SPEAKER 8: 118.
DAVID J. MALAN: 118.
Some smart undergrad start doing the math.
Okay.
118, 188.
What else do we got?
SPEAKER 9: 71.
DAVID J. MALAN: 71.
SPEAKER 10: 79.
DAVID J. MALAN: 79.
Okay.
SPEAKER 11: 47.
DAVID J. MALAN: 47.
Which, teaching staff, that gives us how many?
705 is the answer.
And that's, in fact, exactly correct.
No, we were actually a little bit off there.
But how should this have worked?
What should have just happened?
So, on every iteration of this algorithm, we started with
some number of people standing, and that was the
total number n at first.
Then half of you sat down, and we went to n over 2.
Then half of you sat down.
We went to n over 4, n over 8, n over 16, and so forth,
until, even though it kind of disintegrated there at the
end, in theory, had everyone paired off in balcony and
mezzanine and orchestra here, we would have had just one lone
person standing with a total value, in this case, of 705.
Now, what does that mean, though, for the running time?
Well think about if I as the human had done this manually.
I would have started fairly naively but correctly with 1,
2, 3, 4, 5, 6, 7, 8, and so forth.
Takes quite some time.
So I can do better, right?
In grade school, you don't just count in ones.
You count in twos.
So 2, 4, 6, 8, 10, 12.
And that gets much faster.
But now fundamentally, by leveraging the collective
intelligence of everyone in this room, we can achieve a
curb much more like this, whereby now the number of the
people in this room could double.
Another 700 people walk into this room for 1,400 people,
but it would only take us one more iteration of this
algorithm to solve.
And so, increasingly these days, when we have these huge
data sets in Facebook and Google and the like, it's
solving problems with a bit of insight, this bit of
cleverness, that's allowing us increasingly to do much, much
more powerful things with computers today.
If you like these kinds of things, you might have seen on
Facebook CS50's own Puzzle Day coming up this Saturday.
If you would like to participate in something like
this whereby you, in 2 or 3 or 4 teams of 4, would like to
solve some puzzles such as this one, you stand a chance
to win some fabulous prizes, among which is a Wii and some
gift cards or some other Facebook swag.
This Saturday, noon to 3:00 PM, go to
cs50.net/rsvp for such.
And this slide is online if you'd like to play around.
The problems this year shall be new.
You may notice in the classroom, too, all the more
cameras this year.
So not only will the course be filmed in the usual way, CS50
may also be taking part in a documentary on higher
education that's looking at the transformative experience
that a student can have these days in an undergraduate
course of study.
So toward these, then, not only will we be filming for
that, we will be filming as well for increasingly our
online audience, as well as on occasion this audience here.
So we welcome to the class this year our Harvard
Extension School students, Graduate School of Design,
Education, the business school, the Kennedy School,
the law school, as well as a number of students from
Belmont, Lexington, Newton, and Watertown high schools.
Welcome to you all.
In addition this year, you may have heard, Harvard and MIT,
and Berkeley now, have entered into a collaborative
partnership, an initiative called edX, which is an
initiative to open up education to all the more
people online and fundamentally start doing
research on a much broader scale as to how people learn.
And so CS50 will be the college's first course
participating in that initiative as well.
Which means you will have access to all the more tools,
all the more curricular content, all the more video
content as a result, as well, as of yesterday morning, the
53,019 people who have registered to take CS50 along
with you this year on the Internet.
So without--
[APPLAUSE]
So what this means, in particular, is that the
teaching staff and I have spent quite a bit of time this
summer preparing for the fall, both on campus and off, so
that we can begin to build up a corpus of interesting, of
compelling, of engaging educational content that
focuses, in particular, on more intimate conveyances of
fairly complex material.
So in addition to the course's lectures and sections and
things called walkthroughs, which we'll revisit in just a
bit, we'll also have these shorts this year that allow
you to engage with the course from a different angle
altogether.
So let's use this as an opportunity to take a quick
peek at one that discusses this notion of binaries.
So in computer science, there are things called algorithms--
two of which we just took a look at-- these procedures for
solving problems.
But at the end of the day, you need to
represent information somehow.
And you need to represent it in a way that a computer can
understand.
And even if you don't really understand computers and
you're in that 76% right now, you probably have some vague
sense that computers somehow deal in 0s and 1s, the binary
system, so to speak.
Now why is that the case?
Well, it turns out when computers first came about, if
you needed to represent information, you could do it
with electricity.
And though this is a bit of an oversimplification, a very
easy way of recording information is either by
turning that electricity on--
a 1 in binary, so to speak-- or turning
that electricity off.
So, if Barry, if you wouldn't mind, could we dim the lights
fully for just a moment?
This here is a very gratuitous binary 0.
If we turn the lights back up, now Sanders Theatre is
representing the binary value of 1.
Unfortunately, with just one bit, with just one set of
lights, we can only represent two numbers in the
world, 0 and 1.
And it'd be nice if computers could count a
bit higher than that.
But indeed they can.
So let me pull up on screen here our friend Nate Hardison
who will give us a quick look over the course of just a few
minutes at this notion of binary.
[VIDEO PLAYBACK]
NATE HARDISON: Back when you learned how to read and write
numbers, you learned about the digits 0 to 9.
To write whole numbers larger than 9, you learned that all
you had to do was use some combination of these digits,
as in 52 and 437.
So, this way of writing numbers has a
name, decimal notation.
Why decimal?
Well, the Latin root of a decimal, decem, means 10.
And when you have 10 digits in your notation system, 10
becomes a rather special number.
Let's look at the number 437 written in decimal notation to
understand why.
We can first break up 437 into 400 plus 30 plus 7.
We can take it apart even more so that we've got 4 times 100
plus 3 times 10 plus 7 times 1.
Remember learning about the ones place, the tens place,
the hundreds place, and so on?
This is exactly where that comes from.
And finally, we can see we've got a bunch of powers of 10
embedded in here.
We've got 4 times 10 to the 2 plus 3 times 10 to the 1 plus
7 times 10 to the 0.
So now you see why 10 is a special
number of decimal notation.
In fact, we've got a name for it.
It's called the base since it's the base of the exponent
in our arithmetic here.
Decimal notation is not the only way to represent numbers.
In fact, even if we get rid of the digits 2 through 9, we can
still represent all of the numbers that
we could with decimal.
So now we have two digits, 0 and 1, 2 is our special
number, the base of our notation system.
The name of this notation system is called binary since
the prefix "bi" means 2.
So instead now of having a ones place and tens place and
so on, we now have a ones place, a twos place, a fours
place, and so on, going up by powers of 2.
So let's see this by doing some counting.
So, 0 is still 0, and 1 is still 1.
However, now that we've got a twos place instead of the tens
place, 10 represents the number 2.
To get 3, we add one to that and get 11.
4, since there's now a fours place, is
represented by 100.
Five is 101.
6 is 110.
7 is 111.
8, again, has its own place.
So it's 1000.
And I think you get the point.
[END VIDEO PLAYBACK]
DAVID J. MALAN: So, this is to say, what computers do and
what binary is is actually not that dissimilar from what
we've been taking for granted for some years, right?
You reckon grade school, you learned to count in precisely
the fashion that Nate proposed.
But you probably haven't really thought about it since,
the fact that there is this ones place, tens place, and
hundreds place.
And that's pretty arbitrary.
And indeed, computers simply use this different base.
But at the end of the day, to actually physically represent
this notion of a 0 and 1, you obviously don't just turn the
lights on and off necessarily.
You need to do it on a much finer-grained scale.
And by finer-grained, you might remember this silly
little toy from childhood, Woolly Willy and these little
magnetic particles.
So magnetic particles are something that you can align
in a couple of different directions, perhaps
north-south or south-north.
And so a lot of physical incarnations of technology
these days that use binary, that use 0s and 1s, simply
have magnetism on the inside that aligns things, up-down or
down-up, with thereby representing a 0 or a 1,
respectively.
So indeed, let's move away from the abstract here and
look at the inside of what's a more traditional
computer hard drive.
This one happens to be a bit larger on screen in that it's
from a desktop computer.
But laptops today still have the same technology, but is
gradually being replaced by more sophisticated things that
have actually no moving parts.
The inside, then, of a hard drive.
[VIDEO PLAYBACK]
SPEAKER 12: The hard drive is where your PC stores most of
its permanent data.
To do that, the data travels from RAM along with software
signals that tell the hard drive how to store that data.
The hard drive circuits translate those signals into
voltage fluctuations.
These in turn control the hard drive's moving parts, some of
the few moving parts left in the modern computer.
Some of the signals control a motor which spins
metal-coated platters.
Your data is actually stored on these platters.
Other signals move the read-write head to read or
write data on the platters.
This machine is so precise that a human hair couldn't
even pass between the heads and spinning platters.
Yet it all works at terrific speeds.
[END VIDEO PLAYBACK]
DAVID J. MALAN: So, if we now zoom in on what's actually
happening on top of these platters in terms of the
magnetism, we have this second of two looks.
[VIDEO PLAYBACK]
SPEAKER 13: Let's look at what we just saw in slow-motion.
When a brief pulse of electricity is sent to the
read-write head, it flips on a tiny electromagnet for a
fraction of a second.
The magnet creates a field which changes the polarity of
a tiny, tiny portion of the metal particles which coat
each platter's surface.
A pattern series of these tiny charged up areas on the disk
represents a single bit of data in the binary number
system used by computers.
Now, if the current is sent one way through the read-write
head, the area is polarized in one direction.
If the current is set in the opposite direction, the
polarization is reversed.
How do you get data off the hard disk?
Just reverse the process.
So it's the particles on the disk that get the current in
the read-write head moving.
Put together millions of these magnetized segments and you've
got a file.
Now, the pieces of a single file may be scattered all over
a drive's platters, kind of like the mess of
papers on your desk.
So a special extra file keeps track of where everything is.
Don't you wish you had something like that?
[END VIDEO PLAYBACK]
DAVID J. MALAN: Indeed.
So, we have this ability to represent information, numbers
at a very low level.
We have a physical way of representing that same thing.
But we can't really do all that much of interest yet
other than perhaps some arithmetic and mathematics.
We have no way of representing thus far things like
alphabetical letters so that we humans can communicate
using these same devices.
But thankfully there exists encodings, patterns of 0s and
1s, that represent higher level constructs like a and b
and c and entire sentences and paragraphs and the like.
And so ASCII, which is an acronym that refers to this
coding system whereby a number represents a given letter.
For instance, the number that we know as decimal value 65 is
known as the capital letter A to computers.
The decimal value of 97 in computers is known as a
lowercase a.
And what does that really mean?
Well, even though Nate a moment ago only counted up
from 0 to 8, if we were to continue counting up to 65 or
further to 97, the pattern of 0s and 1s that he would have
drawn on the screen would be exactly what a computer uses
to represent the letter A in all caps or
the letter a in lowercase.
And indeed, there's a whole scheme to this.
This is a, at first glance, overwhelming chart of
encodings, but if you focus just on the right half here,
notice in this middle column we have this notion of numbers
followed by letters.
And at top we have 32.
And the character, char, to which 32, the integer, refers
is apparently the Space Bar character.
When you hit the Space Bar character on your laptop,
well, what you're really sending is a number, a pattern
of 0s and 1s, a flow of electricity if you will,
representing those 0s and 1s that the computer then
interprets as a space character on the screen.
An exclamation point is 33.
Double quotes is 34.
And if we scroll down here over to the right, we see that
65 is indeed A, and 97 is indeed lowercase a.
And so now that we have this encoding scheme, we can start
to spell things out.
Indeed, computers typically express themselves in standard
units, not using an individual bit, which again is not all
that useful to just represent 0 or 1, lights on or off, but
rather using sequences of bits.
And the most common unit of measure, as you probably know
and or at least inferred, is a byte.
A byte is just eight bits, eight 0s or 1s in a row.
So we can start spelling things out.
And so, if we could, why not try this a little bit
collectively here.
Are there eight people in this room who would be willing to
come up on stage?
You have to be comfortable appearing on camera, but you
don't really need to know, otherwise, what's
going on just yet.
I see one person being volunteered over here.
Two, three, four, five, six, seven, and how about eight.
Come on up.
So you are about to represent a byte of people.
Let me have you be the 128's place, you the 64's place, you
the 32's place.
But we're gonna very rapidly have to reverse this.
So let me meet you all over there.
And you should be in the 128's place all the way over here.
Much like the hundreds place and the thousands place would
be farther to the left, we want the biggest placeholder
to be here on the left as well.
We have 64s's 32, 16's, 8's, 4's, 2's, and 1's.
Excellent.
So now we have--
OK, you can help me.
So now we have-- what's your name?
JOANNE: Joanne.
DAVID J. MALAN: Joanne.
So Joanne and I are now going to advise these guys on how we
can go about spelling something out.
So on the backs of their sheets of paper, they have a
little cheat sheet that's going to tell them whether
they were representing a 0 or a 1.
And why don't for simplicity, we'll represent 0 by just
standing there awkwardly.
Very good.
Or a 1 by raising your hand, representing a 1.
And let's see if we can't spell out a four character
phrase here.
So, go ahead now, volunteers, and execute round one by
raising your hand if you're a 1 or keeping it
down if you're a 0.
So, now that we have these three hands up, what number,
everyone else, are they actually representing?
OK.
67.
Why?
Well, quick sanity check.
64's place, because it's a 1, that's like 1 times 64 plus 1
times 2, so that's 66 plus 1 times 1.
That's plus 1, so 67.
So now these guys are collectively representing 67
which apparently represents what here in ASCII?
OK.
So a c.
All right.
So now let's proceed to round two.
Everyone starting with their hands down.
And in round two--
actually there's not much of a role here, I suppose, but
we'll pretend.
So round two, raise or lower your hands.
All right.
Audience, what are we now expressing is 83.
So you could do the mathematics.
But for anyone whose hand is up, you add in the number that
they represent.
So now we have 83.
Let's expand the cheat sheet a little bit, and we now have--?
[INAUDIBLE]
DAVID J. MALAN: OK.
This might be obvious where we're going here, but
nonetheless, round three.
OK.
Round three's good to go down there.
So round three, what number are these guys now
representing?
OK.
I heard 53, which now represents?
Interesting.
Now why this sort of counter intuitive result, right?
If we want to represent 5-- we all probably know
where this is going--
why don't I just raise the 4's place and the 1's place?
Well, realize that there's a difference, fundamentally,
between how a computer interprets these bits.
If you're trying to represent the number 5, then absolutely,
we just raise hand number 4 and raise hand number 1.
But we're not representing numbers here.
The context here on stage is that we're representing
characters, or chars.
And in this context the computer has to realize that,
oh, this pattern of bits is not a number alone, it's
actually representing a higher level concept, in this case an
alphabetical letter.
So the fact that it is now representing the number 5 with
the value of 53 is because in ASCII the thing we
aesthetically see as the number 5 itself needs a
pattern of bits.
Because why?
Well, the world just decided to use the lower numbers, 0,
1, 2, 3, for what look to be fairly cryptic things.
And indeed, these are the characters that aren't on a
keyboard, special expressions that you need in a computer to
do interesting things, but humans never
actually type them.
So 53 indeed represents 5.
Now, just as a final sanity check, what number should they
represent in just a moment?
AUDIENCE: 48.
DAVID J. MALAN: OK.
So 48.
And indeed, go ahead.
Round four.
16 plus 32 is, indeed, 48.
And so a big round of applause, if we could, for our
eight volunteers here.
Thanks.
You can keep this one.
If you--
Very well done.
Any direction is fine.
So, we now have a way not only of thinking about how to
represent data and actually representing it physically,
but also doing higher-level things on top of it.
Indeed, this is going to be a theme throughout computer
science of building more and more interesting complex
things on top of fairly simple ideas, in this
case just 0s and 1s.
In terms of why this is useful, well even though in a
course like this we'll focus on fundamentals and on
programming and on solving of problems, you can go off in
computer science in any number of directions.
In this case here, this is a chart that you have at the
back page of your unofficial guide to CS at Harvard, one of
today's two printouts.
This suggests the many different directions in which
you can go after a course like this.
Learning about artificial intelligence, about graphics,
about machine learning, about language itself.
Realize, too, that there are yet other paths.
There are more mathematical paths in computer science.
If you're not even able to take something like CS50 this
fall, there's introductory courses in the spring.
Computer Science 1, for instance, is yet another
on-ramp to this new world.
Now as an aside in the interest of solving problems
related to courses, realize that CS50 set out some time
ago to try to solve one of these problems, problem known
as my.harvard, which many of you might be using to actually
shop for courses.
But if not, check out a tool like this, as well as other
descendants that some of our past students
and staff have created.
But in Harvard Courses, which is a web-based tool--
something that you will be capable of designing and
deploying yourself, as well as yet other things as well, by
the end of the semester.
Realize that this builds upon an open data set, in this case
a course catalog, and allows students in this case to
explore a fairly complex data set.
We dug up last night a few statistics based on the few
thousand folks who have been using this over
the past few days.
If you've been curious to know how many courses your friends
actually tend to shop, well, today it's data suggests that
7.6 is the average number of courses on
someone's shopping list.
And now I'll give you, also, the statistic of the most
number of courses on someone's shopping list.
And we all probably know someone like this.
201 is this year's record.
Now some of our former students and staff actually
put together a clip to paint a picture for you of what this
path of computer science and CS50 itself is.
Let me go ahead and pull up, thanks to Mr. Hahvahd here, a
video produced by some of your predecessors.
If we could keep the lights up for this.
[VIDEO PLAYBACK]
[END VIDEO PLAYBACK]
SPEAKER 14: (SINGING) We take our time with some scratch,
for loops, events, we can match, compiling using our
bash, this term won't be a bore.
Hacking fun, some free meals, lectures are simply unreal,
our fair is such a big deal, there's so much to adore.
Go, David Malan.
Walkthroughs, I'm not bailing.
Office hours, no one's failing.
Where you think you're coding, baby?
Hey, I just met you and this is crazy,
but here's our reason.
Take CS50.
It's hard to code right without you, baby.
But here's our reason.
Take CS50.
Hey, I just met you and this is crazy,
but here's our reason.
Take CS50.
And every star firm wants to hire me, another reason.
Take CS50.
Before you came into my life, I coded so bad, I coded so
bad, I coded so, so bad.
Before you came into my life, I coded so bad
and I can't go back.
Take CS50.
DAVID J. MALAN: I had no idea that was going to happen.
So, a more serious look at what lies ahead.
So in terms of the expectations of this course,
you're indeed expected to attend or watch the course's
lectures, submit a problem set, take two quizzes, submit
a final project.
In terms of grades, realize that my comment at the opening
about pass/fail, something that we very much take to
heart in CS50.
There is not nearly enough of a culture at Harvard of trying
something and risking failure.
Indeed, we had numbers of students, and myself, in
particular, who were worried about hurting your GPA or
getting a B in something like CS50.
And the opportunity to take a course like this, and other
gateway courses at the introductory level, pass/fail
is a very underutilized opportunity at this college,
in general.
And so please know even I enrolled in this course
initially for pass/fail credit alone.
And even though I did switch at the end of the day, it was
those five initial weeks, up to the fifth Monday of the
semester which is the cutoff, that allowed me to actually
put foot into these new waters and actually try something
very unfamiliar and very uncomfortable
for me at the time.
So in terms, now, of what role the various angles via which
you can approach this course serve, so lectures, it's up to
you if you engage with us in person at this venue.
Indeed, we know statistically that roughly 40% of you will
kind of come and go over the course of the semester.
And 10% of you, we will never see again after today.
And that's perfectly fine, to be honest.
One of the defining characteristics of CS50 is
that there are these innumerable resources, some of
which we'll rattle through in just a moment, including
lectures and sections and things called walkthroughs and
office hours and the like.
And it's more resources than the typical student should
have to or could physically take advantage of.
But that's because of the disparate learning styles that
any student body manifests.
And so in lectures, the primary role, as I see it, is
not to verbally push out fairly complex material and to
necessarily deliver all of the intricacies of the
fundamentals that we'll explore this semester, but
rather to do things like we've been doing thus far already,
these examples, involving humans onstage, trying to
paint a mental picture, and also create, dare I say, some
of these memorable moments.
So that even as you struggle with certain topics, you have
these memories like, oh, even though that was fairly
abstract, the math, I got lost with carrying the 1, like it
really, at the end of the day, is not all that dissimilar to
something I already know.
And so the role that lectures will serve, either in person
here in Sanders or online on video, is really to set the
stage mentally for you each week for the various concepts
and problems that we'll be diving into.
In terms of the high-level concepts, most of these words
might flow over your head for the moment, and that's fine.
For those of you who come into the course more comfortable
will know of some of these topics.
But typically for that 10% of the class for which they have
much more background, taking AP computer science,
programming since they were 12, realize that there will be
opportunities in sections and in problem sets to go all the
more into depth into various topics, filling in whatever
gaps you might have from your high school or prior
background.
In terms of the languages, realize that what language we
use in CS50 is largely irrelevant at
the end of the day.
We happen to use, primarily, a language called C. Toward the
end of the semester, we introduce web-centric
languages like PHP and JavaScript.
But we and others could teach a course like this in most any
modern high-level language.
Python and Ruby and others are quite popular these days.
Because realize at the end of the day, you're not learning
in this course C. You're not learning PHP or JavaScript.
You're learning how to solve problems, whether web-based,
computer-based, or data-oriented itself, using
these simply as tools.
Now, in terms of the logistics, you'll use
something, eventually, called the CS50 Appliance.
Does not matter if you have a Mac, a PC, a Linux computer,
or the like.
You'll have freely available software starting next week
with which to use the CS50 Appliance, a virtual
environment that you'll use on your own computer so that you
and all of your classmates have a uniform Linux desktop
in this case.
It's the problem sets, though, in which you'll really get
your hands dirty in the course.
And at the end of the day, it's the problem sets, I
think, that really define a student's
experience in this course.
Realize that many of the problem sets will be released
in two editions, a standard addition that we expect and
encourage 90% of the class to dive into.
But we also release some problem sets in
so-called hacker additions.
And you know it's the hacker addition because on every page
with a watermark it says hacker addition on it.
And that's for this demographic of you who have AP
computer science with 10 years of programming under your belt
and are looking to fill those gaps and to have more formal,
rather than self-taught, training, perhaps.
Realize that there is a very substantial demographic in the
class that has precisely that same goal.
You'll have five late days.
Problem sets are generally due on Thursdays, but you can
extend five of those deadlines using these
things called late days.
And we'll also drop your lowest score at the end of the
semester per the particulars in the syllabus.
But another defining characteristic of CS50 over
the years has become office hours.
It's an opportunity that you saw visually in photos a bit
ago in which we gather-- previously in house dining
halls, prior to that in the basement of the Science
Center, and this year in Annenberg Hall-- four nights a
week from 8:00 PM to 11:00 PM where you'll have this very
much shared experience of working on, struggling
through, certain problems, but with a substantial support
structure in place.
Indeed, the way this will work is you'll arrive at Annenberg
if you have some question during the week, you'll bring
your laptop, you'll sit down, grab some food, and you'll log
into CS50 Discuss, a web-based utility that the teaching
staff has developed that will allow you to post questions
and see follow-ups in a typical discussion forum
sense, using labels and the like and auto complete to
search the data.
But you'll also be able to, during the hours of office
hours, have your questions escalated to
actual human beings.
Indeed, the goal ultimately is so that one, we begin to build
up over the course of the semester a corpus of hopefully
really useful information, common answers to common
questions, so that you yourself can solve problems
and get unstuck as quickly as possible, but while having the
teaching staff, usually 20 to 30 of the teaching fellows and
course assistants, on staff at once.
We will have what's called the CS50 Greeter in Annenberg.
And when we determine that, you know what, this question,
we can't really answer effectively online.
We need to see your computer.
We want to talk to you one-on-one.
On one.
You're really struggling and you, therefore, want to talk
one-on-one alongside someone, you'll be dispatched to the
CS50 Greeter, a teaching fellow holding, literally, an
iPad that has students' names on one side, teaching staff's
names on the other.
We will click your name followed by the name of a
teaching staff, and your computer screen will start
blinking saying please go see Alice or please go see Bob at
the staff table.
And so in this way, we will be able to dispatch things as
efficiently as possible, as well as guide you toward
solutions all the more readily.
In sections, these will be opportunities for more
intimate hands-on opportunities with one of the
teaching fellows and 12 to 16 or so of your classmates in
which each week we'll have problems in the problem set
that ask a number of conceptual questions and a
number of bit-sized programming questions that you
could figure out on your own, and you could work on your
own, but in the context of section where we work through
collectively some of those problems and go where the
different conversation takes us.
In addition, in section will you have opportunities to
review submissions of homework that you've made, your
classmates, sometimes anonymized, always via opt-in
if you would like to share the work that you've submitted.
So it will really be a two-directional conversation,
an opportunity to review your own work in a much more
dynamic sense, rather than simply looking at a PDF or a
printout and thinking about it for a few seconds and not
necessarily absorbing the feedback that the teaching
staff have provided.
And you'll use a tool here called CS50 Spaces.
For those unfamiliar, this is the language known as C at top
left, and you'll get to know this over time.
But this is a web-based utility that we'll use in
section that will allow you and your 15 or so classmates
to login with your teaching fellow at the
front of the room.
You'll be able to write code in this window.
You'll be able to chat electronically, if you're not
actually at section at that particular moment.
And your teaching fellow, when it comes time to discuss
Alice's or Bob's solution in class, the teaching fellow can
click a button and voila, project onto the screen,
whatever that student has been working on at that particular
point in time.
So for those of you who have friends who have taken CS50 in
the past, realize that sections have been significantly
rebooted this year to be all the more active, all the more
dynamic, and really a two-way conversation between teaching
staff and students.
And walkthroughs.
So for these problem sets, we also offer not only the
specification itself, which is generally a fairly detailed
PDF, but also things known as walkthroughs whereby one
member of the teaching staff will lead a weekly session
that literally walks you through the problem set, giving
you hints and advice and starting points and is meant
to ask the very frequently asked
question, where do I begin?
Well, you begin either by diving into the spec on its
own or by attending or watching these walkthroughs.
The first walkthrough, in fact, will be this Friday.
They'll be on Fridays, not so much because we think it'll be
a popular time but because we can then film them very early
in the week to get them online by the weekend so that you
have as many days as possible to actually engage in that
content as well.
But more on that in lecture this Friday.
Now in terms of the support structure, the most
significant statistic is perhaps the 108 teaching
fellows and course assistants that this
course currently has.
If some of you who don't have conflicting classes would like
to join me up here on stage, it is these guys who will
ultimately really define your experience in the course.
I had a lot of teaching fellows teaching me classes in
the day, and I remember very few of those frankly.
But to date, I still remember among those few, my CS50 TF
who really helped me answer questions, who really helped
me when I was struggling, and really was a partner in this
experience of learning a very new world.
In a little bit, all of these guys will join you outside for
cake, which is a tradition of CS50, in the transept of
Memorial Hall.
But allow me first to introduce you to Nate
Hardison, again, Rob Bowden, and Tommy MacWilliam, this
year's course heads.
If you guys would join me here in the middle.
They have all prepared some inspirational remarks.
TOMMY MACWILLIAM: I didn't prepare anything
inspirational.
But my name is Tommy.
I'm a senior in Mather.
I'm studying computer science.
I'm really excited to be on the hedge team and going
through the CS50 journey with you.
What I really love about CS50 is how it really teaches you
to think about problems in a new way.
This is really a skill that's gonna be invaluable no matter
what field you go into.
And not only that, but we offer more free candy than any
other course on campus.
Yeah, and so I'm really looking forward to seeing what
everyone builds this semester.
And if anyone has any questions now or throughout
the semester, definitely feel free to reach out to me and
I'd be happy to help.
ROB BOWDEN: Hi.
I'm Rob Bowden.
I'm a senior in Kirkland.
Yeah, that's right.
We're all really excited for this next semester.
We hope you're all excited.
I wasn't expecting that.
Yeah.
So we put so much effort into making this
semester really great.
And as long as you're willing to put in the effort, there is
so much you can get out of this course.
Ah, we--
yeah.
You can get a lot of fun out of this course.
We wouldn't have a staff of 108 if you couldn't get a lot
of fun out of it.
So, just try to be involved and you won't regret it.
NATE HARDISON: Hi, guys.
I'm Nate.
I'm the preceptor for the course.
I'm really excited to be here as well.
This is my first year here.
I hope you all take this course and enjoy it as much as
I've enjoyed it so far.
And if you ever want to learn how to count to 9 or 10 in
binary, come talk to me.
DAVID J. MALAN: So at the risk of leaving these guys here on
stage a bit awkwardly, let's whirl through just a few of
the things that await before we adjourn for cake.
What is it that lies ahead?
Well, if we take a quick look back at last year, in problem
set 0, your predecessors dove into a programming
language called Scratch, a graphical programming language
you'll use in the first days of the course starting this
Friday to learn some concepts unfamiliar to some of you.
But realize there will be an advanced aspect of this for
those of you with prior background.
In last year's problem set 2, students dove into the
world of cryptography, the art of enciphering or scrambling
information, implementing programs that encrypted data.
And in the hacker addition last year did students proceed
to crack or decode the passwords in a typical
computer's xe password file by coming up with algorithms and
heuristics for brute force figuring out what someone's
password on a computer system was.
Last year, too, in problem set 3, did students impli-- in
problem set 4-- did students
implement the game of Sudoku.
And in the hacker addition that year did students not
just implement how to play the game, but actually a solver
whereby the computer can provide you, the human, with
hints by more rapidly than you've solving
that particular problem.
In problem set 5, we did forensics, this art of
recovering information that was accidentally or very
deliberately deleted from a computer.
Last year, the teaching staff and I strolled around campus
taking photographs of people, places, and things, and then
accidentally formatted the media card on our camera that
had all those photos.
But no problem.
We made a forensic image of this media card, handed it out
to all students in the class, and challenged them to write
programs that recovered all of the JPEGs from that card.
And this is actually one of our favorite problem sets.
And I dug up in an email from one of your predecessors,
which was great fun to read sometime ago.
He wrote-- this is from Matt-- dear David, yesterday my
sister accidentally formatted her camera's SD card and lost
a year's worth of memorable photos.
She unfortunately isn't the best at backing up her data.
But this situation reminded me of pset 5, so I thought I
would try to run her SD card through the recover program
that I wrote all the way back in October.
So after four hours of figuring out how to create a
raw image from the formatted SD card--
Google proved to be pretty unhelpful in this regard until
ironically I happened to come across your instructions on
the Internet--
after tinkering around with some of the command arguments,
I managed to create the forensic image.
And after installing and configuring the CS50
Appliance, I managed to run the forensic image through my
program and recover all 1,027 of my sister's photographs.
Right, Matt.
So in last year's--
[APPLAUSE]
In last year's problem set 6, we gave the students a
dictionary of 150,000 English words and challenged them to
write a spell checker that answered queries of the form
is this word spelled correctly or
incorrectly as fast as possible.
And in an opt-in basis were students allowed to then
challenge classmates by posting their results, the
amount of RAM that they used, the number of CPU cycles or
seconds that they used, so that students were then ranked
on the course's website.
Again, purely optional aspect of it, but great fun in that
very often would a student get to position number 10 or so on
the big board on the website, go off to dinner, and then
come back and realize his roommate had just edged in
front of him or her on the big board, thereby pouring another
two or three hours just to one up his or her roommate.
So we look forward to something similar
this year as well.
In problem set 7 did we steer in the direction of web
programming, actually solving problems in the ever
increasingly common environment of a web browser.
Know decreasingly do we download software on Macs and
PCs, but increasingly do we do it all within the web.
And indeed last year, some 88% of students' final projects in
the course were web-based.
And those, too, are skills that you will derive from this
class by course's end.
Because what awaits at course's end is the CS50 Fair,
this exhibition that's based on the idea of a science fair.
But in this version of a fair do all students in the class
bring their laptops and their friends and family and others
to Northwest Science, a large building on campus, set up
their laptop, get some food, get some popcorn and drink,
and then exhibit their final projects for all those in
attendance who last year numbered some 2,500 attendees
from across campus.
And expressions like this and like this were not
uncommon at the fair.
Leading up to the fair is the CS50 Hackathon, an opportunity
to hop on a Harvard shuttle, head down the street to
Microsoft at 8:00 PM, and not go home until 7:00 AM.
We serve first dinner at 8:00 PM, second dinner at 1:00 AM,
and for those still standing at 5:00 AM, do we treat to
pancakes at IHOP.
And the Hackathon is an opportunity, as pictured here,
to dive into your final projects, whether working on
your own or with friends in a collaborative environment,
where the entire teaching staff is working well into the
night with an ample supply of Hong Kong Chinese food.
At 5:00 AM will such images as these be quite common this
year as well.
So as we adjourn in a moment for cake, keep in mind that
76% of the people in this room have no prior experience.
And as per the syllabus, what ultimately matters in this
course is not so much where you end up relative to your
classmates, but where you in week 11 end up relative to
yourself in week 0.
This is CS50.