Tip:
Highlight text to annotate it
X
>> SPEAKER 1: All right.
Welcome back.
This is Week Two of CS50, and we have thus far been using functions but
largely taken them for granted.
We've used printf which has the side effect of printing
things on the screen.
We've used get-int, get float.
>> But what if you actually want to create your own functions, as some of
you might have already begun to do for Problem Set One, though
not strictly required?
Well, let's go ahead and revisit that problem of just asking the user for
their name and printing something on the screen, but try to factor out some
of the commonality that we've seen in our code thus far.
So by that I mean the following.
>> I'm going to go ahead and create a new program, just call
it hello.c as usual.
I'm going to go ahead and give myself include standard io.h at the top.
I'm going to also give myself preemptively the CS50 library so that
I don't get yelled at by the compiler.
And now I'm going to go ahead and declare int, main, void.
>> And then in here, this is where I want to begin to outsource functionality to
some other function that I myself am going to write but that doesn't
currently exist.
For instance, suppose that I wanted to write a function that allows me to
print out hello, comma, and then some user's name.
Rather than continuing to do printf hello, %s, wouldn't it be nice if
there were just a function called not printf but print name?
>> So in other words, I want to be able to write a program that does a little
something like this.
First, I'm going to say printf your name, thereby prompting the user to
give me his or her name, and then I'm going to use the familiar string s to
declare a string.
Give me a variable of type string, call it s, and store in that the
result of calling get string.
But now in weeks past, I would have somewhat tediously done hello, %s/n.
>> And in other words, we've seen this example a bunch of times, and it's a
trivial example because there's just one line of code so it's really not a
big deal to keep typing in again.
But suppose that this line of code actually were becoming a burden, and
it's not one line of code but it's 10 lines of code a couple weeks from now,
and you're just getting tired of copying and pasting or
retyping that same code.
Wouldn't it be nice instead of doing printf hello, %s and so forth,
wouldn't it be nice if there were just a function called print name that
takes an argument--
in other words, it takes input--
and then semicolon.
So that function, wouldn't it be nice if that existed?
Then I wouldn't have to worry about what printf is, what %s and all of
these complexities that are not all that interesting.
They are useful.
>> So print name, unfortunately, was not invented some 40 plus years ago.
No one thought to write it.
But that's the beauty of having a programming language, just like in
Scratch you can define custom blocks, so in C and most any language, can you
define your own functionality, can you define your own functions.
So even though we get main by automatically for free, we can declare
our own functions.
>> So I'm going to make some room up here up top, and I'm going to declare my
own function that's going to look a little strange at first but we'll come
back to this before long.
I'm going to say void, thereby indicating this function does
something, has a side effect, but it doesn't return something to me in the
same way that get int or get string itself does.
And I'm going to give this function a name of print name, and I'm going to
specify that this guy is going to take a string, and I'm going to call that
string name.
I could call it anything I want, but I want my code to be self-documenting.
In other words, if one of you were to open this file and read it, you could
sort of infer from the name of that input what role it's supposed to play.
>> And now below that, I'm going to open curly brace and closed curly brace,
and so notice I've followed the same pattern on lines four through seven as
I've been following for a good week plus now between, say, lines nine and
14 which compose main.
In other words, print name is another function.
Now, the compiler is not going to know to call this thing automatically
because I literally just invented it, but it will know still to call main
automatically, and then of course in line 13, I am calling my own function.
And because I've declared that function up on line four before main,
this is going to teach the compiler what quote, unquote, "print name"
means and what it should do.
So I'm sort of giving it a new custom block in the context of, say, Scratch.
>> So in here, I can put that very common or recurring pattern of code I keep
writing in class, printf %s hello, %s/n",--
what do I want to put here?
S?
So I want to put name in this context.
So notice a bit of a dichotomy here.
Because I am declaring my own function and I have somewhat arbitrarily called
it print name, and because I've specified in parentheses that this
function takes one argument, the type of which is a string-- so it's a word
or phrase or something like that-- and I'm calling that argument name, that
means the only variable that's in scope, so to speak, is name.
>> S only exists between what two curly braces, of course?
Well really, just like line 10 through 14, so just like on Monday cannot use
S, but what I can do is pass S into print name.
Print name just so happens to give it an alias, a synonym, a nickname,
calling it name, and now using it in this line.
So now let me save this, zoom out.
>> Let me go ahead and make hello.
Looks good.
Didn't spit out any errors. ./hello Enter.
What's my name?
David.
And hello David.
So not all that exciting, but just think now.
You now have that same ingredient as we did in Scratch to
make our own functions.
>> But there is a bit of a gotcha.
Suppose that I hadn't really thought this through and I actually without
really thinking about it wrote that function down here.
Feels perfectly reasonable.
In Scratch there is no notion of location in your scripts.
You could put one up here, one up here, one up here, and it might start
to look a little messy if you don't lay it out neatly, but it doesn't
matter where physically the scripts were on the screen.
Unfortunately in C-- and this is unlike languages like Java and Python
and others that you might be familiar with-- unfortunately in C, order does
matter because watch what's going to happen now.
>> The default function that's going to execute is, of course, main.
Main is going to call print name on line eight, but unfortunately, the
compiler won't even know that print name exists until it gets to line 11,
which unfortunately is going to be too late.
So let's do make hello.
And now damn, two errors generated.
So now let me scroll up to the very first, as we should always do, and
notice that it's yelling at me, "implicit declaration of function
print name."
>> So we've seen this message before, implicit declaration of function.
When have we seen that kind of error?
When I didn't include a library.
If I forgot cs50.h and I would get yelled at for get string or get int.
But in this case, this function print name isn't in a library, right?
It's literally in this file, so what's really the problem?
>> Well unfortunately in C, it takes you so incredibly literally that if you
want a function called print name to exist, you either have to implement
that function at the very top of your code so that it's accessible to lower
functions, but frankly, that becomes sloppy very quickly.
Personally, I like putting main first because then it's very clear what this
program does at first glance.
And plus, you can get into weird corner cases where if x wants to call
y but y might call x, you just physically can't actually put one
above the other.
>> But it turns out in C, we can solve this very simply.
I'm going to put a little bit of space up here, and I'm just going to
preemptively, albeit somewhat redundantly, going to teach the
compiler that there exists a function called print name, it takes a string,
and I'm going to call it name semicolon.
>> So this now in line four, which we haven't seen before, is a declaration
of a function print name, but it's only a promise that this function will
eventually be defined, eventually be implemented.
This now I can leave alone because now this is the definition, the
implementation, sort of the last mile of the implementation of this
particular function.
So frankly it's stupid, it's annoying, but this is the way C is, and it's
because it takes you very literally and, as a computer frankly should,
only does exactly what you tell it to do, and so that ordering is important.
>> So keep that in mind and again, start to notice the recurrence of patterns.
Odds are you will, if you haven't already, start to encounter messages
like this that at first glance seem completely cryptic, but if you start
to look for these key words like "implicit declaration," mention of a
function in this case-- and frankly, you sometimes even get a little green
carrot symbol that tells you where the issue probably is--
you can begin to work your way through yet unseen error messages.
Any questions on writing your own function in this way?
>> Let's do something that's a little more compelling.
Rather than just do something that has a side effect of printing, let me go
ahead and save a new file, and we'll call this positive.c, even though it's
going to be a little different versus last time.
And this time, I want to re-implement last time's positive.C example, which
is force the user to give me a positive integer.
But I had to use get int last time.
Wouldn't it have been nice if there was a function called get positive int
that I could just outsource this piece of functionality to?
So the difference here is we'll implement get positive int, but unlike
print name which had a side effect-- it didn't return something to me like
a number or a string--
get positive int is, of course, going to return, hopefully, a positive int.
>> So let's do this.
Include cs50.h, Include standard io.h.
Int main void.
And now in here, I'm going to go ahead and let's say int, call it n, equals
get positive int.
And just like get int already exists because the staff wrote it, I'm going
to assume for the moment that get positive int exists, and now I'm going
to go ahead and say printf, thanks for the %i/n",n.
>> So now if I compile this program, what is going to happen in my terminal
window at the bottom of the screen?
I'm going to probably get that same error as before.
So let's try this.
Make positive.
And again, implicit declaration of function, get positive int.
So we can solve this in a couple of ways.
I'm going to keep it simple and just put my declaration up here and get
positive int.
I need the so-called signature.
The signature just refers to the aesthetics of the
first line of the program.
So what should get positive int return?
>> So an int.
I mean ideally, it would return something like positive int, but that
doesn't exist.
We've not seen that among our data types, so we have to deal with the
fact that we have very few data types to work with.
But we can return an int and just trust that it will be positive.
It's going to be called get positive int.
>> And now how about its arguments?
Does it take any input?
Does it need any input?
So it doesn't need to know in advance anything.
Get string doesn't, get int doesn't.
Printf does-- it needs to have some input passed into it-- and print name
needed some input, but get positive int does not.
So I'm going to explicitly tell the compiler void.
Void is the absence of anything else.
So void means nothing is going inside of those parentheses, semicolon.
>> And now at the bottom of my file-- and again, I'm just being kind of ***
here putting main at the top, which is good practice because this way,
anytime you or someone else opens your file, the
functionality is right there.
You can dive in from square one.
So now I'm going to duplicate this, get positive int void, but I'm not
going to hit a semicolon now.
I'm going to open curly braces, and now I need to borrow
some ideas from Monday.
>> So as you recall, we did something like do the following while
something was true.
And what did I do?
I did something like give me a positive integer,
little bit of a prompt.
I could use any words I want.
And then I used what?
Int n equals get int, no arguments to it.
>> And notice the difference.
When you call a function, when you use a function, you don't put in void.
You only do that when declaring a function, teaching the compiler what
it should expect.
So you don't need to put void there yourself.
>> And now what was my condition?
Well, n is not equal to positive, but that's just pseudo-code.
So how do I express this more cleanly?
So less than or equal to zero.
So again, notice you can do less than or equal to.
Even though it's two separate symbols, you can do it on
your keyboard as such.
>> But there's still a bug that I screwed up last time too.
I have to declare--
exactly.
I have to declare n outside of the loop.
So I need to put n up here, and I don't want to re-declare it in here
lest I get a new variable.
I just want to assign a value in here.
>> And now I'm not quite done here.
Let me get ahead of myself and pretend I'm done.
Make positive, and now there's a new error.
Control reaches end of non-void function.
So new error message, but if you kind of tease apart each of the words, it
probably hints at what's wrong.
>> Control.
Control just means to the order of operations in a program.
The computer's in control and something went wrong.
So it reaches the end of a non-void function.
What function is it apparently referring to?
What function is non-void?
So get positive int, and a little confusing in that well,
it's kind of void.
It has a specification of void for its arguments, but its output is going to
be of type n.
So the word on the left is the so-called return type.
The word on the inside here is the zero or more arguments
that a function takes.
>> So what do I need to do?
At this point in my code, line 21 where the blinking prompt now is, I
have a positive int inside of the variable called n.
How do I give it back to main?
Literally.
Return n semicolon.
>> So just as Colton returned a piece of paper with an answer to me by dropping
that piece of paper in the little black box on the table, to do that in
code, you literally just write, return n, and it's as though Colton were
handing me something physical back.
In this case, what's happening is get positive int is going to hand back
what's presumably a positive integer to whom?
Where does that value end up?
That ends up in this variable, n, and then we proceed with line nine.
>> So in other words, in order of operations, this program starts
executing, and the compiler realizes, oh, you want the library?
Let me go grab whatever's inside that.
Oh, you want the standard IO library?
Let me go grab whatever's inside that.
What does the compiler say to itself when it hits line four?
Oh, you promised to implement the function called get positive, but
we'll get back to that later, something along those lines.
>> Int main void just means here's the guts of my program.
Line seven is just a curly brace.
Line eight is saying on the left, give me 32 bits for an integer, call it n.
On the right hand side, it's saying get positive int.
Now let's pause that story because now I don't keep moving my cursor down.
My cursor now goes down here because now get positive int executes.
Int n is declared.
Do the following.
Printf gives me a positive integer.
>> Get an int from the user, store it in n, and maybe do this again and again.
This loop means that this code might execute up and down like this again
and again, but when the user finally cooperates and gives me a positive
int, I hit line 21, at which point the number is handed back, and which one
should I highlight now?
Nine.
Control, so to speak, returns to line nine.
That's the line that's now in charge.
>> So that's what's been happening all this time underneath the hood, but
when we've used functions like printf or even get string that someone else
wrote for you, control was being handed off to someone else's code line
by line by line.
It's just we couldn't see it and we couldn't really depict it in this
program because it's in some other file on the hard drive
unbeknownst to us.
So let's actually compile and run this now.
>> Make positive.
Compile, that's progress.
./positive.
Give me a positive integer.
Let's be difficult.
Negative 1.
Zero.
Let's give it 50.
Thanks for the 50, and so control has now returned.
Any questions, then, on that?
Yeah?
>> [INAUDIBLE].
>> Say again.
Oh, good question.
So you might notice a parallel here that I'm kind of cutting a corner on.
In line 12, I'm saying, get positive int returns an int, but by that same
logic, it now stands to reason that in line six, I'm saying that main returns
an int, but what have we never had in any of our programs?
We've never had mention of this key word return.
>> So it turns out that in C, at least the version of it that we're using
made in 1999, technically, this is happening for you automatically.
Anytime you implement a program and you implement a function called main,
that function will return zero by default if you don't say otherwise,
and zero is just a convention.
The world returns zero thereby indicating that all is well,
effectively leaving us with four billion possible things that could go
wrong so that if we return one, that might signify a code that means this
thing went wrong.
We could return two, which means this other thing went wrong.
We could return four billion, which means this other thing went wrong.
>> And if you now think about your own PC or Mac, you might recall that
sometimes you get cryptic error messages from software that you're
using, and sometimes it has a human friendly description, but there's
often a code or a number on the screen?
If this doesn't come to mind, just keep an eye out for it.
That's typically what these codes are referring to.
They're included in Microsoft Word and other programs so that if you file a
bug report with the company, you can tell them, oh, I got error number 45.
And some programmer back at the company can look that up in his or her
code and say, oh, that's because I made this bug and that's why the user
got this message.
>> But frankly, it's just a little distracting and a little tedious to
conclude that, at least on our first few programs, so we've
been omitting it.
But all this time every one of your functions main has secretly had this
line automatically added for you by the compiler, just by convention to
save you some time.
>> [INAUDIBLE].
>> You do not need to include it in main.
That's fine.
You do need to include it if you were implementing a function like this.
Otherwise the function flat out would not work.
But in main, it's not necessary.
In a week or two, we'll start getting into that habit once we want to start
signifying errors.
Really good question.
>> So quick verbal break to mention that this Friday, we won't be having lunch
per se, but we'll be having dinner with some of the students and staff.
If you'd like to join us, feel free to go to cs50.net/rsvp.
6:00 PM this Friday.
Space is, as always, limited, but we'll continue doing these on a nearly
weekly basis if space runs out this week.
>> So the cliffhanger that we left off on Monday was that strings can actually
be indexed into, which just means you can get at the first character, the
second character, the third character and so forth, because you can
effectively think of a string, like hello, as being in this case five
letters inside of boxes.
And you can get at each of those boxes with what syntax did we
introduce on Monday?
Those square brackets on your keyboard.
That just meant go to location zero.
>> We start counting at zero, so bracket zero signifies h, bracket one
signifies e, and so forth.
And so all the time when we've been using strings and typing in "hello"
and "world" and other things on the screen, it's been stored
in boxes like this.
And take a guess.
What does each box represent physically inside of your computer?
>> [INAUDIBLE].
>> Sorry?
>> Characters.
>> So a character, certainly in the case of strings, and a character is just
eight bits or one byte.
So you probably are at least vaguely familiar with the fact that your
computer has memory.
It has two types of memory at least.
One is the hard disk where you save stuff permanently, and that's
typically big so you can have movies and music and so forth.
>> Then you have another type of memory called RAM, R-A-M, Random Access
Memory, and this is the type of memory that is used when your computer is
running but if you lose power or your battery dies, anything that's stored
in RAM disappears if you lose power altogether because it's not
persistent.
You typically have, these days, a gig of it, two gigs, maybe more.
And the upside of RAM is that it's much much, much faster than hard disks
or even solid state drives these days, but it's typically more expensive so
you have less of it.
>> So today's conversation really refers to RAM, that type of memory that
exists only while there's power being fed into your computer.
So when you type in H-E-L-L-O, Enter on the keyboard, the H is going in one
byte of RAM, the E is going in another byte of RAM, as is
the rest of the word.
So recall what we were able to do last time was this.
Let me go ahead and open up the file that we called string.c, and recall
that it looked a little something like this.
Let me actually roll back and change it to exactly what it looked like,
string length of s.
>> So look at the program here.
We include the CS50 library so that we can use get string.
We include standard io.h so we can use printf.
Why did we include string.h?
This was new on Monday.
So we wanted string length.
Str leng.
People decided years ago, let's just be succinct.
Instead of calling it "string length," let's call it "str leng" and let the
world figure that out, and so that's what we get access to with string.h.
>> This is familiar.
This is familiar.
This is familiar.
This is a little new.
In line 22-- and we'll come back to this, but for now know--
and you would only know this from having read the documentation or if
you knew C already--
get string sometimes can screw up.
If the user is really adversarial or uncooperative and he or she just
doesn't type anything at the keyboard or types so much at the keyboard that
it overwhelms the computer's memory, in theory, get string could return
something other than a string of characters.
It could return a special value called NULL in all caps, N-U-L-L, and this is
just a so-called sentinel value.
It's a special value that signifies something bad happened in this case.
It is the absence of a string.
>> So null I'm checking for simply so that, long story short, str leng and
other functions that come with C, if they expect a string but you pass them
the absence of a string, if you pass them NULL, the computer or the program
will just crash outright.
It will hang.
It will throw up some error message.
Bad things will happen.
So even though this is still not well-defined--
this will make more sense in a week or two-- in line 22, this is just an
example of self defensive error checking just in case one time out of
a million something goes wrong, at least my program won't crash.
>> So if s does not equals something bad, I have this for loop, and this was
where we had that other new piece of syntax.
I have a for loop iterating from zero on up to the length of s.
And then here, I was a printing out s bracket i, but why did I use %c all of
a sudden instead of %s even though s is a string?
It's a character, right?
S is a string, but s bracket something, s bracket i where i is zero
or one or two, that's an individual character in the string, and so for
that, printf needs to be informed that it's indeed a character to expect.
>> And then recall, what did this program actually do?
>> Printed it out in columns.
>> Yeah, exactly.
It just printed the word that I type in a column, one character per line.
So let's see this again.
So make string.
Compiled OK. ./string.
Let me type in H-E-L-L-O, Enter, and indeed I get it, one per line.
>> So let me do one optimization here.
If you think about it, especially if you've programmed before, there's
arguably an inefficiency in line 24.
In other words, it's not necessarily the best design.
Straightforward, at least once you remember what str leng is, but it's
doing something dumb potentially.
What might that be?
>> [INAUDIBLE].
>> Exactly.
It's checking for the length of s every single time even though
H-E-L-L-O is always going to be five characters.
Every time through this loop, the five is not changing.
I might be incrementing i, but what is the length of s on every
iteration of this loop?
It's five, it's five, it's five, and yet I am nonetheless asking this
question again and again and again.
Now frankly, the computer is so damn fast, no one's going to notice a
difference in this case, but these kinds of poor design decisions can
start to add up if the compiler itself doesn't try to fix this for you which
it typically wouldn't, at least in the appliance.
>> So I'm going to do this.
I'm going to add a comma after my first variable, i.
I'm going to give myself another variable, calling it n, just by
convention for numbers, and then I'm going to assign n the value of string
length of s.
And then I'm going to change my condition to be what?
I'm going to change my condition to while i is less than n.
>> So now, how many times am I checking the length of s?
Once, but it's OK to check i against n again and again because now those
values are not actually changing.
Now for now, just know that anytime you call a function, there's a bit of
overhead, not enough to discourage you really from ever using functions, but
certainly when there's a line of code like that-- and the lines will get
more interesting before long-- where there's an opportunity to think, if I
type this code, how many times will it execute?
You'll start to see over time the performance of your programs can
indeed change.
>> In fact, one of the problem sets we've done in years past involves
implementing, as you may recall from week zero, a spell checker, but a
spell checker that's designed to support a dictionary of 150,000 plus
words that we give you guys.
You would have to write code that loads those words into RAM, so into
boxes like we saw on the screen a moment ago, and then as fast as you
can, you need to be able to answer a question of the form, is this word
misspelled?
Is this word misspelled?
Is this word misspelled?
>> And in something like that what we've done in years past is turned it into,
albeit on an opt-in optional basis, a competition of sorts, whereby the
students who use the less RAM and less time, fewer CPU cycles, end up
bubbling up to the top of a little leader board or ranking that we put on
the course's homepage as we've done in years past.
So again, totally optional, but this speaks to the design opportunities
that are ahead once we start building atop some of these
basic building blocks.
>> So let me go back to this diagram for just a moment and reveal a little
something more.
This indeed is a string, and we've taken advantage of a few libraries,
standard io.h which has--
>> Printf.
>> Printf, among other things.
cs50.h, which has get int and get string and so forth, string.h, which
had str leng.
But it turns out there's yet another.
Frankly, there's lots and lots of header files that declare functions
for libraries, but this ctype.h is actually going to be somewhat
advantageous because I'm going to go ahead and implement one
other program here.
>> Let me go ahead and open up something I wrote in advance called
capitalize.c, and let's take a look at how this works.
Notice that I'm using, in this version of it, three familiar files.
Notice that in line 18, I'm getting a line of text.
Notice in line 21, I'm claiming that the following code is going to
capitalize s, whatever the user typed in, and how am I doing that?
Well, I'm taking--
lesson learned from last time--
I'm declaring i and n and iterating over the characters in the string.
And then what is this block of code in line 24 through 27
doing in layman's terms?
>> Lowercase letter back.
>> Exactly.
If s bracket i-- so if the i-th character of s, which is a specific
char in the string, is greater than or equal to lowercase a and--
recall that double ampersand signify and--
and the same character, s bracket i, is less than or equal to lowercase z,
that means it's an a or a b or a c or dot, dot, dot, or a z, which means
it's lowercase.
What do I want to do in that case?
Well, I can do this somewhat cryptically, but
let's tease this apart.
>> I'm going to call printf, prints %c because I want to reprint this
character on the screen.
I'm then going to take s bracket i, the i-th character in s, and then why
am I doing this little trick here, lowercase a minus capital A?
What is that going to give me, generally speaking?
>> [INAUDIBLE].
>> Exactly.
I don't really remember--
it was 65 for capital A. I don't really remember what lowercase a is,
but no matter.
The computer knows.
So by saying, lowercase a minus capital A, it's weird to be
subtracting one char from another, but what are chars underneath the hood?
They're just numbers.
So whatever those numbers are, let the computer remember it
rather than me the human.
>> So lowercase a minus capital A is going to give me a difference.
It happens to be 32, and that would be the case for lowercase b and capital B
and so forth.
It stays consistent, thankfully.
So I'm essentially saying, take the lowercase letter, subtract off that
standard difference, and that effectively changes s bracket i from
lowercase to, of course, uppercase, without my really having to think
about or remember, what were those numbers we talked about when the eight
volunteers came up on stage?
Now meanwhile, in the else, if it's not a lowercase letter as determined
by line 24, just print it out.
I only want to touch the characters that were
actually originally lowercase.
>> So let's see this.
Make capitalize.
Compiled, OK.
./capitalize.
And let me type in H-E-L-L-O in lowercase, Enter.
And notice that it is converted into uppercase.
Let me do this again with a different word.
How about D-A-V-I-D with the first D capitalized as a name typically is?
Enter.
Notice it's still correct.
It just outputted that first D unchanged via that else construct.
>> So keep in mind, then, a couple of things here.
One, if you ever want to check two conditions at once, you can and them
together as we predicted.
You can compare characters in this way and effectively treat characters as
numbers, but frankly, this is so damn cryptic I'm never going to remember
how to come up with this from scratch without reasoning through it for quite
a bit of time.
>> Wouldn't it have been nice if someone out there wrote a function called is
lower that could answer for me true or false, this character is lowercase?
Well thankfully, whoever wrote ctype.h did exactly that.
Let me go up here and add ctype for c types, and now let me go down here and
rewrite this line as follows.
>> So if it's called is lower, I claim, s bracket i, then I'm going to delete
these two lines altogether.
So now someone else, I'm hoping, wrote a function called is lower, and it
turns out they did and they declared it inside of ctype.h.
And now I'm going to leave line 27 alone, I'm going to leave line 31
alone, but notice how much I've tightened up my code.
It's now cleaner.
It's less difficult to look through because now the function, moreover, is
so wonderfully named it just does what it says.
>> So now I'm going to save this.
I'm going to zoom out.
And just as in Scratch you could have Booleans, Boolean values true or
false, that's exactly what is lower effectively returns.
Let me recompile.
Let me re-run.
And now let's try it again, H-E-L-L-O, Enter.
That's pretty good.
And try it again, make sure I didn't screw something up.
That is capitalized as well.
>> But this isn't good enough because the other thing that I'm never going to
remember unless I work through it really carefully on, say, paper is
this damn line.
Wouldn't it be nice if there were a function called to upper?
Well it turns out there is in ctype.h as well.
I'm going to go ahead and type--
let me bring that line back.
Instead of this here, let me go ahead and say, substitute for the %c the
result of calling this function to upper on the i-th character of s.
And now notice it's getting a little balanced.
I have to keep track of how many parentheses I've opened and closed.
>> So now it's even cleaner.
Now this program is getting better and better designed arguably because it's
much, much more readable but it's no let's correct.
Make capitalize.
./capitalize.
H-E-L-L-O. Let's run it again, D-A-V-I-D. OK, so we're still in
pretty good shape.
>> But now to upper.
I propose that there's one more refinement we could make that would be
really nice, that could really tighten up this code and really give us five
out of five for design, for instance.
What would be nice to get rid of?
Well, look how damn long this block of code is just to do something simple.
>> Now as an aside, as you might have seen in super section this past
weekend, you don't strictly need the curly braces when you just have one
line of code, even though we proposed keeping them so that it makes much
more clear, like in Scratch's U-shaped blocks, what's inside of the branch.
But wouldn't it be nice if to upper, when given its input, turned it into
uppercase if it's not, and what would be wonderful in the opposite case if
it's already uppercase?
Just pass it through and leave it alone.
>> So maybe it does that.
I could try and just hope that it does, but let me
introduce one other thing.
Instead of using this built-in terminal window down here, recall that
this square black icon gives you a bigger terminal window that I can full
screen if I want?
So it turns out they're sort of oddly named, but there's these things called
man pages, manual pages, man for short, and I can access these by
typing man--
what do I want to type?
Man to upper.
>> And now notice if there exists a function inside of the computer, in
this case the appliance, which is just the operating system Linux, it's going
to give me a somewhat cryptic set of output, but you'll find over time that
it always is formatted pretty much the same so you start to get used to it.
Notice at the top to upper, and apparently is the same documentation
for to lower.
Whoever wrote it was cutting some corners and put it all on one page.
These things' purpose in life is to convert a
letter to upper or lowercase.
>> Notice that under Synopsis, the man page is teaching me what file I have
to include to use this thing.
It's giving me the signatures for these functions, both of them, even
though we right now only care about one.
Here is now a description.
To upper converts the letter c to uppercase if possible.
>> Still not that instructive, but let me now look under return value, the thing
that's handed back.
So the value returned is that of the converted letter or c if the
conversion was not possible.
What is c?
>> The original character.
>> The original character and we know that by, again, going up to the
synopsis, and whoever wrote this function just decided that the input
to to upper and to lower is just arbitrarily going to be called c.
They could have called it most anything they want, but they kept it
simple as c.
So I've consulted the man page.
This sentence reassures me that if it's not a lowercase letter, it's
going to just give me back c, which is perfect, which means I can get rid of
my else condition.
>> So let me go back to GEdit, and now let me just do this.
I'm going to copy my printf statement.
I'm going to go ahead and right inside the for loop print that out, and get
rid of now this whole if construct.
Wasn't a bad idea, and it was very much correct and consistent with
everything we've preached, but just not necessary.
As soon as you realize some library function exists that someone else
wrote, or maybe you wrote elsewhere in the file, you can use it and really
start to tighten up the code.
>> And when I say things like good style, the fact that this person called the
function to upper, or previously is lower is wonderfully useful because
they're very descriptive.
You wouldn't want to call your functions x and y and z, which have
much, much less meaning.
Any questions on that series of improvements?
>> So suffice it to say one of the takeaways is even as your own problem
set-- maybe problem set one, but certainly P set two and onward, even
when they're correct doesn't necessarily mean they are perfect just
yet or particularly well-designed.
That's the other axis to start thinking about.
So this was a string inside of your computer's memory, but if you have a
whole bunch of characters like H-E-L-L-O inside of RAM, and suppose
that you in your program call get string multiple times such that you
call get string once, then you call get string again.
Well, what's going to happen over time?
>> In other words, if you have a line of code, albeit out of context, like
string s gets--
let's do this.
String name equals get string.
So suppose that line of code is meant to ask the user for his or her name.
This next line of code is meant to ask the user for his or her school, and
this next line, and so forth.
Suppose that we keep asking the user for another and
another and another string.
They're going to stay in memory at the same time.
One is not going to clobber the other.
School is not overwrite the other.
But where do they all end up in memory?
>> Well, if we start to draw on the screen, which we can use this thing
here like a chalkboard, if this black rectangle represents my computer's
memory, I'm going to arbitrarily start dividing it up into little squares,
each of which represents one byte of memory.
Frankly, if you have a gigabyte of RAM these days, you have a billion bytes
of memory in your computer, so a billion of these squares.
So suffice it to say, this isn't really to scale.
>> But we could keep drawing all of these clearly not to scale squares, and this
collectively represents my computer's memory.
Now we'll just do dot, dot, dot.
So in other words, when I now prompt the user with get string to give me a
string, what happens?
If the user types in "hello," that ends up in H-E-L-L-O. But suppose the
user then types in--
actually, I shouldn't have done hello because we're asking
them for their names.
So let's go back if I can do this.
>> So if I type in D-A-V-I-D for my name, but recall that the second line of
code was get string again to get their school.
Where is that word that the user types in going to go next?
Well, maybe it's going to go into H-A-R-V-A-R-D. So even though I've
drawn it as two rows, this is just a whole bunch of bytes in your
computer's RAM.
There's a problem now because now if I'm using RAM in this very reasonable
but sort of naive way, what can you not apparently distinguish?
Where one begins and where one ends, right?
They're kind of blurring together.
>> So it turns out the computer doesn't do this.
Let me actually scroll back in time a few characters, and instead of Harvard
going immediately after the user's name, the user actually gets, behind
the scenes, a special character inserted by the
computer for him or her.
/0, otherwise known as the nul character annoyingly called N-U-L, not
N-U-L-L, but you write it as /0.
It's just all zero bits is a marker in between the first word that the user's
typed and the second.
>> So Harvard actually now ends up as this sequence of characters
and one more /0.
So in other words, by having these sentinel values, eight contiguous zero
bits, you can now begin to distinguish one character from another.
So all this time what was "hello" is actually "hello" with a /0, and
meanwhile, there might very well be quite a bit more RAM
inside of the computer.
>> Let me do one other thing now.
It turns out that all of these squares we've been drawing, they are, yes,
strings, but more generally, these things are arrays.
An array is just a chunk of memory that's back to back to back to back,
and you typically use an array by way of this square bracket notation.
So we're going to see these quite a bit over time, but let me go ahead and
open up, let's call it ages.
And notice what we can do with these same tricks, a little
bit more syntax here.
>> So in line 17 of this program-- actually, let me run the program first
so we can see what this thing does.
Let me call make ages to compile this program.
./ages.
How many people are in the room?
Call it three.
Age of the first person?
18, 19, and 20.
And now somewhat ridiculously, I just have made a program that ages those
three people.
>> So there's clearly an opportunity for some fun arithmetic here.
Thankfully, the math is correct.
18 went to 19, 19 went to 20 and so forth.
But what's really meant to be illustrative here is how we're storing
those three people's ages.
Let me zoom in at what's going on here.
>> So first, these first few lines should be getting pretty familiar.
I'm just prompting the user for the number of people in the room.
Then I'm using get int and do while to do this again and again and again.
We've seen that pattern before, but line 27 is new and actually quite
useful, and will become increasingly useful.
Notice that what's different in line 27 is that I appear to be declaring an
int called ages, but wait.
It's not just int ages.
There's these square brackets, inside of which is n.
>> So the bracket n in this context, not inside of a printf statement here but
in this sole line 27, this line is saying, give me n ints, each of which
is of type int.
So this is a bucket, so to speak, of, in this case, three integers back to
back to back so that I effectively have three variables.
The alternative, to be clear, would be this.
>> If I wanted the first student's age, I might do this.
If I wanted the second student's age I might do this.
If I wanted the third student's age, I might do this.
And god forbid we need everyone's age in this room--
I mean, this is a heck of a lot of copy, paste again and again and again.
And plus once I compile this program, if another student walks in over out
of that door, now my number of variables is incorrect.
>> So what's nice about an array is as soon as you start feeling yourself
copying and pasting, odds are that's not the best approach.
An array is dynamic potentially.
I don't know in advance how many people are going to be in the room,
but I do know I need n of them, and I'll figure out n when the time comes.
This line of code now means, give me a chunk of memory that looks like this
where the number of boxes on the screen is entirely dependent on n that
the user typed in.
>> So now the rest of this program is actually pretty similar to what we
just did with characters.
Notice I have a for loop starting in line 30.
So right after I get the array, I iterate from y equals zero on up to n.
I just have this instructive printf message just saying, give me the age
of person #%i, so number one, number two, number three.
And why did I do this?
Frankly, humans prefer to count from one on up whereas computer scientists,
zero on up.
computer scientists are not going to use this kind of program, so we're
going to just start counting at one like normal people.
>> And now in line 33, notice the slightly different piece of syntax.
The i-th age in that variable of type array is going to get an int.
And now lastly, this is just arithmetic down here.
I decided in a separate loop to claim some time passes, and now in this
separate loop, these lines execute.
>> A year from now, person i will be i years old, but notice this isn't the
variable i.
This is now %i for int.
And notice as the first placeholder, I plug in i plus 1, so we count like a
normal person.
And then for the value of their age, for i years old, I take ages bracket
i-- and why am I doing plus one here?
They just aged.
It's my stupid choice of programs.
They just aged one year.
I could type in any number that I actually want there.
>> So what's actually all of the relevance here?
Well, let me actually scroll back over here and paint a picture
of what lies ahead.
What we'll be doing with our next Problem Set Two is dabbling in the
world of cryptography.
So this is a string of characters, so a sequence of multiple chars, and what
does this say?
It's not in the online version of the slides.
>> So I claim that this equals this, a stupid advertisement from many years
ago that might actually recall one of its origins.
So this is an example of encryption or cryptography.
It turns out that if you want to actually send information or share
information with someone securely, like a message like this, you can
scramble the letters.
But typically, the words are not scrambled randomly.
They're permuted in some way or changed in some way so that-- oops.
That's a fun spoiler for next time.
>> So you can map what is apparently O to B. Notice that lines up
capitalization-wise.
Apparently r becomes e.
Apparently F-H-E-R becomes S-U-R-E. So it turns out there's a mapping, and in
this case there's a pretty stupid mapping if anyone has figured it out?
This is something called Rot 13, Rotate 13.
It is the stupidest of encryption mechanisms because it literally just
adds 13 to every one of the letters, stupid in the sense that if you just
have a bit of free time on your hands and a pencil, or you just think it
through in your head, you could try all possible additions-- one, two,
three, dot, dot, dot, 25 to just rotate the whole alphabet, and
eventually, you'll figure out what this message is.
So if you did something like this in grade school passing messages to your
best friend, if your grade school teacher simply read through the
message and brute forced the solution, you might have gotten
an answer by that.
>> Now of course, in the real world, cryptography is more sophisticated.
This is a snippet of text from a computer system that has usernames and
passwords, as almost all of ours do, and this is what your password might
look like if stored on your hard drive but in encrypted form.
This is not just a rotation of letters, A is B and B is C. This is
much more sophisticated, but it uses what's generally known as secret key
cryptography.
This picture tells the following story with a few icons.
>> On the left, we have what we'll call plain text.
In the world of cryptography, plain text is just the original message
written in English or French or any language whatsoever.
If you want to encrypt it, we'll pass it pictorially through a padlock, so
some of kind of algorithm, some function or program that someone wrote
that scrambles the letters hopefully more complicatedly than just adding 13
to each of them.
>> What you get out of that process in the middle there is called cyphertext.
So kind of a sexy word.
It just means it's the encrypted version of the plain text.
And only if you have that same secret, 13 or minus 13, are you able to
decrypt a message like that.
>> So in Problem Set Two, among the things you'll do if in the Hacker
Edition, you will have to write code to crack these passwords, figuring out
what they were and how they were encrypted, though we do give you a bit
of guidance along the way.
In the Standard Edition, we introduce a couple of ciphers, encryption
mechanisms, one called Caesar, one called Vigenere, that are still
rotational ciphers where A becomes something, B becomes something, but
you have to do it programmatically because there will indeed be a secret
key involved which is typically a number or a keyword that only the
sender and the recipient of these messages should understand.
>> Now, this actually has incarnations in the real world.
This, for instance, is little orphan Annie's secret decoder ring, and you
can actually implement these rotational ciphers--
A becomes something, B becomes something-- with a couple of wheels,
one on the outside, one on the inside such that if you rotate the wheel or
the ring, you can actually line up the letters with different letters,
getting a secret code.
And so as the cliffhanger for today, what I thought I'd do is a bit of
throwback that if you turn on the TV on December 24, you can watch the
movie ad nauseum for 24 hours in a row.
But for today, I'll open it up here and give us just two minutes of a
pedagogically relevant Christmas Story with a little fellow named Ralphie.
>> [VIDEO PLAYBACK]
>> -Be it known to all and sundry that Ralph Parker is hereby appointed a
member of the Little Orphan Annie secret circle and is entitled to all
the honors and benefits occurring thereto.
>> -Signed, Little Orphan Annie.
Countersigned, Pierre Andre in ink.
Honors and benefits already at the age of nine.
>> [SHOUTING ON RADIO]
>> Come on, let's get on with it.
I don't need all that jazz about smugglers and pirates.
>> -Listen tomorrow night for the concluding adventure of the black
pirate ship.
Now, it's time for Annie's Secret Message for you members
of the secret circle.
Remember, kids.
Only members of Annie's Secret Circle can decode Annie's secret message.
Remember, Annie is depending on you.
Set your pins to B2.
Here is the message.
12, 11, 2--
>> -I am in my first secret meeting.
>> -25, 14, 11, 18, 16--
>> -Pierre was in great voice tonight.
I could tell that tonight's message was really important.
>> -3, 25.
That's a message from Annie herself.
Remember, don't tell anyone.
>> -90 seconds later, I'm in the only room in the house where a boy of nine
could sit in privacy and decode.
Aha, B. I went to the next.
E. The first word is "be." S. It was coming easier now.
U. 25.
That's R.
>> -Come on, Ralphie.
I gotta go.
>> -I'll be right down, Ma.
Gee ***.
>> -T. O. Be sure to.
Be sure to what?
What was Little Orphan Annie trying to say?
Be sure to what?
>> -Ralphie, Randy has got to go.
Will you please come out?
>> -All right, Ma.
I'll be right out.
>> -I was getting closer to now.
The tension was terrible.
What was it?
The fate of the planet may hang in the balance.
>> -Ralphie, Randy's gotta go.
>> -I'll be right out for crying out loud.
>> -Almost there.
My fingers flew.
My mind was a steel trap.
Every pore vibrated.
It was almost clear.
Yes, yes, yes, yes, yes.
>> -Be sure to drink your Ovaltine.
Ovaltine?
A crummy commercial?
Son of a ***.
>> [END VIDEO PLAYBACK]
>> SPEAKER 1: This is CS50, and that will be Problem Set Two.
See you next week.
>> SPEAKER 2: At the next CS50, this happens.
>> SPEAKER 1: So one topic we haven't looked at thus far is
that of function pointers.
Now, a function pointer is just the address of a public
function, but much like--
son of a--