Tip:
Highlight text to annotate it
X
In this video, we're gonna continue our development of Cool type checking with a
discussion of type environments. Let's begin by doing some more type rules. So
here's one for the constant false. So it's provable that the constant false has the
type [inaudible] and that's not surprising. If I have a string literal S,
then it's provable that, that has type string. And that's also not very
surprising. The expression new T produces an object of type T. And the type rule for
that is very straightforward. New T has type T. And we're just going to ignore
self type for now. As I mentioned in an earlier video, we'll deal with self type
later in a video all on its own. Here are a couple of more rules. If it's provable
that an expression e has type bool then an e Boolean complement of e not e also has
type bool. And finally perhaps our most complex rule so far, the rule for a while
loop, and we call that the e-1 here is the predicate of the loop, this is what
determines if we keep executing the loop or not, and e2 is the body of the loop.
And so type one is required to have type bool. It needs to be provable that e one
had type bool, and we allow e two, the body of the loop, to have an arbitrary
type. It can have any type t. It has to have some type, so it has to be type able
under some. Rules, but we don't care what the type is because the type of the entire
expression is just object. We don't actually return the, this expression
doesn't return an interesting value, doesn't produce an interesting value, and
to discourage people from trying to rely on it, we just type the whole thing as
object. And this is a little bit of a design decision. Now we could have
designed a language, for example, where the type of a while loop is, was type t.
And that you would get the last value of the loop that was that was executed but
the problems is that if E one the protocol loop is false and reaches the loop the
first time Then you never evaluate e two, and no value is produced, and in that case
you would get a, a void value. Which if so mebody tried to dereference, it would
result in a run time error. That's so to discourage programmers from lying On the
loop, producing a meaningful value. We could just type it, as object. So far,
it's been pretty straight forward to define reasonable type rules for every
construct that we've looked at. But now we actually come to a problem. Let's say we
have an expression which consists just of a single variable name, and that's a
perfectly valid, cool expression, and the question is, what is the type of that
variable, call it X And as you can see. When we're just looking at X by itself we
don't have enough information to give X a type. This local structural rule does not
carry any information about the type of X And stepping back one level, inference
rules have the property that all the information needs to be local. Everything
we need to know. To carry out the function of the rule has to be present in the rule
itself. There are no external data structures. There's nothing we're passing
around here that's on the side. Everything has to be encoded in this rule and, so far
at least, we just don't know Enough to say what the type of a variable should be. So
the solution to this problem is just to put more information in the rules, and
that's what we're going to do, so a type environment gives types for free
variables. So what is a free variable, a variable is free in an expression if it is
not defined within that expression. So for example, in the expression X, X is free.
In the expression x plus y (x+y), well, here this expression uses both x and y,
and there's no definition of either x or y in that expression, so x and y are free
And that expression. If I have, let Y... So I'm declaring a variable Y in X + Y.
Well what's free in this expression, well this expression uses X and Y but the use
of Y is governed by a definition of Y that occurs within the expression itself. So we
say here that Y is bound, Y is a bound variable in this expression but X is still
free, so only X is free in that expression. And the ide a here is, that if
I have an expression with three variables, and you want me to type check it, you have
to tell me what the types of those variables are. So I can type check X if
you tell me what the type of X is. I can type check X plus Y if you tell me the
types of X and Y. And I can type check this expression, this line expression, if
you tell me the type of its one free variable X, the type of Y. We will be
given a declaration by the let, but we still have to tell me what the type X is.
So the free variables are just those variables where you have to give me the
information and then I can carry out the type checking. The type environment
encodes this information so a type environment is a function from object
identifiers from variable names to types So let O be a type environment. One of
these functions from object identifier names types. And now we're going to extend
the kinds of logical statements that we prove to look like this. And the way that
this going to be read is that under the assumptions that variables have the types
given by O. So the assumptions go over here on the left side of the turnstile.
These are the assumptions that we're making about the free variables in E. So
the assumption that, excuse me, three variables. Have the types given by o is
provable, that's this turn style here, that the expression e has type t. And, so
this notation very nicely, separates what we're assuming. This is input to our,
process of figuring out what the type is from what we're proving. So if you tell me
the types of the free variables as given by o, then I can tell you, the type e. The
type environment has to be added to all the rules that we've gone through so far.
So for example, for intergal literals if I have some set of assumptions of all the
types of variables, that doesn't really change, it doesn't, in fact it doesn't
change, what the type is an intergal literal. Any intergal literal will still
have type int. And so in this case, for this particular kind of expression, I we
don't use any of our assumptio ns about the types of variables. Now, it's a little
bit different with the case of sum expressions. So if I have the expression E
one plus E two, and I have some assumptions, zero, about the types of
variables, well, then I want to prove that E one has type int, and I'm gonna do that
using the types of the variables given by zero, so E one might contain free
variables and I'll need to look in zero to figure out what the types of those
variables are. And similarly for E two, I will type E two under the same set of
assumptions. And if E1 has type int under the assumptions O and E2 has type int
under the assumptions O. Well, then I can conclude that E1 plus E2 has type int
under the same set of assumptions O. And we can also write new rules, so now our
big problem with free variables becomes a very easy problem. If I want to know what
the type of X is, and there's a missing O here, if I want to know what the type of X
is, I simply look it up in my object environment. So under the assumption that
the variables have the types given by O, what is the type of x? Well, I look up in
O what the type of X is assumed to be, and I then can prove that X has that type T.
So now let's take a look at a rule that actually does something interesting with
the variables from the point of view of the environments. So here is a [inaudible]
expression. And let's remind ourselves what this does. This is a [inaudible]
expression that has no initialization. So it says that X is going to be a new
variable. It's going to have type T0, and that variable is going to be visible in
the sub expression E1. And so, now, how am I going to type check that? Well, I'm
going to type check E1 in some kind of environment. And this is a new notation
here, so let me define what it means. So remember O is a function, it maps a,
variable names to types and OT/X this notation here is also a function. And what
this is, is the function O modified at the single point X to return T. So in
particular a, this function, this whole thing here is one function, this wh ole
thing I'm underlining here is a function, that applied to X is Returns t So that
says that this sort of assumptions says that x has type t and for any other
variable. So I apply it to some other variable y, where x is different from y.
Well then, I just get whatever type y has in [inaudible]. Okay? So what this rule
then says is that I'm going to type check E1 in the same environment O, except that
at point X, it's going to have the type T0. So we're gonna change just the type of
X that have type T0, because that's the type of the new identifier that's bound at
E1. And all the other types will be the same. And using those assumptions, I'll
try to prove that E1 has some type. I will get a type for E1. And then that will be
the type of the entire let expression. Now notice something about the type
environment. What this says is that before we type check E1, we need to modify Our
set of assumption. Modify our type environment to include a new assumption
about x, then we type check e one, and then of course when we leave, type
checking e one, we're going to remove that assumption about x, that new assumption,
because outside of the let, we just have the original set of assumptions, though.
And so, I hope that, that, terminology and that description reminds you of something
that we talked about earlier, because this type environment is really implemented by
the simple table. So, in our rules, The type environment that carries around the
information that will be stored or is typically stored in the symbol table of a
compiler. To summarize this video, The type environment gives types to the free
identifiers in the current scope. And this is very important, because it doesn't even
really makes sense to talk about type checking an expression, unless we have
some information for the types of the free identifiers. And the type environment is
just a way of formalizing that, of giving a name to some set of assumptions about
what the types of those free identifiers are. Notice that the type environment is
passed down the abstract syntax tree from the root towards the leaves. That is, as
we pass through definitions, the type environment is extended with new
definitions For example, [inaudible] expressions. And so the type environment
will grow as you pass from the root of the abstract syntax tree down towards the
leaves of the abstract syntax tree. And then the types are computed up the
abstract syntax tree from the leaves towards the roots. So we begin at the
leaves, get all the types of the leaf expressions, most of which are very easy.
Things like integers and string constants have the obvious types. And we just look
up the types of variables, in the type environment. And then we compute the types
for the more complicated expressions in a bottom up fashion.