11. The Subtype Relation

[music]. In this segment we're going to give a very precise definition to subtyping, and do it in a way that we can add a lot more flexibility to our type system without having to change any of the typing rules we already had in our programming language or have learned earlier in the course. So, this is important because a programming language already has a lot of typing rules and they're complicated and sophisticated. And if we had to make complicated changes to all of them just to support passing a record with three fields where we only needed two of those fields, that would be annoying. We have great typing rules. For example. When you pass an argument to a function, the type of that argument should equal the type of the function parameter. It's a nice rule. It's easy to explain. We can implement it in our language. But that rule in and of itself, is what in the previous segment did not let us pass something that we wanted to pass and that would do no harm. So the way we're going to fix this now, is by adding just two things to our language. The first thing we're going to add, is a notion of subtyping separate from everything else. So on the slides, I'm going to write t1 less than : t2 to mean t1 is a subtype of t2. This does not have to be syntax in your language, this is just binary relation on two types. Given two types, t1 and t2, maybe one is a subtype of the other, and maybe they're not. And what we're going to have to do is come up with a definition of that relation. Just like the less than relation on integers, is a binary relation. It takes two integers, and either the left one is less than the right one or it's not. We're going to define a relation on two types. Now once we have that relation, which we call the subtyping relation. Now all we do to our programming language is we add exactly one rule. And it's the rule you see in blue here. It says if some expression has type t1 and if t1 is a subtype of t2, then e also has type t2, and that's the only rule we have to add and everything will just work out. For example, if x:real y:real color:string is a subtype of x:real y:real. Then anything of the type of three fields also has the type with two fields and since it also has that type, we'll be able to pass it to a function that expects the super type. So we've really separated things out very, very nicely here. Now we do want to do this carefully. There's a common misconception by people who don't actually work on programming languages, that if you're making up a new language, you can do whatever you want, because it's your language, all right? And as we learn repeatedly in the course yes you have a lot of flexibility in language design, but if you do it wrong, your language doesn't work the way you expect. For example, dynamic scoping for closures is a terrible idea for lots of reasons. The same thing is true with typing and subtyping rules, that you can't just use whatever rules you want, at least not if you want your type system to actually accomplish something. Right? Back when we learned about static typing, we learned that the purpose of a type system is to prevent certain operations. In the language we're studying for subtyping, the purpose is to make it impossible to ever have a program that type checks but when you run it you tried to access a field of a record, that is not in that record. So it turns out that with just our typing rules from the previous segment we have that property. We are sound for preventing bad record field accesses. Now when we add subtyping, we want to make sure we stay sound. If we define the subtyping relationship in a way that ruins the soundness of our type system, I would say we've done the wrong thing. And there's a little saying about this in programming languages, which is that subtyping is not a matter of opinion. All right. Either your subtyping rules break your soundness or they don't. It's important to us that they not break the soundness. So the key thing you have to reason about for not breaking it[COUGH] is this idea of substitutability. So if t1 is a subtype of t2, then what we want is, that any value of type t1, can be used in every way a value of t2 can be. So this is going to be true for our example with the record with more fields. In the super type, where we just have x:real, y:real, the only thing you can really do with that record besides pass it around is get the x field, get the y field, set the x field, set the y field. And you can do all of those things, with something of type x:real, y:real, color:string. So the substitutability principle holds. All right? So that's why this works and that's why that's a legitimate thing to add to our subtyping relation. So without further ado. How 'bout I add, I now define, at least a preliminary subtyping relation? These are all things that I'm going to use to say that one type t1 can be less than another type t2. So the first thing is what we've been talking about. This has a name. It's called width subtyping. That you can take a wider record, and make it a subtype. Of a slimmer record, provided that the slimmer record has all the field names and types that are in the wider record. So that's the rule we have been talking about and you can do that. Basically you can take your wider record and drop some of the fields. Fine. Here's another rule that maybe it didn't even occur to you that we might need. How about I have two records that have the same fields and the same types but they're not written down in the same order. Well the order doesn't matter. Right, the you know, when I'm writing down a record type, it doesn't matter if I say x:real y:real. Or if I say y:real x:real. So we should let, one record type be a subtype of another record type that just permutes the order of the fields. Doesn't change the types of the fields, doesn't change what fields are there, but writes them out in a different order. And that's a very nice rule, it seems to clearly pass our substitutability test, because, you know, we'll still be able to get and set all the same fields. My point is if you don't write this down as part of your subtyping relation all sorts of programs don't type check that you really want to type check. So it's a great rule to put in your subtyping relation. And then we have two that, that kind of come along for free that every subtyping relation should have. But we should be explicit about them. The first one is transitivity, that if t1 is a subtype of t2 and t2 is a subtype of t3, then t1 is a subtype of t3. This should follow directly from substitutability. If a value of t1 can be used a value of t2 and a value of t2 can be used as a value of t3, then a value of t1 should be useable as a value of t3. This is a nice rule to have because that way our other rules, our rules that actually do something more interesting, don't have to say everything all at once. That if we want to think of subtyping as dropping a field, and then reordering fields width and then permutation we can use transitivity to get it works that way. And finally one you might think is really weird is called reflexivity. It's a word from discreet mathematics or logic. Every type is a subtype of itself so t is a subtype of t for any type t. And we don't really need that, so far. But when we see some other subtyping rules. In particular the subtyping rules for functions, which we'll get to shortly. It will be nice to have this rule, because it helps simplify other rules. In, in ways that we'll see. So summary. We have one new typing rule. If t1 is a subtype of t2, and e has type t1, then e also has type t2. And then we have to define the notion of two subtypes being, two types being subtypes of each other. And this slide is where we made out preliminary definition. These rules are all fine and in upcoming segments we'll try to add additional rules and see if we can do that in a sound way.