Statistics 21 - Lecture 14

To talk about this AIDS vaccine trial before we start talking about the course material, but the, the issues that come up in this are things that we are going to be addressing later on in the semester when we talk about experiments versus observational studies and hypothesis testing. and in particular there's a tie between this and Fisher's exact test, which we'll, we'll talk about in some detail. So there've been a number of articles in the international press including this one yesterday from The New York Times, it's funny they, The New York Times apparently called me and sent me email while I was lecturing to you guys, so I got out and I had these messages. They wanted comment on this AIDS trial that took place in Thailand. so the, the, the story is that, this was a, a massive trial with roughly 16,000 people involved. They were recruited at various centers in Thailand. Some of them were given a placebo. A placebo is something that doesn't have any direct biological effect. It might have a psychological effect, but it doesn't have any pharmacological effect directly. And the others were given this mixture of two vaccines. There were two, two different things involved. Followed people up for some period, so that it's not just once vac-, it's not just two vaccines once, it's sequence of something like four injections over some period of, of time. And then they look at, during the, during the time of the follow up how many people contracted AIDS in each of the two groups. The group that got the vaccines versus the group that got the placebo. And what they found was slightly fewer people who got the vaccines got AIDS than people who got the placebo vaccines, the, the placebo, shots. So, in particular well, I mean the, the rates of infection among the, the people who were involved in this trial were relatively high by U.S. standards. I think it was over about a six year period, if I'm remembering right, and something like nine-tenths of a percent of the control group, and seven-tenths of a percent of the treated group got AIDS during that time, which is just an enormous infection rate compared to how things are in the U.S. I don't know whether that's typical for Thailand overall, or whether these centers that recruited subjects were recruiting people who were particularly at high risk. I know that transmission rates in Africa are very high. Transmission rates in Thailand are very high. I don't know, I don't know whether this is typical or just what happened in the study group. At any rate they analyze the data in a number of ways. So if you've got you know, there, there were roughly 8,000, I've got the numbers over here. There were Eh, there were 8,200 in the placebo group and there were 8,202 I think in the in the con, in the, in the vaccinated group, in the treatment group. and in those two groups 56 of the people who were vaccinated contracted AIDS and 76 of those who were in the placebo group contracted AIDS during the, the trial. So the question is, how, can you figure out whether that is a surprising difference if the vaccine didn't actually help, right? So it turned out that fewer people who got the vaccine got AIDS than people who got the placebo, but it's going to be one group or the other that gets less, right? It's a very, kind of not plausible, but the two groups would end up with exactly the same number. So the question is, is the difference between the number of people who contracted AIDS in the two groups surprising, if the vaccine really didn't help you at all, if the, if the vaccine made no difference. And one of the big issues is, you know, how strong is the evidence that the vaccine helps. So first of all how big is the effect and then. How big is, is the apparent effect. And then secondly is that apparent effect so big, compared to the sample sizes that are involved that you can be really confident that the vaccine helps, or is that, is that apparent effect small enough compared to the size of the experiment that you can't conclude confidently that the vaccine makes a difference. So if you look at th e difference in the raw rates, as I said it's something like nine-tenths of a percent versus seven-tenths of a percent It's about a 25% difference in, in rates in which people contract AIDS in the two groups. So roughly, a 25% reduction in risk. But. It's still a very small number. It's from a small number to a smaller number. And yet, all the numbers are bigger then they are in the US and the general population. So, how do you think about the issue of, is this difference between 56 and 76 surprising or not. Hm? That's a, that's a fundamentally statistical question, and it's the kind of question that we're going to, we're going to figure out how to answer. How strong is the evidence that the vaccine helps? Right. So you can imagine the following. Each person is represented by, each person in the study, each among the 16,400 people, each person has a ticket and on the ticket are. Sort of, two numbers. One, this represents what's going to happen if the person is vaccinated, this represents what's going to happen if the person is given the placebo. And it might be that this person would get you know, won't get AIDS if he's vaccinated and won't get AIDS if he's given the placebo. Then, here's another person represented by, by the same ticket. This person might not get AIDS if vaccinated, will get AIDS if, if given the placebo. Here's somebody else, who's going to get aids no matter what. Okay does this make sense? Alright so we can think of those numbers as being fixed ahead of time. If the vaccine makes no difference whatsoever to who comes down, who contracts AIDS, than for each person these two numbers are going to be equal. Right. So, because the vaccine doesn't help, I'm either not going to get AIDS no matter what, or I am going to get AIDS no matter what. The vaccine doesn't, doesn't change anything. That, that make sense? So now I can say, let's look at how many people in all, contracted AIDS. And in, in the study group and the answer to that is 56 plus 76, which is what, 132. So, I can imagine that I've got this population that, that I have 16,402 tickets. Of those, 100 and, what did I just say it was? 132? 132 of them have a one and a one. And the rest have a zero and a zero. Okay and now I'm going to take a random sample of those and I'm going to put, what did I say was? 8,200 of them got the placebo so 8,200 of them are going to be labeled placebo when I'm going to look at the placebo column. And the other 8,202 are going to get, are going to be the ones who are vaccinated, and I'm going to look at the vaccination column for those people. I get to see one number or the other, but if the vaccine makes no difference, the two numbers are equal for each person, and so it's, it's as if I simply took a random sample from, from these, 16,402 tickets of which 132 of them had ones on them and the rest had zeros on them. How surprising would it be if I did that, that. 56, 56 of them would end up in the vaccinated group and 76 of them will end up in the placebo group. Right, the, whether you going to get a, in this little story, whether you going to contract AIDS or not for a sort of predetermined makes whether you get the vaccine makes no difference whatsoever, what's the chance that just, because I'm randomly deciding 8,200 people to the one group and 8,202 to the other group; that I'm going to end up with a disproportionate number and the vaccine group, disproportionately small. So, I can say well, would I be surprised if the difference between this, these two groups, groups were this large in I, in either direction. Okay, so on that you can just imagine taking these 16,402 tickets, reaching in and pulling out 8,200 of them, and saying, you guys are the vaccinated group that, did I get that right? No you guys are the placebo group. Okay, and I do that over and over and over again. And I look at how often the difference between the rate of the vaccinated group, and the rate of the placebo group, is as big as this or bigger. Right? Tells me how surprising it would be to see this if the vaccine makes no difference what-so-ever. And the a nswer is about eight percent of the time you would see a difference this big or bigger. Okay? So, Is eight% a small number, is eight% a big number? You know, that's kinda hard to say. If you do enough trials. It's pretty, you, you know, it, it, it. It's pretty eerily guaranteed that eventually, you know, an event that has eight% probability is probably going to happen, right? I mean, if you did this 100 times, you, you know, you did a 100 trials of the vaccine, you'd be pretty confident at least one you'd get something that was eight% or less likely to occur even if the vaccine makes no difference whatsoever. Alright, let's look at what the reporter said. . So this 31% number's based on a different analysis which I actually don't trust as much, Alright. Okay. So. This, I, with, with deference to the, the reporter, this is gibberish. Okay. Sixteen% chance that the results were due to chance. Doesn't make sense. Neither does this eight% chance that, that results were meaningless. Okay. I just explained to you what the eight% chance has to do with. But, what's going on here is, assume the vaccine doesn't work. What's the chance you would see as big an effect as you saw? That's the calculation, right? It's not what's the chance that the results are meaningless or what's the chance the vaccine doesn't work? Vaccine either does work or it doesn't work. There's no probability about that. What's the probability? And the probability is in suppose it doesn't work, what's the chance the data would come out the way the data came out, okay? What's the chance the difference in rates between the control group and the vaccinated group would be. The 26% difference that it was, the difference between roughly 9/10ths of a percent and 7/10ths of a percent alright, so this is a very different question and this has gotten garbled. This make sense to everybody? Okay. Alright. the, the original study is here in the New England Journal of Medicine. it came out in the middle of the night on Monday and, and you can click through to read it to get some idea of what they are doing. it's very interesting, very complicated protocol, very complicated procedure, massive undertaking. not a terribly compelling conclusion. I mean, it doesn't, it doesn't mean it doesn't work. But it's just not very strong evidence that it does work, nor does the effect seem to be that large. that doesn't mean it couldn't save a lot of lives. Who knows? Yeah. So, eight percent is the chance that if the vaccine doesn't work, you would get that kind of difference between two groups? Exactly. So eight percent is the chance that if the vaccine doesn't work, you would see that big a reduction in risk. What's the couple ahead that? So 92% is the chance that if the vaccines did work, you could see that big a difference? So that the 92% is the chance that you would see a smaller affect then that. On the assumption that the vaccine doesn't work. So the calculation is all on the assumption that the vaccine doesn't work. So, it's not the probability that it doesn't work, it's assumed it doesn't work. What's the chance you see what you saw? Alright, that's called a P value. We'll, we'll get there. Other questions about this stuff? Okay. Let's, let's go to course stuff. We were talking about we, we talked about the definition of conditional probability. We talked about independence, and we talked about the, the fact that it's very, it, it, there's a tendency to confuse, independence and mutual exclusivity. They're very different that if, if you have two events and neither of them has probability zero, if they're mutually exclusive, they're dependent. They can't be independent alright we can turn the definition of conditional probability around to make something called a multiplication rule. So, if you remember. The definition of conditional probability, is the probability of an event A given an event B, is the chance that both A and B occur, divided by the chance that B occurs, if P of B , is not zero. . Well, there are a lot of situations where. it's kind of easier to understand what P of A given B is, th an, than to calculate P of the intersection of A and B, and so it's helpful to be able to turn this around and just say if we multiply both sides by P of B, and then flip this over we've got the probability that A and B both happened is the probability of A given B times the probability of B. And this is called the multiplication rule, where all I've done is to multiply P of B. Alright, so why is this helpful? Well, very often this is an easy thing to think about. So for example, if I ask, what's the chance that the first two cards in a well shuffled deck are kings? Okay, the, the top part of the second card. that's the intersection of two events. The first card is a king, and the second card is a king. What's the chance that let, let's call A the event that the second card is the King and B the event that the first card is the King, just because that's going to be easier to think about. So what's the chance that the second card is a King given that the first card is a King? A well shuffled deck, you know the top card is a king. What are the options for the second card. There's 51 different possibilities, right? How many of those are kings? Three. Okay? And those are all equally likely, because all the permutations of the deck were equally likely. We had a uni-, a uniform distribution on, on permutations of the deck. So, this number is three out of 51, right? And what's the chance of the first card. Is, is a king. The top card. Unconditional chance the top card is a king. Four out of 52. Okay. So the chance that the first card is a king and the second card is a king is three out of 51 times four out of 52. That's an easier way to calculate things than to start thinking about all the ordered pairs you could have in the first two cards. Okay. very often, if you're talking about a, a test for a disease some kind of screening, something like that, this is an easy number to come up with. It would be something like, what's the chance that you test positive for steroid use, given that you're using steroids? Okay, just that. So, you can calibrate, you can calibrate a test that way. You can take a bunch of people who are using steroids and run the test on them and see how often they test positive. Okay? But, if you pull somebody at random from a population, and you want to know what's the chance that they sort of have steroids, that they are using steroids and test positive for steroids right? That they test positive, it's a different thing. We've got to, you've got to think about, okay, I'm getting a little bit ahead of myself. Let's. Well let's do, let's do an example that involves screening and I'll, I'll go to the one that's No actually let's slow down, let's do this first. exercise fourteen, eight, you're taking a statistics class, grading policy says you have to get a C or better on the homework and the midterm and the final, or get a B or better on the final in order to pass, okay. And let's suppose you don't do any work at all, you just guess on everything. Okay. So, and you, and you guess independently on the homework, the midterm, and the final, and let's suppose the chance that you can get a C or better on the homework by guessing is 23%. Chance you get a C or better on the midterm by guessing is eighteen%, and the chance you get a C or better on the final by guessing is sixteen%. And finally, the chance you get a B or better on the final by guessing is five%. Okay. So first of all, can you tell from this what the chances of getting, exactly a C on the final is, rather than a C or better. Forget about plus minus. We're thinking of the grade can be C, D, A. Sorry? Thirteen okay. So the chance you get a seer better on the final by guessing, sorry, is sixteen%., The chance you get a B or better by guessing is five%. So it would be eleven%,, right? Okay, you've got the event that you get a C or better. A subset of that is the event that you get a B or better. Right? So the bigger thing would be C or better, the smaller's B or better. The difference is exactly a C. Yes? Okay so you've taken the event that you get a C or better, parti tioned it into two pieces. You know the probability of this, you know the probability of one of the pieces. What's left over is the probability of exactly a C. So sixteen percent minus five percent is eleven%.. Okay, what's the chance that you pass the course by guessing? . Guess. . okay. We, we have to, we're, there's two way's that you can pass by guessing. Right? You can either get a B or better on the final, or you can get a C or better, on everything, yes? Now those aren't mutually exclusive the way we've said them, yes? Because, if you get a C, or better, on everything, that includes getting a B, or better, on, on the final as one, one possibility. So if we want to divide and conquer. We're going to break this up into pieces to find the probability. We're going want to break it into things we can calculate the probability separately. Right? The easiest thing is going to be to break the event that you pass the course into disjoint events each of which we can find the probabilities of, so we can sum the probability of, to find the overall probability. So, what are, what is the two sort of disjoint ways of passing the course by guessing? . Yeah. getting a C or better on the homework, miderm. if you're getting a C on [INAUDIBLE] or getting a C or better on [INAUDIBLE]. Excellent. Okay, so one, one way to pass the class is you get a C or better on the homework, C or better on the midterm and exactly a C on the final. And another way is to get a B or better on the final regardless of what your homework and midterm are. Now, those two ways of doing it are now disjoined, right, because if you get a B or better on the final you didn't get exactly a C on the final. So now we can find the probability that you passed the class by finding the probability of those two disjoint ways of doing it and adding their probabilities. So the first event we'll call, A is the event that you get greater than or equal to C on homework. Great, greater than or equal to C. On mid term and exactly is C on the final. I'm a, if the chalk is particularly, fragile o r I'm particularly brutal on it today, but greater than or equal to B on the final. Okay and now A intersect B is empty, they're disjoint, and A union B is all the ways that you can pass the class. Okay. So, we've, we've partitioned the event passing the class into two disjoint pieces. We can find the probabilities separately and add them to find the probability that we pass. Okay. So, how about this one? There's, the probability of that ought to be pretty easy. Yeah, it's, it's given. It's five%,. Okay, So, P of B is five%. And what's P of A? Well, this has to happen, and this has to happen, and this has to happen. Right? It's probability of an intersection. One of the assumptions is that you're guessing independently on all these things. So the event that you get at least a C on the homework is independent of the event that you get at least a C on the mid-term and that's independent of the event that you get exactly C on the final. So we can find the probability that this happens. And this happens and this happens by multiplying the probabilities that they happen separately using the independence. Okay? So what's the chance you get at least a C on the home work? So that's 23% . What's the chance you get at least a C on the midterm. eighteen%, and what's the chance you get exactly, you see on the final. Figure that was eleven%. Okay, so the probability of a, using the independence is. What did we just say, it was point two three, times point one eight, times point one, one. Okay, and so if we add.05 to this product, that's the chance you pass the course by guessing. Is this alright? Any questions? . Didn't, okay. So, we used independence to find the probability intersection. We first broke things up and then just join pieces so we could, we could add their probabilities to find the probability that, that either of those things occurs. okay. Let's talk about, screening for a disease, or a condition, or something like that. . Alright, so let's suppose that ten percent of some population has benign chronic flatulence. Yes? I just s poke with Mike Basil. What's in the way of like, . Okay. So, when do you multiply, when do you add? . Okay. If you want, the probability of A union B, and a intersect B is dis-joined, the mutually exclusive. Then . Then, you can take P of A union B is P of A plus P of B. Okay and that's one of the axioms, right? That's just got to be true. So, if you are looking for the probability of the union and the events are disjoint, you can find that probability by adding. If you want. Probability of A intersect B, and A and B are independent. . You can find it by multiplying P of AB is = to P of A x P of B. And that's sort of the definition of independence okay. If you're looking for the probability of a union. And the events aren't mutually exclusive you can't just add, you got to subtract off the probability of the intersection. If you're looking for the probability of the intersection and the events aren't independent you can't just multiply, you've got to do something else. You could use the multiplication rule cuz it's always the case that P of AB Sorry. It's always the case that P of AB is equal to P of A given B P of B. Just in the special case of independence, P of A given B is P of A. Learning that B happened doesn't tell you anything about whether A happened. It's the, it's the same as the unconditional probability. Over here, this is always. It's always the case that p of a union b is equal to p of a plus p of b minus p of a intersect p. In this special case that the intersection is empty, that probability is zero. So, reduce this to this. We good? Okay. People have had a chance to. Okay. So, suppose we have a way for screening for benign chronic flatulence that has a 90% chance of correctly detecting that you have this, this condition, which is not fatal. and a ten percent chance of a false positive. Okay. So there's a ten percent chance that the dog did it or something. okay. And you pick a person at random from the population, so everybody has the same change of, of, of being picked and you test the person and the te st reports positive, what's the chance the person actually has the disease? So I, I, I don't think I said out loud that we've got ten, ten percent of the population has the condition. The test is 90 percent accurate. Somebody tests positive. What's the chance they actually have the disease? Now. Very often, this is an invitation to make a, people's intuition very often leads them astray in this particular thing. Think, oh, if the screen is 90% accurate, then if somebody tests positive, there's a 90% chance they have the disease. Right? Wrong. You know, what does the 90% say? It's 90% chance that you test positive if you have the disease, not 90% chance that you have the disease if you test positive. It's the other way around. Okay? So what we need to do is turn the conditional probability of testing positive given that you have the disease into the conditional probability that you have the disease given that you test positive. They're two different things, two different conditional probabilities. One is probability of A given B, the other's probability of B given A. Right. Okay. Well, let's, let's, use D to denote the event that you have, that the person has the disease, T to denote the event that the person tests positive. So translating the words into symbols is the hard part of most of these problems. Once you've got it translated into math, then the math isn't very difficult. And so what the problem statement said is that. Ten% of the population has the disease. So the probability that a person picked at random from the population has the disease is ten%. B and D is ten%. And the test has a 90 percent accuracy. Is interpreted to mean the chance you test positive, given you have the disease, is 90 percent. A chance that you test positive if you don't have the disease is ten%. Those are, those are two different pieces of information. You can't derive this from this, right? Where the, the numbers are happening to be the complement because we're saying that, that the test is sort of 90% accurate in both directi ons. The chance of a false positive is ten%. The change of a false negative is ten%. Does that make sense? Alright, so what we're, what we're supposed to find is what's the chance that the person has the disease, given that they test positive. What we're given is what's the chance they test positive and they have the disease, the other way around, okay? So, by the definition of conditional probability, that is simply the chance that they have the disease and test positive divided by the chance that they test positive. Now problems like this, that start involving conditional probabilities, the easiest thing to do to solve them is to draw a tree. If you, if you draw a tree you will seldom go astray. ,, . . I need to put more trees in the book. I haven't done it yet. All right, so the population is divided into two groups. Those who have the disease and those who don't. Right? Now there's a ten, there's a ten% chance that a person picked at random has the disease and a 90% chance that a person picked at random doesn't. So this is D, , this is D compliment. Then if you have the disease, there is a 90% chance that you test positive for it. And a ten percent chance that you test negative. If you don't have the disease, there is a ten percent that you test positive and a 90% chance that you test negative. What we're putting here are conditional probabilities. So what is the probability that somebody in this group that is has the disease, tests positive. Multiplication rule says, the chance of d intersect t, the chance you have the disease and test positive is the conditional probability that you test positive given that you have the disease, that's 90% times the unconditional probability that you have the disease, that's ten%. So, the amount of probability that's down here. The, the, this, this node, which is, has the disease and tests positive, has probability 0.9 times 0.1, which is 0.09. Right? nine%. Okay, these people have the disease and test positive. nine percent chance of that happening, if you pull a person at random and test them. All right , this is has the disease, tests negative. Right? ten percent is conditional probability that you test negative, given that you have the disease. ten percent is probability that you have the disease. So the probability that you have the disease and test negative is this conditional probability times that unconditional probability, .1 times one. Is 01,. One%. Yes? So this is one percent and that is DT complement. Then here. This is don't have the disease test positive. Probability of that intersection is this conditional probability times that unconditional probability, nine. Times nine. Is 09,. Nine%. So this, this here is nine percent and that event is D compliment T. And then this pile here probability that you, probability that you don't have the disease and test negative is the conditional probability that you don't... That you test negative given that you don't have the disease times the unconditional probability that you don't have the disease. We have nine. Times 9,. is 81%,, yes? And that is D compliment, T compliment. And just as a sanity check, do these add up to a 100%? Yep. 81 and nine is 90, and ten is 100. Yeah. Okay, so this is a good thing. Alright, so what are we interested in, we're interested in the probability that you have the disease, given that you test positive. So, which of these branches of the tree corresponds to testing positive? The ones that have a T in them, yeah? Not a T complement. So that's this and I think we'll draw that to conclude the probability, that and that. Yes. Okay, so, when we condition on the event that we test, that the person tests positive, we are limiting the universe to these outcomes. Where somebody tests positive. Of those. What fraction actually have the disease? It's 50%, right? It's nine percent divided by the eighteen%,, eighteen percent of people are going to test positive. Of those one-half, nine percent actually have the disease, the other one-half are false positives. This makes sense? Okay. So, what we've got is. Probability, that you have the disease, given that you test positive, is probability that you, have the disease and test positi ve divided by the probability that you test positive. We have partitioned the event you test positive into two pieces. You can test positive two ways. You can test positive, and have the disease, or test positive and not have the disease. Those are the only possibilities. Right? So this denominator is, the numerator is still p of dt. The denominator is p of d t plus p of d complement t. All right. We partition the event T into two disjoint pieces. T N D. T and not D. And what we just did, this is, this is equal to nine percent over nine percent plus nine percent and that's a half, right? . Okay. So even though the disease is 90, eh the disease, even though the test is 90% accurate, the conditional probability that you have the disease given that you test positive, is only 50%. Okay? Because a lot of the positive test results are false positive test results. So, you need to take the base rate of occurrence of the, in, in this case the disease, into account. And figuring out what the chances that a person has the condition, given that they test positive for the condition. Base rate matters. Alright now the way we actually did this you can kind of workout we, we calculated this, we calculated this by. Using conditional probability times unconditional probability to find the probability of this piece, we partition this event into two disjoint events. For each of those, we found their probability by multiplying a conditional probability times an unconditional probability. We can write down what we did in symbols, and that's called Baye's rule. So let's look at what we actually did. Alright we said P of DT, sorry P of D given T, is P of DT divided by P of DT plus one. P of T, and that's P of DT over P of DT plus P of D compliment T right? That's partitioning the denominator into two pieces. Now how did we actually calculate that? We calculate that as probability T given D, P of D, that's. That's how we calculated the numerator, P of DT. We took P of T given D, 90% times the unconditional probability of, of D, ten%.. How did we calculate these bott om pieces? Well, it was P of, this piece is the same as the numerator. That's T given D, P of D, plus P of T given D compliment, P of D compliment. So we use the multiplication rule to calculate the numerator and to calculate the two pieces of the denominator. this is called Bayes' Rule. The Reverend Thomas Bayes. Okay, Bayes' rule is a way of turning a conditional probability into sort of the converse conditional probability, P of, it's expressing P of D to the T, in terms of P of T given D and P of T given D complement. Okay. Memorizing the formula. Will probably not help you. Being able to derive the formula, will help you. Being able to draw a tree will definitely help you. Okay, so if you get a problem like this. Drawing a tree is, the way to go. You're much less likely to get confused about what goes where. Yes? So for the . Right. So how do you, I mean like, how do you know that ? Okay, so the question is how, how did I know to sort of put, start by branching on who has the disease and who doesn't have, we're just branching on who tests positive and who doesn't test positive. Well answer is, this is people. Right? And I'm dividing people into groups. I don't know how many people test positive. That's actually not given. That's kind of what I'm trying to figure out. I do know how many people have the disease. That is part of the problem statement. So I actually don't even have the information I would need to branch on test results first. Does that make sense? Yep. Okay. Other questions? Okay. There is an example I would encourage you to work through for fun. It's this 14-7. But, I, I just, I just have to talk about it briefly, although I'm not gonna through it. So it's, it's been said that if you put enough monkeys in front of typewriters, and you wait long enough. One of them will type the complete works of Shakespeare. My colleague Robert Wilensky in the computer science department said the following. We've all heard that a million monkeys banging on a million typewriters will eventually reproduce the e ntire works of Shakespeare. Now, thanks to the internet, we know this is not true. . anyway , this is an interesting problem. It's worth, it's worth puzzling over for, for a little bit. I want to go on and talk about the next, chapter, but let's do one slightly more complicated problem first. Ahg, we toss a coin five times independently and the question is, What's the chance the total number of heads in the first three tosses is, is less than or equal to total number of heads in the last two tosses. How, how on earth do we, figure this out? . . . . We can start by naming things. Alright? It's always a good idea. Introduce some symbols, give things a name. So, let X be the number of heads in the first three tosses, and let's let Y be the number of heads in the last two. So, what we're asking is, what's the probability that X is less than or equal to Y? Okay. So that's some event. What are the possibilities for X and Y? What can they each be? . How many heads can you get in three tosses of a coin? Zero, one, two, or three right? Alright, okay. Thi, this, the possible values are zero, one, two, and three. What are the possible values here? Zero, one, and two, okay? What combinations of values corresponds to x being less than or equal to y? Alright, so. . We. Any, any of those there's, there's four possibilities for the first and three possibilities for the second, there's twelve possibilities in all. Right? They're not equally likely but there are twelve possibilities. Some of them corresponds to X being less than or equal to Y, some of them don't. So for example if X is two and Y is zero, X is not less than or equal to Y. Okay. If X is zero, Y can be anything, and X is less than or equal to Y. Yes? Okay, so let's just make a little table; X, and then what are the possible values of Y? That corresponds to x being less than or equal to y, so if x is zero. Y can be zero, one or two. And x is in fact less or equal to y. Yes, if x is one. Y can just be one or two, in order to have X less or equal to Y. If X is two, Y could only be two, or else X would, is not less or equal to Y. And if X is three, Y can't be bigger. Yep. So, these are the outcomes that corresponds to the event that x is less than or equal to y. So far so good. Okay, are these mutually exclusive? Yeah. If x is equal to zero, x is not equal to one. Right? These, these rows, if you like, are mutually exclusive. Yep? Okay. So, if I can find the probability of any of these three things happening. Right? This is really three out comes. Zero, zero, zero, one, zero, two. Right? And this is two outcomes.1,1 and 1,2. If I can find the probability of any of these three things happening I can add it to the probability of either of these two things happening, and add it to the probability of this happening. And what I've got is the probability that x is less than or equal to y. Okay. So how do I find this? Well what's the chance that x is in fact... So, now I have, I'm gonna now divide this up in another way. What's the chance that x is zero and y is zero? It's the intersection between two things, chance X is zero. The intersection event that X is zero and the event that Y is zero. The number of tosses, the number of heads in the first three tosses is independent of the number of heads in the second two tosses, the last two tosses. Yes? So the event that X is zero, is independent of the event that Y is zero. If I can find the probability that X is zero and Y is zero by multiplying the chance that X is zero times the chance that Y is zero, using independence. Yes? Okay. What's the chance that Y is zero, one or two? Sorry. One, yeah? Those are the only possibilities for Y. Okay, what's the chance that X is zero? Toss a coin three times. 50% percent chance, it's a fair coin tosses are independent. 50% chance that it lands heads the first time, 50% chance it lands heads the second time, 50% chance it lands heads the third time. In order for X to be zero it has to land heads all three times. Okay. So the tosses are independent. So the chance it lands tails the first time, a nd the second time, and the third time is the chance it lands herds the first, sorry, tails the first time, times the, the chance it lands tails the second time, times the chance it lands tails the third time. That's the independence in action. Right? So a half times a half times a half, that's an eighth. Good. So the chance that X equals zero is an eighth. I'm not sure how to do this clearly. I guess I'll put the probabilities here. So, the chance of that is an eighth. And, what's the chance that Y is zero, one, or two? We just said it's one. So the chance that X is zero and Y is zero, one, or two, because X and Y are independent, I can just multiply. That's an eighth times one, and the whole contribution is an eighth. What's the chance that X is one? . So this is a probability. This is a probability. This is a value. Okay, how can X be one? I need to get one head and two tails, yes? That one head could be on the first toss, or the second toss, or the third toss. That's a partition. Right? Because if it's on exactly the first toss, it wasn't on the second toss. If it's on the second toss, it wasn't on the third toss, et cetera. So, what's the chance that I get a head on the first toss and then two tails? That's an eighth, right? A half times a half times a half. What's the chance that I get tails on the first toss, heads on the second, tails on the third? Half times a half times a half is an eighth. What's the chance I get two tails and then a head? Half times a half times a half is an eighth. Yes? There's three disjoint ways that I could get one head and two tails. Each of them has probability an eighth, so the chance that I get, exactly one head in those three tosses is three eighths. Yes? What's the chance that I get one or two heads in the second, in the last two tosses. Well it's the chance of anything, minus the chance that I don't get any heads at all. Right? We find that, but probably the compliment is easier, in this case. The, the compliment of the event that I get one or two heads is the event that I get no heads at all, yes? Cuz those are the only possibilities here. What's the chance that I get no heads at all. In the last two tosses. Quarter, right? Tails, tails, one-half, one-half, one-fourth. So the chance that I get either one or two heads is three quarters. Yep, one minus a quarter, so this is three quarters. And the overall chance that this row happens, that x1 is equal to one and y1 is equal to one or two, is three-eighths*(3/4), times three quarters, because x and y are independent. So 9/32. What about, here? What's the chance that X is equal to two? Need to get two heads and one tail in the first three tosses. Those could be heads, heads, tails. Heads, tails, heads. Tails, heads, heads. Alright. I have, I have three choices for which ones get tails. Just like here I had three choices for which one gets heads. Same probability, three-eighths. What's the chance Y is equal to two? Quarter. Okay, so this is three-eighths times a quarter by independence, which is 3,30 seconds. Okay, one-eighth is 4/32. So I've got four plus nine is thirteen, plus three is 16/32, is one-half. Okay. So overall this whole probability is a half. Everybody follow all the moves? We used independence a lot. Is the independence of all the tosses. We use the compliment rule to find the chance of getting either one or two heads. By subtracting the chance of getting no heads from the chance of getting zero, one, or two. Right, okay. Speak now. Alright. Again we have kinda blasted through but I'd like to do this. Anybody looked at this chapter yet? Chapter fifteen? Let's Make a Deal problem. Anybody heard of the Let's Make a Deal problem with the Monty Hall problem? Yeah, there was a game show called Let's Make A Deal. and one of the things that happens in the game is that there are. Three doors, one of which hides. a brand new corvette and one of them has a goat and the other one has a year's supply of hay for the goat or something like that. So there's a prize that you watch and then a couple of things that are jokes, okay. And what happens is that your o ffered a chance to pick a door. And, after you've picked the door, the host, Monty Hall, shows you, that the prize you want is not behind one of the doors you didn't pick. Okay? So you pick door A, and the host shows you that door B has the year's supply of hay. And the question now is, is it better strategy to stick with your original guess? Or to switch your guess to door C? He offers you now a chance to switch. He shows you that the grand prize is not behind a door you didn't pick, and offers you a chance to switch. Should you switch? Yeah. No. Yes. Yes. No. Yes. No. Maybe. Who cares? Uh-huh. Who wants a Corvette anyway? Yeah. Okay. So, what makes this more entertaining is that this question was posed to Marilyn Bosevant, the world's smartest human. the Ask Marilyn column. I don't know. Is she still an active columnist? Yeah. Okay. So and she said that you should switch. And a handful of University professors wrote back to her saying you're wrong, here's the math, you shouldn't switch, conditional probability that the prize is behind C given that it isn't behind door B is 50%, so you're wrong Marilyn. All right. So the question is, who's smarter? Marilyn or the professors? Hm. So, here's a, here's a model of the game. So I, you know. I, I, Monty offers me a chance to pick a door. I pick door A. Monty shows me that the prize is not behind Door B. Now I have a chance to switch. Should I stick with my original guess or pick Door C? I'm going to stick with my original guess. Oops! It was actually behind Door C. I would've lost. Okay so again I pick door A, Monty shows me it isn't behind door C, I'm going to pick door B, oh I won. Proof. no. we're actually going to look at doing this over and over again and figure out what the probabilities are that you would win a certain number of times with a particular strategy in repeated play and see if we can use that to figure out empirically what the right answer is. But first we're gonna work on it theoretically. Yes? What would you, could you, stay, or would you figure different [INAUDIBLE]. We'll get there, , in excruciating detail. Alright, so I picked door A, he's showing me it isn't behind door B. I stick with my original guess, I lost. Okay, these are, by the way this really is being done as randomly as I can do it with a computer, right. The, the computer is hiding the prize behind a door at random. And after I click it is showing me a door that the prize is not behind and it's not then moving the prize. Right? It's sort of, the prize is staying where, where it was and I get to pick another door. So I'm gonna pick, I'm gonna stick with a again. Look, that time I won by sticking with door a. Okay? Right. So what's the better strategy? Alright. So What are the rules of the game? So, we're going to assume the prize is hidden at random, there's a, there's a chance of a third of it being behind each of the three doors. I'm going to pick a door at random. It really doesn't, if the prize is hidden at random it sort of doesn't matter how I pick. But, the point is that I don't have any information about what door hides the prize. So, I'm not able to sort of exploit something to do a better job at that stage. Then, the, an important rule is that the host is never going to open a door and say,"There's the corvette, you didn't get it." Right? Not until after the next round, right? He's never, the door he opens is, is always going to be a door that doesn't have the prize behind it. Okay? So far so good? Alright. So, here is the naive argument. So, don't switch, because after the host shows you what's behind one door, there's two doors left, the prize is equally likely to be behind either one. There's no advantage to switching. It doesn't matter whether you switch or not. Okay. Now it was true when the game started that the prize was equally likely to be behind any of the three doors. The question is when you rule out a door is it now equally likely to be behind the two remaining doors or not. Okay, this argument says it is. All right, so here's a more sophisticated argument which is kind of what the professors were saying, so let's call the three doors A, B and C. Capital A is the event that the prize actually behind door little a, Capital B is the event that it's actually behind door little b and so on. So the event, b compliment is the event that the prize isn't behind door B. Okay. That is it's actually behind either door A or door C. Now. Let's suppose I pick door A. The host then shows me that it isn't behind door B. What's now the probability, the conditional probability that it's behind door A versus behind door C. No. Calculation would be what's the probability that it's behind door C, given that it isn't behind door B? Well that's the chance that it's behind door C and not behind door B, divided by the chance that it's not door B. This is just the definition alright. But. In order for it to be behind door C it can't be behind door B. So the intersection of C and B compliment is just C, C is a subset of B compliment. That make sense? It's behind door C, it wasn't behind door B. You already know that. All right? So this is just P of C divided by P of B compliment. P of C, that's the chance the prize was originally behind door C, that's a third. The chance that it wasn't originally behind door B is two-thirds. All right? So that conditional probability is a half. Okay, and similarly if you find the chance that it's behind door A and given that it isn't behind door C, you get exactly the same number, it's a half. So according to this argument, there's no advantage to switching. That's what the professors planned. All right. It's math, it's got to be right, yeah? Well, so it isn't. here's Marilyn's argument, basically. Suppose you play over and over again. In the long run a third of the time your first guess is gonna be right. Right? Cuz in the long run, a third of the, whatever, say you always pick door A. Third of the time it'll really be behind door A. And so if you stick with your original guess you'll win a third of the time. What happens the other two thirds of the time? Right? T he other two thirds of the time is behind one of the remaining doors, but the host shows you which one it isn't behind. And so, the other two thirds of the time, if you switch, you'll win. Alright. Does that make sense? Two-thirds of the time is behind one of the doors you didn't pick. When that happens, if you switch you win. One-third of the time it's behind the door you did pick. When that happens if you stick with the original guess you win. Okay. So divide the universe of possibilities up into two things, it's behind your original guess, or it isn't. The chance is it's behind your original guess is a third, the chance it isn't is two-thirds. If you, if you always stick with your original guess, you win a third of the time, because a third of the time the original guess is right. If you always switch, you win two-thirds of the time, because two-thirds of the time it's behind one of the other two doors, and the host just showed you the one that it isn't behind. The only one left is the one that it is behind. That makes sense? Okay. So. What's wrong with the math? How can the math be wrong? It calculates the wrong thing, okay? There's no algebraic error. There's a conceptual error. The right story, the right thing to calculate is not the probability that it's behind door C given, given that it isn't behind door B. That's not an accurate description of what happened. It's what's the probability that it isn't, that it's behind door C given that you picked door A and the host revealed to you that you that it isn't behind door B right. The sort of, the fact that you picked door A, the host knew which door you picked, and then using that information decided which door to show you. Which door to open. That, that essential ingredient is missing. The, the professors calculated the wrong conditional probably. Okay? So, let's see if we can now work through the right calculation. So, what's the chance that it's behind door C, given that you picked door hay, door A, and the host reveals that it isn't behind door B? Littl e more complicated thing to think about. Well, we're going to divide the universal possibilities. So under what circumstances, okay if, if I picked I, I need an extra assumption here which is what happens if I pick a door that doesn't hide the prize. Which door is the host then going to show me? And we're gonna pretend that if I picked door a and it really is behind a then the host flips a coin to decide whether to show me door b or door c right. If I pick door a and the prize is really behind door b the host doesn't have a choice he has to open door c. If I pick a and it's really behind c he has to show me b. He really doesn't have a choice. Make sense? But if, if I pick door A, and it really is behind door A, he has a choice, I'm going to pretend he tosses a coin to make that choice. Alright. So we're now going to use the idea that the door that I pick is independent of the door it's actually behind. I don't have any information to help inform my original choice. So, what's the chance that I, that, that it actually is behind door C and I pick door A? Well, my pick is independent of how the prize was hidden. So the probability that it is actually behind door c is a third. The probability that I pick door a is a third under the assumption that I'm making. The probability of both of these events, they're independent, so it's the product of the probabilities. That's a ninth. Okay? So far so good? Alright. So now, I want to think about, what's the probability that I pick A, and the host shows me that it isn't behind door B. Well there's two ways that the host can end up showing me that it isn't behind B alright. One way is I pick A, it's actually behind door C. Host doesn't have a choice, has to show me door B. Yes? The other way is I pick A, it really is behind A, host tosses a coin. 50 percent chance the host then shows me that it isn't behind B. 50 percent chance he shows me it isn't behind C. So far so good, alright. So if I pick a, okay so, I'm partitioning the event that I pick A and host reveals that it isn 't behind B into two disjoint events okay. They're disjoint because if it really is behind A. It isn't behind B or C. . That, that's what I'm partitioning. And so. Chance that it is behind A and I pick A and the host shows me that it isn't behind B. Plus the chance that it really is behind C and I pick A and the host shows me it isn't behind B. So what's the chance of this? If it's, if it's behind A, what's the chance, so, the, the chance that it's behind A and I pick A, we already figured out is a ninth, right, okay, if it's behind A and I pick A. Half the time, the host is gonna show me that it isn't behind b. Half the time the host is gonna show me that it isn't behind c. Right? So this probability is half of a ninth. Okay, what about this? If it's actually behind C and I pick A, chance of that is a ninth, right? But the host then doesn't have a choice. If this, if these two things happen the host has to show me that it isn't behind b. It has no choice. So, the probability of this piece is a nine. So I have an eighteenth plus a ninth. That make sense? Alright. . okay, so, now I have all the pieces that I need, to do this calculation. What's the chance that it really is behind door C, given that I pick A, and the host reveals that it isn't behind door B. Well, that's. The chance that I, that it is behind C, and I pick A, and the host reveals that it isn't behind B, divided by, the chance that I pick A and the host reveals that it isn't behind B. The chance of this is this thing we just figured out. That's an eighteenth. I'm sorry, yeah, that's, that's a, that's right, that's a ninth. And this other piece is a ninth and an eighteenth, so we have a ninth, divided by ninth plus an eighteenth, it's two thirds. Sorry? I'm finding a conditional probability. So, the conditional probability of C, given this and this is the probability of C and this and this, divided by the probability of this. Okay. So, this agrees with Marilyn's solution, it says that switching is the better strategy. If I switch the chance I win is two thirds, if I stick with my original guess the chance I win is one third. The, the problem is not in the arithmetic. The problem is in what is the conditional probability we're trying to compute. The problem is in the conception. Does this make sense? If not, this is a good time to ask questions. Yes. so can you go up to . so . Mm-hm. So instead of . . What happened first. Okay, so the question is should this be probability that it's behind door c given. Host reveals big compliment. Given. Given. okay so the scenario at, at the time you're offered the choice, what has happened? You've picked door a and the host has shown you that it isn't behind door b. That's the information. The information is you, you, you picked a given door. Host knows that you picked that door. On that basis, host then opens some other door. Okay, so the, the information that you really have is, is this, and you get to base your decision on all that information. I don't know if that makes sense or not. your conditioning on everything that's already happened. At the point that you have to make a decision. So, you picked a particular door. Host knows you picked that door. Host followed some rules to decide then what door to, to, to show you. In particular the rule is if there's only one door he can show you without showing you the prize he shows you that one. And, if, if there's two doors he can show you he tosses a coin. So, he's done all that and now. What should you do? You're, you're not happy, but yet. But why doesn't C get. ,, . Right. So, I mean, because then that's . Okay so there's a couple things going on here, I'm going to try to tease them apart, alright? So we don't have a notion of an event given another event, we have a notion of a conditional probability given another event. So, what is the event that's being conditioned on? Right, so we don't have a notion of, . Probability of A given B given C. That's not something we have a definition for right? It's, what we have is a chance of A given something. Alright. And sort o f what's the something that's going to go in here. It's the information that we have. About, about the state of the world. Now the host. . Decides to reveal B after knowing that you picked A. That, that's certainly true. Right? But . The . The way the host goes about making that decision. We sort of encapsulated in the rules right? We're using, we're using the way the host makes the decision in doing these probability calculations. Right. Right, but. But, what we're conditioning on is where are we in the game. Well, what's happened at this point, I picked door A, host has showed me it isn't behind door B. That's where we are. But I, I'm, I'm not, yeah I'm, I, can tell I'm not satisfying you, I'm not, I don't have a better way to say it right now, but I'll think about it. Yeah. Okay. I'm sorry. Other, other questions. Alright. So, what's wrong here with, you know, in the first, first argument, just, says, oh if there's two possibilities they must be equally likely. Oh they're not equally likely. There's no reason to think they're equally likely. In fact, you know, the, the possibility that it's behind the door you picked is one third, the possibility that it's behind that remaining door that's, that's still closed after the host has shown you one is two thirds, not equally likely, you can't just assume that they are. Argument two does a calculation that is algebraically correct. It's just the algebraically correct calculation of the wrong quantity. It doesn't actually describe how the game is played. Argument three is nice and simple and it only depends on the assumption that the prize is initially equally likely to be behind any of the three doors. Okay. it's, it's clean. I, I, I like it a lot. Argument four is more technical, which involves additional assumptions about independence of, of this and that and how the host decides which door to show you and so forth and so on, but it gets the same answer ultimately as argument three. Alright, let's work one of these examples quickly. Okay. I'll give you guys a ch oice, we have five minutes. okay, one is, let's make a deal with the prize behind one of five doors. And, one is, The host follows a different rule in deciding which door to show you. He doesn't always show you a door that you didn't pick. a door that doesn't hide the price. Sometimes he might. He just picks at random from the remaining doors. And This one. the, the host might even open the door that you picked. Sometimes, he's just picking a random completely, including possibly opening the door that you picked. Sorry? It takes away the point of the game. Oh yes, it does take away the point of the game. Not such a fun game. And then, this last one is a completely different thing about mice running in mazes at random. Random mice. Any preference? Is the thinking behind picking else different from the one theory there? let, let's do the mice. And, and I, I'll, okay. so you got two mice in separate mazes. there's only one path that leads to the cheese. The first maze has four paths, three of them are dead ends. Second maze has five paths, four of them are dead ends. The mice go at the same speed. All nine of the paths are the same length. Each mouse decides at random which path to take first equal chance of taking each path, if it goes to a dead end, mouse goes back to the starting point at the same speed, picks one of the remaining paths at random, then same chance of, of taking each of the untrodden paths. Keeps going until, un, until each mouse finds the cheese. Mice are independent of each other, and the question is what's the chance that the mouse in the first maze takes less time to find the cheese than the mouse in the second maze? Okay. Does this sound at all like a problem we've already solved today. No . Cuz that problem was about coins. That's completely different from random mice. Yeah. . This is a lot like the chance that the number of heads in the first three tosses is less than or equal to the number of chances, the number of heads in the last two tosses. Yes? Okay. What's the chance that, so, what are the possibilities? How, how does the first mouse win? Well. First mouse can take one, two, three, or all four paths before, before finding the cheese, right. So, there's four possibilities, first mouse can take one, two, three or four. Second mouse can take one, two, three, four, or five, right. If, what's the chance the first mouse takes less paths? Well the first mouse, if the second mouse takes one path, the first mouse can't take less. Okay. If the second mouse takes two paths, then if the first mouse takes one he wins. The second mouse takes three, the first one takes one or two, he wins etc. Right? So the outcomes that corresponds to the first mouse winning are one from here. two from here. one or two from here, three from here, one, two, three from here, four from there, one, two, three, and that or four and five from there. Yes? Those are, those are the outcomes that corresponds to the first mouse taking less time. The mice are independent. The chance the first mouse takes one path. To find the cheese is what? What's, what's the chance that? So there's four path, paths to pick from. What's the chance it's the first path the mouth, the mouse picks? Mouth pick. . One quarter. One quarter. Okay. What's the chance that the second mouse takes, exactly two paths? So the chance of this is a quarter. What's the chance the second mouse takes exactly two? For that to happen, the first, the first guess of this mouse has to be wrong. And then the second guess has to be right. Yes? What's the chance that the first guess is wrong? Okay. Four, fifth, and then if, if the first guess is wrong, what's the chance the second guess is right. A quarter, right? There's still one right path left. There's four paths remaining. The mouse is picking equally, likely each, each one with equal probability. So the chance that this takes exactly two paths is four fifths times one quarter. Okay? What about this chance that it takes one or two? Well, the chance it takes exactly one is a quarter. The chance it takes exactly two is. Three quarters times one-third. Right? Okay. So a quarter plus three quarters, times one third is this one taking one or two. What's the chance this mouse takes exactly three? Well, it going to be wrong the first time. Be wrong the second time, and be right the third time. Yep. Okay. All right, so that's, this is basically the story and then to find the chance that this happens and this happens, we take this chance times this chance. We partitioned everything. We add the results to get, to get the overall probability. Okay? Yes? So we're going to take probability of, of all this column to column times the probability. No. If. In each row we're going to multiply this times this. To have a new column? To get a new column, because that's the chance that this happens and this happens. Okay, but this corresponds to one way that the first mouse can win. Okay. This is a disjoint way that the first mouse can win, this is a disjoint way that the first mouse can win. So we're going to add those probabilities. . Okay.