Hypothesis Testing, Probabilty, And Distribution of Sample Means - Part b

>> Okay, so we've done over a little of background. Now, let's focus again on the actual study. So let's say we find a school building where lead dust was discovered in the air from those lead-based paints and let's say that the average IQ for these 15 children that we learned about was 91. Okay, so our sample size is 15 and our sample mean is 91-- oh, already, 91 is below average, but the question is, is that far enough below average for us to say, "Hey this is a pretty significant finding, that's what we will determine as we move along." Okay, we also need information about our population, so that we know about the sample and the population. For IQ test, the average IQ is 100 and the standard deviation is 16, and this is based on the way that IQ tests are designed. Note that IQ is a scale variable, right? Falls along the scale, the distance between 99 and 100 as the same as between 100 and 101, so it's a scale, and IQ is normally distributed like most things out there that can measure, they tend to be normally distributed. So when we look at our population of distribution of IQ scores, our mean is 100, our standard deviation is 60. So now, it's time to make that comparison between our sample mean and the population mean, and what we're going to use as discussed previously is a Z-test and what a Z-test tells us-- the essence of a Z-test is what is the number of standard errors that our sample mean is away from the population mean? The larger the Z-test result that tells us that the sample mean is more standard errors away from the population mean. If we have a really huge Z-test result, say that's a Z-test result of five, now I say, "Hey, your sample mean has five standard errors away from the population mean. The probability that happened due to chance would be almost essentially zero. Okay, let's briefly talk about hypothesis testing because that's going to be essential anytime we want to collect evidence and evaluate whether or not we should reject our null hypothesis and support the research hypothesis, that's the whole basis for doing research. Okay, so for hypothesis testing, our research hypothesis is that the children who inhaled lead dust will have a lower IQ score than the general population. The null hypothesis, that's the alternative, is that lead has no effect on IQ or could actually improve it. If the sample mean is far enough below the population mean, then we get to reject the null hypothesis. So take a look at our picture above of the distribution sample, notice they are shaded in the bottom left part of the distribution sample means, that is referred to as the reject zone and if our sample mean is in that reject zone, we're going to be able to reject the null hypothesis allowing us to support research hypothesis, that something took place. Okay, now typically, if there's no special treatment going on, a sample mean should be close to the population mean, which is why in general, if you want to estimate a population mean, that's a good idea to go out and collect the sample and use the sample mean to estimate the population mean, but keep in mind that theoretically, any possible sample mean could take place. Generally, the sample mean will be close to the population mean, but again, theoretically, it's possible, you could get a really extreme sample mean just due to chance, and that is going to wreck a little bit of havoc with our hypothesis testing, it means that sometimes just due to chance, you may get bogus at this, that is evidence that would make you say, "Hey, something is going on," when really, it was just due the chance, and nothing really was taking place. The probability that the evidence will make you say, "Oh yes, something took place," and happened just due the chance, is set to 0.05, that is in the behavioral sciences and several other sciences-- we say we'll allow a 0.05 probability for evidence to make us think we should reject the null when we shouldn't. Okay, if the null hypothesis is true that lead has no effect, or actually helps IQ, there is still a five percent chance of our accidentally rejecting the null hypothesis, right? That's our alpha level and again, take a look at that distribution sample means, all of those x-bars that you see, all of them are possible, so most likely, the x-bars will not be in the shaded region. Most likely, the sample means won't be in the shaded region. If our alpha is 0.05, the non-shaded region with probability falling in the non-shaded region would be 0.95. So, the null hypothesis is true, most likely, we're not going to reject it and that's good. 0.05 at a time, we will. Okay, now on the positive side, if there's a real effect, we would expect our sample mean to-- that is if there's real effect of lead, if it really does harm IQ, we'd expect our sample mean to fall on that shaded region. See the line drawn, the vertical line drawn through the distribution sample means? That's our decision criteria, we say, "Hey, if the sample mean falls on that decision criterion or further into the shaded reject zone, we will reject the null." Okay? So we will reject the null hypothesis if the sample mean is beyond the decision criterion in the shaded region, otherwise, we'd retain the null hypothesis. We expect the sample mean to be far below the population mean if our research hypothesis is correct, right? So specifically, we expect the sample mean will be in the shaded reject region if the research hypothesis is correct. Okay, now as mentioned, the Z-test will let you know how many standard errors you are below the mean, if our sample mean is at least 1.645 standard errors below the mean, then we get to reject the hypothesis and support the research hypothesis, that negative 1.645, that's where our decision criterion is and everything to the left is shaded and that's our reject zone, and you're wondering, where did you get this negative 1.645? You can get it from a Z-table, you would-- I have to look up the P-value of 0.05 and find out what's the corresponding Z-score. Our Z-table will help you to approximate it, but you'd actually need a more sophisticated Z-table to get the actual negative 1.645, but I'll let you know, anytime you do a one-tailed test, it's going to be-- if you expect the sample mean to be below the population mean, it's always going to be negative 1.645, so it's a value we just know, and when you read the section on hypothesis testing, you don't go over that in a little bit more detail. Okay, so the value negative 1.645, that is standard errors below the mean, identifies the start of the shaded region, probability of any sample mean in that shaded region is 0.05 or less, so if the null hypothesis is true, there's a 0.05 chance that the sample mean will end up there even though no effect is going on, that would be bogus evidence and that would be an "Oh, no" type thing, we would be incorrectly rejecting the null. On the other hand, if our treatment has an effect, that is if lead does, it's actually harming IQ, well, that's right exactly where we'd expect the sample mean to be, and we expected to be below the population mean, and pretty far below the population mean. Okay, with that erratical [phonetic] background covered on hypothesis testing, let's get down to the actual mechanics. So we need to calculate the Z-test, the first step for calculating Z-test is to figure out what is the standard error. To figure out the standard error, it's approximately equal to the standard deviation divide by the square root of the sample size. Our standard deviation for the individual scores is 16 and our sample size was 15. So when-- we'll say 16 divided by the square root of 15 and that comes out to be 4.13, what that tells us is the variability for a sample means is a lot less than the variability for individual scores, right? Individual scores, we have standard deviation of 16, lots of variability possible for a sample size 15, those sample means are going to be closer to the population mean, they're variability, if you will, their standard error is 4.13 which is much less than 16. Okay, once we know our standard error, that's becomes kind of our unit of measurement, so then we go ahead and we calculate the Z-test, and our Z-test is the sample mean minus the population mean divided by the standard error and our sample mean is 91, our population mean is 100, and our standard error is 4.13, and when we do the math, that comes out to be a negative 2.18, that is our sample mean is 2.18, standard error is below the mean, that's pretty far below the mean. Okay, then next, we're going to evaluate our Z-test result to find out what is this mean. So, we make a decision, we'll reject the null hypothesis if the Z-test is 1.645, standard error is away from the mean or further out. Notice that the sample mean-- for our particular research, is in that shaded reject zone. Our sample mean was 2.18, our sample mean was 2.18, standard error is below the mean, right? That negative 2.18 means we're below the mean and that 2.18 itself says, we're 2.18, standard error is below the mean, that means that we're beyond our decision criterion of negative 1.645. So our sample mean is not one standard error below the mean, not two standard errors below the mean, but 2.18 standard errors below the mean, so we'll reject our null. Probability to having due to chances, 0.05 or less. Okay, so here, the Z-test mentioned was 2.18 negative indicating the sample mean was far below the population mean and that the probability this is happening just due to chance is 0.05, so we say well, 0.05, this could be due to chance but most likely, it's because lead does impair IQ. Our null hypothesis of no effect or actual improvement from lead dust was rejected, based upon the evidence collected, the sample mean was 91. As a result, the research hypothesis was supported, that lead dust actually does impair a brain development. So conclusion, lead impairs brain development, notice that-- to evaluate this evidence, we need to know about probability, distribution sample means, and hypothesis testing. You may want to go back through the workbook to review those topics and then listen to this narrated PowerPoint again to further solidify your understanding of these topics and how they're related.