Confidence Intervals & Single Sample T Test Using Spss

>> Do college students get less sleep? To answer this question we're going to use SPSS to calculate a sample mean, do a single sample T test, as well as a confidence interval. We're going to try to answer the following questions. First, approximately how much sleep do college students get? Second, do college students sleep less in the general population? And third, what's a ballpark range for how much sleep college students get? That's 95 percent likely it contained the 2 population mean. Let's focus on our first question. To answer that we'll use a sample mean as a point estimate for the population mean. Now it is based on our sample of 40 randomly selected college students. We'll go ahead and estimate how much sleep the typical college student needs. We'll keep in mind that if we were to sample repeatedly, each of our sample means would be a little bit different. So we know going into this that there's going to be some error that will be involved. OK, with that in mind we'll go ahead with this scenario and use SPSS. We'll go ahead and click on analyze, and then from the drop down menu select Compare Means, and then 1 sample T test. When a dialogue box comes up we go ahead and we take our sleep variable and move it over to the test variables. For the test value we're going to put in what we're going to compare college students to. In this case we're going to compare them to the United States adult population, and recent research shows that they're getting 7 hours of sleep each night on average. Here's our SPS output. In looking at it we'll focus on the sample mean of 6.36. So we know now, in terms of our estimate, that on average college students get about 6.36 hours of sleep each night. There may be some error, in fact there certainly is some error involved, but that's our best estimate based upon our sample. Second, do college students sleep less than the general population? Well we're going to focus specifically on, can we reject our null hypothesis? The null hypothesis is that college students are just like everybody else, needing 7 hours or perhaps even more of sleep. We're looking at a distribution here for the amount of sleep required by the adult population in the U.S., it's centered on 7 hours. Looking at the left tail, because we're going to do this as a 1 tail hypothesis test, we can see our .05 shaded region, this will be our reject zone, so that if any sample mean falls in this reject zone it will go ahead and be rejected. Coming back to our distribution of our adult population, let's assume for a moment that the null hypothesis is true. If that's the case then .95 of our sample mean should fall in the none reject zone, or the retain the null hypothesis area. So we have a decision. Will we reject the null hypothesis, or retain the null hypothesis? If the null hypothesis is true we know that due to error we'll reject the null .05 of the time. The .95 of the time we'll make the correct decision. So again if the null hypothesis is true, retaining the null hypothesis will be a correct decision. Rejecting the null hypothesis would be a type 1 error. On the other hand, it could be the case that college students really do need less sleep than the typical U.S. adult. And in that case, if we were to reject the null, that would be a correct decision, and if we retain the null that would be a type 2 error. OK, that was a brief review of the decision matrix. Let's come back to our SPS output. Remembering that we get to reject a hypothesis if our P value is .05 or less. Here's kind of like the big picture. You have the null hypothesis and you have some data. If they conflict, then they both can't be true and we can't throw out the data, assuming we think it's good data, so rejecting a hypothesis instead when that P value is .05 or less. Looking at our SPS output, we have our T value of -2.452, our degrees of freedom of 32, and our significance .02. Let's focus first on the T value, keeping in mind, how did SPSS calculate that T value? Remember our formula for single sample T test is sample mean minus the population mean, divided by the standard error. We can get this information from our SPS output. Our sample mean is 6.36, our population mean, assuming that a hypothesis is true, that college students are just like everyone else, would be 7, and our standard error is .26. And we calculate that we get a T value of -2.46, pretty close to what SPS gave us. Any difference is likely to be due to a rounding error on our part. What does that T value of -2.45 actually mean? It means that our sample mean is approximately 2.45 standard errors below the mean. That is, we would have to go 1, 2, and half approximately standard errors below the mean before we get to our sample mean. That's pretty far away if this null hypothesis is true. This would be a very unlikely sample mean. So again, that sample mean is 2.45 standard errors below the mean. OK the next thing in our SPS is the degrees of freedom, which is 32, and that's simply calculated as our sample size minus 1. And then finally we have our significance for a two-tailed test. Now remember, we were interested in a one-tail test. We just wanted to show that college students sleep less than the general population. But SPSS doesn't ask us if we're doing a one-tail or two-tail, it always gives us the two-tail. Given that it gave us a two-tail of .02, think about it as two-tails, we take that .02 and we know that has to mean .01 is below the mean, and .01 is above the mean. So if we added them together we'd get our two-tail value of .02. We're doing a one-tail test, then is just half the amount of our two-tail test. So for a one-tail test our P value is .01. So keep this in mind. Any time you're doing a one-tail test, and you get the SPSS output for a two-tail test, you can usually make the correction. Just take the value, divide it by 2, and that will give you the P value for a one-tail test. OK, remember we get to reject the null hypothesis if P is less than or equal to .05. In this case we'll be able to reject the null hypothesis since our P value is less than or equal to .05. And we'll conclude that college students sleep less than the average population. Onto our 3rd question. What's a ballpark range for how much sleep college students get? That's 95 percent likely to contain the true population mean. Specifically we're going to determine a 95 percent confidence interval. Here you can see some examples of what a confidence interval might look like. We have our sample mean in the middle, and then we have a range around it. We're 95 percent confident it contains the true population mean. Keep in mind though that not all confidence intervals will contain the true population mean. If you're doing a 95 percent confidence interval, that means about 5 percent of your confidence intervals will not contain the population mean. Here's an example of 100 confidence intervals generated by a program where for each one it took a sample mean, and that will let you know whether the confidence interval's more to the left or to the right, based upon that sample mean. And also notice that there are different widths, depend upon the sample standard deviation. So here are some confidence intervals, and then generated another 100, and another 100. And the ones in red are where the confidence interval does not contain the true population mean at the 95 percent level. So you can see sometimes you might get a few more than 5 out of every 100. So to review. A confidence interval is a range that's likely to contain the population mean. And now let's look at how do you determine a 95 percent confidence interval for how much sleep college students get each night? Let's return back to our SPS output, and we'll focus on the 95 percent confidence interval of the difference. To create our confidence interval we'll begin with our test value of 7, that is that's our comparison point of the adult population needing 7 hours. But what about for college students? Well we take that 7 and we add to it our lower bound value of -1.1655, which comes out to be 5.83. So that's our lower bound of our confidence interval. Then for upper bound we take again that 7, and we add to it our upper bound of -.1079 that SPS gave us, and that comes out to be 6.89. So we have a confidence interval for how much sleep college students require. That's 95 percent likely to contain the true population mean. It goes from 5.83 to 6.89. Now we could also have done this by hand, and I just want to take a moment to show you that, just so you understand where is SPS getting these values? So our formula is sample mean plus or minus T critical times the standard error. T critical is the only perhaps new thing to you, so let's just go through this step by step. Our sample mean we get from the SPS output, it's 6.36. Our standard error, we also get from the SPS output, it's .26. And for the T critical, well for that we can go to a T table. If our confidence interval is 95 percent, that leaves 5 percent left over. So for a confidence interval of 95 percent, at the T table we look up .05 as a column. Now our degrees of freedom, as you may recall from the SPS output, was 32. So in our T table, when we look up row 32 degrees of freedom, and where our column and row intersect, 2.0369, that is our T critical value. So let's plug that then into our confidence interval formula. So we have confidence interval for 95 percent is equal to our sample mean, 6.36, plus or minus the T critical, 2.03, times the standard of error .26. And we'll simplify there, so we get that the confidence interval of 95 percent is equal to 6.36 plus or minus .53, and that gives us our confidence interval range of 5.83 to 6.89, which matches what SPSS gave us. Alright now, let's say you wanted a different confidence interval. With the T table, if you wanted a confidence interval of 99 percent you would use an alpha .01. If you want a confidence interval of 90 percent you use an alpha of .1. Everything else should be the same. With SPSS what you'll do is first you'll again select the single sample T test, then when the dialogue box comes up for the one sample T test you click on Options, and for the confidence interval percentage you put in the percentage you're interested in, for example 99 percent. OK, so to recap. For 99 percent confidence interval there's a 95 percent probability that the confidence interval contains the population mean, keeping in mind that approximately 5 out of every 100 of them will not. Let's say we want a more shorter confidence interval. Well the confidence interval will become smaller if we're willing to be wrong more often. For example, up above we see the confidence interval for 99 percent. If we instead want a confidence interval for 95 percent, well the confidence interval is now shorter. It's less likely to contain the true population mean. If we had a confidence interval of 50 percent, well our confidence interval would become even smaller, and it would be even less likely to contain the true population mean. OK, what would be another way to go to a smaller confidence interval that would imply reader accuracy? Well, if we increase our sample size that will also result in a decrease in our confidence interval. So if we increase our sample size, for example, from 10 to 20 that would make a smaller confidence interval. And over here are another 100 confidence intervals that I ran, this time increasing that sample size. So in this case there's still a 95 percent confidence interval, but we're now working with a smaller range which can be helpful. So a larger sample size gives us more information, allowing us to have a smaller confidence interval size. OK, we've looked at how SPSS can help answer our questions regarding do college students get less sleep. Based on our sample, approximately how much sleep do college students get? And we said college students get approximately 6.36 hours of sleep. Do college students sleep less than the general population? And we concluded they sleep less than the population P less than or equal to .05. And finally, what's a ballpark range for how much sleep college students get? That's 95 percent likely to contain the true population mean. And we concluded college students get little sleep, that at a 95 percent confidence interval ranging from 5.83 to 6.89. I hope you've found this tutorial helpful, and my appreciation to those who provide materials on the web that helped me in creating this presentation. Thank you.