Tip:
Highlight text to annotate it
X
This presentation is the first of two presentations on understanding, calculating, and interpreting
an independent samples t-test. The independent samples t-test is a versatile test that works when we have two subsets of
a larger sample of data, and we want to test whether or not there are significant mean
differences between the subsets. There are many instances where this test is appropriate:
for instance, regions of the United States such as the Confederacy and the Northeast,
Hispanic American and Asian American demographics in a survey, members of Congress in the Tea
Party Caucus and the Congressional Progressive Caucus. In each instance, these sub-samples
are part of a larger sample: states in the U.S., respondents to a survey in the U.S.,
and members of Congress, respectively.
Here are fundamental criteria for the independent samples t-test. First, the samples that we
are comparing must have different summary statistics. This means that the samples have
different mean, median, mode, variance, and standard deviation values. Next, the sample
sizes must be different. Finally, cases cannot appear in both samples. For instance, for
this test to work, a U.S. state cannot both be in the Confederacy and in the Northeast,
a respondent cannot be both Hispanic American and Asian American, and the member of Congress
cannot be in both the Tea Party Caucus and the Congressional Progressive Caucus.
And here is an example of the criteria being met. You're looking at a map of the state
of Idaho, with all 44 counties represented. If you grew up attending elementary and secondary
school in Idaho, you probably know the Idaho counties song. (If you didn't and you're curious,
there are YouTube videos of the tune that you can watch.) You'll notice that the map
of Idaho is divided into six planning districts. In 1972, the state created these planning
districts so that the counties could better work together on planning matters. The planning
districts correspond to different regions of the state.
It's easy to see that planning districts I and II comprise the northern region of the
state. Idaho County is where Grangeville is located and it is at the bottom of District
II, whereas Boundary County is at the top of District I and borders Canada. Planning
districts III and IV comprise the southwest region of Idaho, whereas districts V and VI
are in the southeastern part of the state.
One general hypothesis about Idaho politics and society is that it is heavily influenced
by regions. For instance, the territorial capital of Idaho was first placed in Lewiston,
in the north, in Nez Perce County. By 1867, the capital was moved to Boise in Ada County,
and the land grant state university was placed in Moscow, in Latah County, in exchange. The
regional rivalry sure has been bitter at times! Additionally, each region now has a major
university: the University of Idaho in Moscow, Boise State University in Boise, and Idaho
State University in Pocatello. Some folks refer to the "Great State of Ada" in derision
at residents of Boise in particular, accusing them of being out of touch with the rest of
the state. And so it goes....
But are there meaningful differences between the regions of Idaho? One way to test this
is to use an independent samples t-test. Here's the formula, which I'll explain a couple of
times, so don't sweat it if you don't get it at first.
Let's focus on the numerator for starters. The numerator is easy: X bar sub 1 -- X bar
sub 2 means that you subtract the mean value for the first independent sample by the mean
value in the second independent sample.
The denominator is more complex. On the left hand side of the denominator, the first operation
is to take the number of cases in the first independent sample and multiply that number
by the standard deviation of the first independent sample squared. Then the second operation
is to take the number of cases in the second independent sample and multiply that number
by the standard deviation of the second independent sample squared. Add the results of the first
operation to the results of the second operation. Then divide by the first sample size added
to the second sample size minus two. Then take the square root of that division result.
This is your overall first denominator result.
On the right hand side of the denominator, add the number of cases in the first independent
sample by the number of cases in the second independent sample, then divide by the number
of cases in the first independent sample multiplied by the number of cases in the second independent
sample. Then take the square root of that division result. This is your overall second
denominator result.
Your next step is to multiply your overall first denominator result with your overall
second denominator result. The result of this multiplication is your overall third denominator
result.
Your final step is to divide your overall third denominator result into your numerator
result.
In our next presentation, we will work through a data example of the independent samples
t-test and interpret the results.