Tip:
Highlight text to annotate it
X
Hello, friends.
We will continue our journey through comparative analysis
looking at effect size and power.
Though I have chosen to discuss effect size and power
under our ANOVA videos, I wanted to point out that this
is exactly like kurtosis and skewness.
This reaches well beyond ANOVA [INAUDIBLE]
t-test, ANOVA, MANOVA, or any of a group of other processes.
Effect size iis the distance of the actual value that we
gain for the mean from the data set from the anticipated
value for the mean.
Consider the following.
We might have a group of cars.
And we expect that that group of cars would have a mean mile
per gallon of 30.5.
We take a sample, and we found that the mean for
the sample is 28.3.
We have a distance between what was anticipated and what
the actual value really is.
We call this the effect size.
And the effect size in this case would be
2.2 miles per gallon.
Now the effect size is the distance between the means of
the variables.
You will notice that in ANOVA, we have three groups, and we
have a distribution of the value that we're looking at
for each of those groups, so we would have a difference
between the mean of each of these.
The effect size is the distance of the actual value
from the anticipated value.
Again, you want to consider the following.
Effect size may be strong, it might be moderate,
or it may be weak.
So we have three values and effect size that we can have
are strong, moderate, or weak.
Now, we will use a coefficient known as the partial eta
squared to discuss effect size.
And of course, we in math and statistics love these little
Greek symbols.
Eta squared will stand for the coefficient that gives us the
effect size.
Now, a strong effect size, which is eta squared greater
than 0.14, means that if we go into samples, and we're
comparing two groups of three groups, and we select a value
at random, a strong effect size means that it is very
likely that we can look at that data point based on its
value and determine which groups it belongs to, because
the distances between the groups were so profound.
A moderate is between 0.6 0.14.
And that means if we randomly select a data point, it might
be identified as to the group which it belongs to
based on its value.
And weak is between what 0.1 and 0.6, and that means that
it's not likely that we would be able to take that data
point and look at its value and determine the group that
it belongs to.
Now, I've done some very clever things for you here.
In discussing effect size, I have drawn you a parallel that
the effect size tells us, if we randomly select a data
point, how likely it is that we would be able to predict
the group that it belongs to.
That's what effect size really is all about.
Now, I want to give you an example of where you might
have a strong significance but a very weak effect size.
In the 1970s, the analysis of GRE scores indicated that men
scored higher than women.
Well, you know the guys jumped all over that and said, that
means we're smarter than women.
Well, I'm not going to make that statement.
I don't believe that at all.
Sometimes my wife scares me to death.
But what we had actually in the GRE test at that time, and
you still have today, is a bias towards the engineering
fields, the math fields.
And in the 1970s, of course, those were dominated by men.
But they found a significant difference between the scores.
Now, the difference in the averages
was very, very minute.
However, the number in the sample was enormous.
So this made that small difference significant.
You remember something about the difference being divided
by s over the square root of n from your initial statistics?
So the more that you had in the sample, the smaller that
denominator becomes, and you divide it in.
Man, it makes a big z score, and lo and behold, you've got
a great significance with that.
Well, the fact is is that the random selection of a
participant yielded no likelihood of predicting the
group based on the scores.
I mean, the value was just minutely differing.
But because the number was so high, it made that little bit
of difference significant.
Well in fact, that was a very, very weak effect size, which
meant that it really didn't have any meaning.
You can have significance and lack any meaning whatsoever in
the results that you have.
Now, power is about the probability that the test will
reject the null hypothesis when the null
hypothesis is false.
Power analysis can be utilized to calculate the minimal
sample size required so that one can be reasonably likely
to detect an effect of a given size.
Power can also be used to compare different statistical
testing procedures.
We might compare a parametric design to a non-parametric
design, and find that one has more power than the other, the
power being the probability that we will reject the null
hypothesis when the null hypothesis is false.
I want to share with you this table just a minute.
And it might make this a little bit more clear, the
options for evaluating the null hypothesis.
Now, the possibilities are that the null hypothesis is
true or it's false.
And our actions could be that we do not reject it or
that we reject it.
So it's either true or false.
We're either not going to reject it or reject it.
If we have a true null hypothesis and we do not
reject, we made a correct decision.
And that's where alpha comes in.
Alpha is equal to the significance, the likelihood,
the little error that we're willing to live with.
If we reject the null hypothesis when it's true,
this is called a type I error.
And if we fail to reject the null hypothesis when it's
false, that's a type II error.
Now, the type I error is really 1 minus alpha.
And a type II error is going to be 1 minus beta, where beta
is the power.
If the null hypothesis is false and we reject it, that's
a correct decision.
That's where power comes in.
Power is represented by beta, the probability that we will
reject the null hypothesis when it's false.
Alpha is the probability that we will fail to reject the
null hypothesis when it is true.
Well, my friends, I will show you how to do
effect size and power.
You'll recognize the data set again, the percent
women and the group.
We'll go up to Analyze.
Now, if you were doing ANOVA, you'd go to Compare Means and
One-Way ANOVA.
We're not doing ANOVA.
Let's go to General Linear Model.
Let's go to Univariant, because we have one
independent variable group.
We will translate percent women into the dependent
variable, group into the fixed factor.
We will go to the Options.
In the Options, we will select Estimates of Effect Size, and
Observe Power.
And you see we could put a lot of other things in there.
And we want to display those for overall.
Here we go.
And let's say git 'r done.
And here comes our analysis, just that quickly.
Well, let's take this SPSS readout now, and let's go
through and see if we can find what we look for.
Now, keep in mind that we did some statistics that we really
didn't have to do.
We have descriptives and a Levene's test for homogeneity
[INAUDIBLE]
variance.
We didn't need that.
What we want to look at, though, is this thing, Tests
of Between-Subjects Effects, and this little area, partial
eta squared--
0.167 for the corrected model.
That is a strong, strong partial eta squared, and a
strong effect size.
That means that these variables in the groups differ
so much that if we randomly pick one, we're very likely to
be able to tell which group that it goes to--
we'll look at the post hoc test shortly--
one set or the other.
Then also when we come in here we want to look at the power,
let's do the observed power--
0.995.
This is a very powerful test.
We've done well to get this far.
Well, my friends, I would never want to close one of
these videos without thanking you for your patronage.
Your patronage keeps my family fed.
Enjoy watching these videos.
You take care of me and I'll take care of you.
May the odds be ever in your favor.