Tip:
Highlight text to annotate it
X
>> Okay, so we've done
over a little of background.
Now, let's focus again
on the actual study.
So let's say we find a school
building where lead dust was
discovered in the air
from those lead-based paints
and let's say
that the average IQ
for these 15 children
that we learned about was 91.
Okay, so our sample size is 15
and our sample mean is 91--
oh, already,
91 is below average,
but the question is,
is that far enough below
average for us to say,
"Hey this is a pretty
significant finding,
that's what we will determine
as we move along."
Okay, we also need information
about our population,
so that we know
about the sample
and the population.
For IQ test,
the average IQ is 100
and the standard deviation is
16, and this is based
on the way
that IQ tests are designed.
Note that IQ is a scale
variable, right?
Falls along the scale,
the distance between 99
and 100 as the same
as between 100 and 101,
so it's a scale,
and IQ is normally distributed
like most things out there
that can measure,
they tend to be
normally distributed.
So when we look
at our population
of distribution of IQ scores,
our mean is 100,
our standard deviation is 60.
So now, it's time to make
that comparison
between our sample mean
and the population mean,
and what we're going to use
as discussed previously is a
Z-test and what a Z-test tells
us-- the essence
of a Z-test is what is the
number of standard errors
that our sample mean is away
from the population mean?
The larger the Z-test result
that tells us
that the sample mean is more
standard errors away
from the population mean.
If we have a really huge
Z-test result,
say that's a Z-test result
of five, now I say, "Hey,
your sample mean has five
standard errors away
from the population mean.
The probability
that happened due
to chance would be almost
essentially zero.
Okay, let's briefly talk
about hypothesis testing
because that's going
to be essential anytime we
want to collect evidence
and evaluate whether
or not we should reject our
null hypothesis
and support the research
hypothesis,
that's the whole basis
for doing research.
Okay, so for hypothesis
testing, our research
hypothesis is
that the children
who inhaled lead dust will
have a lower IQ score
than the general population.
The null hypothesis,
that's the alternative,
is that lead has no effect
on IQ or could actually
improve it.
If the sample mean is far
enough below the population
mean, then we get
to reject the null hypothesis.
So take a look at our picture
above of the distribution
sample, notice they are shaded
in the bottom left part
of the distribution sample
means, that is referred
to as the reject zone
and if our sample mean is
in that reject zone,
we're going to be able
to reject the null hypothesis
allowing us
to support research
hypothesis,
that something took place.
Okay, now typically,
if there's no special
treatment going on,
a sample mean should be close
to the population mean,
which is why in general,
if you want
to estimate a population mean,
that's a good idea to go out
and collect the sample
and use the sample mean
to estimate the population
mean, but keep in mind
that theoretically,
any possible sample mean could
take place.
Generally,
the sample mean will be close
to the population mean,
but again, theoretically,
it's possible,
you could get a really extreme
sample mean just due
to chance, and that is going
to wreck a little bit of havoc
with our hypothesis testing,
it means that sometimes just
due to chance,
you may get bogus at this,
that is evidence
that would make you say, "Hey,
something is going on,"
when really,
it was just due the chance,
and nothing really was
taking place.
The probability
that the evidence will make
you say, "Oh yes,
something took place,"
and happened just due the
chance, is set to 0.05,
that is in the behavioral
sciences and several other
sciences--
we say we'll allow a 0.05
probability for evidence
to make us think we should
reject the null
when we shouldn't.
Okay, if the null hypothesis
is true that lead has no
effect, or actually helps IQ,
there is still a five percent
chance of our accidentally
rejecting the null hypothesis,
right?
That's our alpha level
and again, take a look
at that distribution sample
means, all of those x-bars
that you see,
all of them are possible,
so most likely,
the x-bars will not be
in the shaded region.
Most likely,
the sample means won't be
in the shaded region.
If our alpha is 0.05,
the non-shaded region
with probability falling
in the non-shaded region would
be 0.95.
So, the null hypothesis is
true, most likely,
we're not going to reject it
and that's good.
0.05 at a time, we will.
Okay, now on the positive
side, if there's a real
effect, we would expect our
sample mean to--
that is if there's real effect
of lead, if it really does
harm IQ, we'd expect our
sample mean to fall
on that shaded region.
See the line drawn,
the vertical line drawn
through the distribution
sample means?
That's our decision criteria,
we say, "Hey,
if the sample mean falls
on that decision criterion
or further
into the shaded reject zone,
we will reject the null."
Okay? So we will reject the
null hypothesis
if the sample mean is beyond
the decision criterion
in the shaded region,
otherwise,
we'd retain the
null hypothesis.
We expect the sample mean
to be far below the population
mean if our research
hypothesis is correct, right?
So specifically,
we expect the sample mean will
be in the shaded reject region
if the research hypothesis
is correct.
Okay, now as mentioned,
the Z-test will let you know
how many standard errors you
are below the mean,
if our sample mean is
at least 1.645 standard errors
below the mean, then we get
to reject the hypothesis
and support the research
hypothesis,
that negative 1.645,
that's where our decision
criterion is and everything
to the left is shaded
and that's our reject zone,
and you're wondering,
where did you get this
negative 1.645?
You can get it from a Z-table,
you would--
I have to look up the P-value
of 0.05 and find
out what's the
corresponding Z-score.
Our Z-table will help you
to approximate it,
but you'd actually need a more
sophisticated Z-table
to get the actual negative
1.645, but I'll let you know,
anytime you do a one-tailed
test, it's going to be--
if you expect the sample mean
to be below the population
mean, it's always going
to be negative 1.645,
so it's a value we just know,
and when you read the section
on hypothesis testing,
you don't go over that
in a little bit more detail.
Okay, so the value negative
1.645, that is standard errors
below the mean,
identifies the start
of the shaded region,
probability of any sample mean
in that shaded region is 0.05
or less, so if the null
hypothesis is true,
there's a 0.05 chance
that the sample mean will end
up there even though no effect
is going on,
that would be bogus evidence
and that would be an "Oh,
no" type thing,
we would be incorrectly
rejecting the null.
On the other hand,
if our treatment has an
effect, that is if lead does,
it's actually harming IQ,
well, that's right exactly
where we'd expect the sample
mean to be, and we expected
to be below the population
mean, and pretty far below the
population mean.
Okay, with that erratical
[phonetic] background covered
on hypothesis testing,
let's get down to the
actual mechanics.
So we need
to calculate the Z-test,
the first step
for calculating Z-test is
to figure out what is the
standard error.
To figure out the standard
error, it's approximately
equal to the standard
deviation divide
by the square root
of the sample size.
Our standard deviation
for the individual scores is
16 and our sample size was 15.
So when-- we'll say 16 divided
by the square root of 15
and that comes out to be 4.13,
what that tells us is the
variability
for a sample means is a lot
less than the variability
for individual scores, right?
Individual scores,
we have standard deviation
of 16, lots
of variability possible
for a sample size 15,
those sample means are going
to be closer
to the population mean,
they're variability,
if you will,
their standard error is 4.13
which is much less than 16.
Okay, once we know our
standard error,
that's becomes kind
of our unit of measurement,
so then we go ahead
and we calculate the Z-test,
and our Z-test is the sample
mean minus the population mean
divided by the standard error
and our sample mean is 91,
our population mean is 100,
and our standard error is
4.13, and when we do the math,
that comes
out to be a negative 2.18,
that is our sample mean is
2.18, standard error is below
the mean, that's pretty far
below the mean.
Okay, then next, we're going
to evaluate our Z-test result
to find out what is this mean.
So, we make a decision,
we'll reject the null
hypothesis
if the Z-test is 1.645,
standard error is away
from the mean or further out.
Notice that the sample mean--
for our particular research,
is in that shaded reject zone.
Our sample mean was 2.18,
our sample mean was 2.18,
standard error is below the
mean, right?
That negative 2.18 means we're
below the mean
and that 2.18 itself says,
we're 2.18,
standard error is below the
mean, that means
that we're beyond our decision
criterion of negative 1.645.
So our sample mean is not one
standard error below the mean,
not two standard errors below
the mean, but 2.18 standard
errors below the mean,
so we'll reject our null.
Probability to having due
to chances, 0.05 or less.
Okay, so here,
the Z-test mentioned was 2.18
negative indicating the sample
mean was far below the
population mean
and that the probability this
is happening just due
to chance is 0.05,
so we say well, 0.05,
this could be due to chance
but most likely,
it's because lead does
impair IQ.
Our null hypothesis
of no effect
or actual improvement
from lead dust was rejected,
based upon the evidence
collected,
the sample mean was 91.
As a result,
the research hypothesis was
supported,
that lead dust actually does
impair a brain development.
So conclusion,
lead impairs brain
development, notice that--
to evaluate this evidence,
we need to know
about probability,
distribution sample means,
and hypothesis testing.
You may want to go back
through the workbook
to review those topics
and then listen
to this narrated PowerPoint
again to further solidify your
understanding of these topics
and how they're related.