Tip:
Highlight text to annotate it
X
Well, my friends, now that you've had opportunity to look
at what a linear regression is and to examine the linear
regression model, it is important that we take just a
moment to review the assumptions that are made
regarding multiple linear regression.
Now, multiple linear regression requires several
conditions, and I'm going to outline these and then discuss
each of them with you.
The first of these that only relevant
variables are included.
The next is that a linear relationship is required.
All variables must be normally distributed.
And homoscedasticity is assumed.
It's an awesome word.
We must of us prefer homogeneity of variance.
These assumptions relate directly to the validity of
the research findings.
So let's look at each of these assumptions and see what we
might discover about it.
Only relevant variables are included.
I believe that almost goes without saying.
Variables can be modeled to predict other variables
without regard to their relevancy.
That happens all the time.
For example, the cost of individual Christmas gifts and
shoe size could be modeled to predict the academic GPA of
entering college freshman.
What would be the purpose of such research?
What would it mean?
Christmas gifts and their cost and shoe size, how in the
world would that ever relate to GPA?
I laugh sometimes, because you see this going on all the time
with the media.
I mean, they'll take a hedgehog somewhere or a
groundhog or a flying squirrel, or anything else,
and they'll say, well, when a flying squirrel can jump from
this limb to that limb and not see its shadow, then the stock
market's going to go up.
I don't know.
Seems to me that they might be violating the first
assumption, that only relevant variables are included.
Now, this also means that you don't have cold linearity.
In other words, if one variable is the same as the
other variable, just under a different wrapper, then you,
in fact, don't have two variables.
You only have one.
We discussed this earlier in some of the other videos.
Assumption two is that a linear
relationship is required.
Obviously, linear regression is about linear modeling.
Linear infers a linear relationship between the
independent variables and the dependent variables.
That just goes without saying.
Now one of the neat things is that our correlation
coefficients will let us know if we have a linear
relationship or not.
If you will recall, the Pearson r tells us if we have
a linear relationship.
That's what it examines, linear correlation.
So now you know the Pearson r is going to
continue to be important.
The third assumption is that all variables are normally
distributed.
Normality is very important in multiple linear regression.
The condition may be hard to meet, however, by the data
sets that you extract and that you analyze.
There are many researchers that say that the process is
considered robust, even if you don't meet this assumption.
If the data sets do not contain extreme outliers, then
the research that you've done may in fact be robust.
And guys, this is kind of an approach when you have
assumptions.
If you have an assumption that is not met, then declare that
it's not met.
Then go out and find a researcher that says that the
process is still robust.
Let me tell you, it doesn't matter what you look for.
There's a researcher out there somewhere that says
it's going to be OK.
Pretty interesting.
Assumption four.
Now, I love this word.
Homoscedasticity is assumed.
Homoscedasticity is often called
homogeneity of variance.
Those of us with terrible Texas accents find it easier
to say homogeneity of variance than we do homoscedasticity.
Now, what homogeneity of variance assumes is that the
variance across all of the little data sets, all of the
variables are the same.
In other words, you have this.
I want you to notice that you have curves
that are spread equally.
Their variance is the same.
That means their standard deviations are the same.
And they look alike.
So when you have homogeneity of variance, your variables
look like this.
They're the same width.
Now, if you don't have homogeneity of variance, or
homoscedasticity is violated, then you have a
situation like this.
You may have a very narrow curve, you may have a medium
curve, you may have a big fat curve.
I resemble that remark, I don't know.
You may have a broad curve.
No, I still resemble that.
I'm like, well, you may have a wide--
I'd better give up.
OK, but in other words, if you have homogeneity of variance,
your curves are all the same.
And if you don't have homogeneity of variance, you
have a little problem that you're measuring curves that
are not the same width.
Now, how did we do?
We wanted to give you the assumptions for multiple
linear regression.
Only relevant variables are included.
A linear relationship is required.
All variables are normally distributed.
Homoscedasticity is assumed.
In other words, you have homogeneity of variance.
And these assumptions relate directly to the validity of
your research findings.
Now when you don't meet an assumption, I want to remind
you, that you would declare that.
And then you go find a researcher that says your
study is still robust, even if it doesn't meet that
assumption.
But what you're doing is you're declaring that you
didn't meet it, and leaving the findings of your research
in the hands of the informed reader.
Well, may the odds be ever in your favor.
And of course, I still like the whole
Vulcan thing, you know.
But the odds be ever in your favor.
I'm trying to be cool.
Last night I was watching television, the
Turner Classic Movies.
And they were-- that's what old folks do--
and they were talking about how cool Steve
McQueen was in the movie.
Because he was man enough that when the movie, when they woke
him up, you know, in his film, he's in these old dorky
looking pajamas.
And he gets up and it takes him about 30 minutes to get
his body moving.
And that really brought hope to me.
Because my pajamas are dorkier looking than his were, and it
takes me more than 30 minutes to get moving in the morning.
Must be cool.
May the odds be ever in your favor.
May the odds be ever in my favor.
You have a good day.
This is the Dog, signing off.