Tip:
Highlight text to annotate it
X
Hello, my friends.
We're going to continue our discussion on linear
regression by exploring multiple linear regression.
And this video is just simply an introduction to the concept
in which I will try to represent it visually that you
might understand what multiple linear
regression is all about.
As we start this thing out, we just finished simple linear
regression.
And simple linear regression is about modeling the data set
with one independent variable to model
one dependent variable.
Multiple linear regression's about modeling a data set with
more than one independent variable to model one
dependent variable.
In other words, we'll examine five independent variables and
use them to model the results of one dependent variable.
Multiple linear regression, of course, is founded in
correlational analysis just as simple linear regression was
founded in correlational analysis.
Now, what is multiple linear
regression, you may ask yourself?
Now, we'll start out here and we will look at one dependent
variable that we may want to model.
And if we want to model that dependent variable, we may
identify several independent variables.
In this picture, we have identified three distinct
independent variables and we want to use them to model a
dependent variable.
Now what happens in multiple linear regression is that
multiple linear regression will begin to examine each of
these three independent variables to determine which
one has the most impact on the variance of
the dependent variable.
Now, don't let the word variance scare you, because I
will come back to that again.
All statistics is about mean and about vary.
So, that's very important.
So multiple linear regression will look for the independent
variable that is having the most impact first.
And of course, the goal is that it will examine
all three of those.
And it will look at how much variance in the dependent
variable that each independent variable explains.
Now, it will identify then the one that
explains the most variance.
And once it identifies the independent variable that is
explained in the most variance in the dependent variable then
it will more or less freeze that independent variable to
examine the other two independent variables, and
identify that independent variable, which explains the
next-most variance in the dependent variable.
And of course, this process goes on, and on, and on.
Once a variable is identified as having impact on the
dependent variable, it is frozen, and then the other
variables are examined to determine in a priority order,
which one explains the most, which one explains the next,
which one explains the next?
If one of those independent variables explains 50%, then
the next one may explain 25%.
And the next one may explain 10%.
So between those three independent variables, they
thus explain 85% of the variance in
the dependent variable.
The goal is to explain as much of the variance via the
independent variables as is meaningful.
Now notice the word meaningful.
To explain 100% of the variance, you have to explain
everything.
And that's just not possible.
It's never possible to explain 100%.
But how much is meaningful?
What if you found a good 75% fit model?
Multiple linear regression is about replication and the
identification of the most
important independent variables.
Now I put dependent there.
I apologize.
It's independent variables.
Which one is the most important in impacting the
dependent variable?
Once you identify one, you freeze it and then you examine
the others.
Many researchers criticize multiple linear regression
because replication also replicates error.
But friends, I will tell you this.
That anything that is replicated, replicates error.
And a lot of--
I hate to quote the great statistician, George Box, who
said "all methodologies are wrong." He was correct.
But he did add that some are, however, useful.
There is no perfect methodology.
Multiple linear regression's by no means perfect.
But it may prove useful.
All linear regression is continued upon the lower and
the upper bounds.
Your regression is between your lowest x values and your
highest x values.
And outside those parameters, it really
doesn't have much meaning.
Again, I want to thank you very much for your support.
We'll keep working on this material.
We're going to master multiple linear regression.
Just want you to know what it is.
And I've been kind of beat up for saying,
live long and prosper.
My daughter told me I needed to come into the new age, so
we'll do this.
May the odds be ever in your favor.
This is Dog, signing off.