Tip:
Highlight text to annotate it
X
Let's observe in more detail why the hospital example gives us such a wrong conclusion.
We study two variables--in-hospital and dying or passing away.
We rightfully observe that these two things are correlated.
If we were to do a scatter plot where we have two categories--
whether or not we're in the hospital and whether or not a person passed away--
you find there's an increased occurrence of data over here
and of data over here relative to the other to data points over here.
That means the data correlates.
What does correlation mean?
In any plot, data is correlated if knowledge about one variables tells us something about the other.
This is a correlated data plot. Here's another data plot.
Correlated or not? Yes or no?