Interpreting Regression Results

Recall that in the preceding presentation, we considered the theory of regression analysis. The basic idea is that in regression analysis we draw a best fit line through a series of data points. This best fit line has all of the algebraic properties that we learned about many years ago. The y-intercept and slope, in particular, are important properties to keep in mind. Our task in this presentation is to apply this theory to real-world data points, to figure out what we can learn about politics and public policy by drawing best fit lines. What we are looking at is a set of data points. As before, each circle denotes a state in the United States. The independent variable, presented on the x-axis, is the amount of money states spend per pupil on education. Note that there is substantial variation in education spending across the states. The y-axis presents the dependent variable, the percent of the state’s population that has a bachelor’s degree. Once again, the percentages vary widely across the states. What do you think the best fit line for this set of data points looks like? Is it upward or downward sloping? Is it shallow or steep in slope? Rather than look at the best fit line in picture form, what we are going to do is look at it in number form. That’s what the information we are looking at conveys. This information includes the slope of the best fit line that has been drawn through the set of data points. So where exactly do we look? And how exactly do we interpret what we find? Let’s focus on the slope of the best fit line. The slope of the line is .001, the number at the bottom of the column labeled “B.” What do we take away from this slope? First of all, the slope is positive in sign, which tells us that the line is upward sloping. In other words, as per pupil spending increases, the percent of a state’s popular with bachelor’s degrees also increases. What is the strength of the association between per pupil spending and the percent of a state’s population with bachelor’s degrees? Recall precisely what a slope tells us in the equation of a line. A unit increase in x is associated with a [fill in the blank with the slope] change in y. In our example, each one- unit increase in per pupil spending is associated with a .001 unit increase in the percent of a state’s population with a bachelor’s degree. Is that a big change or a small change? To answer this question, we first need to consider the way in which the independent and dependent variables are measured. The independent variable, per pupil spending, is measured in dollars. Looking at the scatter plot once again, we can see that sates range from less than $4,000 in per pupil spending to nearly $10,000 in per pupil spending. This is all important to know because this means that a one unit increase in per pupil spending means going from $1 to $2, or from $4,000 to $4,001 dollars, or from $9,999 to $10,000. Such changes are, of course, trivial. No state is going to enact a new policy whereby it increases its per pupil spending by a single dollar. Take the case of Alabama, which spends $4,564 per pupil. Let’s say that the governor and state legislature in Alabama agree to increase per pupil spending up to $6,000. Based on the best fit line, what might policymakers expect be the associated change in the percent of the state’s population with bachelor’s degrees? This would be a 1,436 unit increase in spending (6,000-4,564 = 1,436). If a one unit increase in spending is associated with a .001 increase in the percent of a state’s population with bachelor’s degrees, then a 1,436 unit increase in per pupil spending is associated with a .001 * 1,436, or 1.436, unit change in the percent of the population with bachelor’s degrees. In other words, the percent of Alabama’s population with bachelor’s degrees would be expected to increase by nearly one and a half percent, from its current level of 20.4 percent to an expected increase to 21.8 percent. Is that a big or a small change? Is it worth the money? Those are issues for each one of us to decide, based on our values toward education, government spending, and so forth. What regression analysis has told us is what size outcome we can expect if a particular policy change is made. This is what the practical application of regression analysis looks like. One question that we have not yet addressed is whether the association between per pupil spending and the percent of a state’s population with bachelor’s degrees is statistically significant. This is something we can find out very easily from the regression results that we are looking at. Look over at the right-hand column of numbers, the column labeled “Sig.” The bottom number in that column tells us whether or not the association is statistically significant. Here is the decision rule that we will use. If the number in the “Sig.” column is .05 or smaller, then we can reject the null hypothesis of no association with 95 percent confidence. In our example, since the number is .009, we can indeed reject the null. This means that the association between per pupil spending and the percent of a state’s population with bachelor’s degrees is statistically significant. Our best guess is that this association did not occur in our data set by random chance or anything like that. Rather, the association is indicative of a true underlying association between per pupil spending and the percent of a state’s population with bachelor’s degrees. This wraps up our initial introduction on how to interpret regression results. In the following presentation, we will consider the circumstances in which we do not have a single independent variable, as we have focused on here, but rather when we have a number of independent variables that we hypothesize are associated with a dependent variable of interest.