Tip:
Highlight text to annotate it
X
Normality test using SPSS, how to check whether data are normally distributed. As you know in
statistical analysis, there are dependant variables and independent variables. A dependent
variable is a variable that may depend on other factors. For example, exam scores as a variable
may change depending on the students' gender. An independent variable on the other hand, is a
variable that doesn't change. For example, gender doesn't change, depending on exam scores.
Many parametric statistical methods require that the dependent variable is approximately
normally distributed for each category of the independent variable. The normal curve, is the
familiar, classic bell shaped curve. In our example, exam scores need to be approximately,
normally distributed for both males and females.
Lets use SPSS to verify this. We must investigate the following numerical and visual outputs.
The skewness and ketosis zed values should be somewhere in this #[1:19] minus 1.96 to plus 1.96. The Shapiro
#[1:26] p-value should be above 0.05. The histograms normal Q-Q plots and box plots should
visually indicate that our data are approximately normally distributed.
Remember that your data doesn't have to be perfectly normally distributed - the main thing here
is that they are approximately normally distributed, and that you check each category of the
independent variable. In our example, we must check both male and female data.
Now I will show you how to do it, with the help of SPSS. Afterwards I will provide references,
and show examples of how you can write out your results in your paper, or audible manuscript.
In the SPSS menu, click on analyze and select descriptive statistics and then explore. In our
example, exam scores is the dependent variable, because as I said, we assume that they may
change, depending on gender, and gender is our independent variable.
Next, click on plots, and select histogram - you don't need stem and leaf. Select normality plots
with test, and continue. Click okay to execute and generate the output. First, focus on skewness
and ketosis. The measures are in the left column, and the standard errors are in the right column.
The skewness and ketosis measures should be as close to zero as possible in SPSS. In reality
however, data are often skewed and quixotic as you now. A small departure from zero therefore
is no problem as long as the measures aren't too large, compared to their standard errors. As a
consequence, you must divide the measure by its standard error and you need to do this by hand,
using a calculator. This will give you the set value, which as I said should be somewhere
between minus 1.96 and plus 1.96.
Let us start with the males in our example. To calculate the skewness zed value, divide the
skewness measure by its standard error. Here, it is 1.02 - this value, 1.02 is neither below minus
1.96 nor above plus 1.96, which is exactly what we want.
Next calculate the quixotic zed value for the males. In this example, it is 0.81, which is also
within plus minus 1.96. Next, calculate the skewness and quixotic zed values for the female
data. It is minus 0.03, and minus 1.16. All four zed values in our example are within plus minus
1.96. Hence, we end this part about skewness and quixotic by concluding that the exam score
data are a little skewed and quixotic for both males and females, but they don't differ
significantly from normality.
Next, let us focus on the Shapiro-Wilk test statistic. The null hypothesis for this test of normality
is that the data are normally distributed. The null hypothesis is rejected if the p-value is below
0.05. In SPSS output, the p-value is labeled "SIG."
In our example, the p-value for males is 0.456, and females 0.493 are both above 0.05, so we
keep the null hypothesis. The Shapiro-Wilk test thus indicates that our example data are
approximately normally distributed.
Next, let us look at the graphical figures for both male and female data. Start by inspecting the
histograms visually - they should have the approximate shape of a normal curve. And I think
they have in our example. So everything is okay here. Then look at the normal Q-Q plot, the
dots should be along the line. This indicates that the data are approximately normally distributed.
In our example I think they are normally distributed on the line, so that's okay.
Skip the d-trend in Q-Q plots - you don't need them. Look at the box plots they should be
approximately symmetrical. Although they are not perfectly symmetrical in our example, I think
they are good enough.
Finally, before I show you how to write out your results, let me provide resources. These are the
books and articles that are the basis for this tutorial.
This is how I would write out the results.
I would put it under the sub-heading, example characteristics, and I would phrase it something
like this. Feel free to pause the tutorial now to read my example text more in detail.
In case you are wondering, you don't need to report the skewness and quixotic zed values - its
enough to report the measures and their standard errors.
SE is the abbreviation for standard error.
In this tutorial, I've showed you how to check if a dependent variable is approximately normally
distributed for each category of an independent variable. I did this because I assume that you
will eventually want to use certain parametric statistical methods to explore and investigate your
data, such as, for example, t-tests.
If it turns out that your dependent variable is not approximately normally distributed for each
category of the independent variable, it is still no problem. In such case you will have to use non-
parametric methods, because they make no assumptions about the distributions.
Thank you very much for watching and let me end by wishing you success with your research,
and your paper or article manuscript.
Captions by GetTranscribed.com