Tip:
Highlight text to annotate it
X
In this video, you learn to sort data with Base SAS.
In this video, you learn to sort data with Base SAS.
Let me show you how you use the SORT procedure to sort data in SAS.
There’s a data set named titanic1 in the libsas library. Let’s take a look at the data set.
I’ll use SAS Explorer, drill into Libraries, libsas, and then open the titanic1 data set.
It contains demographic data for some of the passengers on the Titanic. The data is not sorted. There are various reasons that you
might want to sort a data set: to prepare it for a detailed listing report, for grouped analysis, or for a merge.
Suppose I want to group the data by survivors and non-survivors. I’ll use the numeric variable, Survived.
A value of 1 indicates a survivor and 0 is a non- survivor. I’ll close table viewer by clicking the X.
Let’s take a look at my program. A SORT procedure sorts a data set in ascending order by default, arranging the observations from
lowest to highest based on the sort variables that you specify. This step calls PROC SORT to sort the input data set, libsas.titanic1.
PROC SORT overwrites the original data set unless you use the OUT= option.
I’ve used out=work.titanic2, so the sorted observations are written to a data set named titanic2 in the work library.
I specified the sort variable, Survived, in the BY statement. The RUN statement ends the SORT step.
Let me highlight and submit this program by clicking Run.
PROC SORT doesn’t generate a report, so I need to check the log to see if it worked. There are no errors or warnings in the log.
The notes indicate that 117 observations were read from the input data set and the same number were written to the output data set.
This looks good. I’ll use Explorer to view the resulting data set. It is currently displaying the libsas library.
Now I need to go up one level, and then drill into the work library and double-click the output data set, titanic2.
The non-survivors are listed first, followed by the survivors. This is because an ascending numeric sort was performed, and zero comes before 1.
Later, you’ll see a descending sort, but first I’ll sort on multiple variables. Look at the Age column.
What if I want to sort by age within each group? That’s easy. I just add the Age variable to the BY statement.
I need to close the table viewer before I recreate titanic2, because SAS will not overwrite a data set that is in use.
Look at the BY statement in this step. The data will be sorted by Survived, and then within each group it will be sorted by Age.
I’ll run this step and use Explorer to open work.titanic2 again.
Why are there dots in the Age column? Remember, SAS does an ascending sort by default.
The dots indicate missing numeric values, which are treated as the smallest possible value, so they appear first in the sorted data set.
I’ll scroll through the output, there you go, missing values begin at observation 41. This looks good, the non-survivors are listed first
from the youngest to the oldest. When I page down to see the survivors they are also listed from youngest to oldest.
What if I want to sort from the oldest to the youngest instead, and I want to list the survivors before the non-survivors?
That’s easy too. Let me close the table and go back to my program.
To sort from highest to lowest, or from Z-to-A, I used the keyword DESCENDING before each variable that I want sorted in descending order.
I’ll run this step and use Explorer one last time to look at titanic2.
I see the survivors from oldest to youngest, and when I page down to observation 78, I see the non-survivors displayed from oldest to youngest.
In this demonstration, you have seen how to sort a SAS data set with Base SAS.
Thank you for your interest in SAS.