Tip:
Highlight text to annotate it
X
Let's see an example of a frequency distribution.
Example, okay,
on a scale
of one to five,
how satisfied
are you
with your
car?
Okay, and just to clarify a bit, if
a customer ranks a score of one
this corresponds to, would never buy again.
Never buy again that's how
dissatisfied, unsatisfied with the car. All the way up to five,
which would be, um, let's say, want to buy
one for a friend.
K meaning you're so incredibly satisfied that
you want to spread the love and you think everybody should have this car.
Want to buy one for a
friend, okay.
So let's say
that we have a certain number of people respond and so
maybe out of that set of people that respond. Let's define X to be
satisfaction score. Satisfaction score.
So let's think about this. If we're going to create the frequency distribution and
the idea a distribution is that you give all the acceptable values
a score ever variable and then list how frequently they occur.
Um, X could take on which values?
X could take on the value one,
two, three, four, or five.
Right, we don't know allow partial satisfaction, if we did we would give it
a different scale, maybe one to ten or something. So here are the five values and
for our frequency, let's say, that we had
in our survey we had two people say that they were not satisfied.
We had five people say that they were very,
uh, not dissatisfied but not that satisfied. Three people,
three people said that they were somewhat satisfied.
Um, let's say that six said that they were pretty satisfied,
probably would buy another one. And let's say that 10 people said that they love
the car so much
that they would yes, buy one for themselves again and they want to buy
one for a friend.
Okay, so what we've created here is called the frequency distribution.
Okay, the frequency distribution shows all the values that this
variable could take on. Some satisfaction score could be a value 1,2,3,4,5
and since is the frequency not just some distribution, we talk about maybe
others later.
It also shows how frequently somebody
gave it a score of one so two out of the total amount gave it a score of one.
Five out of the total amount gave it a score two, and so forth.
Okay, so if we look at this,
if we want to know the total number of participants.
I would say the total number, N. Well this is a sample so I'm going to use
lower case n, for sample.
The total number, and I want this color,
n is going to be equal to the sum of all these values.
Okay, so the sum of the frequency scores. So
two plus five, plus three, plus
six, plus ten. Uh, two and five
two and three, five, ten, 26. So, 26 people
showed up. Okay, and
we can create what's called the relative frequency.
Alright, let me put another vertical bar in here, and so if I took
that F value over the sample
size that would tell me that relative frequency.
So the the portion out of the whole of people that agreed with the statement
one, statement two, so on and so forth. So for this one
we would have two out of 26,
gave it a score have one. Two out of 26.
Five out of 26, gave it a score of two.
Three out of 26, gave a score of three. Six out of 26, gave it a score four. And ten out of
26, gave a score five. So what we're going to do is look at
this data, how well I can circle that.
Okay, this data, this data set where you have the score is the frequency even the
relative frequency.
So two maybe is a lot, maybe it's not, but if we compared to the total number than
we can get a
size comparison. Hey is that a lot of the people, two?
When you can compare it to, say two out of 26 versus
ten out of 26. This is a lot more percentage of the people
that agree with statement five. Okay, so
this is a nice table. Um,
often though we want to look at this information in a graphical form.
Okay, so we have a few choices available to us. We have the histogram or the polygon,
and those two choices are based on the fact
that we have quantitative data here. Right, so
the value of X, so the variable that we're looking at
those are measured on,
well an ordinal scale,
kind of because we're ranking, but it's not first, second, third.
Okay, we're just putting a, kind of, interval value to it. We're saying that
this, if number one, corresponds. Its kind of a nominal scale
but since it's a number were gonna say that it's more like the
interval scale. There's no absolute zero, so it's definitely not the ratio.
So because it's a quantitative value
we can put it on a histogram.
Histogram, or a polygon graph.
Okay, if it was categorical, so maybe I'm looking at colors and people's
preferences in color. So
if I asked, hey what's your favorite color? And I said, two people said blue
five people said red, three people said yellow, six people said green, and ten
people said
orange. Then
X over here, these would be
categorical variables that would be listed out on a nominal scale and we
couldn't use these graphs we have to use something called
a bar chart. Okay, it's very similar to histogram but there are differences so
this is for
definitely for the nominal scale, sometimes
we use it for the ordinal scale,
and that's it though. Okay, so let's take a look at creating a
histogram
by hand and a polygon and I hope to give you enough difference between them so you
can understand.
Okay, so our dataset we had, for our X values, we had
1,2,3,4, and 5.
For the frequencies we had 2
5,3, 6, and 10.
Alright, so I'm going to try really hard to make a good graph here,
it might be tough. This line, and let's do it
white, let's get a vertical line and a horizontal line.
Alright, so on the horizontal we are going to use,
put the X values and on the vertical we're gonna put the frequencies.
Okay so the ordered pairs, we're going to plot some points based on these ordered
pairs. So this is the ordered pair (1,2),
(2,5), (3,3),
(4,6), and (5,10).
Oops, 10.
Okay, so up here at 1,
2, 3, 4,
5 and on the vertical, let's just go by
2. 2, 4, 6,
8, 10. And that's 10,
and that's 8, 6, 2,
1, 2, 3, 4,
5. Okay, so here we're going to do, and I'll do this in blue,
whatever color this is. I'm gonna do the polygon graph,
it's not blue, is it?
Alright, so this is going work by just plotting the points. So we have,
two people said the value one. We have for the value two we have five people, so we're
up here.
For the value three we had three people,
so over there. Value four we had six people, all the way up here.
And for the value five we had all the way up here at ten.
Okay, so a polygon
meaning just that shape. We're going to connect these with
lines. Let's actually do it with a straight line
and keep that color. So connecting these dots, here we go
here we go, here we go, and
up. So we get a quick
kinda visual snapshot of what's going on and this graph here
is called a polygon graph. It graphically represents this data in
yellow.
Alright so I probably should put a title on this,
let's called this graph, the distribution of,
so this will be satisfaction scores.
Okay, and it's a visual interpretation of this data set in yellow.
Okay, what's nice about it is that it quickly tells us that just by looking at
this graph it says that
we see that is kinda got this upward trend and so what we're seeing, that
appears that more people are satisfied than not satisfied
and just comparing the scores here, five.
There's more people that agree with five than any of the other ones
Three and one where the lowest scores. One is definitely the lowest, so I
would say overall people are satisfied with the product.
Okay, we can also create, let's do this in
green, we can also create what's called a histogram.
By the way, I'm color blind so I might be calling out colors that
aren't real. I don't know. I think this is green. I think this is blue but
it could be purple, it could be turquoise and I wouldn't be able to tell you that.
Anyway, so okay, histogram. Histogram, what we're going to do is very similar in the
sense of
creating a kind height statement for the frequency but
instead of having a dot what we're going to do is
split the distance between our numbers. So between one and two
we have 1 1/2, between 0 and 1 we have 1/2.
2 1/2, 3 1/2, 4 1/2, and 5 1/2.
What we're going to do is, we're going to draw rectangles
starting from the halfway mark to the next half way mark
and the height of that rectangle is going to be the frequency.
Okay, so in other words, we want a center the value for
this case. One we're gonna center that in the middle of this rectangle that has height
too. So here's the rectangle and then for the value two you were going to straddle
again a rectangle that centered on two. We're going to go up to a height of 5.
Very poor artistic skills.
For three, there are three scores and so,
we're going to do this, okay.
And then for the value four, there are six. Okay,
I put that in,
and down, and for 5, they are all the way up here
10 scores. Alright, so
a lot of times we'll shade this in, so I'll just rough that
shading in. This is how I would do it on paper anyway, so I'm just getting it by
hand.
Alright,
what's nice about the histogram is that if this were
truly a value, so the if X
was a variable that was coming from the ratio
the scale measurement was the ratio. What the histogram will do is, it puts in
the real limits.
So this lower bound here for one, for example, is the lower real limit. And this
1 1/2 is the upper real limit.
Okay, so in other words the shows how people would round. So
if somebody was kind of split between score 1 and 2
and it's like, you know, I'm not half way
between one and two, I'm a little bit closer to one, so I'm going to round down. Okay, so if somebody was,
you know, having a hard time ranking their satisfaction
with the car and they're like, I'm not sure if I'm a one or two.
Well some of them are going round down. Some are going round up and so the histogram
capture that kind of
approximate rounding issue.
Not that we're actually talking about people rounding because you have to choose one or two but
you could be in the middle and you
kind of decide to go one way or the other. And so having these lower and upper real
limits it allows us to kind of
capture that possibility. Okay, the other nice thing about
a histogram is it gives you a better sense the overall
shape of the graph and we're gonna talk about shape of the graph in a later
video
corresponding with the material in Chapter three.