Ch 2 Hist And Poly

Let's see an example of a frequency distribution. Example, okay, on a scale of one to five, how satisfied are you with your car? Okay, and just to clarify a bit, if a customer ranks a score of one this corresponds to, would never buy again. Never buy again that's how dissatisfied, unsatisfied with the car. All the way up to five, which would be, um, let's say, want to buy one for a friend. K meaning you're so incredibly satisfied that you want to spread the love and you think everybody should have this car. Want to buy one for a friend, okay. So let's say that we have a certain number of people respond and so maybe out of that set of people that respond. Let's define X to be satisfaction score. Satisfaction score. So let's think about this. If we're going to create the frequency distribution and the idea a distribution is that you give all the acceptable values a score ever variable and then list how frequently they occur. Um, X could take on which values? X could take on the value one, two, three, four, or five. Right, we don't know allow partial satisfaction, if we did we would give it a different scale, maybe one to ten or something. So here are the five values and for our frequency, let's say, that we had in our survey we had two people say that they were not satisfied. We had five people say that they were very, uh, not dissatisfied but not that satisfied. Three people, three people said that they were somewhat satisfied. Um, let's say that six said that they were pretty satisfied, probably would buy another one. And let's say that 10 people said that they love the car so much that they would yes, buy one for themselves again and they want to buy one for a friend. Okay, so what we've created here is called the frequency distribution. Okay, the frequency distribution shows all the values that this variable could take on. Some satisfaction score could be a value 1,2,3,4,5 and since is the frequency not just some distribution, we talk about maybe others later. It also shows how frequently somebody gave it a score of one so two out of the total amount gave it a score of one. Five out of the total amount gave it a score two, and so forth. Okay, so if we look at this, if we want to know the total number of participants. I would say the total number, N. Well this is a sample so I'm going to use lower case n, for sample. The total number, and I want this color, n is going to be equal to the sum of all these values. Okay, so the sum of the frequency scores. So two plus five, plus three, plus six, plus ten. Uh, two and five two and three, five, ten, 26. So, 26 people showed up. Okay, and we can create what's called the relative frequency. Alright, let me put another vertical bar in here, and so if I took that F value over the sample size that would tell me that relative frequency. So the the portion out of the whole of people that agreed with the statement one, statement two, so on and so forth. So for this one we would have two out of 26, gave it a score have one. Two out of 26. Five out of 26, gave it a score of two. Three out of 26, gave a score of three. Six out of 26, gave it a score four. And ten out of 26, gave a score five. So what we're going to do is look at this data, how well I can circle that. Okay, this data, this data set where you have the score is the frequency even the relative frequency. So two maybe is a lot, maybe it's not, but if we compared to the total number than we can get a size comparison. Hey is that a lot of the people, two? When you can compare it to, say two out of 26 versus ten out of 26. This is a lot more percentage of the people that agree with statement five. Okay, so this is a nice table. Um, often though we want to look at this information in a graphical form. Okay, so we have a few choices available to us. We have the histogram or the polygon, and those two choices are based on the fact that we have quantitative data here. Right, so the value of X, so the variable that we're looking at those are measured on, well an ordinal scale, kind of because we're ranking, but it's not first, second, third. Okay, we're just putting a, kind of, interval value to it. We're saying that this, if number one, corresponds. Its kind of a nominal scale but since it's a number were gonna say that it's more like the interval scale. There's no absolute zero, so it's definitely not the ratio. So because it's a quantitative value we can put it on a histogram. Histogram, or a polygon graph. Okay, if it was categorical, so maybe I'm looking at colors and people's preferences in color. So if I asked, hey what's your favorite color? And I said, two people said blue five people said red, three people said yellow, six people said green, and ten people said orange. Then X over here, these would be categorical variables that would be listed out on a nominal scale and we couldn't use these graphs we have to use something called a bar chart. Okay, it's very similar to histogram but there are differences so this is for definitely for the nominal scale, sometimes we use it for the ordinal scale, and that's it though. Okay, so let's take a look at creating a histogram by hand and a polygon and I hope to give you enough difference between them so you can understand. Okay, so our dataset we had, for our X values, we had 1,2,3,4, and 5. For the frequencies we had 2 5,3, 6, and 10. Alright, so I'm going to try really hard to make a good graph here, it might be tough. This line, and let's do it white, let's get a vertical line and a horizontal line. Alright, so on the horizontal we are going to use, put the X values and on the vertical we're gonna put the frequencies. Okay so the ordered pairs, we're going to plot some points based on these ordered pairs. So this is the ordered pair (1,2), (2,5), (3,3), (4,6), and (5,10). Oops, 10. Okay, so up here at 1, 2, 3, 4, 5 and on the vertical, let's just go by 2. 2, 4, 6, 8, 10. And that's 10, and that's 8, 6, 2, 1, 2, 3, 4, 5. Okay, so here we're going to do, and I'll do this in blue, whatever color this is. I'm gonna do the polygon graph, it's not blue, is it? Alright, so this is going work by just plotting the points. So we have, two people said the value one. We have for the value two we have five people, so we're up here. For the value three we had three people, so over there. Value four we had six people, all the way up here. And for the value five we had all the way up here at ten. Okay, so a polygon meaning just that shape. We're going to connect these with lines. Let's actually do it with a straight line and keep that color. So connecting these dots, here we go here we go, here we go, and up. So we get a quick kinda visual snapshot of what's going on and this graph here is called a polygon graph. It graphically represents this data in yellow. Alright so I probably should put a title on this, let's called this graph, the distribution of, so this will be satisfaction scores. Okay, and it's a visual interpretation of this data set in yellow. Okay, what's nice about it is that it quickly tells us that just by looking at this graph it says that we see that is kinda got this upward trend and so what we're seeing, that appears that more people are satisfied than not satisfied and just comparing the scores here, five. There's more people that agree with five than any of the other ones Three and one where the lowest scores. One is definitely the lowest, so I would say overall people are satisfied with the product. Okay, we can also create, let's do this in green, we can also create what's called a histogram. By the way, I'm color blind so I might be calling out colors that aren't real. I don't know. I think this is green. I think this is blue but it could be purple, it could be turquoise and I wouldn't be able to tell you that. Anyway, so okay, histogram. Histogram, what we're going to do is very similar in the sense of creating a kind height statement for the frequency but instead of having a dot what we're going to do is split the distance between our numbers. So between one and two we have 1 1/2, between 0 and 1 we have 1/2. 2 1/2, 3 1/2, 4 1/2, and 5 1/2. What we're going to do is, we're going to draw rectangles starting from the halfway mark to the next half way mark and the height of that rectangle is going to be the frequency. Okay, so in other words, we want a center the value for this case. One we're gonna center that in the middle of this rectangle that has height too. So here's the rectangle and then for the value two you were going to straddle again a rectangle that centered on two. We're going to go up to a height of 5. Very poor artistic skills. For three, there are three scores and so, we're going to do this, okay. And then for the value four, there are six. Okay, I put that in, and down, and for 5, they are all the way up here 10 scores. Alright, so a lot of times we'll shade this in, so I'll just rough that shading in. This is how I would do it on paper anyway, so I'm just getting it by hand. Alright, what's nice about the histogram is that if this were truly a value, so the if X was a variable that was coming from the ratio the scale measurement was the ratio. What the histogram will do is, it puts in the real limits. So this lower bound here for one, for example, is the lower real limit. And this 1 1/2 is the upper real limit. Okay, so in other words the shows how people would round. So if somebody was kind of split between score 1 and 2 and it's like, you know, I'm not half way between one and two, I'm a little bit closer to one, so I'm going to round down. Okay, so if somebody was, you know, having a hard time ranking their satisfaction with the car and they're like, I'm not sure if I'm a one or two. Well some of them are going round down. Some are going round up and so the histogram capture that kind of approximate rounding issue. Not that we're actually talking about people rounding because you have to choose one or two but you could be in the middle and you kind of decide to go one way or the other. And so having these lower and upper real limits it allows us to kind of capture that possibility. Okay, the other nice thing about a histogram is it gives you a better sense the overall shape of the graph and we're gonna talk about shape of the graph in a later video corresponding with the material in Chapter three.