What is Cpk?

Minitab is the leading global provider of software and services for quality improvement and statistics education. Quality. Analysis. Results. For more information visit Minitab.com This podcast is available at KeithBower.com Hello, today I'm going to talk about the capability index Cpk. Now, to begin with, let's consider Cpk as a capability index, so with a capability index we usually use Cpk alongside Cp. So what is Cp? Well if you think of the formula for Cp, it's just the upper specification limit minus the lower specification limit, and you divide all of that by 6 standard deviations. Why do we use Cpk then? Well, if you think of the formula for Cp... nowhere in that formula does it use the sample mean. [xbar] In other words, we do not take into account where this distribution is centered. The formula for Cpk, on the other hand, you take the minimum value of the (upper specification limit minus the mean) divided by 3 standard deviations, and (the mean minus the lower specification limit) divided by 3 standard deviations. We take the smaller of those two values, and it's going to give us an idea of how capable this process is when we take into account where the distribution is centered. Of course, if Cp and Cpk are exactly the same value (pretty darned close to each other) then that means the distribution is centered midway between the specification limits. Notice that Cpk does not take into account a "target value" - it's just looking at whether or not you're centered midway between the specs. If you are interested in a specific target value, then you may want to look at a capability index like Cpm, and for a further discussion of Cpm and these other capability indices I'll point you to my website which has some references which will hopefully be useful to you. Now, what is it that we need to keep in mind when we look at a capability index? because capability indices within the [statistical] community are highly controversial. Well, if the process itself is not in statistical control then we can't really put our hand on our heart and say whether or not we believe that this mean and standard deviation are valid estimates of these (true) parameters - for the population mean and standard deviation. They could be utterly erroneous. If you think about it this way, let's say that the process is trickling down over time, and you're just going to take the mean of all of that? The estimate of the mean, at the end, is going to be disingenuous because, heck, it's down here! So the process needs to be in statistical control. As W. Edwards Deming once said: "A process has no measurable capability unless it is in statistical control," and I completely agree with that. A second assumption that we have in place for Cpk is that we assume the distribution is well modeled by a Normal distribution. If it is not - if another distribution (let's say a Weibull distribution) is a more adequate fit to this stable process, then the results that you would get by incorrectly assuming a Normal distribution... you could be miles off the mark as to what the actual proportion that you'd be getting falling outside of these spec limits - could be miles off the mark. So you need to assess the assumption of Normality with the underlying distribution that you're looking at. The third assumption is that... when we are doing a capability analysis we need to consider the amount of data that are being employed in the study. If somebody has a Cpk of, let's say, 1.33 and somebody else has a Cpk of 1.5, but the estimate of 1.5 comes from only 10 data points, whereas the 1.33 comes from 10,000 data points, I'm going to have a lot more assuredness of the validity of that estimate of Cpk being 1.33 (with more data) than I have with 10 data points. Therefore, I think it's very important for us to consider confidence intervals for these capability indices. From any good statistical software package it'll be an option - it may automatically be printed out (what these confidence intervals are) but I think it's very important for you to consider the amount of data that are being used for these estimates. Otherwise the values that you get for Cp and Cpk... these are just point estimates... that's all they are. But using confidence intervals, you're going to have a more valid estimate as to what the true capability - best/worst case scenario - really is. So I hope this has been useful. Cpk is widely used within the Quality community but we need to keep in mind the assumptions so we get a worthwhile estimate of this process capability index. Of course, if you've got any questions on this or anything else, please feel free to email them to me through my website, KeithBower.com For more information on statistical methods for quality improvement, visit KeithBower.com