Flowing Requirements from the VoC or VoP

In a previous post, I talked about the voice of the customer (VoC), voice of the process (VoP) and the necessity of combining the two when specifying a product. Here, I’d like to offer a general method for applying this in the real world, which can be implemented as a template in Excel.

Recap

I showed that there was a cost function associated with any specification that derived from both the VoC (expressed as tolerances or specification limits) and from the process capability. An example cost function for a two-sided tolerance is reproduced below.

Percent of target production costs given an average production weight and four different process capabilities.

I argued that, given this cost function, specifying a product requires specifying both the product specification limits (or tolerances) and the minimally acceptable process capability, Cpk. Ideally, both of these should flow down from a customer needs analysis to the finished product, and from the finished product to the components, and so on to materials.

Requirements flow down and up

To flow all requirements down like this, we would need to know the transfer functions, $Y = f(X)$ , for each requirement Y and each subcomponent characteristic X. There are methods for doing this, like Design for X or QFD, but they can be difficult to implement. In the real world, we don’t always know these transfer functions, and determining them can require non-trivial research projects that are best left to academia.

As an illustration, we will use the design of a battery (somewhat simplified), where we have to meet a minimum requirement that is the sum of component parts. The illustration below shows the component parts of a battery, or cell. It includes a container (or “cell wall”), positive and negative electrodes (or positive and negative “plates”), electrolyte and terminals that provide electrical connection to the outside world. Usually, we prefer lighter batteries to heavier ones, but for this example, we’ll suppose that a customer requires a minimum weight. This requirement naturally places limits on the weight of all components.

In the absence of transfer functions, we often make our best guess, build a few prototypes, and then adjust the design. This may take several iterations. A better approach is to estimate the weight specification limits and minimum Cpk by calculation before any cells are actually built.

General drawing of the structure of aircraft battery’s vented type NiCd cell. Ransu. Wikipedia, [http://en.wikipedia.org/wiki/ File:Aircraft_battery_cell.gif]. Accessed 2014-04-04.

Suppose the customer specifies a cell minimum weight of 100 kg. From similar designs, we know the components that contribute to the cell mass and have an idea of the percentage of total weight that each component contributes.

$m_{cell}=m_{container}+m_{terminals}+m_{electrolyte}+m_{poselect}+m_{negelect}$

Each individual component is therefore a fraction f_m of the total cell mass, e.g.

$m_{container}=f_{m,container}m_{cell}$

More generally, for a measurable characteristic c, component i has an expected mean or target value of $T_{i,c}=f_{i,c}\mu_{parent,c}$ or $T_{i,c}=f_{i,c}T_{parent,c}$ .

In our example, we may know from similar products or from design considerations that we want to target the following percents for each fraction f_m:

5% for container
19% for terminals
24% for electrolyte
26% for positive electrodes
26% for negative electrodes

Specification Limits

Upper Specification Limit (USL): The maximum allowed value of the characteristic. Also referred to as the upper tolerance.
Lower Specification Limit (LSL): The minimum allowed value of the characteristic. Also referred to as the lower tolerance.

Since the customer will always want to pay as little as possible, a specified lower weight of 100 kg is equivalent to saying that they are only willing to pay for 100 kg of material; any extra material is added cost that reduces our profit margin. If we tried to charge them for 150 kg of material, they would go buy from our competitors. The lower specification limit, or lower tolerance, of the cell weight is then 100 kg.

If the customer does not specify a maximum weight, or upper specification limit, then we determine the upper limit by the maximum extra material cost that we are willing to bear. In this example, we decide that we are willing to absorb up to 5% additional cost per part. Assuming that material and construction contributes 50% to the total cell cost, the USL is then 110 kg. To allow for some variation, we can set a target weight in the middle: 105 kg. From data on previous designs and the design goals, we can apportion the target weight to each component of the design, as shown in the table below.

We can apply the same fractions to the cell USL and LSL to obtain a USL and LSL of each component. As long as parts are built within these limits, the cell will be within specification. The resulting specification for cell and major subcomponents is illustrated in table [tblSpecification]. Further refinement of the allocation of USL and LSL to the components is possible and may be needed if the limits do not make sense from a production or cost perspective.

Part	Percent	Target	LSL	USL
		/kg	/kg	/kg
Cell	100%	105	100	110
Container	5%	5.2	5	5.5
Terminals	19%	19.9	19	20.9
Electrolyte	24%	25.2	24	26.4
Positive electrodes	26%	27.3	26	28.6
Negative electrodes	26%	27.3	26	28.6

Variance of components and Cpk

When a characteristic is due to the sum of the part’s components, as with cell mass, the part-to-part variation in the characteristic is likewise due to the variation in the components. However, where the characteristic adds as the sum of the components,

$m_{cell}=m_{container}+m_{terminals}+m_{electrolyte}+m_{poselect}+m_{negelect}$

the variance, $\sigma^{2}$ adds as the sum of squares

$\sigma_{cell}^{2}=\sigma_{container}^{2}+\sigma_{terminal}^{2}+\sigma_{electrolyte}^{2}+\sigma_{poselect}^{2}+\sigma_{negelect}^{2}$

The variance of any individual component is therefore a function of the total parent part variance

$\sigma_{container}^{2}=\sigma_{cell}^{2}-\sigma_{terminal}^{2}-\sigma_{electrolyte}^{2}-\sigma_{poselect}^{2}-\sigma_{negelect}^{2}$

$\displaystyle \sigma_{container,mass}^{2}=f_{\sigma,container}\sigma_{cell,mass}^{2}$

Since this is true for all components, the two fractions $f_{m}$ and $f_{\sigma}$ will be approximately equal. Therefore if we don’t know the fractions $f_{\sigma}$ , we can use the fraction $f_{m}$ , which usually easier to work out, to allocate the variance to each component:

$\displaystyle \sigma_{container,mass}^{2}=f_{m,container}\times\sigma_{cell,mass}^{2}$

More generally, for measurable characteristic $c$ of a subcomponent $i$ of a parent component,

$\displaystyle \sigma_{i,c}=\sqrt{f_{c,i}}\:\sigma_{c,parent}$

Since the given $\sigma$ is the maximum allowed for the parent to meet the desired Cpk, this means that $\sigma_{i}^{2}$ is an estimate for the maximum allowed component variance. Manufacturing can produce parts better than this specification, but any greater variance will drive the parent part out of specification.

Calculating Specification Limits

In general, there are two conflicting goals in setting specifications:

Make them as wide as possible to allow for manufacturing variation while still meeting the VoC.
Make them as narrow as possible to stay near the minimum of the cost function.

For this, Crystall Ball or iGrafx are very useful tools during development, as we can simulate a set of arts or processes, analyze the allowed variation in the product and easily flow that variation down to each component. In the absence of these tools, Minitab or Excel can be used to derive slightly less robust solutions.

Calculating from Customer Requirements

Identify any customer requirements and set specification limits (USL and LSL) accordingly. If the customer requirements are one-sided, determine the maximum additional cost we are willing to accept, and set the other specification limit accordingly. Some approximation of costs may be needed.
If no target is given, set the target specification for each requirement as the average of USL and LSL.
Set the minimum acceptable Cpk for each specification. Cpk = 1.67 is a good starting value. Use customer requirements for Cpk, where appropriate, and consider, also, whether the application requires a higher Cpk (weakest link in the chain….
Calculate the maximum allowed standard deviation to meet the Cpk requirement as $\sigma_{parent}=\left(USL-LSL\right)/\left(6\times Cpk\right)$ .
For each subcomponent (e.g. the cell has subcomponents of container, electrodes, electrolyte, and so on), apportion the target specification to each of the subcomponents based on engineering considerations and judgement. If the fractions $f$ are known, $T_{i}=f_{i}\times T_{parent}$ .
Calculate the fraction $f_{i}$ (or percent) of the parent total for each subcomponent if not already established in step (5).
Calculate the USL and LSL for each subcomponent by multiplying the parent USL and LSL by the component’s fraction of parent (from step 6). $USL_{i}=f_{i}\times USL_{parent}$ and $LSL_{i}=f_{i}\times LSL_{parent}$ .
Estimate the allowed standard deviation $\sigma_{i}$ for each subcomponent as
$\displaystyle \sigma_{i}=\mathtt{SQRT}\left(f_{i}\right)\times\sigma_{parent}.$
Calculate the minimum allowed Cpk for each subcomponent from the results of (5), (7) and (8), using the target, $T$ , for the mean, $\mu$ .
$\displaystyle Cpk_{i}=minimum\begin{cases}\frac{USL_{i}-T_{i}}{3\sigma_{i}}\\\frac{T_{i}-LSL_{i}}{3\sigma_{i}}\end{cases}$
Repeat steps (5) through (9) until all components have been specified.
For each component, report the specified USL, LSL, target T and maximum Cpk.

Calculating from Process Data

When there is no clear customer-driven requirement or clear requirement from parent parts (e.g. dimensional specifications that can be driven by the fit of parts), but specification limits are still reasonably needed, we can start from existing process data.

This is undesirable because any change to the process can force a change to the product specification, without any clear understanding of the impact on customer needs or requirements; the VoC is lost.

The calculation of USL and LSL from process data is also somewhat more complicated, as we have to use the population mean and standard deviation to determine where to set the USL and LSL, without really knowing what that mean and standard deviation are.

In the real world, we have to live with such constraints. To deal with these limitations, we will use as much data as is available and calculate the confidence intervals on both the mean and the standard deviation. The calculation for USL and LSL becomes

$\setlength\arraycolsep{2pt}\begin{array}{rl}\displaystyle USL &=\textrm{upper 95\% confidence on the mean}\smallskip\\ \displaystyle &\quad +k\times\textrm{upper 95\% confidence on the standard deviation}\end{array}$
$\setlength\arraycolsep{2pt}\begin{array}{rl}\displaystyle LSL &=\textrm{lower 95\% confidence on the mean}\smallskip\\ &\quad -k\times\textrm{upper 95\% confidence on the standard deviation}\end{array}$

where $k$ is the number of process Sigmas desired, based on the tolerance cost function. Most of the time, we will use $k=5$ , to achieve a Cpk of 1.67.

We always use the upper 95% confidence interval on the standard deviation. We don’t care about the lower confidence interval, since a small $\sigma$ will not help us in setting specification limits.

Calculate the mean ( $\mu_{parent}$ ) from recent production data. In Excel, use the AVERAGE() function on the data range.
Calculate the standard deviation () from recent production data. In Excel, you can use the STDEV() function on the data range.
1. If the order of production data is known, or SPC is in use, a better method is to use the range-based estimate from the control charts. This will be discussed in subsequent training on control charts.
Count the number of data points, n, that were used for the calculations (1) and (2). You can use the COUNT() function on the data range.
Calculate the 95% confidence level on the mean. In Excel, this is accomplished with
$CL=\mathtt{TINV}\left(\left(1-0.95\right);n-1\right)\times\sigma_{parent}/\mathtt{SQRT}\left(n\right)$

In Excel 2010 and later, TINV() should be replaced with T.INV.2T().
Calculate the 95% confidence interval on the mean as $CI_{upper}=\mu+CL$ and $CI_{lower}=\mu-CL$ .
Calculate the upper and lower 95% confidence limits on the standard deviation. In Excel, this is accomplished with
$\sigma_{upper}=\sigma_{parent}\times\mathtt{SQRT}\left(\left(n-1\right)/\mathtt{CHIINV}\left(\left(1-0.95\right)/2;n-1\right)\right)$

and

$\sigma_{lower}=\sigma_{parent}\times\mathtt{SQRT}\left(\left(n-1\right)/\mathtt{CHIINV}\left(1-\left(1-0.95\right)/2;n-1\right)\right)$

In Excel 2010 and later, CHIINV() can be replaced with CHISQ.INV.RT() for improved accuracy.
Calculate the LSL as $LSL_{parent}=CI_{lower}-k\sigma_{upper}$ . You might use a value other than 5 if the customer requirements or application require a higher process Sigma.
Calculate the USL as $USL_{parent}=CI_{upper}+k\sigma_{upper}$ .
For each subcomponent (e.g. the cell has subcomponents of positive electrode, negative electrode, electrolyte, and so on), apportion the parent part mean to each of the subcomponents based on engineering considerations and judgement. If the fractions $f$ are known, $T_{i}=f_{i}\times\mu_{parent}$ .
If the the fraction (or percent) $f_{i}$ of the parent total for each subcomponent is not known, calculate it using the results of step (9).
Calculate the USL and LSL for each subcomponent by multiplying the parent USL and LSL by the component’s fraction of parent (from step 6). $USL_{i}=f_{i}\times USL_{parent}$ and $LSL_{i}=f_{i}\times LSL_{parent}$ .
Estimate the allowed standard deviation $\sigma_{i}$ for each subcomponent as $\sigma_{i}=\mathtt{SQRT}\left(f_{i}\right)\times\sigma_{lower}$
Calculate the minimum allowed Cpk for each subcomponent from the results of (5), (7) and (8), using the target $T_{i}$ for the mean, $\mu_{i}$ .
$\displaystyle Cpk_{i}=minimum\begin{cases}\frac{USL_{i}-T_{i}}{3\sigma_{i}}\\\frac{T_{i}-LSL_{i}}{3\sigma_{i}}\end{cases}$
Repeat steps (9) through (13) until all components have been specified.
For each component, report the specified USL, LSL, target T and maximum Cpk.

Can We do Better than R-squared?

If you're anything like me, you've used Excel to plot data, then used the built-in “add fitted line” feature to overlay a fitted line to show the trend, and displayed the “goodness of fit,” the r-squared (R²) value, on the chart by checking the provided box in the chart dialog.

The R² calculated in Excel is often used as a measure of how well a model explains a response variable, so that “R² = 0.8” is interpreted as “80% of the variation in the 'y' variable is explained by my model.” I think that the ease with which the R² value can be calculated and added to a plot is one of the reasons for its popularity.

There's a hidden trap, though. R² will increase as you add terms to a model, even if those terms offer no real explanatory power. By using the R² that Excel so helpfully provides, we can fool ourselves into believing that a model is better than it is.

Below I'll demonstrate this and show an alternative that can be implemented easily in R.

Some data to work with

First, let's create a simple, random data set, with factors a, b, c and response variable y.

head(my.df)

##       y a       b      c
## 1 2.189 1 -1.2935 -0.126
## 2 3.912 2 -0.4662  1.623
## 3 4.886 3  0.1338  2.865
## 4 5.121 4  1.2945  4.692
## 5 4.917 5  0.1178  5.102
## 6 4.745 6  0.4045  5.936

Here is what this data looks like:

Calculating R-squared

What Excel does when it displays the R² is create a linear least-squares model, which in R looks something like:

my.lm <- lm(y ~ a + b + c, data = my.df)

Excel also does this when we call RSQ() in a worksheet. In fact, we can do this explicitly in Excel using the Regression analysis option in the Analysis Pack add-on, but I don't know many people who use this, and Excel isn't known for its reliability in producing good output from the Analysis Pack.

In R, we can obtain R² via the summary() function on a linear model.

summary(my.lm)

## 
## Call:
## lm(formula = y ~ a + b + c, data = my.df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.2790 -0.6006  0.0473  0.5177  1.5299 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(&gt;|t|)  
## (Intercept)    2.080      0.763    2.72    0.034 *
## a             -0.337      0.776   -0.43    0.679  
## b             -0.489      0.707   -0.69    0.515  
## c              1.038      0.817    1.27    0.250  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.1 on 6 degrees of freedom
## Multiple R-squared:  0.833,  Adjusted R-squared:  0.75 
## F-statistic:   10 on 3 and 6 DF,  p-value: 0.00948

Since summary() produces a list object as output, we can grab just the R² value.

summary(my.lm)$r.squared

## [1] 0.8333

Normally, we would (somewhat loosely) interpret this as telling us that about 83% of the variation in the response y is explained by the model.

Notice that there is also an "adjusted r-squared” value given by summary(). This tells us that only 75% of the variation is explained by the model. Which is right?

The problem with R-squared

Models that have many terms will always give higher R² values, just because more terms will slightly improve the model fit to the given data. The unadjusted R² is wrong. The calculation for adjusted R² is intended to partially compensate for that “overfit,” so it's better.

It's nice that R shows us both values, and a pity that Excel won't show the adjusted value. The only way to get an adjusted R² in Excel is to run the Regression analysis; otherwise, we have to calculate adjusted R² manually.

Both R² and adjusted R² are measures of how well the model explains the given data. However, in industry we usually want to know something a little different. We don't build regression models to explain only the data we have; we build them to think about future results. We want R² to tell us how well the model predicts the future. That is, we want a predictive R². Minitab has added the ability to calculate predictive R² in Minitab 17, and has a nice blog post explaining this statistic.

Calcuting predictive R-squared

Neither R nor Excel provide a means of calculating the predictive R² within the default functions. While some free R add-on packages provide this ability (DAAG, at least), we can easily do it ourselves. We'll need a linear model, created with lm(), for the residuals so we can calculate the “PRESS” statistic, and then we need the sum of squares of the terms so we can calculate a predictive R².

Since the predictive R² depends entirely on the PRESS statistic, we could skip the added work of calculating predictive R² and just use PRESS, as some authors advocate. The lower the PRESS, the better the model is at fitting future data from the same process, so we can use PRESS to compare different models. Personally, I'm used to thinking in terms of R², and I like having the ability to compare to the old R² statistic that I'm familiar with.

To calculate PRESS, first we calculate the predictive residuals, then take the sum of squares (thanks to (Walker’s helpful blog post) for this). This is pretty easy if we already have a linear model. It would take a little more work in Excel.

pr <- residuals(my.lm)/(1 - lm.influence(my.lm)$hat)
PRESS <- sum(pr^2)
PRESS

## [1] 19.9

The predictive R² is then (from a helpful comment by Ibanescu on LikedIn) the PRESS divided by the total sum of squares, subtracted from one. The total sum of squares can be calculated directly as the sum of the squared residuals, or obtained by summing over Sum Sq from an anova() on our linear model. I prefer using the anova function, as any statistical subtleties are more likely to be properly accounted for there than in my simple code.

# anova to calculate residual sum of squares
my.anova <- anova(my.lm)
tss <- sum(my.anova$"Sum Sq")
# predictive R^2
pred.r.squared <- 1 - PRESS/(tss)
pred.r.squared

## [1] 0.5401

You'll notice that this is smaller than the residual R², which is itself smaller than the basic R². This is the point of the exercise. We don't want to fool ourselves into thinking we have a better model than we actually do. One way to think of this is that 29% (83% – 54%) of the model is explained by too many factors and random correlations, which we would have attributed to our model if we were just using Excel's built-in function.

When the model is good and has few terms, the differences are small. For example, working through the examples in Mitsa's two posts, we see that for her model 3, R² = 0.96 and the predictive R² = 0.94, so calculating the predictive R² wasn't really worth the extra effort for that model. Unfortunately, we can't know, in advance, which models are “good.” For Mitsa's model 1 we have R² = 0.95 and predictive R² = 0.32. Even the adjusted R² looks pretty good for model 1, at 0.94, but we see from the predictive R² that our model is not very useful. This is the sort of thing we need to know to make correct decisions.

Automating

In R, we can easily wrap these in functions that we can source() and call directly, reducing the typing. Just create a linear model with lm() (or an equivalent) and pass that to either function. Note that pred_r_squared() calls PRESS(), so both functions have to be sourced.

pred_r_squared <- function(linear.model) {
    lm.anova <- anova(linear.model)
    tss <- sum(lm.anova$"Sum Sq")
    # predictive R^2
    pred.r.squared <- 1 - PRESS(linear.model)/(tss)
    return(pred.r.squared)
}

PRESS <- function(linear.model) {
    pr <- residuals(linear.model)/(1 - lm.influence(linear.model)$hat)
    PRESS <- sum(pr^2)
    return(PRESS)
}

Then we just call the function to get the result:

pred.r.squared <- pred_r_squared(my.lm)
pred.r.squared

## [1] 0.5401

I've posted these as Gists on GitHub, with extra comments, so you can copy and paste from here or go branch or copy them there.

References and further reading

Mitsa, T. Use PRESS, not R squared to judge predictive power of regression. 12 May 2013. Analytical Bridge. Accessed 14 May 2014. Shows how r-squared can lead to a misleading interpretation of model fit and provides an explanation of the PRESS statistic, with examples comparing three linear models in R.
Mitsa, T. Cross-validation in R: a do-it-yourself and a black box approach. 22 May 2013. Accessed 14 May 2014. Shows how to use the package DAAG to calculate PRESS, or to calculate it manually.
Walker, S. Calculating the PRESS statistic in R. 18 June 2013. ecology & stats. Accessed 14 May 2014. Provides a simple function for calculating PRESS in R.
Multiple Regression Analysis: Use Adjusted R-Squared and Predicted R-Squared to Include the Correct Number of Variables
Adjusted R-Square or Predicted R-Square. LinkedIn. Accessed 14 May 2014. Forum dscussion thread discusing the relative merits of adjusted and predicted R², in which the equation for calculating predicted R² is given.
Why is adjusted R-squared less than R-squared if adjusted R-squared predicts the model better?. StackExchange. Accessed 10 May 2014. Q&A thread discussing the relative merits of R² and adjusted R².
R Core Team (2014). R: A language and environment for statistical computing. R Foundation for
Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009.

Tom Hopper

Competitive organizations through high-performance learning

Month: May 2014