Update to plot.qcc using ggplot2 and grid

Two years ago, I blogged about my experience rewriting the plot.qcc() function in the qcc package to use ggplot2 and grid. My goal was to allow manipulation of qcc’s quality control plots using grid graphics, especially to combine range charts with their associated individuals or moving range charts, as these two diagnostic tools should be used together. At the time, I posted the code on my GitHub.

I recently discovered that the update to ggplot2 v2.0 broke my code, so that attempting to generate a qcc plot would throw an obscure error from someplace deep in ggplot2. The fix turned out to be pretty easy. The original code used aes_string() instead of aes() because of a barely-documented problem of calling aes() inside a function. It looks like this has been quietly corrected with ggplot2 2.0, and aes_string() is no longer needed for this.

The updated code is up on GitHub. As before, load the qcc library, then source() qcc.plot.R. For the rest of the current session, calls to qcc() will automatically use the new plot.qcc() function.

Sample Size Matters: Design and Cost

We’ve seen in the previous posts that in designing products we need to know characteristics like the mean and standard deviation of the population, but are limited to only being able to measure sample means and standard deviations. This leaves us with uncertainty in our knowledge of population characteristics, and that uncertainty directly impacts our ability to make better products. In this post, we’ll see how business financial requirements and estimation uncertainties due to sample size interact both to to further limit our available design options and to drive up our sample size requirements.

Impact on specifications

Looking back at our graph of Cpk, Cpk values below the target value (blue line) increase production and sales costs through increased rework, scrap and warranty. Above the blue line, we’ve added product or production costs by over-designing the product or process. Since the price of a product is determined by the market, any increase in cost decreases our gross profit margin:

$\text{Gross Profit} = f\left(\text{Price}, \text{Cost} \right) = \text{Price} - \text{Cost}$

As outlined in the first post of this series, we are going to cut material costs by 10% on a part that had to weigh at least 100 kg. That was a $6 reduction in costs on a $120 part.

Our first issue is that we have to be sure that we have a good baseline for improvement. If the existing parts are very different than our expectations, we may be creating more trouble by making changes. We also don’t know how much variation there is in part weight.

We collect production data over a week and determine that the current mean part weight is, as expected, 120 kg with a standard deviation of 6.7 kg. With 120 kg of material, we calculate a Cpk of

We calculate that we have to remove $6/0.5 = 12$ kg, reducing mean weight from 120 kg to 108 kg.

Any single product below 100 kg runs the risk of being rejected by the customer, possibly at great cost (e.g. they may require special field service on older parts, or decide to buy from a competitor in the future, or both), so we don’t want to have a higher defect rate with the new product and process than with the old, because this will increase labor and overhead costs. With the new product and process, we want to target a standard deviation of at least

We might stop there, and say that when we have a design for 108 kg and prototypes that weigh on average 108 kg with a standard deviation of up to 2.7 kg, we’re done. Our specification now looks like this:

	Minimum	Maximum	Target
Part Weight	100	108	?
Standard Deviation	0	2.7	?
Cpk	1.0	?	?

However, there would be substantial risk that we would not achieve our goals of both meeting the customer requirement of 100 kg and reducing material costs by 10%. Using these numbers as our target, we have a 50% chance that we will be over the cost target, and a 50% chance that our defect rate will be higher than target.

In order to meet customer requirements, we want to be confident that all parts weigh at least 100 kg. In order to meet business needs, we have to be 95% confident that at least half of our product weighs at most 108 kg.

For the customer requirement, we need to calculate the Cpk. In the past, “all” product really meant a Cpk of 1.0, or 93% of product. To calculate this we need our 95% confidence estimate of the mean, $\overline{X}_{\text{lower 95\%}}$ and our 95% confidence of the standard deviation, $S_{\text{upper 95\%}}$ .

For the business requirement, we need the confidence bounds on our estimate of the mean, $\overline{X}_{\text{upper 95\%}}$

Now we need to design and build our prototypes. How many parts do we build and weigh? Recognizing that there will be uncertainty in our estimate of $\mu$ and $\sigma$ from such trials, we cannot simply calculate $\overline{X}$ and $S$ and then calculate the estimated Cpk based on the sample, since there is a 50% chance that our products will be worse than we measure from our study. We have to be more careful with our customer base than that.

We have to use the confidence bounds on $\overline{X}$ and $S$ :

calculation of S_upper, X-bar_upper, X-bar_lower

These equations are easier to understand if we graph them for several values of $S$ .

This graph shows the maximum and minimum possible $\overline{X}$ to assure compliance with customer requirements on both weight (solid green line) and the cost targets (dashed blue line), for four different values of $S$ . Red regions indicate that both sets of requirements cannot be met; green shaded regions indicate possible $\overline{X}$ that meet both sets of requirements.

As can be seen, while we calculated a naive target for standard deviation, $\sigma$ , of 2.7 kg, the measured sample standard deviation, $S$ , must be much smaller to assure that we meet requirements. Likewise, small sample sizes can make it impossible to assure that we meet requirements.

We can now ammend our requirements:

	Minimum	Maximum	Target
Part Weight	100	108	105.5
Standard Deviation	0	2.7	(from Cpk)
Cpk	1.0	f(n, S)	1.0
$\overline{X}$	f(n, S)	f(n, S)	104
$S$	0	1.2	1.0
n (for sampling)	10	100	10

Not only do we have to design our product and process to be more stringent than the naive requirements, we have to test more than we might otherwise wish to.

Importantly, our specification now contains the tolerance ranges on the weight, the standard deviation of the weight, and the Cpk. This is the minimum set of information that we need to fully specify a part. For the purposes of testing and checking short-term process performance, we also need to specify the number of samples to collect and sample mean and standard deviation.

UPDATE 2016-08-25: Equations were no longer rendering correctly; this was fixed.

References

R Core Team (2014). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009.

Sample Size Matters: Design and Experiments

Previously, I introduced the idea that samples do not look exactly like the populations that they are drawn from, and had a closer look at what impact sample size has on our ability to estimate population statistics like mean, proportion or Cpk from samples. Here, I will have a closer look at how this uncertainty impacts our engineering process. In the next post, I will tie in the engineering impacts and decisions to the business value and costs.

Difference to detect

When we are testing, we’re either testing to determine that the new product or process performs better than the old or, for cost reduction projects, that the cheaper product or process is at least as good as the existing one.

This means that we need to detect a difference between old and new values, such as a difference in the mean weight between the new and old parts. The larger the sample size, $n$ the smaller the difference, $\Delta$ that we can detect. The error, $\epsilon$ , in our estimate of the differences gets smaller as sample size increases:

$\epsilon_{\Delta} \propto \frac{\sigma}{\sqrt{n}}$

Given the uncertainties in our estimate of $\mu$ and $\sigma$ , illustrated above, it should be clear now that with small sample sizes we can only detect large differences of many multiples of the sample standard deviation, $S$ .

Mean

When trying to determine if a new product or process is better than an old one, we are usually interested in shifting the mean. We want a product to be lighter, provide more power, or a process to work faster. In such cases, we need to estimate the difference of the means, $\Delta = \mu_2 - \mu_1$ and ensure that it is different than 0 (or some other pre-determined value). The minimum difference that we can reliable detect is plotted below for different sample sizes.

Standard deviation

In many Six Sigma projects, and any time we want to shift the mean closer to a specification limit, we need to compare the new population standard deviation with the old. The simplest way of making this comparison is by taking the ratio $F = \sigma_{2}^{2} / \sigma_{1}^{2}$ , where $\sigma_{2}^{2}$ is the larger of the two variances. The dependence on sample size is illustrated below.

You can see from the inset plot, which includes sample sizes of 2 and 3, that small sample sizes really hurt comparisons of variance, and that interesting differences in variance can’t be detected until we have more than 10 samples.

Proportions

Proportions, such as fraction of defective parts between a new and old design, can be compared by looking at the difference between the two proportions, $\Delta = \left| p_1 - p_0 \right|$ .

You can see from this that proportions data provides much less information than variable data; we need much larger sample sizes to achieve usefully small $\Delta$ .

Summary and look forward

When designing experiments, the goal is to detect some difference between two populations. The uncertainty in our measurements and the variation in the parts has a big impact on how many parts we need to test, or greatly limits what we can learn from an experiment.

Next time, I’ll show how these calculations of sample size and uncertainty impact the busines.

Sample Size Matters: Uncertainty in Measurement

In my previous post, I gave a brief introduction to populations and samples, and stated that sample size impacts our ability to know what a population really looks like. In this post, I want to show this relationship in more detail. In future posts, I will look at how sample size considerations impact our engineering process and what impacts this has on the business.

Mean and sample size

The error in our estimate of the mean, $E$ , is proportional to the standard deviation of the sample, $S$ , and the sample size, $n$ .

$E \propto \frac{S}{\sqrt{n}}$

We can visualize this easily enough by plotting the 95% confidence interval. When we sample and calculate the sample mean ( $\overline{X}$ ), the true population mean, $\mu$ , (what we really want to know) is likely to be anywhere in the shaded region of the graph below.

This graph shows the 95% confidence region for the true population mean, $\mu$ ; there’s a 95% chance that the true population mean is within this band. The “0” line on the y axis is our estimate of the mean, $\overline{X}$ . We can’t know what the true population mean is, but it’s clear that if we use more samples, we can be sure that our estimate is closer to the true mean.

Standard deviation and sample size

Likewise, when we calculate the sample standard deviation, $S$ , the true standard deviation, $\sigma$ has a 95% chance of being within the confidence band below. For small sample sizes (roughly less than 10), the measured standard deviation can be off from the true standard deviation by several times. Even for ten samples, the potential error is nearly $\pm 1$ standard deviation.

Proportion and sample size

For proportions, the situation is similar: there is a 95% chance that the true sample proportion, $p$ , is within the shaded band based on the measured sample proportion $\hat{p}$ . Since this confidence interval depends on $\hat{p}$ and cannot be standardized the way $\mu$ and $\sigma$ can be, confidence intervals for two different proportions are plotted.

For small $n$ , proportions data tells us very little.

Process capability and production costs

The cost of poor quality in product or process design can be characterized by the Cpk:

$Cpk = \mathrm{minimum} \begin{cases}\frac{USL - \mu}{3\sigma} \\\frac{\mu - LSL}{3\sigma}\end{cases}$

Where USL is the upper specification limit (also called the upper tolerance) and LSL is the lower specification limit (or lower tolerance).

We can estimate the defect rate (defects per opportunity, or DPO) from the Cpk:

$DPO = 1 - \Pr\left(X < 3 \times Cpk - 1.5\right)$

That probability function is calculated in R with pnorm(3 * Cpk - 1.5) and in Excel with NORMSDIST(3 * Cpk - 1.5). The 1.5 is a typical value used to account for uncorrected or undetected process drift.

Since we don’t know $\mu$ and $\sigma$ , we have to substitute $\overline{X}$ and $S$ . The uncertainty in these estimates of the population $\mu$ and $\sigma$ mean that we have uncertainty in what the true process Cpk (or defect rates) will be once we’re in production. When our sample testing tells us that the Cpk should be 1.67 (the blue line), the true process Cpk will actually turn out to be somewhere in the shaded band:

Below the blue line, our product or process is failing to meet customer expectations, and will result in lost customers or higher warranty costs. Above the blue line, we’ve added more cost to the production of the product than we need to, reducing our gross profit margin. Since that gray band doesn’t completely disappear, even at 100 samples, we can never eliminate these risks; we have to find a way to manage them effectively.

The impact of this may be more evident when we convert from Cpk to defect rates (ppm):

Summary and a look forward

With a fair sampling process, samples will look similar to—and statistically indistinguishable from—the population that they were drawn from. How much they look like the population depends critically on how many samples are tested. The uncertainties, or errors in our estimates, resulting from sample size decisions have impacts all through our design analysis and production planning.

In the next post, I will explore in more detail how these uncertainties impact our experiment designs.

Sample Size Matters

I find that Six Sigma and Design for Six Sigma courses are often eye-opening experiences for participants. There is an experience of discovering that there are tools available to answer problems that have vexed them, and learning that good engineering and science decisions can lead directly to good business outcomes through logical steps.

One of the most remarkable such moments is when students realize the importance of sample size. In the best cases, there is a forehead-slapping moment where the student realizes that much of the testing they’ve done in the past has probably been a complete waste of time; that while they thought they were seeing interesting differences and making good decisions, they were in fact only fooling themselves by comparing too-small data sets.

I want to show in the next few blog posts why sample size matters, both from a technical perspective and from a business perspective.

Design example

Throughout the next few posts, I’ll use the example of a manufactured product which the customer requires weigh at least 100 kg, sells for about $140 and that costs $120 to manufacture and convert to a sale (the cost of goods sold, or COGS, is $120).

		Amount
Sales		140
COGS		120
Material	60
Labor and Overhead	60
—	—	—
Gross Profit		20

We want to develop a new version of the product, using a modified design and a new process that, by design, will reduce the cost of material by 10%. The old cost of material was 50% of COGS, or $60. To achieve the material cost reduction of 10%, we have to remove $6 in material costs, improving gross profit to $26.

We believe that the current design masses 120 kg, so we estimate that our new part mass should be $120 - 0.1 \times 120 = 108$ kg.

	Current Design	New Design Target
Part Weight	120	108

Seems like we might be done at this point, and I’ve seen plenty of engineering projects that stop here. Unfortunately, this isn’t the whole story. Manufacturing will be unable to produce parts of exactly 108 kg, so they’ll need a tolerance range to check parts against. We have that customer requirement for at least 100 kg, so any variation has to stay above that. We also want to save money relative to the current design, so we don’t want many parts to weigh much more than this, especially since the customer isn’t really willing to pay us for the “extra” material beyond 100 kg.

Population versus sample statistics

Most of process or product improvement is concerned with reducing the standard deviation, $\sigma$ , shifting the mean (a.k.a. average), $\mu$ , or reducing a proportion, $p$ , of a process or product characteristic. These summary statistics refer to the population characteristics—the mean, standard deviation or proportion of all parts of a certain design that will ever be produced, or all times that a production step will ever be completed in the intended manner.

Since we can’t measure the whole population up front—we will be producing parts for a long time—we have to draw a sample from the population, and use the statistics of that sample to gain insight into the total population. We can visualize this, somewhat crudely, with the following:

We can imagine that the blue circles are conforming parts, and the orange octagons are non-conforming parts. If the sampling process is fair, then the sample proportion $\hat{p}$ will be close to—and statistically indistinguishable from—the true population proportion $p$ . In the population we have 44 parts total, 8 defective parts and 36 conforming parts. In the sample that we drew, we have 10 parts total, 9 conforming and 1 defective. While $(p = 8/36 = 1/4 \ne \hat{p} = 1/9$ , statistically we have

matrix(c(1, 8, 10-1, 44-8), ncol=2) %>% 
  chisq.test(simulate.p.value = TRUE)

## 
##  Pearson's Chi-squared test with simulated p-value (based on 2000
##  replicates)
## 
## data:  matrix(c(1, 8, 10 - 1, 44 - 8), ncol = 2)
## X-squared = 0.3927, df = NA, p-value = 0.6692

With such a high p-value (0.67), we fail to reject the null hypothesis that $\hat{p} = p$ ; in more colloquial terms, we conclude that the apparent difference between 8/36 and 1/9 is only due to random errors in sampling. (For larger counts of successes and failures, prop.test() would also work and would be more informative.)

From our perspective, of course, we don’t know what the population looks like. We don’t have any way of knowing with certainty—or accessing data about—future performance, so there is no way for us to know what the total population looks like. In lieu of population data, we develop a sampling process that allows us to fairly draw a sample from that population.

While we want to know the true population mean, $\mu$ , the true population standard deviation, $\sigma$ , or the true population proportion $p$ , we can only calculate the sample mean, $\overline{X}$ , the sample standard deviation, $S$ , or the sample proportion $\hat{p}$ .

From the known sample, we then reason backward to what the true population looks like. This is where statistics comes into play; statistics allows us to place rigorous boundaries on what the population may look like, without fooling ourselves. Sample size is critical to controlling the uncertainty in these boundaries.

Summary and a look forward

Testing in product development—and usually in production—involves sampling a product or process. Samples never look exactly like the population that we are concerned about, but if the sampling process is fair then the samples will be statistically indistinguishable from the population. With due awareness of the statistical uncertainties, we can use samples to make decisions about the population.

In the next post, I will look at how sample size impacts the uncertainty in our estimation of population statistics like the mean and standard deviation. In a later post, I will look at how this uncertainty impacts the business.

A short aside on statistical tests for proportions

The usual way to compare two proportions would be a proportions test (prop.test() in R), but because we have so few samples to compare, the results may be unreliable and prop.test() generates an appropriate warning. fisher.test() provides an exact estimate of the p-value, but the assumptions are violated with data like this, where we are sampling a fixed number of parts (i.e. row sums are fixed, but column sums are not controlled). This leaves us with using a chi-squared test (chisq.test() in R) which is less informative but does the job. Either the Barnard test or Bayesian estimation based on Monte Carlo simulation would be more informative and possibly more robust.

Normality and Testing for Normality

Many of our statistical tests make assumptions about the distribution of the underlying population. Many of the most common—ImR (XmR) and XbarR control charts, ANOVA, t-tests—assume normal distributions in the underlying population (or normal distributions in the residuals, in the case of ANOVA), and we’re often told that we must carefully check the assumptions.

At the same time, there’s a lot of conflicting advice about how to test for normality. There are the statistical tests for normality, such as Shapiro-Wilk or Anderson-Darling. There’s the “fat pencil” test, where we just eye-ball the distribution and use our best judgement. We could even use control charts, as they’re designed to detect deviations from the expected distribution. We are discouraged from using the “fat pencil” because it will result in a lot of variation from person to person. We’re often told not to rely too heavily on the statistical tests because they are not sensitive with small sample sizes and too sensitive to the tails. In industrial settings, our data is often messy, and the tails are likely to be the least reliable portion of our data.

I’d like to explore what the above objections really look like. I’ll use R to generate some fake data based on the normal distribution and the t distribution, and compare the frequency of p-values obtained from the Shapiro-Wilk test for normality.

A Function to test normality many times

First, we need to load our libraries

library(ggplot2)
library(reshape2)

To make this easy to run, I’ll create a function to perform a large number of normality tests (Shapiro-Wilk) for sample sizes n = 5, 10 and 1000, all drawn from the same data:

#' @name assign_vector
#' @param data A vector of data to perform the t-test on.
#' @param n An integer indicating the number of t-tests to perform. Default is 1000
#' @return A data frame in "tall" format
assign_vector <- function(data, n = 1000) {
  # replicate the call to shapiro.test n times to build up a vector of p-values
  p.5 <- replicate(n=n, expr=shapiro.test(sample(my.data, 5, replace=TRUE))$p.value)
  p.10 <- replicate(n=n, expr=shapiro.test(sample(my.data, 10, replace=TRUE))$p.value)
  p.1000 <- replicate(n=n, expr=shapiro.test(sample(my.data, 1000, replace=TRUE))$p.value)
  #' Combine the data into a data frame, 
  #' one column for each number of samples tested.
  p.df <- cbind(p.5, p.10, p.1000)
  p.df <- as.data.frame(p.df)
  colnames(p.df) <- c("5 samples","10 samples","1000 samples")
  #' Put the data in "tall" format, one column for number of samples
  #' and one column for the p-value.
  p.df.m <- melt(p.df)
  #' Make sure the levels are sorted correctly.
  p.df.m <- transform(p.df.m, variable = factor(variable, levels = c("5 samples","10 samples","1000 samples")))
  return(p.df.m)  
}

Clean, random data

I want to simulate real-word conditions, where we have an underlying population from which we sample a limited number of times. To start, I’ll generate 100000 values from a normal distribution. To keep runtimes low I’ll have assign_vector() sample from that distribution when performing the test for normality.

n.rand <- 100000
n.test <- 10000
my.data <- rnorm(n.rand)
p.df.m <- assign_vector(my.data, n = n.test)

We would expect that normally distributed random data will have an equal probability of any given p-value. i.e. 5% of the time we’ll see p-value ≤ 0.05, 5% of the time we’ll see p-value > 0.05 and ≤ 0.10, and so on through > 0.95 and ≤ 1.00. Let’s graph that and see what we get for each sample size:

ggplot(p.df.m, aes(x = value)) + 
  geom_histogram(binwidth = 1/10) + 
  facet_grid(facets=variable ~ ., scales="free_y") + 
  xlim(0,1) +
  ylab("Count of p-values") +
  xlab("p-values") +
  theme(text = element_text(size = 16))

Histogram of p-values for the normal distribution, for sample sizes 5, 10 and 1000.

This is, indeed, what we expected.

Now let’s compare the normal distribution to a t distribution. The t distribution would pass the “fat pencil” test—it looks normal to the eye:

ggplot(NULL, aes(x=x, colour = distribution)) + 
  stat_function(fun=dnorm, data = data.frame(x = c(-6,6), distribution = factor(1)), size = 1) + 
  stat_function(fun=dt, args = list( df = 20), data = data.frame(x = c(-6,6), distribution = factor(2)), linetype = "dashed", size = 1) + 
  scale_colour_manual(values = c("blue","red"), labels = c("Normal","T-Distribution")) +
  theme(text = element_text(size = 12),
        legend.position = c(0.85, 0.75)) +
  xlim(-4, 4) +
  xlab(NULL) +
  ylab(NULL)

Starting with random data generated from the t-distribution:

my.data <- rt(n.rand, df = 20)

Histogram of p-values for the t distribution, for sample sizes 5, 10 and 1000.

The tests for normality are not very sensitive for small sample sizes, and are much more sensitive for large sample sizes. Even with a sample size of 1000, the data from a t distribution only fails the test for normality about 50% of the time (add up the frequencies for p-value > 0.05 to see this).

Testing the tails

Since the t distribution is narrower in the middle range and has longer tails than the normal distribution, the normality test might be failing because the entire distribution doesn’t look quite normal; we haven’t learned anything specifically about the tails.

To test the tails, we can construct a data set that uses the t distribution for the middle 99% of the data, and the normal distribution for the tails.

my.data <- rt(n.rand, df = 20)
my.data.2 <- rnorm(n.rand)
# Trim off the tails
my.data <- my.data[which(my.data < 3 & my.data > -3)]
# Add in tails from the other distribution
my.data <- c(my.data, my.data.2[which(my.data.2 < -3 | my.data.2 > 3)])

Histogram of p-values for sample sizes 5, 10 and 1000, from a data set constructed from the t distribution in the range -3 to +3 sigmas, with tails from the normal distribution below -3 and above +3.

Despite 99% of the data being from the t distribution, this is almost identical to our test with data from just the normal distribution. It looks like the tails may be having a larger impact on the normality test than rest of the data

Now let’s flip this around: data that is 99% normally-distributed, but using the t distribution in the extreme tails.

my.data <- rnorm(n.rand)
my.data.2 <- rt(n.rand, df = 20)
# Trim off the tails
my.data <- my.data[which(my.data < 3 & my.data > -3)]
# Add in tails from the other distribution
my.data <- c(my.data, my.data.2[which(my.data.2 < -3 | my.data.2 > 3)])

Histogram of p-values for sample sizes 5, 10 and 1000, from a data set constructed from the normal distribution in the range -3 to +3 sigmas, with tails from the t-distribution below -3 and above +3.

Here, 99% of the data is from the normal distribution, yet the normality test looks almost the same as the normality test for just the t-distribution. If you check the y-axis scales carefully, you’ll see that the chance of getting p-value ≤ 0.05 is a bit lower here than for the t distribution.

To make the point further, suppose we have highly skewed data:

my.data <- rlnorm(n.rand, 0, 0.4)

This looks like:

For small sample sizes, even this is likely to pass a test for normality:

What have we learned?

With small sample sizes, everything looks normal.
The normality tests are, indeed, very sensitive to what goes on in the extreme tails.

In other words, if we have enough data to fail a normality test, we always will because our real-world data won’t be clean enough. If we don’t have enough data to reliably fail a normality test, then there’s no point in performing the test, and we have to rely on the fat pencil test or our own understanding of the underlying processes.

Don’t get too hung up on whether your data is normally distributed or not. When evaluating and summarizing data, rely mainly on your brain and use the statistics only to catch really big errors in judgement. When attempting to make predictions about future performance, e.g. calculating Cpk or simulating a process, recognize the opportunities for errors in judgment and explicitly state you assumptions.

Individuals and Moving Range Charts in R

Individuals and moving range charts, abbreviated as ImR or XmR charts, are an important tool for keeping a wide range of business and industrial processes in the zone of economic production, where a process produces the maximum value at the minimum costs.

While there are many commercial applications that will produce such charts, one of my favorites is the free and open-source software package R. The freely available add-on package qcc will do all the heavy-lifting. There is little documentation on how to create a moving range chart, but the code is actually quite simple, as shown below.

The individuals chart requires a simple vector of data. The moving range chart needs a two-column matrix arranged so that qcc() can calculate the moving range from each row.

library(qcc)
#' The data, from sample published by Donald Wheeler
my.xmr.raw <- c(5045,4350,4350,3975,4290,4430,4485,4285,3980,3925,3645,3760,3300,3685,3463,5200)
#' Create the individuals chart and qcc object
my.xmr.x <- qcc(my.xmr.raw, type = "xbar.one", plot = TRUE)
#' Create the moving range chart and qcc object. qcc takes a two-column matrix
#' that is used to calculate the moving range.
my.xmr.raw.r <- matrix(cbind(my.xmr.raw[1:length(my.xmr.raw)-1], my.xmr.raw[2:length(my.xmr.raw)]), ncol=2)
my.xmr.mr <- qcc(my.xmr.raw.r, type="R", plot = TRUE)

This produces the individuals chart:

The qcc individuals chart.

and the moving range chart:

The qcc moving range chart.

The code is also available as a gist.

References

R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
Scrucca, L. (2004). qcc: an R package for quality control charting and statistical process control. R News 4/1, 11-17.
Wheeler, Donald. “Individual Charts Done Right and Wrong.” Quality Digest. 2 Feb 20102 Feb 2010. Print. <http://www.spcpress.com/pdf/DJW206.pdf>.

Tom Hopper

Competitive organizations through high-performance learning

six sigma

Update to plot.qcc using ggplot2 and grid

Sample Size Matters: Design and Cost

Impact on specifications

References

Sample Size Matters: Design and Experiments

Difference to detect

Mean

Standard deviation

Proportions

Summary and look forward

Sample Size Matters: Uncertainty in Measurement

Mean and sample size

Standard deviation and sample size

Proportion and sample size

Process capability and production costs

Summary and a look forward

Sample Size Matters

Design example

Population versus sample statistics

Summary and a look forward

A short aside on statistical tests for proportions

Normality and Testing for Normality

A Function to test normality many times

Clean, random data

Testing the tails

What have we learned?

Individuals and Moving Range Charts in R

References