The free and open-source R statistics package is a great tool for data analysis. The free add-on package qcc provides a wide array of statistical process control charts and other quality tools, which can be used for monitoring and controlling industrial processes, business processes or data collection processes. It’s a great package and highly customizable, but the one feature I wanted was the ability to manipulate the control charts within the grid graphics system, and that turned out to be not so easy.
I went all-in and completely rewrote qcc’s plot.qcc()
function to use Hadley Wickham’s ggplot2 package, which itself is built on top of grid graphics. I have tested the new code against all the examples provided on the qcc help page, and the new ggplot2 version works for all the plots, including X-bar and R, p- and u- and c-charts.
In qcc, an individuals and moving range (XmR or ImR) chart can be created simply:
library(qcc) my.xmr.raw <- c(5045,4350,4350,3975,4290,4430,4485,4285,3980,3925,3645,3760,3300,3685,3463,5200) x <- qcc(my.xmr.raw, type = "xbar.one", title = "Individuals Chart\nfor Wheeler sample data") x <- qcc(matrix(cbind(my.xmr.raw[1:length(my.xmr.raw)-1], my.xmr.raw[2:length(my.xmr.raw)]), ncol = 2), type = "R", title = "Moving Range Chart\nfor Wheeler sample data")
This both generates the plot and creates a qcc object, assigning it to the variable x
. You can generate another copy of the plot with plot(x)
.
To use my new plot function, you will need to have the packages ggplot2, gtable, qcc and grid installed. Download my code from the qcc_ggplot project on Github, load qcc in R and then run source("qcc.plot.R")
. The ggplot2-based version of the plotting function will be used whenever a qcc object is plotted.
library(qcc) source("qcc.plot.R") my.xmr.raw <- c(5045,4350,4350,3975,4290,4430,4485,4285,3980,3925,3645,3760,3300,3685,3463,5200) x <- qcc(my.xmr.raw, type = "xbar.one", title = "Individuals Chart\nfor Wheeler sample data") x <- qcc(matrix(cbind(my.xmr.raw[1:length(my.xmr.raw)-1], my.xmr.raw[2:length(my.xmr.raw)]), ncol = 2), type = "R", title = "Moving Range Chart\nfor Wheeler sample data")
Below, you can compare the individuals and moving range charts generated by qcc and by my new implementation of plot.qcc()
:
New features
In addition to the standard features in qcc plots, I’ve added a few new options.
size
orcex
- Set the size of the points used in the plot. This is passed directly to
geom_point()
. font.size
- Sets the size of text elements. Passed directly to
ggplot()
and grid’sviewport()
. title = element_blank()
- Eliminate the main graph title completely, and expand the data region to fill the empty space. As with qcc, with the default
title = NULL
a title will be created, or a user-defined text string may be passed totitle
. - new.plot
- If
TRUE
, creates a new graph (grid.newpage()
). Otherwise, will write into the existing device and viewport. Intended to simplify the creation of multi-panel or composite charts. - digits
- The argument
digits
is provided by the qcc package to control the number of digits printed on the graph, where it either uses the default option set for R or a user-supplied value. I have tried to add some intelligence to calculating a default value under the assumption that we can tell something about the measurement from the data supplied. You can see the results in the sample graphs above.
Lessons Learned
This little project turned out to be somewhat more difficult than I had envisioned, and there are several lessons-learned, particularly in the use of ggplot2.
First, ggplot2 really needs data frames when plotting. Passing discrete values or variables not connected to a data frame will often result in errors or just incorrect results. This is different than either base graphics or grid graphics, and while Hadley Wickham has mentioned this before, I hadn’t fully appreciated it. For instance, this doesn’t work very well:
my.test.data <- data.frame(x = seq(1:10), y = round(runif(10, 100, 300))) my.test.gplot <- ggplot(my.test.data, aes(x = x, y = y)) + geom_point(shape = 20) index.1 <- c(5, 6, 7) my.test.gplot <- my.test.gplot + geom_point(aes(x = x[index.1], y = y[index.1]), col = "red") my.test.gplot
Different variations of this sometimes worked, or sometimes only plotted some of the points that are supposed to be colored red.
However, if I wrap that index.1 into a data frame, it works perfectly:
my.test.data <- data.frame(x = seq(1:10), y = round(runif(10, 100, 300))) my.test.gplot <- ggplot(my.test.data, aes(x = x, y = y)) + geom_point(shape = 20) index.1 <- c(5, 6, 7) my.test.subdata <- my.test.data[index.1,] my.test.gplot <- my.test.gplot + geom_point(data = my.test.subdata, aes(x = x, y = y), col = "red") my.test.gplot
Another nice lesson was that aes()
doesn’t always work properly when ggplot2 is called from within a function. In this case, aes_string()
usually works. There’s less documentation than I would like on this, but you can search the ggplot2 Google Group or Stack Overflow for more information.
One of the bigger surprises was discovering that aes()
searches for data frames in the global environment. When ggplot()
is used from within a function, though, any variables created within that function are not accessible in the global environment. The work-around is to tell ggplot which environment to search in, and a simple addition of environment = environment()
within the ggplot()
call seems to do the trick. This is captured in a stack overflow post and the ggplot2 issue log.
my.test.data <- data.frame(x = seq(1:10), y = round(runif(10, 100, 300))) my.test.gplot <- ggplot(my.test.data, environment = environment(), aes(x = x, y = y)) + geom_point(shape = 20) index.1 <- c(5, 6, 7) my.test.subdata <- my.test.data[index.1,] my.test.gplot <- my.test.gplot + geom_point(data = my.test.subdata, aes(x = x, y = y), col = "blue") my.test.gplot
Finally, it is possible to completely and seamlessly replace a function created in a package and loaded in that package’s namespace. When I set out, I wanted to end up with a complete replacement for qcc’s internal plot.qcc()
function, but wasn’t quite sure this would be possible. Luckily, the below code, called after the function declaration, worked. One thing I found was that I needed to name my function the same as the one in the qcc package in order for the replacement to work in all cases. If I used a different name for my function, it would work when I called plot()
with a qcc object, but qcc’s base graphics version would be used when calling qcc()
with the parameter plot = TRUE
.
unlockBinding(sym="plot.qcc", env=getNamespace("qcc")); assignInNamespace(x="plot.qcc", value=plot.qcc, ns=asNamespace("qcc"), envir=getNamespace("qcc")); assign("plot.qcc", plot.qcc, envir=getNamespace("qcc")); lockBinding(sym="plot.qcc", env=getNamespace("qcc"));
Outlook
For now, the code suits my immediate needs, and I hope that you will find it useful. I have some ideas for additional features that I may implement in the future. There are some parts of the code that can and should be further cleaned up, and I’ll tweak the code as needed. I am certainly interested in any bug reports and in seeing any forks; good ideas are always welcome.
References
- R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
- Scrucca, L. (2004). qcc: an R package for quality control charting and statistical process control. R News 4/1, 11-17.
- H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009.
- Wheeler, Donald. “Individual Charts Done Right and Wrong.” Quality Digest. 2 Feb 20102 Feb 2010. Print. <http://www.spcpress.com/pdf/DJW206.pdf>.
Would you be interested in using the grid to update https://public.opencpu.org/ocpu/github/qitools/charts/www/ ? The last update of qcc stopped allowing overlay regions and lines for targets in the setting of a trial.
I really like what you have done, but I ran into a problem. Here is a reproducible version.
library(qcc)
source(“SPCinR/qcc.plot.R”)
set.seed(749)
mydata <- rnorm(12)
newdata <- c(3.25)
qcc(mydata ,type="xbar.one",newdata=newdata)
I receive this error:
Error: `breaks` and `labels` must have the same length
Thank you,Todd. I just noticed this the other day, myself. What version of ggplot2 are you using?
How do you use qcc or your ggplot2-compatible version to draw an XmR-chart with both the individuals-chart and the corresponding range-chart combined? I only managed to plot a single chart using qcc, not a compound XmR-chart as suggested by Don Wheeler. Any hints how to achieve this in R?
qcc will only plot one at a time, so you have to create a type=”xbar.one” and a separate type=”R”. It’s explained somewhere in the qcc documentation, but see my gist for a full example: https://gist.github.com/tomhopper/9000495
Using my ggplot2-based version of qcc.plot, you can use the package egg (available on GitHub via the devtools package), or grid, or gridExtra to combine the two into a single graph object.
Apologies for the delay in responding; I missed WordPress’s notification.
Hi Thomas,
I’m an R novice. In the last block of code in your qcc.plot.R file I got an error:
Error in lockBinding(sym = “plot.qcc”, envir = getNamespace(“qcc”)) :
unused argument (envir = getNamespace(“qcc”))
At first I thought it was because line 643 used env=getNamespace instead of envir=getNamespace. And indeed the error went away when I made that change. But once I saved the file, the error showed up again.
Any ideas on what I should change?
Ross
Ross, before sourcing my code, did you load the qcc package with “library(qcc)” or “require(qcc)”?
Hi,
Great work is done I want to use both the I-MR charts on one window. Can I do that
And currently, I am getting this error while using your function.
Error in assignInNamespace(x = “plot.qcc”, value = plot.qcc, ns = asNamespace(“qcc”), :
no slot of name “methods” for this object of class “derivedDefaultMethod”
Plotting individuals and moving range together on one plot area is why I re-wrote plot.qcc(). However, for backward compatibility with qcc(), there is no built-in option to do this. You’ll need to do it either by using library(grid) and plot the qcc objects to viewports that you set up, or you’ll want to try to use functions for arranging ggplot2 plots, such as in cowplot or ggpubr.
Regarding the error…it looks like a change in R is causing an intermittent problem with assignInNamespace(). I’ll work on a permanent solution, but until I can update the code, try source() again, or read in the code (e.g. open the qcc.plot.R file and execute the code with the “run” command). It seems to sometimes throw an error, and sometimes work.