Combining Expert Judgements

Both product development and project planning often require making educated guesses. There are two models that I’ve seen for doing this. The first is to have a designated subject matter expert (SME) who provides The Answer. The second is to get a group of SMEs together to discuss and arrive at a consensus answer.

Under most circumstances, I’m not a fan of the first approach. Individuals are simply too fallible, too easily swayed by anecdotes rather than real data or too busy to consider a problem fully and in depth. The exception is when you are estimating work; then the SME is the person who will actually have to do the work, and their estimate is better than anyone else’s. The second solution often suffers from several other problems. Too many people working a problem can take a long time, answers can be driven by the most vocal or the most risk-averse members of the group and groups sometimes deliver bad answers due to diffusion of responsibility or other “group think” effects.

One solution to the problems of group decisions is to use Affinity Diagramming and the Analytical Hierarchy Process (AHP) to structure the problem, gather individual judgements of the alternatives, and then determine their relative values or importance. Affinity diagramming followed by AHP is incredibly powerful, and does a great job taking the individual bias out of the equation. I’ve used it with teams to rank the importance of product requirements, as a training exercise to accurately estimate the relative area of geometric shapes and personally to decide which car to buy. It works.

Unfortunately, AHP takes time, and for any but the simplest assessments, you really need custom software to support the process. Affinity diagramming can be done very effectively with Post-It notes, but the calculations of AHP cannot be easily set up in a program like Excel. It also requires discrete alternatives to choose from. For estimates of a single variable, such as lifetime, or other performance characteristics, I have had to develop a different approach.

The technique below works when you want to create a point estimate of a continuous variable. For instance, you might need to estimate product lifetime and establish a warrant period, or you might need to estimate a performance level that can be communicated to customers (implying a performance guarantee), or you might need to estimate the duration of a set of project activities. We can easily implement the calculations in Excel, R, or a similar tool.

Example

What we’re going to do is get the SMEs to individually estimate a range of possible results (low, most likely, high). Then we’ll generate triangular probability distributions from each estimate and finally combine those estimates. To get a robust result, we will account for the SME’s estimation of their own accuracy, and treat their different estimates discrete distributions when combining.

The below is the data entered by the SMEs.

Subject Matter Experts data entries for estimating a continuous variable

Using the above data, we can calculate triangular probability distributions for each, then combine them by treating them as discrete distributions. This produces the sequence of distributions, below.

Combining estimation distribution from Subject Matter Experts

We can then summarize the combined distribution with some useful values:

most likely: 7087
likely low: 4360
likely high: 10711
50% between 5162 and 8370

The process

  1. Bring a few SMEs together. While the technique will work with any number, usually five to six is more than sufficient, and ten should be considered the maximum for useful input. We’ll call this number N so we can refer to it later.
  2. Define the problem. Everyone should agree precisely what you are estimating.
  3. Agree on a basis for the evaluation. This might be an agreed conceptual model of how the product behaves or a set of criteria or goals to evaluate against.
  4. Collect three estimates—low, most likely, high—from each SMEs for the variable of interest (call this variable X).

Collecting a range is important; we need to be honest with ourselves that we don’t know what the value will actually be: there is uncertainty in the SME judgement—if we had the data to be more precise then we’d use that—and there’s variation to consider. Ranges allow us to derive probabilistic estimates that represent both the limits of our SME’s knowledge and the natural variation.

  1. Ask each SME for their assessment of three probabilities on a scale of 0% to 100%:
    1. How likely are we to see the real data fall within this range if your reasoning or model is correct? This will usually be very high, like 95% or higher.
    2. How likely are we to see the real data fall within this range even if your reasoning has some flaw? This may also be high, but you can use 50% if the answer is “I don’t know.”
    3. How likely is it that the reasoning is correct? Again, this will usually be pretty high. That’s why they’re SMEs.

At this point, you have all the information that you need from the SMEs and can proceed to the calculations.

  1. Using the three probabilities of the argument accuracy, calculate the values of X at end points for a triangular probability distribution.
  2. For each SME’s guesstimates, use the triangle distributions to generate a large number of probabilities across the total range of possible Xs.
  3. For each value of X used in the generation of the probability distributions, randomly choose a probability from one of the SMEs. For each value of X, we now have N probabilities that we treat as part of a discrete distribution.
  4. Use the resulting combined distribution to determine values of interest, e.g. the median, fifth percentile, ninety-fifth percentile and so on.

From a diverse set of estimates, you now have a single, robust estimation for a variable. The resulting probability distribution will not be smooth, but you will be able to pull out single values that are meaningful and robust, such as the ninety-fifth percentile for duration estimates.

A few tips

How you ask for the estimates matters. Just asking for “low,” “middle” and “high” estimates will get you very inconsistent results. Likewise, asking for “worst-case” or “best-case” will often get you some pretty wild estimates. You want to ask questions like “how long would this take if many things went right,” “how long would the product last in more severe operating conditions” or “what is the most likely power available?” You don’t want the “middle” estimate, but a “most likely” estimate; you’re not shooting for half-way between the low and high estimates, but for a high-probability point estimate.

When some SMEs know more about the variable being estimated than others, you could also weight their judgements. This weighting is used in the second to last step by adjusting the probability of randomly selecting a value from each SME’s probability distribution.

Issue Logs and Risk Registers

Every product development project includes uncertainty over what will happen. The uncertainty—each assumption or best guess—reduces our chances of project success. The job of the project manager and team members is to ensure success by managing risk.

When something goes wrong—deviates from the plan—it stops being a risk and becomes an issue that must be addressed to ensure success. Issues are those conditions that are having a negative impact on your ability to execute the project plan. You can easily identify them because they directly cause schedule slippage and extra work.

There are two simple tools that can—and should—be used on every project to manage risks and issues to prevent disaster. One is the risk register; the other is the issue log. In my experience, these two documents are often conflated, but they are distinct documents that should contain different information and drive different actions.

The risk register is a means of capturing risks that we want to monitor over the life of the project so that we can take action before they have a negative impact on the project. These are conditions that you have decided not to explicitly work into the plan, but don’t want to let “slip under the radar” to create big issues for you later.

The issue log is where you record any problems that were not accounted for in the plan and that threaten to delay the project, push it off budget or reduce the scope (e.g. reduce product performance).

Issue Log Risk Register
Description of the issue Description of the risk
Underlying problem or cause of the issue Risk profile—sources of uncertainty and the potential impact
Action plan Potential actions
Priority or scheduling Monitoring plan
Who is responsible for assuring this issue is resolved Who is responsible for monitoring
Date opened and date resolved, sometimes a tracking number or other ID Date last updated, tracking ID

Issue Log

The issue log is fundamentally about corrective actions. The project has deviated from the plan, and now we need to get back on course to complete the project on time, on budget and with the agreed goals. The issue log is used to capture this information.

While the cause of the problem is often obvious, it is always a good idea to probe for deeper, systemic causes that could lead to further delays. Asking “why?” five times in order to permanently and irrevocably fix a problem doesn’t take very long compared to the total delays that a project can experience.

Risk Register

The hard part of a risk register is the risk profile. Different people respond differently to risk, and some are more comfortable with thinking about uncertain outcomes than others. These differences between people lead to a lot of variation and debate in identifying risks; a good strategy for making risk registers easy is to standardize. The best practices are to focus on the causes of the risk and the probable impacts and to standardize the process.

There has been a lot written about risk management. Some of the best, in my opinion, is the work by De Meyer, Loch and Pich, which was first brought to my attention by Glenn Alleman over at the Herding Cats blog. In their excellent book, Managing the Unknown: A New Approach to Managing High Uncertainty and Risk in Projects, they break down risk into two major components: relationship complexity and task complexity.

When the relationships of stakeholders or partners are complex—groups aren’t aligned—then you can expect disagreements and conflict. Successful strategies for dealing with relationship complexity include increased communication and more rigidly defined relationships.

When tasks are complex—there are many links between tasks, so that changing one task can affect many, or there is a high degree of uncertainty in what needs to be done—then the successful strategies range from critical path management to an entrepreneurial approach of working multiple solutions in parallel (see also De Meyer 2001).

By implementing these pairings of source of risk with management strategy in a risk register template, we can greatly simplify the process and drive more consistent risk management results. Adding in a simple analysis of the impact can help us with prioritization (where do we spend our resources monitoring) and monitoring frequency.

Monitoring is all about how you will know when to do something about the risk. i.e. You want to decide in advance what condition will trigger you to transition this risk to the issue log. Measures should be relevant to the risk, quantitative where possible and the method of measurement should be clearly defined (you don’t want people disagreeing over the project plan just because they measure something differently). Set up measurement intervals that make sense by asking yourself how long you can go without knowing that you have a problem. Plot the results as a time series or on a control chart to allow you to distinguish between normal variation in the measurement and a condition that requires action.

References

  • Loch, Christoph H, Arnoud De Meyer, and Michael T Pich. Managing the Unknown. Hoboken, New Jersey: John Wiley & Sons, 2006. Print.
  • De Meyer, Arnoud, Christoph H Loch, and Michael T Pich. “Uncertainty and Project Management: Beyond the Critical Path Mentality.” 2001 : 1–23. Print.

Apple-Google Wage Fixing and Systems Thinking

It seems that some of the most successful people in the world, at some of the world’s largest and most respected companies, engaged in an apparently wide-spread and illegal effort to limit employees’ job opportunities and wages.

This was stupid. It was stupid for the obvious reason: it was illegal. It was stupid for a more insidious reason, though: it will backfire on the company. We can explore why if we use a couple of system archetypes to think about the situation. These archetypes will come in handy in a wide range of situations.

The problem was retaining talent, who could be easily enticed away by more attractive compensation packages or to work at more exciting companies. Either it was easier to get an attractive compensation packages at a competitor, or the work did not stay sufficiently interesting or engaging. Simply put: employees were not happy enough.

The fix was to limit recruitment among competing companies.

In systems thinking, this is a classic “shifting the burden” dynamic. In shifting the burden, pictured below, you have two types of solutions to the symptoms of a problem: the fundamental solution—a corrective action for the root cause—and the symptomatic solution. The symptomatic solution reduces the symptom, but also creates a side effect that has a negative impact on the fundamental problem. Symptomatic solutions only have a temporary benefit before things get worse.

The classic shifting the burden system archetype describes wage-fixing practices as solutions to employee turnover.

The classic shifting the burden system archetype describes wage-fixing practices as solutions to employee turnover.

The green links, labelled “+,” indicate that the two conditions at either end of the tail increase and decrease together. The application of more symptomatic solution causes an increase in the side effect; reducing the use of symptomatic solutions causes a decrease in the side effect. The red links, labelled “–,” indicate that the two conditions at either end work in opposite directions. An increase in the side effect causes a decrease in the effectiveness of the fundamental solution.

There are three cycles, or loops, in this diagram. Two of them are “balancing loops;” over time, one factor tends to balance out the other, and the situation stabilizes. The third loop is a “self-reinforcing loop;” such loops will “snow-ball” or continue increasing over time:

  1. Applying a symptomatic solution
  2. increases the side effect
  3. which reduces the effectiveness of the fundamental solution
  4. which increases the symptom
  5. which drives more application of the symptomatic solution
  6. and increases the side effect.

The solution is to focus on fundamental solutions—get at the root cause—and avoid or limit reliance on symptomatic solutions. Symptomatic solutions are always temporary and usually make things worse in the long term; fundamental solutions are permanent and don’t have negative side effects.

Why employees leave is also explained, in broad strokes, by another system archetype: the limits to growth.

The classic limits to growth system archetype describes why it’s hard to keep employees engaged and hold down turnover.

Here we have a self-reinforcing dynamic created by successes and interesting work with good compensation. Over time, employees should generate more success and have more interesting work and better compensation. However, this is coupled to a balancing loop that becomes stronger over time, and slows down the reinforcing loop, like the brakes in a car. Employees burn out, or generally stop producing as much. This balancing loop is driven by some limiting condition, which makes the slowing action—burnout or disengagement—stronger over time. While this simple version looks like it will lead to a steady state, more realistic versions often result in a crash, where the results not only level off, but actually decrease.

The solution to a limits to growth system is to attack the limiting condition. If employees get bored doing the same thing over time, then you have to find a way for them to be engaged with enough new, interesting work. If they can earn substantially better compensation packages at competitors, then you have to (approximately) match those packages.

If you don’t fix the limiting condition, you might see a temporary improvement via the dynamics of shifting the burden (or the closely-related archetype, fixes that fail), but in the long term the problem will only get worse.

A Simple Introduction to the Graphing Philosophy of ggplot2

“The emphasis in ggplot2 is reducing the amount of thinking time by making it easier to go from the plot in your brain to the plot on the page.” (Wickham, 2012)

“Base graphics are good for drawing pictures; ggplot2 graphics are good for understanding the data.” (Wickham, 2012)

I’m not ggplot2′s creator, Hadley Wickham, but I do find myself in discussions trying to explain how to build graphs in ggplot2. It’s a very elegant system, but also very different from other graphing systems. Once you understand the organizing philosophy, ggplot2 becomes very easy to work with.

The grammar of ggplot2 graphics

There is a basic grammar to all graphics production. In R‘s base graphics or in Excel, you feed ranges of data to a plot as x and y elements, then manipulate colors, scale dimensions and other parts of the graph as graphical elements or options.

ggplot2′s grammar makes a clear distinction between your data and what gets displayed on the screen or page. You feed ggplot2 your data, then apply a series of mappings and transformations to create a visual representation of that data. Even with base graphics or Excel we never really plot the data itself, we only create a representation; ggplot2 makes this distinction explicit. In addition, ggplot2′s structure makes it very easy to tweak a graph to look the way you want by adding mappings.

A ggplot2 graph is built up from a few basic elements:

1. Data The raw data that you want to plot
2. Geometries geom_ The geometric shapes that will represent the data.
3. Aethetics aes() Aesthetics of the geometric and statistical objects, such as color, size, shape and position.
4. Scales scale_ Maps between the data and the aesthetic dimensions, such as data range to plot width or factor values to colors

Putting it together, the code to build a ggplot2 graph looks something like:

data
+ geometry to represent the data,
+ aesthetic mappings of data to plot coordinates like position, color and size
+ scaling of ranges of the data to ranges of the aesthetics

A real example shows off how this all fits together.

library(ggplot2)
# Create some data for our example
some.data <- data.frame(timer = 1:12, 
                        countdown = 12:1, 
                        category = factor(letters[1:3]))
# Generate the plot
some.plot <- ggplot(data = some.data, aes(x = timer, y = countdown)) +
  geom_point(aes(colour = category)) +
  scale_x_continuous(limits = c(0, 15)) +
  scale_colour_brewer(palette = "Dark2") +
  coord_fixed(ratio=1)
# Display the plot
some.plot
Demonstration of the key concepts in the grammar of graphics: data, geometries, aesthetic mappings and scale mappings.

Demonstration of the key concepts in the grammar of graphics: data, geometries, aesthetic mappings and scale mappings.

Here you can see that the data is passed to ggplot(), aesthetic mappings between the data and the plot coordinates, a geometry to represent the data and a couple of scales to map between the data range and the plot ranges.

More advanced parts of the ggplot2 grammar

The above will get you a basic graph, but ggplot2 includes a few more parts of the grammar that you’ll want to be aware of as you try to visualize more complex data:

5. Statistical transformations stat_ Statistical summaries of the data that can be plotted, such as quantiles, fitted curves (loess, linear models, etc.), sums and so o.
6. Coordinate systems coord_ The transformation used for mapping data coordinates into the plane of the data rectangle.
7. Facets facet_ The arrangement of the data into a grid of plots (also known as latticing, trellising or creating small multiples).
8. Visual Themes theme The overall visual defaults of a plot: background, grids, axe, default typeface, sizes, colors, etc.

Hadley Wickham describes various pieces of this grammar in recorded presentations on Vimeo and YouTube and the online documentation to ggplot2. The most complete explanation is in his book ggplot2: Elegant Graphics for Data Analysis (Use R!) (Wickham, 2009).

References

Wickham, Hadley. ggplot2: Elegant Graphics for Data Analysis. Dordrecht, Heibelberg, London, New York: Springer, 2009. Print.
Wickham, Hadley. A Backstage Tour of ggplot2 with Hadley Wickham. 2012. Video. YouTube. Web. 21 Mar 2014. . Contributed by REvolutionAnalytics.

Normality and Testing for Normality

Many of our statistical tests make assumptions about the distribution of the underlying population. Many of the most common—ImR (XmR) and XbarR control charts, ANOVA, t-tests—assume normal distributions in the underlying population (or normal distributions in the residuals, in the case of ANOVA), and we’re often told that we must carefully check the assumptions.

At the same time, there’s a lot of conflicting advice about how to test for normality. There are the statistical tests for normality, such as Shapiro-Wilk or Anderson-Darling. There’s the “fat pencil” test, where we just eye-ball the distribution and use our best judgement. We could even use control charts, as they’re designed to detect deviations from the expected distribution. We are discouraged from using the “fat pencil” because it will result in a lot of variation from person to person. We’re often told not to rely too heavily on the statistical tests because they are not sensitive with small sample sizes and too sensitive to the tails. In industrial settings, our data is often messy, and the tails are likely to be the least reliable portion of our data.

I’d like to explore what the above objections really look like. I’ll use R to generate some fake data based on the normal distribution and the t distribution, and compare the frequency of p-values obtained from the Shapiro-Wilk test for normality.

A Function to test normality many times

First, we need to load our libraries

library(ggplot2)
library(reshape2)

To make this easy to run, I’ll create a function to perform a large number of normality tests (Shapiro-Wilk) for sample sizes n = 5, 10 and 1000, all drawn from the same data:

#' @name assign_vector
#' @param data A vector of data to perform the t-test on.
#' @param n An integer indicating the number of t-tests to perform. Default is 1000
#' @return A data frame in "tall" format
assign_vector <- function(data, n = 1000) {
  # replicate the call to shapiro.test n times to build up a vector of p-values
  p.5 <- replicate(n=n, expr=shapiro.test(sample(my.data, 5, replace=TRUE))$p.value)
  p.10 <- replicate(n=n, expr=shapiro.test(sample(my.data, 10, replace=TRUE))$p.value)
  p.1000 <- replicate(n=n, expr=shapiro.test(sample(my.data, 1000, replace=TRUE))$p.value)
  #' Combine the data into a data frame, 
  #' one column for each number of samples tested.
  p.df <- cbind(p.5, p.10, p.1000)
  p.df <- as.data.frame(p.df)
  colnames(p.df) <- c("5 samples","10 samples","1000 samples")
  #' Put the data in "tall" format, one column for number of samples
  #' and one column for the p-value.
  p.df.m <- melt(p.df)
  #' Make sure the levels are sorted correctly.
  p.df.m <- transform(p.df.m, variable = factor(variable, levels = c("5 samples","10 samples","1000 samples")))
  return(p.df.m)  
}

Clean, random data

I want to simulate real-word conditions, where we have an underlying population from which we sample a limited number of times. To start, I’ll generate 100000 values from a normal distribution. To keep runtimes low I’ll have assign_vector() sample from that distribution when performing the test for normality.

n.rand <- 100000
n.test <- 10000
my.data <- rnorm(n.rand)
p.df.m <- assign_vector(my.data, n = n.test)

We would expect that normally distributed random data will have an equal probability of any given p-value. i.e. 5% of the time we’ll see p-value ≤ 0.05, 5% of the time we’ll see p-value > 0.05 and ≤ 0.10, and so on through > 0.95 and ≤ 1.00. Let’s graph that and see what we get for each sample size:

ggplot(p.df.m, aes(x = value)) + 
  geom_histogram(binwidth = 1/10) + 
  facet_grid(facets=variable ~ ., scales="free_y") + 
  xlim(0,1) +
  ylab("Count of p-values") +
  xlab("p-values") +
  theme(text = element_text(size = 16))
Histogram of p-values for the normal distribution, for sample sizes 5, 10 and 1000.

Histogram of p-values for the normal distribution, for sample sizes 5, 10 and 1000.

This is, indeed, what we expected.

Now let’s compare the normal distribution to a t distribution. The t distribution would pass the “fat pencil” test—it looks normal to the eye:

ggplot(NULL, aes(x=x, colour = distribution)) + 
  stat_function(fun=dnorm, data = data.frame(x = c(-6,6), distribution = factor(1)), size = 1) + 
  stat_function(fun=dt, args = list( df = 20), data = data.frame(x = c(-6,6), distribution = factor(2)), linetype = "dashed", size = 1) + 
  scale_colour_manual(values = c("blue","red"), labels = c("Normal","T-Distribution")) +
  theme(text = element_text(size = 12),
        legend.position = c(0.85, 0.75)) +
  xlim(-4, 4) +
  xlab(NULL) +
  ylab(NULL)

Density plot of normal and t distributions

Starting with random data generated from the t-distribution:

my.data <- rt(n.rand, df = 20)
Histogram of p-values for the t distribution, for sample sizes 5, 10 and 1000.

Histogram of p-values for the t distribution, for sample sizes 5, 10 and 1000.

The tests for normality are not very sensitive for small sample sizes, and are much more sensitive for large sample sizes. Even with a sample size of 1000, the data from a t distribution only fails the test for normality about 50% of the time (add up the frequencies for p-value > 0.05 to see this).

Testing the tails

Since the t distribution is narrower in the middle range and has longer tails than the normal distribution, the normality test might be failing because the entire distribution doesn’t look quite normal; we haven’t learned anything specifically about the tails.

To test the tails, we can construct a data set that uses the t distribution for the middle 99% of the data, and the normal distribution for the tails.

my.data <- rt(n.rand, df = 20)
my.data.2 <- rnorm(n.rand)
# Trim off the tails
my.data <- my.data[which(my.data < 3 & my.data > -3)]
# Add in tails from the other distribution
my.data <- c(my.data, my.data.2[which(my.data.2 < -3 | my.data.2 > 3)])
Histogram of p-values for sample sizes 5, 10 and 1000, from a data set constructed from the t distribution in the range -3 to +3 sigmas, with tails from the normal distribution below -3 and above +3.

Histogram of p-values for sample sizes 5, 10 and 1000, from a data set constructed from the t distribution in the range -3 to +3 sigmas, with tails from the normal distribution below -3 and above +3.

Despite 99% of the data being from the t distribution, this is almost identical to our test with data from just the normal distribution. It looks like the tails may be having a larger impact on the normality test than rest of the data

Now let’s flip this around: data that is 99% normally-distributed, but using the t distribution in the extreme tails.

my.data <- rnorm(n.rand)
my.data.2 <- rt(n.rand, df = 20)
# Trim off the tails
my.data <- my.data[which(my.data < 3 & my.data > -3)]
# Add in tails from the other distribution
my.data <- c(my.data, my.data.2[which(my.data.2 < -3 | my.data.2 > 3)])
Histogram of p-values for sample sizes 5, 10 and 1000, from a data set constructed from the normal distribution in the range -3 to +3 sigmas, with tails from the t-distribution below -3 and above +3.

Histogram of p-values for sample sizes 5, 10 and 1000, from a data set constructed from the normal distribution in the range -3 to +3 sigmas, with tails from the t-distribution below -3 and above +3.

Here, 99% of the data is from the normal distribution, yet the normality test looks almost the same as the normality test for just the t-distribution. If you check the y-axis scales carefully, you’ll see that the chance of getting p-value ≤ 0.05 is a bit lower here than for the t distribution.

To make the point further, suppose we have highly skewed data:

my.data <- rlnorm(n.rand, 0, 0.4)

This looks like:
Histogram of log-normal data

For small sample sizes, even this is likely to pass a test for normality:
Histogram of p-values for a log-normal distribution

What have we learned?

  • With small sample sizes, everything looks normal.
  • The normality tests are, indeed, very sensitive to what goes on in the extreme tails.

In other words, if we have enough data to fail a normality test, we always will because our real-world data won’t be clean enough. If we don’t have enough data to reliably fail a normality test, then there’s no point in performing the test, and we have to rely on the fat pencil test or our own understanding of the underlying processes.

Don’t get too hung up on whether your data is normally distributed or not. When evaluating and summarizing data, rely mainly on your brain and use the statistics only to catch really big errors in judgement. When attempting to make predictions about future performance, e.g. calculating Cpk or simulating a process, recognize the opportunities for errors in judgment and explicitly state you assumptions.

Rewriting plot.qcc using ggplot2 and grid

The free and open-source R statistics package is a great tool for data analysis. The free add-on package qcc provides a wide array of statistical process control charts and other quality tools, which can be used for monitoring and controlling industrial processes, business processes or data collection processes. It’s a great package and highly customizable, but the one feature I wanted was the ability to manipulate the control charts within the grid graphics system, and that turned out to be not so easy.

I went all-in and completely rewrote qcc’s plot.qcc() function to use Hadley Wickham’s ggplot2 package, which itself is built on top of grid graphics. I have tested the new code against all the examples provided on the qcc help page, and the new ggplot2 version works for all the plots, including X-bar and R, p- and u- and c-charts.

In qcc, an individuals and moving range (XmR or ImR) chart can be created simply:

library(qcc)
my.xmr.raw <- c(5045,4350,4350,3975,4290,4430,4485,4285,3980,3925,3645,3760,3300,3685,3463,5200)
x <- qcc(my.xmr.raw, type = "xbar.one", title = "Individuals Chart\nfor Wheeler sample data")
x <- qcc(matrix(cbind(my.xmr.raw[1:length(my.xmr.raw)-1], my.xmr.raw[2:length(my.xmr.raw)]), ncol = 2), type = "R", title = "Moving Range Chart\nfor Wheeler sample data")

This both generates the plot and creates a qcc object, assigning it to the variable x. You can generate another copy of the plot with plot(x).

To use my new plot function, you will need to have the packages ggplot2, gtable, qcc and grid installed. Download my code from the qcc_ggplot project on Github, load qcc in R and then run source("qcc.plot.R"). The ggplot2-based version of the plotting function will be used whenever a qcc object is plotted.

library(qcc)
source("qcc.plot.R")
my.xmr.raw <- c(5045,4350,4350,3975,4290,4430,4485,4285,3980,3925,3645,3760,3300,3685,3463,5200)
x <- qcc(my.xmr.raw, type = "xbar.one", title = "Individuals Chart\nfor Wheeler sample data")
x <- qcc(matrix(cbind(my.xmr.raw[1:length(my.xmr.raw)-1], my.xmr.raw[2:length(my.xmr.raw)]), ncol = 2), type = "R", title = "Moving Range Chart\nfor Wheeler sample data")

Below, you can compare the individuals and moving range charts generated by qcc and by my new implementation of plot.qcc():

The qcc individuals chart as implemented in the qcc package.

The qcc individuals chart as implemented in the qcc package.

The qcc individuals chart as implemented using ggplot2 and grid graphics.

The qcc individuals chart as implemented using ggplot2 and grid graphics.

The qcc moving range chart as implemented in the qcc package.

The qcc moving range chart as implemented in the qcc package.

The qcc moving range chart as implemented using ggplot2 and grid graphics.

The qcc moving range chart as implemented using ggplot2 and grid graphics.

New features

In addition to the standard features in qcc plots, I’ve added a few new options.

size or cex
Set the size of the points used in the plot. This is passed directly to geom_point().
font.size
Sets the size of text elements. Passed directly to ggplot() and grid’s viewport().
title = element_blank()
Eliminate the main graph title completely, and expand the data region to fill the empty space. As with qcc, with the default title = NULL a title will be created, or a user-defined text string may be passed to title.
new.plot
If TRUE, creates a new graph (grid.newpage()). Otherwise, will write into the existing device and viewport. Intended to simplify the creation of multi-panel or composite charts.
digits
The argument digits is provided by the qcc package to control the number of digits printed on the graph, where it either uses the default option set for R or a user-supplied value. I have tried to add some intelligence to calculating a default value under the assumption that we can tell something about the measurement from the data supplied. You can see the results in the sample graphs above.

Lessons Learned

This little project turned out to be somewhat more difficult than I had envisioned, and there are several lessons-learned, particularly in the use of ggplot2.

First, ggplot2 really needs data frames when plotting. Passing discrete values or variables not connected to a data frame will often result in errors or just incorrect results. This is different than either base graphics or grid graphics, and while Hadley Wickham has mentioned this before, I hadn’t fully appreciated it. For instance, this doesn’t work very well:

my.test.data <- data.frame(x = seq(1:10), y = round(runif(10, 100, 300)))
my.test.gplot <- ggplot(my.test.data, aes(x = x, y = y)) + 
  geom_point(shape = 20)
index.1 <- c(5, 6, 7)
my.test.gplot <- my.test.gplot +
  geom_point(aes(x = x[index.1], y = y[index.1]), col = "red")
my.test.gplot

Different variations of this sometimes worked, or sometimes only plotted some of the points that are supposed to be colored red.

However, if I wrap that index.1 into a data frame, it works perfectly:

my.test.data <- data.frame(x = seq(1:10), y = round(runif(10, 100, 300)))
my.test.gplot <- ggplot(my.test.data, aes(x = x, y = y)) + 
  geom_point(shape = 20)
index.1 <- c(5, 6, 7)
my.test.subdata <- my.test.data[index.1,]
my.test.gplot <- my.test.gplot +
  geom_point(data = my.test.subdata, aes(x = x, y = y), col = "red")
my.test.gplot

Another nice lesson was that aes() doesn’t always work properly when ggplot2 is called from within a function. In this case, aes_string() usually works. There’s less documentation than I would like on this, but you can search the ggplot2 Google Group or Stack Overflow for more information.

One of the bigger surprises was discovering that aes() searches for data frames in the global environment. When ggplot() is used from within a function, though, any variables created within that function are not accessible in the global environment. The work-around is to tell ggplot which environment to search in, and a simple addition of environment = environment() within the ggplot() call seems to do the trick. This is captured in a stack overflow post and the ggplot2 issue log.

my.test.data <- data.frame(x = seq(1:10), y = round(runif(10, 100, 300)))
my.test.gplot <- ggplot(my.test.data, environment = environment(), aes(x = x, y = y)) + 
  geom_point(shape = 20)
index.1 <- c(5, 6, 7)
my.test.subdata <- my.test.data[index.1,]
my.test.gplot <- my.test.gplot +
  geom_point(data = my.test.subdata, aes(x = x, y = y), col = "blue")
my.test.gplot

Finally, it is possible to completely and seamlessly replace a function created in a package and loaded in that package’s namespace. When I set out, I wanted to end up with a complete replacement for qcc’s internal plot.qcc() function, but wasn’t quite sure this would be possible. Luckily, the below code, called after the function declaration, worked. One thing I found was that I needed to name my function the same as the one in the qcc package in order for the replacement to work in all cases. If I used a different name for my function, it would work when I called plot() with a qcc object, but qcc’s base graphics version would be used when calling qcc() with the parameter plot = TRUE.

unlockBinding(sym="plot.qcc", env=getNamespace("qcc"));
assignInNamespace(x="plot.qcc", value=plot.qcc, ns=asNamespace("qcc"), envir=getNamespace("qcc"));
assign("plot.qcc", plot.qcc, envir=getNamespace("qcc"));
lockBinding(sym="plot.qcc", env=getNamespace("qcc"));

Outlook

For now, the code suits my immediate needs, and I hope that you will find it useful. I have some ideas for additional features that I may implement in the future. There are some parts of the code that can and should be further cleaned up, and I’ll tweak the code as needed. I am certainly interested in any bug reports and in seeing any forks; good ideas are always welcome.

References

  • R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
  • Scrucca, L. (2004). qcc: an R package for quality control charting and statistical process control. R News 4/1, 11-17.
  • H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York, 2009.
  • Wheeler, Donald. “Individual Charts Done Right and Wrong.” Quality Digest. 2 Feb 20102 Feb 2010. Print. <http://www.spcpress.com/pdf/DJW206.pdf>.

Individuals and Moving Range Charts in R

Individuals and moving range charts, abbreviated as ImR or XmR charts, are an important tool for keeping a wide range of business and industrial processes in the zone of economic production, where a process produces the maximum value at the minimum costs.

While there are many commercial applications that will produce such charts, one of my favorites is the free and open-source software package R. The freely available add-on package qcc will do all the heavy-lifting. There is little documentation on how to create a moving range chart, but the code is actually quite simple, as shown below.

The individuals chart requires a simple vector of data. The moving range chart needs a two-column matrix arranged so that qcc() can calculate the moving range from each row.

library(qcc)
#' The data, from sample published by Donald Wheeler
my.xmr.raw <- c(5045,4350,4350,3975,4290,4430,4485,4285,3980,3925,3645,3760,3300,3685,3463,5200)
#' Create the individuals chart and qcc object
my.xmr.x <- qcc(my.xmr.raw, type = "xbar.one", plot = TRUE)
#' Create the moving range chart and qcc object. qcc takes a two-column matrix
#' that is used to calculate the moving range.
my.xmr.raw.r <- matrix(cbind(my.xmr.raw[1:length(my.xmr.raw)-1], my.xmr.raw[2:length(my.xmr.raw)]), ncol=2)
my.xmr.mr <- qcc(my.xmr.raw.r, type="R", plot = TRUE)

This produces the individuals chart:

The qcc individuals chart as implemented in the qcc package.

The qcc individuals chart.

and the moving range chart:

The qcc moving range chart as implemented in the qcc package.

The qcc moving range chart.

The code is also available as a gist.

References

  • R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
  • Scrucca, L. (2004). qcc: an R package for quality control charting and statistical process control. R News 4/1, 11-17.
  • Wheeler, Donald. “Individual Charts Done Right and Wrong.” Quality Digest. 2 Feb 20102 Feb 2010. Print. <http://www.spcpress.com/pdf/DJW206.pdf>.