Definitions

I was recently asked a question that raised some good design issues. The question went “why should changing this cause a change in that characteristic?”

The immediate and obvious answer was that it wouldn’t and couldn’t. Theoretically, a large decrease in this (X) might cause an increase of a few percent in that (Y); nothing more. Only someone was claiming that decreasing X decreased Y, too.

They were right. No, the theoretical relationship isn’t wrong. It’s right.

The theoretical calculation is fairly straightforward. You put so much of X in, and, after some calculation, you get so much of Y out. The less X you have, the more Y you get. The hard part is figuring out just how much of X you’re putting in.

The measurement of Y introduces a bunch of variation based on other factors. You measure by changing certain conditions A, B and C. These, in turn, affect some other factors, M and N. X, A, M and N together determine what value you measure for Y.

So decreasing X affects the other factors in such a way that the net effect is a decrease in the measured value of Y.

“Oh, sure,” you respond. “But the theoretical calculation should account for that.”

Not really. The theoretical calculation should tell us what the best case is…what our target should be. The actual measurement is going to produce different results based on various factors, some of which we control and some we can’t. A calculation based on the measurement process would require uncertainty ranges and return a probability distribution; not a singular value. Messy.

Engineers and researchers need to consider both of these as definitions. If you’re designing for some characteristic, as a researcher or engineer you’re usually going to be concerned with the theoretical calculations. This is how you were taught in school, and you’ll naturally be interested in getting as close to the best case as possible. However, not everyone is going to be interested in the theoretical calculation. The folks in Quality who are checking the product for conformance will be more interested in how it’s measured, the operational definition, than in the theoretical definition. The manufacturing plant only want to hear about the operational definition; for them, the world would be a better place without the theoretical definition.

As a design engineer, you need to be more concerned about the operational definition. You’ll be arguing that you designed a part for Y performance (or to “do Y“). The next question that management and your customers should (and probably will) ask is, how do you know you designed it to do that? The answer is always by data analysis. How do you get the data? Via the operational definition. What you know is determined by how you measure, and that’s the operational definition.

This has applicability well outside of engineering design. Physicists have been arguing this very point ever since Bohm and Heisenberg developed the Copenhagen interpretation of quantum physics. Management by objective depends on the ability to close the loop by measuring outcomes. This means that management by objectives requires operational definitions of every objective (though few organizations actually get this far, and management by objectives becomes management by manager gut feeling). Even more enlightened management techniques, such as those advocated by Deming and Scholtes, require operational definitions to enable an organization’s performance improvement (e.g. through the use of control charts, which are only possible with operational definitions).

Use the theoretical definition to tell you the best possible case, but be sure to design according to the operational definition.

Successful Labs

I’ve worked in and managed a few R&D and test labs in my career. Lately, I’ve been thinking about success factors for those labs. At a high level, I believe that a successful Lab has four key attributes: good data; the right data; the ability to communicate the data clearly and effectively; and a relentless pursuit of perfection.

Good Data

Good data is accurate, has a known precision, and is reproducible by your lab and by third parties. This requires good calibration practices, careful evaluation of measurement uncertainty and documented test methods. In short, you need:

  • good calibration procedures and a calibration schedule that keeps equipment in calibration;
  • good procedures for measuring and documenting the measurement uncertainty and sources of error;
  • good procedures for how to set up tests, collect data and then utilize the information from calibration and measurement system analysis in your data analysis.

Yes, this all boils down to standard work. I’ve said it before, and I’ll say it again. Despite the common opinion that standard work is an impediment creative R&D-type work, I’ve found just the opposite is true.

This is harder to accomplish than it sounds, but there’s plenty of resources available to help. There’s even an international standard that a lab can be accredited against: ISO 17025.

The Right Data

Ensuring that you’re collecting the right data makes collecting good data look easy. Getting the right data means performing a test that provides useful information. There are two components to this: testing the expected conditions; and testing the boundary conditions.

If you’re doing your own testing, then testing the expected conditions is easy. You know what you’re thinking and what you expect, and you go test it. If I want to test if ice freezes at zero degrees Celsius, then I test it. However, if the testing is outsourced, then things get complicated. Suppose I live in Denver, Colorado, and I want to test if ice freezes at zero degrees Celsius in the winter. To test it myself, I might stick a thermometer in a glass of water, put it outside and wait. Suppose, though, that the testing is outsourced to a lab in a place like Bangladesh, India, that’s hotter, lower in altitude and more humid. They can provide an answer, but will they address my intent? As the test requester, I may not ask the right question; as the test group, they may leap to test without fully understanding why I’m asking. This sort of confusion actually happens quite frequently, even when the test group and the requester are in the same building. It can be months before people realize that their question was only partially answered.

Paradoxically, testing the boundary conditions is difficult when you’re doing your own testing, while the communication errors described above make it easier for an outside lab to test the boundary conditions. It’s almost inevitable that they’ll test some boundary conditions. The reason for this is that boundary conditions are defined by one’s assumptions, and people are generally pretty poor at identifying and thinking through their own assumptions. Mentally, we tear right through the assumptions to the interesting bits. The outside lab, though, isn’t going to be quite testing the expected conditions; they’ll always be nearer at least one set of boundary conditions.

One common solution used by labs is to develop a detailed questionnaire to try to force their customers to detail their request in the lab’s terms. I’ve done this myself. It doesn’t work. A questionnaire, like a checklist, can help capture the things you know you’d otherwise overlook, but neither a questionnaire nor a checklist can bridge a communication gap.

The solution that works is to send the lab personnel out into the gemba; to go and see the customer’s world and understand what they’re doing and why they’re making their request. This is difficult. The lab may be geographically distant from the gemba. People working in a lab often got there by being independent thinkers and workers, and not by being very gregarious. Labs are also paid or evaluated for the testing they do; not for the customer visits they make. A lab manager needs to be able to overcome these obstacles.

Communication

Once the lab has the right, good data, there’s still one big challenge left. All that data goes to waste if it isn’t communicated effectively. This means understanding that the data tells a story, knowing what that story is, and then telling that story honestly and with clarity. One could write several books on this subject. I direct your attention to the exceptional works of Edward Tufte, especially The Visual Display of Quantitative Information, an excellent NASA report by S. Katzoff titled Clarity in Technical Reporting, and the deep works of William S. Cleveland, including The Elements of Graphing Data and Visualizing Data. If you don’t have these in your library, then you’re probably not communicating as clearly and effectively as you should.

Effective communication is critical even when you’re doing your own testing. It provides a record of your work so that others can follow in your footsteps. Without effective communication, whatever you learn stays with you, and you lose the ability to leverage the ideas and experience of others.

Pursuing Perfection

While we have to get the job done today, we probably haven’t delivered everything your customer needed. There’s always opportunity for improvement. A good lab recognizes this, constantly engages in self reflection and finds ways to improve. This is a people-based activity. A good manager enables this critical self-reflection and supports and encourages the needed changes.

Such critical self-reflection is not always easy. In many business environments, an admission of imperfection is an invitation to be attacked, demoted, or fired. Encouraging the self-reflection that is crucial to improvement requires building trust with your employees; providing a safe environment. Employees have to be comfortable talking about their professional faults, having others talk about those faults, and they have to believe that they can improve.

Being genuine and honest can help a manager move their group in this direction. Ensuring that there are no negative consequences to the pursuit of perfection will also help. Unfortunately, it’s not entirely up to the manager; it’s a question of politics, policies and corporate culture. Effective managers and leaders need to navigate these waters for the good of their team. They’ll do this better with the support and involvement of their team.

Innumeracy

A number of blogs that I follow are talking about a recent article in the Wall Street Journal, We’re Number One, Alas. The author argues that the U.S.’s corporate tax rate is too high, claiming that countries with somewhat lower corporate tax rates generate more revenue from those taxes as a fraction of GDP. He uses the graph below to make his point.

Corporate Taxes and Revenue, 2004

The Laffer Curve on this graph is claimed to show the relationship between revenue and tax rate. A Laffer curve is based on the hypothesis that a government generates zero revenue when the tax rate is at either zero percent or one hundred percent, and that the maximum revenue from taxes falls somewhere in between these two extremes. The author is claiming that the optimum is below the current U.S. rate, and illustrates this by placing the U.S. on the far side of a big cliff.

This is an egregious case of innumeracy. I told myself when I started blogging that I would steer clear of the blog echo chamber as much as possible, but it is not all that uncommon to see similar presentations in the corporate world. Some data points are plotted, and then some chartjunk is added to tell a story…a story that may not be supported by the data at all. There are a few things that managers and engineers can do to combat this. For instance, if there’s supposed to be a correlation between values, we can ask for the correlation coefficient.

In this case, the correlation coefficient is about 0.1. This is equivalent to saying that just ten percent of the variation in revenues from taxes is due to the tax rate. If you were working on process improvement, you would not want to focus on a factor that only accounted for ten percent of the variation; you would be looking for a factor that explained greater than fifty percent.

Another approach would be to ask for a hypothesis test. The hypothesis would be that there is no correlation; the alternate hypothesis is that there is some correlation. As a business manager, you want to select the level of risk that you’re willing to accept. This is an economic decision, as risk analysis usually is. For the sake of argument, we will accept a risk of five percent. This is our “alpha” value (α-value), which we’ll express as a fraction: 0.05. We now need to perform the appropriate hypothesis test and compare the resulting p-value against our α-value. If the p-value is lower than the α-value, then we have correlation; if the p-value is greater than the α-value, there is no correlation.

There are plenty of statistics packages out there that can perform these analyses for us. Some are easier to use than others; some are more powerful than others. We use Minitab at work, and I find it indispensable. I also drop into R occasionally. R is much more powerful and free, but it’s all command-line programming, so it also has a much larger learning curve.

The p-value on this data is greater than 0.05, which means there is no linear correlation between the revenue and the tax rate.

Linear, though, means you have a straight line, and the Laffer curve is not linear by definition. The data fails our first tests, but the assumptions in our tests may have driven us to a false failure.

Let’s go back and start by plotting just the data.

Corporate Taxes and Revenue, 2004, data only

The exaggerated Laffer curve in the original presentation is not evident in this data. Excluding the outlier where revenue is 0.1 of GDP (looking at the Wall Street Journal’s graph, we see that this is for Norway), the data is roughly linear: zero revenue at a zero percent tax rate, and slightly increasing revenue with increasing tax rate. There may be a slight rounding-off or flattening in the tax rate range 0.20 to 0.35.

Since we do not know what shape the Laffer curve should take—where the maximum should be—and we don’t have enough data to find it, we can use the Lowess algorithm to create a smoothed curve.

Corporate Taxes and Revenue, 2004, with Lowess curve

This confirms our observation that the relationship is essentially linear, with a possible rounding off above 0.20. I’ve added a rug plot to the axes, which gives a tick for every data point. This is useful because it helps us to focus on the distribution of the data, much as a separate histogram would.

Where does all this get us? It tells us that the author’s curve was most likely drawn in just to make his point and does not fit any data or data analysis. It also tells us that the author’s story had nothing to do with the data.

I have seen this many times in the corporate world. Graphs of neat lines, where all the data points have been averaged out and left out. Graphs where a preconceived curve is fitted to data without regard for how well (or poorly) the curve fits the data. Data may be removed completely and fitted curves smoothed until they look “look right.”

Combating this is not hard. It just takes some thought and a few questions.

First, make sure you actually see the data, and not just some prettified lines. Graphs should contain data points, and real data usually is not terribly clean or smooth.

Second, when a curve is fit to the data, ask what the correlation is. This should be a number, and less than 0.5 (or 50%) means there is no useful correlation. The closer to 1 (or 100%), the better. Ask, too, what the basis of the fitted line is: is this just some high-order polynomial or spline that was selected because of the high correlation, or is there a solid physical—or theoretical—basis for the selected line? If there is no physical basis, straight lines are preferable to curves.

Third, ask for a numerical analysis. Graphs are powerful; they allow us to see all the data, to see the trends, and to determine what sorts of numerical analyses are possible. However, graphs are not a substitute for numerical, or statistical, analysis. The two compliment each other. So ask for the hypothesis statement and the alternate hypothesis. Ask for the p-value and the α-value (which should have been selected before the experiment was conducted).

I realize that this is unfamiliar territory for a lot of managers, who have little mathematical background and often no training in statistics. It’s not hard to ask the questions, and it’s not hard to understand the answers—they’re pretty straight-forward—and I don’t think that you need to have a deep understanding of the mathematics. Let the experts be the experts, but ask questions to make sure that you are covering the risk.