Let’s just dump the lab data on the Internet – not that easy

July 17, 2004

When I was at Chiron Diagnostics, a marketing manager was giving a talk about his informatics strategy. He noted the difficulty that labs had in accessing data from a variety of sources, especially with regard to “legacy” databases and offered as part of his strategy – “let’s just dump all of this data on the internet”.

I attended a talk recently about reducing errors in POCT (point of care testing). The speaker was advocating using web browsers to access lab data on the internet and offered as an advantage that “non-programmers can help design the software”.

These statements imply that setting up a program to use the internet to access lab data is easy. Since almost anyone can design a web page, then can anyone design a web based system to access lab data? The answer is no and this is why.

There is no “dumping the data on the internet”. The data, whether it will be on the internet or is on a network, still resides in a database (often SQL Server or Oracle) and one must be knowledgeable about these databases, including the table structures, the SQL query language, database security, and so on. So one must start with this knowledge but using the internet makes things more complicated not easier.

For example, the internet uses a stateless protocol. To see what a stateless protocol is consider the task of going to a web site, for example: http://www.aacc.org. Depending on your connection speed, after a short time, your screen fills up with the AACC start page and it is common to say “I’m at the AACC site and reading the content.” Although everyone says this, what really has happened is that when you have clicked on a hyperlink such as the AACC link, a request is sent from your PC to the AACC server to download the start page to your PC. When the download is complete, you are no longer connected to the AACC web site – you are reading the content from your own PC. In fact, if after the download was complete, the AACC server failed and was no longer online, you wouldn’t be aware of this unless you pressed reload.

This makes retrieving data from a database through the internet more complicated than over a network. Of course, it’s done all of the time, by using middleware such as ASP.net, which is a programming language. Unlike designing a web page, using these programming languages is not for everyone. Finally, one must also understand the web server upon which this service runs, especially the security and authentication methods.

So non-programmers really can’t help design the software.

How to specify and estimate outlier rates

July 17, 2004

Outliers are often distinguished from other error sources because the root cause of the outlier may differ from other error sources or because some authors recommend different disposition of outliers once they are detected (often as in “don’t worry about that result – it’s an outlier”). Unfortunately, some of these practices have lead to the neglect of outliers. Outliers are errors just like all other errors; just larger. Moreover, outliers are often the source of medical errors, since a large assay result error can lead to an incorrect medical treatment (1).

Setting outlier goals

An outlier goal is met if the number of observations in region A in Figure 1 is below a specified rate. A total error goal is met if the percentage of observations in region B is at or greater than a specified percentage (often 95% or 99%) – see for example the NCCLS standard  EP21A (2). The space that is between regions A and B is specified to contain the percentage of observations equal to B – A.

Figure 1. Outlier and Total Error Limits



Outlier limits
Total error


Estimating outlier rates

The difficulty in estimating outlier rates is that one is trying to prove that an unlikely event does not happen. There are two possible ways to do this and each have their advantages and disadvantages. Moreover, outliers are often the result of a different distribution than most of the other results. This makes it impossible to estimate outlier rates by simply assuming that all results come from a normal distribution.

Method Advantage Disadvantage
Modeling Requires fewer samples Modeling is difficult (and time consuming) – if wrong, the estimated outlier rate will be wrong
Counting No modeling is required Requires a huge number of samples



There are several types of modeling methods. One is to create a cause and effect or fishbone diagram of an assay and simulate assay results by selecting random observations from each assumed or observed distribution of assay variables to create an “assay result” and subtracting an assumed reference value from this result to obtain an assay error. The distribution of these differences allows one to estimate outlier rates.

The “GUM method” (guide to the expression of uncertainty in measurement) also starts with a cause and effect or fishbone diagram of an assay.  In the GUM method, a mathematical model is used to link all random and systematic errors sources . All systematic errors are either corrected by adjustment or can be converted into random errors when the error is unexplained. All (resulting) random errors are combined using the mathematical model, and following the rules of the propagation of error, to yield a standard deviation which expresses the combined uncertainty of all error sources. A multiple of this standard deviation (the coverage factor) provides a range for the differences between an assay and its reference for a percentage of the population of results. By selecting a suitable multiplier, one may estimate the magnitude of this range of differences (e.g., the outlier limits) for the desired percentage of the population (e.g., the outlier rate) that corresponds to the outlier goal. A concern with use of the GUM method is that it requires modeling all known errors. If an error is unknown, it won’t be modeled and the GUM standard deviation will be underestimated (3).


A FMEA (Failure Mode Effects Analysis) seeks to identify all possible failure modes and for those modes that are ranked as most important, mitigations are implemented to reduce risk. Thus, at the end of a FMEA, one has the potential to quantify outliers rates although in practice in clinical chemistry final outlier risk is rarely quantified. FMEA is important because risk is assessed for non continuous variables, such as the risk of reporting an assay value for the wrong patient.


In the counting method, outliers are considered as discrete events. That is, each assay result is judged independently from every other result to be either an outlier or not, based on the magnitude of the difference between the result and reference. Of course, the choice of reference method is important. If the reference method is not a true reference method but a comparison method (another field method), then there is no way to know that a large difference that is being called an outlier is due to the new method or existing method.

The rate of outliers is simply the numbers of outliers found divided by the total number of samples assayed and converted to a percent.

Outlier rate = (x/n) * 100

where   x = the numbers of outliers found

n = the total number of samples assayed

This rate is not exact because it is a sample. Hahn and Meeker present a method to account for this uncertainty (4). The table shows for various numbers of total observations and outliers found, the maximum percentage outlier rate with a stated level of confidence. This gives one an idea of sample sizes required to prove the maximum outlier rate.

Sample Size Number Outliers Found Maximum Percent Outlier Rate (95% Confidence) Maximum Percent Outlier Rate (99% Confidence) ppm Outlier Rate (95%) ppm Outlier Rate (99%)
10 0 25.9 36.9 259,000 369,000
100 0 3.0 4.5 30,000 45,000
1,000 0 0.3 0.5 3,000 5,000
1,000 1 0.5 0.7 5,000 7,000
10,000 0 0.03 0.05 300 500
10,000 1 0.05 0.07 500 700
10,000 10 0.2 0.2 2,000 2,000
The following entry is a “six sigma” process
881,000 0 3.40037E-04 5.23E-04 3.4 5.2


Understanding the table entries

Using the third row as an example, 1,000 samples have been run and no outliers have been found. The estimated outlier rate is zero. However, this is only a sample and subject to sampling variation. Using properties of the binomial distribution allows one to state with 95% confidence that there could be no more than 0.3% outliers for the true rate. This is equivalent to saying that in 1,000,000 samples there could be no more than 3,000 outliers. 

“Six sigma” and outliers

The popular six sigma paradigm assumes that if one has a process with a 1.5 standard deviation shift and variation of 6 standard deviations, the number of defects will be 3.4 per million. Defects per million for 1 to 6 sigma are shown on the following table.

SIGMA (SL) NORMSDIST(SL) SL+1.5 1.5-SL Prob. Good Prob. Defect Defects per million
1 0.84134474 0.99379 0.691462 0.302328 0.697672 697672.1
2 0.977249938 0.999767 0.308538 0.69123 0.30877 308770.2
3 0.998650033 0.999997 0.066807 0.933189 0.066811 66810.6
4 0.999968314 1 0.00621 0.99379 0.00621 6209.7
5 0.999999713 1 0.000233 0.999767 0.000233 232.7
6 0.999999999 1 3.4E-06 0.999997 3.4E-06 3.4


These results assume a normal distribution. In a diagnostic assay, it would be difficult if not impossible to prove that all results are normally distributed. However, the corresponding entry in the bottom of the first table corresponds to a six sigma process of 3.4 defects.


Laboratories are not going to run 10,000 samples (nor should they) to prove that there are no outliers. Unfortunately, there are proposals to get laboratories to perform a limited type of GUM modeling which is totally inadequate and would prove nothing (3). Manufacturers could (and do) run large numbers of samples during assay development but don’t want to include estimation of outlier rates in their product labeling.

Thus, outliers remain an ignored topic and only surface when they cause problems. One possible remedy would be a uniform way for manufacturers to report outlier studies as part of their product labeling.


  1. Cole LA, Rinne KM, Shahabi S, and Omrani A. False-Positive hCG Assay Results Leading to Unnecessary Surgery and Chemotherapy and Needless Occurrences of Diabetes and Coma Clin Chem 1999;45:313 – 314
  2. National Committee for Clinical Laboratory Standards. Estimation of total analytical error for clinical laboratory methods; approved guideline. NCCLS document E21-A 2003 NCCLS Villanova, PA
  3. Krouwer JS Critique of the Guide to the Expression of Uncertainty in Measurement Method of Estimating and Reporting Uncertainty in Diagnostic Assays Clin Chem 2003;49:1818-1821.
  4. Hahn GJ and Meeker WQ. Statistical intervals. A guide for practitioners. Wiley: New York, 1991, p. 104

Six Sigma: An example, or a more serious technical issue 7/2004

July 14, 2004

In the first part of this series, some technical comments about six sigma were made – however these comments are largely curiosities as six sigma is really a collection of quality management tools.

Here the attempt is made to work through an example for a diagnostic assay.

The assay selected is a home glucose assay and the first step is to set a goal. Although there are a lot of ways to do this, there is an ISO document (15197) which contains goals. According to this standard, the “minimum acceptable accuracy” goal is:

“Ninety-five percent (95%) of the individual glucose results shall fall within ± 0,83 mmol/L (15 mg/dL) of the results of the manufacturer’s measurement procedure at glucose concentrations < 4,2 mmol/L (75 mg/dL) and within ±20 % at glucose concentrations >= 4,2 mmol/L (75 mg/dL).”

The ISO document says that “the minimum acceptable accuracy criteria are based on the medical requirements for glucose monitoring”.

First a word on terminology. In the ISO world, accuracy does not mean freedom from bias, rather it means freedom from all sources of error and is comparable to total error.

Now here is the problem. If one looks at the 75 mg/dL or less range, the goal states that one should not get values outside of ± 15 mg/dL, based on medical requirements. The problem is that this is stated for 95% of all values. So for an assay that just meets requirements this means that just under 5% of the values could exceed the goal which means that the number of defects one will see is 50,000 per million assays! In six sigma terms, this is close to a 3 sigma process (3.14487). This also means that this ISO document says that a 3 sigma process is acceptable.


Sigma Defects per Million
1 697,672.1
2 308,770.2
3 66,810.6
4 6,209.7
5 232.6
6 3.4


With 3 sigma goals, one should not expect six sigma processes.

Some technical comments about Six Sigma in the lab 7/2004

July 13, 2004

To recall, six sigma implies a process with 3.4 defects per million. Assume that one has a set of measurements that are normally distributed, for example lab glucose differences from reference. Many people remember that ± 2 standard deviations includes 95% of the values. The idea behind six sigma is that if one sets goals of ± 6 standard deviations, there will be very few defects (where defects are defined as being outside of the ± 6 standard deviation limits = 6 sigma. Of course, these limits must be clinically meaningful, one can’t just multiply the observed standard deviation by six to get the limits. So one starts with limits and sees where the measured standard deviation is and calculates how many “sigma” the process exhibits.

How do you get 3.4 defects – When I go through the math (1) to get the number of defects for a six sigma process, I get 0.00197, not 3.4 defects:

Defects per million = 1,000,000 * number of defects

where number of defects = (1 – probability good)

where probability good = F(6 sigma) – F(-6 sigma)

where F() = the cumulative distribution function of the normal distribution

What’s wrong? Six sigma assumes a 1.5 sigma bias, so if one repeats the calculations where probability good = F(1.5 + 6 sigma) – F(1.5 – 6 sigma), one gets 3.4.

Six sigma is defined for a normally distributed process. – Well, one might make the case that lab glucose differences from reference might be normally distributed but then again they might not, especially since when things go wrong there are outliers which tend to make things non normal. If the data are non normal, the six sigma numbers no longer apply.

But what does six sigma mean if one is talking about attribute data such as patient ID problems rather than continuous variables such as glucose differences. A patient ID mix up is clearly a defect but this does not come from a normally distributed process. Each attempt to match a patient ID and sample is either correct or incorrect. While one can count defects per million, this has nothing to do with “sigma” (e.g., a standard deviation).

A six sigma process has bias – Hard to believe but a six sigma process is defined to have a bias of 1.5 sigma. So what the big deal with this bias, if inclusion of a 1.5 sigma bias for a six sigma process still results in only 3.4 defects per million. The problem has been explained by George Klee, who shows that even a small bias can lead to increased patient misclassifications (2), which raises the risk for a wrong patient treatment. As an aside, a six sigma process with its 1.5 sigma bias, would not be suitable for the ISO standard GUM (the guide to the expression of uncertainty in measurement) since GUM requires all biases to be eliminated. So where did this bias come from. The originators of the six sigma concept were from Motorola, where it was observed that the average process tended to have a 1.5 sigma (drift) bias. Hence the inclusion of this bias was to be representative of real processes.

Does six sigma help labs, hospitals, or diagnostic companies – The above comments have nothing to do with the desirability of usefulness of six sigma. They are technical comments / answers for questions that I had. Of course no one will build in a bias to try to achieve six sigma! Six sigma is really a collection of quality improvement tools (3) and as such the name is somewhat misleading. A better name might be Total Quality Management – II (or a higher number). For example, FMEA (Failure Mode Effects Analysis) is part of Six Sigma and may involve an entire program without the measurement of a single standard deviation. Most “black belts” are managers and are neither statisticians nor people who have a career in data analysis and quality. The success of six sigma in organizations is a function of management commitment and training.


  1. Lucas JL. The essential six sigma Quality Progress 2002;35:27-31.
  2. Klee GG. Analytic performance goals based on direct effect of analytic bias on medical classification decisions. CDC 1995 Institute: Frontiers in Laboratory Practice Research. pp 219-226. Available online at : www.phppo.cdc.gov/dls/pdf/institute/klee.pdf
  3. Hahn GJ, Hill WJ, Hoerl RW, and Zinkgraf SA. The impact of six sigma improvement – A glimpse into the future of statistics. The American Statistician 1999;53:208-215