|I have been interested in ways to compare different assays. The recent guest essay on the Westgard web site on this topic stimulated the following comments.
The title of the essay is: “The Quality Goal Index – Its Use in Benchmarking and Improving Sigma Quality Performance of Automated Analytic Tests” The essay starts with:
“The long-term goal of six sigma quality management is to achieve an error rate of 3.4 or less per million opportunities for all laboratory processes. In percent terms, that’s an error rate of less than 0.001%.”
A few sentences later, the author is describing how to calculate “sigma performance” as a measure of quality performance and states that for the CV estimate:
“Data integrity can be assured only when procedures are in place and rigorously practiced to exclude erroneous quality control results due to procedural blunders and statistical outliers.”
This sentence causes me to question whatever follows with suspicion. The author excludes two types of errors: blunders and statistical outliers. From a clinician point of view, interest is in obtaining the correct answer (meaning “correct enough”). If an incorrect answer is produced by a blunder, it is nevertheless wrong. So right off the bat, one knows that whatever is being measured by the author is a subset of real quality performance and one has no doubt heard that the majority of clinical laboratory errors come from pre or post analytical problems.
But perhaps the author intends to measure a subset of quality performance – that due to the analytical process. Then I have a problem with excluding statistical outliers. There is no justification for this exclusion. From a simple numbers view, assume that excluded data are greater than 3 standard deviations from the average. This means excluding about 0.3% of the data, if the data are normally distributed. Well, that’s one way to get to an error rate of less than 0.001%! One might argue that exclusion of a specific outlier is ok because the outlier must have been due to a blunder, but that is speculation and it is possible that the outlier occurred as part of the analytical process – one simply does not know.
From another point of view, if one constructs a Clarke or Parkes type of error grid (e.g., similar to that used for glucose), then dangerous errors will be (by definition of this grid) large errors in certain regions and are likely to be outliers (in a statistical sense) and could easily be excluded. But these are the very errors that should be measured (see essay on FDA’s new waiver guidance).
Along these lines, note that serious assay errors are associated with patient harm. These rare values:
The same arguments apply to proficiency testing which often have automated outlier rules that “clean” the data.
This same trend to exclude outliers has been prevalent in discussions to publish a CLSI standard for GUM (Guide to the Expression of Uncertainty in Measurement).
But not excluding outliers messes up our analysis
Welcome to the real world. That’s true. This is why I recommend to assess quality control data which:
Even if outliers are included
Remember that even if outliers are included, the quality performance measured is still a subset of the analytical performance because random biases, especially those due to patient interferences will be missed because:
This is described in more detail in another essay.
In the list of essays, I used Outlier……………..s. I first saw this in: Beckman, R. J., and R. D. Cook, (1983). Outlier…s. Technometrics, vol. 25, pp. 119-149.
Outliers in Quality Control and Proficiency Testing