“I so anoint myself”

June 29, 2006

“The following list presents 10 persons who have made a significant impact on the IVD industry.” This is how the magazine IVD Technology begins and then gives a short description of each of the 10 people (1). Two of the people listed in the top 10 happen to be on the editorial advisory board of IVD Technology (2). Hmmm…..

About half of the editorial advisory board are in regulatory affairs and four of the top 10 are also in regulatory affairs (including the two above). In case you’re wondering, Leonard Skeggs, the inventor of the auto-analyzer didn’t make the list! OK, to be fair, the text also says “Efforts were made to ensure that this list reflects contributions in both the regulatory and scientific areas.”, but the title and first sentence are misleading.


  1. Top 10 Persons in the IVD Industry IVD Technology April 2005, see http://www.devicelink.com/ivdt/archive/05/04/002.html
  2. See, http://www.devicelink.com/ivdt/eab.html


“No reaction”

June 29, 2006

In November of 1998, I was invited to attend the chairholder’s council of NCCLS (now called CLSI). This is a meeting of the leaders of the committees that produce clinical laboratory standards. During the meeting, NCCLS started a quality initiative kicked off with a keynote speech and rationale for the program by David Nevalainen, listed at the time as from the Abbott Quality Institute (1). He presented a quality system quite similar to ISO 9000. I commented at the presentation that in my experience, ISO 9000 (upon which the NCCLS quality system is based) has had virtually no impact on quality in industry. (I believe this is still true). There was no reaction to my comment.

One year latter, I was attending the November 1999 chairholder’s council. In the lobby of the hotel, I was reading the Wall Street Journal when I noticed that one of the top stories was about an FDA fine for Abbott quality problems. The fine was for 100 million dollars and ordered Abbott to stop selling certain assays (2). When I tried to point out to NCCLS senior management the connection among the NCCLS quality system, Nevalainen, and Abbott, I got no reaction.


  1. See, for example: http://arpa.allenpress.com/arpaonline/?request=get-document&doi=10.1043%2F0003-9985(1999)123%3C0566:TQSA%3E2.0.CO%3B2
  2. Abbott to pay $100 million in fine to U.S. The Wall Street Journal, November 3, 1999.


“If it isn’t in ISO, it doesn’t exist”

June 29, 2006

There is a CLSI subcommittee that deals with risk management. One of the European participants had trouble with the word mitigation as in the term “risk mitigation.” It was pointed out that the ISO standard on risk management 14971 does not contain the term risk mitigation primarily because of translation difficulties and therefore, the CLSI standard should not use this term.

Now this translation problem baffles me as ISO standards are in English. Moreover if one does a search in Google for risk mitigation, one will get over 4 million hits.


Medical diagnostics industry participates in fake news

June 23, 2006

You may (or may not) be aware that some news stories aired by television news stations are provided by companies and the news station fails to disclose this. Hence, this is often referred to as “fake news”. The medical diagnostic industry participates in fake news. For a segment on allergy testing provided by Quest Diagnostics and aired by KABC-7 (Los Angeles), go here.

Why Bland Altman plots should use X, not (X+Y)/2 when X is a reference method

June 18, 2006

This essay has been published in Statistics in Medicine. (Jan S. Krouwer: Why Bland-Altman plots should use X, not (Y+X)/2 when X is a reference method. Statistics in Medicine, 2008;27:778-780). It is no longer available on this web site.

The Excel simulation file is still available below. 


More on GUM

June 17, 2006

I have critiqued the use of GUM (Guide to the expression of uncertainty in measurement) for commercial diagnostic assays (1) and also commented on a Letter about GUM (2).

To review why I don’t favor the use of GUM for commercial diagnostic assays:

  • GUM is an extremely complicated modeling method with respect to the capabilities of most clinical laboratories
    • This leads to “simplified” versions of GUM for clinical laboratories which are completely inadequate (3). Whereas one can argue that these simplified methods aren’t GUM, they may nevertheless be claimed as such.
  • GUM requirements won’t be met in many cases. For example:
    • many assays don’t meet the definition of a well defined physical quantity
    • one must correct known errors, which is impractical if not impossible for users of commercial diagnostic assays, who must know what the errors are and how to fix them, and many assays do have problems (although most results are within medically acceptable limits).
  • GUM typically estimates the 95% limits of the error distribution. Whereas this is useful information, GUM provides no information about the remaining 5% of errors – note that the assumption that all data is or has been transformed to Normality is a big stretch.
    • This focus on 95% of the error distribution goes against the patient safety movement of focusing on the largest errors (e.g., the remaining 5%).
  • GUM is unnecessary as one can simply count errors in various severity categories to get rates without the use of complicated modeling with assumptions that may be wrong.

Having said all this, I am still onboard for use of GUM for reference materials.

This essay is about another GUM article for which I published a Letter (4), which prompted a reply from the authors (5). Their article was about use of GUM for serological assays (6). What follows was sent as an eLetter to Clinical Chemistry.

I appreciate the response by Dr. Dimech and understand that analyzing real data is never easy. Of course, I was unaware of Dr. Dimech’s response – I can only react to the words on the paper, not material that is omitted for whatever reason – thus my Letter.

Here is my response to Dr. Dimech’s reply to my Letter combined with his original paper.

Right after the statement to exclude outliers comes the advice:  “It is suggested that results reported by each laboratory are checked for normality by use of a bar graph (See Fig. 1 in the online Data Supplement) or a statistical method such as Grubbs test.”

Normality is usually tested graphically with histograms and / or normal probability plots, not bar graphs. Grubb’s test is not a test for normality – it is a test for outliers and requires normal data! Statistical tests for normality include the Shapiro-Wilk, Kolmogorov-Smirnov, and Anderson-Darling tests.

Perhaps more importantly, consider the authors’ first sentence in the paper:  ”Most regulatory authorities that use International Organization for Standardization (ISO) Standards to assess laboratory competence require an estimate of the uncertainty of measurement (MU) of assay test results.”

At best this sentence is ambiguous. Perhaps the authors mean that one of the components of laboratory competence is an uncertainty interval but one could also interpret this sentence to equate an uncertainty interval with laboratory competence, even though to a clinician, laboratory competence would suggest an acceptable rate of errors from all sources.

In the case of laboratory data, the distribution of errors can be of any shape and can contain large errors, which may or may not be detached from the rest of the error distribution. To a clinician, wrong answers are dangerous, regardless of their source. So, blunders such as the typographical error are part of the population of interest to a clinician. Now for certain purposes, one can define a subset of the population of errors that contain only analytical error sources and exclude pre- and post- analytical error sources. However, this subset can be quickly confused with the total population and the first sentence in this paper will add to this confusion.


  1. Krouwer JS Critique of the Guide to the Expression of Uncertainty in Measurement Method of Estimating and Reporting Uncertainty in Diagnostic Assays Clin. Chem. 2003;49:1818 – 1821.
  2. Stöckl D, Van Uytfanghe K, Rodríguez Cabaleiro D, Thienpont LM, Patriarca M, Castelli M, Corsetti F, and Menditto A Calculation of Measurement Uncertainty in Clinical Chemistry Clin Chem 2005 51: 276-277
  3. White GH and Farrance I Uncertainty of Measurement in Quantitative Medical Testing: A Laboratory Implementation Guide Clin Biochem Rev 2004;25:Suplement ii,S1-S24 available at http://www.aacb.asn.au/pubs/Uncertainty%20of%20measurement.pdf
  4. Krouwer JS Uncertainty Intervals Based on Deleting Data Are Not Useful Clin. Chem. 2006;52:1204 – 1205.
  5. Dimech W Uncertainty Intervals Based on Deleting Data Are Not Useful: Reply Clin. Chem. 2006; 52:1205.
  6. Dimech W, Francis B, Kox J, Roberts G. Calculating uncertainty of measurement for serology assays by use of precision and bias. Clin Chem 2006;52:526-529

Detection Systems – Fault Isolation, Automation, and Diagnostic Accuracy – 6/2006

June 12, 2006
Detection Systems – Fault Isolation, Automation, and Diagnostic Accuracy – 6/2006

First, a quick review

A clinical laboratory’s product is the report provided to clinicians, whose main element is the assay result. The result needs to be as error free as possible to prevent harm to patients. Assay performance goals can be expressed in terms of error grids such as are available for glucose. It is helpful to conceptualize clinical laboratory errors in terms of a fault tree or FMEA. The top level error one wants to prevent is providing an incorrect result to a clinician.

Another possible top level error is delay in the reporting of a result – to keep things simple that is not considered here, but could also lead to patient harm.

This top level error is the “effect” of many possible lower level errors (e.g., causes). In order to prevent the top level error, the clinical laboratory’s quality program tries to address lower level errors either by

  • preventing errors or
  • detecting and recovering from errors.

Note that detection without recovery is not useful and that these are two (separate) steps.

The use of quality control

Quality control is a means of detecting errors. The recovery part of quality control is simple – after a failed quality control result is observed, no patient results are reported since the last successful quality control . This raises an immediate concern about the CMS proposal to allow quality control to be run once a month, as this makes recovery rather useless – all of these potentially incorrect patient results will have been reported to clinicians. To summarize, quality control detects lower level errors and prevents the effect of these errors. In this way, it blocks the error cascade expressed by a fault tree or FMEA.

There is a another task that clinical laboratories must do after a failed quality control and that is to determine why the quality control failed, so as to correct the problem. This is where fault isolation plays a role.

Fault Isolation – Why its important

Fault isolation, when it is present, refers to a detection system, which points to a single root cause for the failure. To see why this is important, consider the following case, where incorrect results are generated by an assay system because of regent degradation caused by the reagent being stored above its maximum allowable storage temperature. To prevent this error, training would be used and perhaps the use of redundant refrigeration systems. In addition, consider two different detection systems to deal with this failure.

Fault isolation absent

Quality Control – The bad reagent can lead to a failed QC. Since failed QC can be caused by many factors, there is no fault isolation. So one must follow a troubleshooting protocol to determine the root cause of failed QC. This troubleshooting ensures that the next set of results will not fail QC – at least not for that root cause!

Fault isolation present

Temperature Sensor on Reagent – A sensor of the reagent box that indicates storage at a too high temperature by a color change does has fault isolation. Of course this relies on another detection step, where one looks at the temperature sensor.


Ideally, one would like all detection systems to have fault isolation since no troubleshooting is required which returns the system quicker to an error free state. But to design in detection systems with fault isolation for all errors, one must have a complete knowledge of all the ways a system can fail.

For the reasons this knowledge is often not the case, see the AACC expert session.

The value of quality control is that in many cases it detects errors, even though no one (the clinical laboratory or the manufacturer) has knowledge that such an error may occur. The disadvantage of quality control is that there is no fault isolation and a corrective action could involve a substantial amount of work. When this corrective action occurs before product release, it is simply part of product development, but when it occurs after product release in a clinical laboratory, it is also product development but conducted in part by the clinical laboratory.

Automated detection recovery systems:

Automated detection recovery systems are desirable and are prevalent on instrument systems. As an example, a sample’s response curve is evaluated by an algorithm. The algorithm can detect whether the response is too noisy, and if so signal the analyzer to suppress reporting that result (e.g., the recovery). Note that either the previous temperature sensor detection system or quality control are manual detection recovery systems.

There is no guarantee that an automated detection recovery system has fault isolation. In the noisy response example, there is no indication of what is causing the noise. For example, it could be a lipemic specimen or alternatively a dirty reaction chamber.

Diagnostic accuracy

The final dimension in this essay is the diagnostic accuracy of the detection system. This was also covered in the AACC expert session and relates the to number of false positives and false negatives that occur with the detection process.

Final Summary

With sufficient knowledge, one would either design a system without errors or employ detection systems for all possible failures. However, one does not have this knowledge. Good detection systems have high diagnostic accuracy, are automated, and have fault isolation. The value of quality control is that in spite of not having fault isolation or being automated, it can catch errors that are missed by detection systems.