Commutability and revival of a 39 year old model

March 13, 2018

Commutability is a hot topic these days and it should be. One would like to think that someone tested on one system will get the same result if they are tested on another system.

In reading the second paper (1) in a three series set of articles, I note that a term for interferences is present (in addition to average bias and imprecision) to estimate error. Almost forty years ago, this was suggested (see reference 2).

Although reference 2 was not cited in the Clinical Chemistry paper, at least a model accounting for interferences is being used.



  1. Clinical Chemistry 64:3 455–464 (2018)
  2. Lawton WH, Sylvester EA, Young-Ferraro BJ. Statistical comparison of multiple analytic procedures: application to clinical chemistry. Technometrics. 1979;21:397-409.

Performance specifications, lawsuits, and irrelevant statistics

March 11, 2018

Readers of this blog know that I’m in favor of specifications that account for 100% of the results. The danger of specifications that are for 95% or 99% of the results is that errors can occur that cause serious patient harm for assays that meet specifications! Large and harmful errors are rare and certainly less than 1%. But hospitals might not want specifications that account for 100% of results (and remember that hospital clinical chemists populate standards committees). A potential reason is that if a large error occurs, the 95% or 99% specification can be an advantage for a hospital if there is a lawsuit.

I’m thinking of an example where I was an expert witness. Of course, I can’t go into the details but this was a case where there was a large error, the patient was harmed, and the hospital lab was clearly at fault. (In this case it was a user error). The hospital lab’s defense was that they followed all procedures and met all standards, e.g., sorry but stuff happens.

As for irrelevant statistics, I’ve heard two well-known people in the area of diabetes (Dr. David B Sachs and Dr. Andreas Pfützner) say in public meetings that one should not specify glucose meter performance for 100% of the results because one can never prove that the number of large errors is zero.

That one can never prove that the number of large errors is zero is true but this does not mean one should abandon a specification for 100% of the results.

Here, I’m reminded of blood gas. For blood gas, obtaining a result is critical. Hospital labs realize that blood gas instruments can break down and fail to produce a result. Since this is unacceptable, one can calculate the failure rate and reduce the risk of no result with redundancy (meaning using multiple instruments). No matter how many instruments are used, the possibility that all instruments will fail at the same time is not zero!

A final problem with not specifying 100% of the results is that it may cause labs to not put that much thought into procedures to minimize the risk of large errors.

And in industry (at least at Ciba-Corning) we always had specifications for 100% of the results, as did the original version of the CLSI total error document, EP21-A (this was dropped in the A2 version).

Assumptions – often a missing piece in data analysis for lab medicine

February 24, 2018

A few blog entries ago, I described a case when calculating the SD did not provide an estimate of random error because the observations contained drift.

Any time that data analysis is used to estimate a parameter, there are usually a set of assumptions that must be checked to ensure that the parameter estimate will be valid. In the case of estimating random error from a set of observations from the same sample, an assumption is that the errors are IIDN, which means that the observations are independently and identically distributed in a normal distribution with mean zero and variance sigma squared. This can be checked visually by examining a plot of the observations vs. time, the distribution of the residuals, the residuals vs. time, or any other plot that makes sense.

The model is: Yi = ηi + εi and the residuals are simply YiPredicted – Yi


Total error, EP21, and vindication

February 18, 2018

To recall, total analytical error was proposed by Westgard in 1974. It made a lot of sense to me and I proposed to CLSI that a total analytical error standard should be written. This proposal was approved and I formed a subcommittee which I chaired and in 2003, the CLSI standard EP21-A, which is about total analytical error was published.

When it was time to revise the standard – all standards are considered for revision – I realized that the standard had some flaws. Although the original Westgard article was specific to total analytical error, it seemed that to a clinician, any error that contributed to the final result was important regardless of its source. And for me, who often worked in blood gas evaluations, user error was an important contribution to total error.

Hence, I suggested the revision to be about total error, not total analytical error and EP21-A2 drafts had total error in the title. There were some people within the subcommittee and particularly one or two people not on the subcommittee but in CLSI management, who hated the idea, threw me off my own subcommittee and ultimately out of CLSI.

But recently (in 2018) a total error task force published an article which contained the statement, to which I have previously referred:

Lately, efforts have been made to expand the TAE concept to the evaluation of results of patient samples, including all phases of the total testing process.” (I put in the bolding).

Hence, I’m hoping that the next revision, EP21-A3 will be about total error, not total analytical error.

An observation from the ATTD glucose Conference

February 14, 2018

The 11th International Conference on Advanced Technologies and Treatments for Diabetes (ATTD) is underway in Vienna, Austria. The abstracts from the conference are available here. Here’s an interesting observation: I searched for the term MARD and it was found 48 times whereas the term error grid was found only 10 times. I published a paper describing problems with the MARD statistic and offered alternatives.

Comments about clinical chemistry goals based on biological variation – Revised Feb. 7, 2018

February 5, 2018

There is a recent article which says that measurement uncertainty should contain a term for biological variation. The rationale is that diagnostic uncertainty is caused in part by biological variation. My concerns are with how biological variation is turned into goals.

On the Westgard web site, there are some formulas on how to convert biological variation into goals and on another page, there is a list of analytes with biological variation entries and total error goals.

Here are my concerns:

  1. There are three basic uses of diagnostic tests: screening, diagnosis, and monitoring. It is not clear to me what the goals refer to.
  2. Monitoring is an important use of diagnostic tests. It makes no sense to construct a total error goal for monitoring that takes between patient biological variation into account. The PSA total error goal is listed at 33.7%. Example: For a patient tested every 3 months after undergoing radiation therapy, a total error goal of 33.7% is too big. Thus, for values of 1.03, 0.94, 1.02, and 1.33, the last value is within goals but in reality would be cause for alarm.
  3. The web site listing goals has only one goal per assay. Yet, goals often depend on the analyte value, especially for monitoring. For example the glucose goal is listed at 6.96%. But if one examples a Parkes glucose meter error grid, at 200 mg/dL, the error goal to separate harm from no harm is 25%. Hence, the biological goal is too small.
  4. The formulas on the web site are hard to believe. For example, I < 0.5 * within person biological variation. Why 0.5, and why is it the same for all analytes?
  5. Biological variation can be thought to have two sources of variation – explained and unexplained – much like in a previous entry where the measured imprecision could be not just random error, but inflated with biases. Thus, PSA could rise due to asymptomatic prostatitis (a condition that by definition that has no symptoms and could be part of a “healthy” cohort). Have explained sources of variation been excluded from the databases? And there can be causes of explained variation other than diseases. For example, exercise can cause PSA to rise in an otherwise healthy person.
  6. Biological variation makes no sense for a bunch of analytes. For example, blood lead measures exposure to lead. Without lead in the environment, the blood lead would be zero. Similar arguments apply to drugs of abuse and infectious diseases.
  7. The goals are based on 95% limits from a normal distribution. This leaves up to 5% of results as unspecified. Putting things another way, up to 5% of results could cause serious problems for an assay that meets goals.

A simple improvement to total error and measurement uncertainty

January 15, 2018

There has been some recent discussion about the differences between total error and measurement uncertainty, regarding which is better and which should be used. Rather than rehash the differences, let’s examine some similarities:

1.       Both specifications are probability based.
2.       Both are models

Being probability based is the bigger problem. If you specify limits for a high percentage of results (say 95% or 99%), then either 5% or 1% of results are unspecified. If all of the unspecified results caused problems this would be a disaster, when one considers how many tests are performed in a lab. There are instances of medical errors due to lab test error but these are (probably?) rare (meaning much less than 5% or 1%). But the point is probability based specifications cannot account for 100% of the results because the limits would include minus infinity to plus infinity.

The fact that both total error and measurement uncertainty are models is only a problem because the models are incorrect. Rather than rehash why, here’s a simple solution to both problems.

Add to the specification (either total error or measurement uncertainty) the requirement that zero results are allowed beyond a set of limits. To clarify, there are two sets of limits, an inner set to contain 95% or 99% of results and an outer set of limits for which no results should exceed.

Without this addition, one cannot claim that meeting either a total error or measurement uncertainty specification will guarantee quality of results, where quality means that the lab result will not lead to a medical error.