A selected catalog of critiques

July 12, 2018

The highlighted articles can be viewed without a subscription.

Imprecision calculations – Evaluations commonly reported total imprecision as less than within-run imprecision. Correct calculations are explained.

How to Improve Estimates of Imprecision Clin. Chem., 30, 290-292 (1984)

Total error models – Modeling total error by adding imprecision to bias is popular but fails to account for several other error sources. These articles (and others) provide alternative models.

Estimating Total Analytical Error and Its Sources: Techniques to Improve Method Evaluation Arch Pathol Lab Med., 116, 726-731 (1992)

Setting Performance Goals and Evaluating Total Analytical Error for Diagnostic Assays Clin. Chem., 48: 919-927 (2002)

Too optimistic project completion schedules – Project managers would forecast completion dates that were never met. The article shows how to get better completion estimates using past data.

Beware the Percent Completion Metric Research Technology Management, 41, 13-15, (1998)

GUM – The guide to the expression of uncertainty in measurement was suggested to be performed by hospital labs. There’s no way a hospital lab could carry out this work.

A Critique of the GUM Method of Estimating and Reporting Uncertainty in Diagnostic Assays Clin. Chem., 49:1818-1821 (2003)

ISO 9001 – There have been many valuable quality initiatives. In the late 80s, ISO 9001 was a program to certify that companies that passed had high quality. But it was nothing more than documentation – it did nothing to improve quality. Maybe the lab equivalent ISO 15189 is the same.

ISO 9001 has had no effect on quality in the in-vitro medical diagnostics industry Accred. Qual. Assur., 9: 39-43 (2004)

Bland-Altman plots – Bland-Altman plots (difference plots) suggest plotting the difference of y-x vs. (y+x)/2 in order to prevent spurious correlations. But the article below shows that if x is a reference method, following Bland and Altman’s advice will produce a spurious correlation. The difference of y-x vs x should be plotted when x is a reference method.

Why Bland-Altman plots should use X, not (Y+X)/2 when X is a reference method Statistics in Medicine, 27 778-780 (2008)

Six Sigma – This metric is often presented as a sole quality measure but it basically measures only average bias and imprecision. As this article shows there can be severe problems with an assay even when it has a high sigma.

Six Sigma can be dangerous to your health Accred Qual Assur 14 49-52 (2009)

Glucose standards – The glucose meter standard ISO 15197 has flaws. This letter pointed out what the experts missed in a question and answer forum.

Wrong thinking about glucose standards Clin Chem, 56 874-875 (2010)

POCT12-A3 – The article explains flaws in this CLSI glucose standard

The new glucose standard POCT12-A3 misses the mark Journal of Diabetes Science and Technology, September 7 1400–1402 (2013)

Regulatory approval evaluations – The performance of assays during regulatory evaluations is often quite better than when the assays are in the field. The articles gives some reasons why.

Biases in clinical trials performed for regulatory approval Accred Qual Assur, 20:437-439 (2015)

MARD – This metric to classify glucose meter quality leaves a lot to be desired. The article below suggests an alternative

Improving the Glucose Meter Error Grid with the Taguchi Loss Function Journal of Diabetes Science and Technology, 10 967-970 (2016)


How to insult manufacturers

June 22, 2018

In the third article about commutability, one of the reasons for non commutability is said to be the wrong calibration model used for the assay.

First of all, an incorrect calibration model has nothing to do with commutability. Sure, it causes errors but so do a lot of things, like coding the software incorrectly.

But what’s worse is the example given. There are a bunch of points that are linear up to a certain level and then the response drops off. In this example, the calibration model chosen is linear, which of course is wrong. But come on people, do you really think a manufacturer would mess this up!

Advice to prevent another Theranos

May 28, 2018

Not surprising that there a bunch of articles about Theranos. An article here from Clin Chem Lab Med wrote “We highlight the importance of transparency and the unacceptability of fraud and false claims.”

And one of the items in the table that followed was:

“Do not make false claims about products…”

Is the above really worth publishing? On the other hand, the article talks about an upcoming movie about Theranos starring Jennifer Lawrence. Now that is worth publishing.

Big errors and little errors

May 27, 2018

In clinical assay evaluations, most of the time, focus is on “little” errors. What I mean by little errors are average bias and imprecision that exceed goals. Now I don’t mean to be pejorative about little errors since if bias or imprecision don’t meet goals, the assay is unsuitable. One of the reasons to distinguish between big and little errors is that often in evaluations, big errors are discarded as outliers. This is especially true in proficiency surveys but even for a simple method comparison, one is justified in discarding an outlier because the value would otherwise perturb the bias and imprecision estimates.

But big errors cause big problems and most evaluations focus on little errors, so how are big errors studied? Other than running thousands of samples, a valuable technique is to perform a FMEA (Failure Mode Effects Analysis). This can or should cover user error, software, interferences, besides the usual items. A FMEA study is often not very enthusiastically received but it is a necessary step in trying to ensure that an assay is free from both big and little errors. Of course, even with a completed FMEA, there are no guarantees.


More on Theranos and Bad Blood

May 25, 2018

I finished the book Bad Blood, which chronicles the Theranos events. It was hard to put the book down as it is seldom to hear about events like this in your own field.

I experienced some of the things that happened at Theranos as I suspect many others did as they are not unique to Theranos such as:

  • Bad upper management, including a charismatic leader
  • Hiring unqualified people
  • Establishing unrealistic product development schedules
  • Loosing good people
  • Having design requirements that make little sense but cause project delays
  • Poor communication among groups

But I never experienced falsifying data.

Theranos started in 2003. By 2006, they were reporting patient results on a prototype. From 2006 until 2015, when the Wall Street Journal article appeared, they were unable to get their system to work reliably. Nine years is way too long for me too technology – the above bullet points may be an explanation.

Finally, a pathologist who wrote an amateur pathology blog was the source of the tip to the Wall Street Journal report (and Bad Blood author).

Added 5/26/18 – Among the Theranos claims was to be able to report hundreds of tests from a drop of blood. Had this been achieved it would have been remarkable. Another was that with all of these blood tests performed by Theranos, healthcare would be dramatically improved. This claim never made any sense. Most people are tested today with as many blood tests as needed.


Theranos and Bad Blood

May 22, 2018

Having watched the 60 minutes story about Theranos, I bought the book that was mentioned, Bad Blood. Actually, I preordered it (for Kindle) but it was available the next day.

Here is an early observation. Apparently, even back in 2006, demonstrations of the instrument were faked. That is, if a result could not be obtained with the instrument, it was nevertheless obtained through trickery.

At the time, there were several small POC analyzers on the market. So I don’t see what was so special about the Theranos product.  And with all those scientists and engineers, why was it so difficult to reliably obtain a result? Early on, the Theranos product used microfluidics but then changed to robotics.

More to follow.

A simple example of why the CLSI EP7 standard for interference testing is flawed

May 10, 2018

I have recently suggested that the CLSI EP7 standard causes problems (1). Basically, EP7 says that if an interfering substance results in an interference less than the goal (commonly set at 10%), then the substance can be reported not to interfere. Of course, this makes no sense. If a substances interferes at a level less than 10%, it still interferes!

Here’s a real example from the literature (2). Lorenz and coworkers say “substances frequently reported to interfere with enzymatic, electrochemical-based transcutaneous CGM systems, such as acetaminophen and ascorbic acid, did not affect Eversense readings.

Yet in their table of interference results they show:

at 74 mg/dL of glucose, interference from 3 mg/dL of acetaminophen is -8.7 mg/dL

at 77 mg/dL of glucose, interference from 2 mg/dL of ascorbic acid is 7.7 mg/dL


  1. Krouwer, J.S. Accred Qual Assur (2018). https://doi.org/10.1007/s00769-018-1315-y
  2. Lorenz C., Sandoval W, and Mortellaro M. Interference Assessment of Various Endogenous and Exogenous Substances on the Performance of the Eversense Long-Term Implantable Continuous Glucose Monitoring System. DIABETES TECHNOLOGY & THERAPEUTICS Volume 20, Number 5, 2018 Mary Ann Liebert, Inc. DOI: 10.1089/dia.2018.0028.