Advice to prevent another Theranos

May 28, 2018

Not surprising that there a bunch of articles about Theranos. An article here from Clin Chem Lab Med wrote “We highlight the importance of transparency and the unacceptability of fraud and false claims.”

And one of the items in the table that followed was:

“Do not make false claims about products…”

Is the above really worth publishing? On the other hand, the article talks about an upcoming movie about Theranos starring Jennifer Lawrence. Now that is worth publishing.

Big errors and little errors

May 27, 2018

In clinical assay evaluations, most of the time, focus is on “little” errors. What I mean by little errors are average bias and imprecision that exceed goals. Now I don’t mean to be pejorative about little errors since if bias or imprecision don’t meet goals, the assay is unsuitable. One of the reasons to distinguish between big and little errors is that often in evaluations, big errors are discarded as outliers. This is especially true in proficiency surveys but even for a simple method comparison, one is justified in discarding an outlier because the value would otherwise perturb the bias and imprecision estimates.

But big errors cause big problems and most evaluations focus on little errors, so how are big errors studied? Other than running thousands of samples, a valuable technique is to perform a FMEA (Failure Mode Effects Analysis). This can or should cover user error, software, interferences, besides the usual items. A FMEA study is often not very enthusiastically received but it is a necessary step in trying to ensure that an assay is free from both big and little errors. Of course, even with a completed FMEA, there are no guarantees.


More on Theranos and Bad Blood

May 25, 2018

I finished the book Bad Blood, which chronicles the Theranos events. It was hard to put the book down as it is seldom to hear about events like this in your own field.

I experienced some of the things that happened at Theranos as I suspect many others did as they are not unique to Theranos such as:

  • Bad upper management, including a charismatic leader
  • Hiring unqualified people
  • Establishing unrealistic product development schedules
  • Loosing good people
  • Having design requirements that make little sense but cause project delays
  • Poor communication among groups

But I never experienced falsifying data.

Theranos started in 2003. By 2006, they were reporting patient results on a prototype. From 2006 until 2015, when the Wall Street Journal article appeared, they were unable to get their system to work reliably. Nine years is way too long for me too technology – the above bullet points may be an explanation.

Finally, a pathologist who wrote an amateur pathology blog was the source of the tip to the Wall Street Journal report (and Bad Blood author).

Added 5/26/18 – Among the Theranos claims was to be able to report hundreds of tests from a drop of blood. Had this been achieved it would have been remarkable. Another was that with all of these blood tests performed by Theranos, healthcare would be dramatically improved. This claim never made any sense. Most people are tested today with as many blood tests as needed.


CLSI EP7 3rd Edition

May 24, 2018

I have critiqued how results are presented in the previous version of EP7, where an example is given that if an interference is found to be less than 10% (also implied as less than whatever goal is chosen), the substance can be said not to interfere.

This is in Section 9 of the 2nd Edition. I am curious if this problem is in the 3rd edition but not curious enough to buy the standard.

Theranos and Bad Blood

May 22, 2018

Having watched the 60 minutes story about Theranos, I bought the book that was mentioned, Bad Blood. Actually, I preordered it (for Kindle) but it was available the next day.

Here is an early observation. Apparently, even back in 2006, demonstrations of the instrument were faked. That is, if a result could not be obtained with the instrument, it was nevertheless obtained through trickery.

At the time, there were several small POC analyzers on the market. So I don’t see what was so special about the Theranos product.  And with all those scientists and engineers, why was it so difficult to reliably obtain a result? Early on, the Theranos product used microfluidics but then changed to robotics.

More to follow.

Mandel and Westgard

May 20, 2018

Readers may know that I been known to critique Westgard’s total error model.

But let’s step it back to 1964 with Mandel’s representation of total error (1), where:

Total Error (TE) = x-R = (x-mu) + (mu-R) with

x= the sample measurement
R=the reference value and
mu=the population mean of the sample

Thus, mu-R is the bias and x-mu the imprecision – the same as the Westgard model. There is an implicit assumption that the replicates of x which estimate mu are only affected by random error. For example, if the observations of the replicates contain drift, the Mandel model would be incorrect. For replicates sampled close in time, this is a reasonable assumption, although it is rarely if ever tested.

Interferences are not a problem because even if they exist, there is only one sample. Thus, interference bias is mixed in with any other biases in the sample.

Total error is often expressed for 95% of the results. I have argued that 5% of results are unspecified but if the assumption of random error is true for the repeated measurements, this is not a problem because these results come from a Normal distribution. Thus, the probability is extremely remote that high multiples of the standard deviation will occur.

But outliers are a problem. Typically for these studies, outliers (if found) are deleted because they will perturb the estimates – the problem is the outliers are usually not dealt with and now the 5% unspecified results becomes a problem.

If no outliers are observed, this is a good thing but here are some 95% confidence levels for the maximum outlier rate given the number of sample replicates indicated where 0 outliers have been found.

N                             Maximum outlier rate (95% CI)

10                           25.9%
100                         3.0%
1,000                      0.3%

So if one is measuring TE for a control or patient pool and keeping the time between replicates short, then the Westgard model estimate of total error is reasonable, although one still has to worry about outliers.

But when one applies the Westgard model to patient samples, it is no longer correct since each patient sample can have a different amount of interference bias. And while large interferences are rare, interferences can come in small amounts and affect every sample – inflating the total error. Moreover, other sources of bias can be expected with patient samples, such as user error in sample preparation. And with patient samples, outliers while still rare, can occur.

This raises the question as to the interpretation of results from a study that uses the Westgard model (such as a Six Sigma study). These studies typically use controls but the implication is that they inform about the quality of the assay – meaning of course for patient samples. This is a problem for the reasons stated above. So one can say that if an assay has a bad six sigma value, the assay has a problem, but if the assay has a good six sigma value, one cannot say the assay is without problems.



  1. Mandel J. The statistical analysis of experimental data Dover, NY 1964, p 105.


Speaking of interferences …

May 13, 2018

I have discussed some shortcomings about how interferences are handled. This reminded me of something that I and my coworker published a number of years ago (1).

The origin of this publication came from Dr. Stan Bauer at Technicon Instruments. He was a pathologist with a passion for statistics. He had hired Cuthbert Daniel, a well-known consulting statistician who developed a protocol for the SMA analyzer. This was a nine sample long run of three concentration levels that provided an estimate of precision, proportional and constant bias, sample carryover, linear drift, and nonlinearity. The reason that the protocol worked was the choice of the sample order provided by Cuthbert Daniel.

In 1985, I chose to make a CLSI standard out of the protocol – EP10. It is now in version A3 AMD. (I have no idea what the AMD means).

The protocol could be extended to provide even more information by adding a candidate interfering substance to up to all three concentration levels. Since each level is repeated three times, the interference is added to only one replicate. Using multiple regression, one can now estimate 8 parameters – whereby in addition to the original parameters, the bias (if any) for each of the three interfering substances.

Now one run is virtually useless, but at Ciba Corning, we ran these protocols repeatedly during the development of an assay, so that with multiple runs, if a substance interfered, it would be detected.


Krouwer JS and Monti KL: A Modification of EP10 to Include Interference Screening,. Clin. Chem., 41, 325-6 (1995).

A simple example of why the CLSI EP7 standard for interference testing is flawed

May 10, 2018

I have recently suggested that the CLSI EP7 standard causes problems (1). Basically, EP7 says that if an interfering substance results in an interference less than the goal (commonly set at 10%), then the substance can be reported not to interfere. Of course, this makes no sense. If a substances interferes at a level less than 10%, it still interferes!

Here’s a real example from the literature (2). Lorenz and coworkers say “substances frequently reported to interfere with enzymatic, electrochemical-based transcutaneous CGM systems, such as acetaminophen and ascorbic acid, did not affect Eversense readings.

Yet in their table of interference results they show:

at 74 mg/dL of glucose, interference from 3 mg/dL of acetaminophen is -8.7 mg/dL

at 77 mg/dL of glucose, interference from 2 mg/dL of ascorbic acid is 7.7 mg/dL


  1. Krouwer, J.S. Accred Qual Assur (2018).
  2. Lorenz C., Sandoval W, and Mortellaro M. Interference Assessment of Various Endogenous and Exogenous Substances on the Performance of the Eversense Long-Term Implantable Continuous Glucose Monitoring System. DIABETES TECHNOLOGY & THERAPEUTICS Volume 20, Number 5, 2018 Mary Ann Liebert, Inc. DOI: 10.1089/dia.2018.0028.

Sending protocols to the FDA

May 6, 2018

It seems that recently, companies are sending proposed evaluation protocols to the FDA for review. This never made sense to me. After all, what is expected in an evaluation protocol is not a mystery but prescribed in most cases as available requirements (e.g., three clinical sites). In some cases such as for a waiver assay, FDA guidance is quite specific as to the protocol and analysis. Moreover, sending anything to the FDA has the possible consequence that it will come back with comments (as in we think that you should do XYZ rather than ABC). Hence, before collecting any data, you already have a problem!

Now when sending the protocol was unavoidable I recommended that the proposed protocol be as “vanilla” as possible. For example, I recommended as a precision analysis plan “we will follow EP5.” The problem is that if you propose something detailed, you will be expected to follow it and often the course of an analysis depends on the data. Thus, you may have to deviate from what you said.