EFLM – after three years it’s disappointing

February 15, 2017


Thanks to Sten Westgard, whose website alerted me to an article about analytical performance specifications. Thanks also to Clin Chem Lab Med for making this article available without a subscription.

To recall, the EFLM task group was going to fill in details about performance specifications as initially described by the Milan conference held in 2014.

Basically, what this paper does is to assign analytes (not all analytes that can be measured but a subset) to one of three categories for how to arrive at analytical performance specifications: clinical outcomes, biological variation, or state of the art. Note that no specifications are provided – only which analytes are in which categories. Doesn’t seem like this should take three years.

And I don’t agree with this paper.

For one, talking about “analytical” performance specifications implies that user error or other mishaps that cause errors are not part of the deal. This is crazy because the preferred option is the effect of assay error on clinical outcomes. It makes no sense to exclude errors just because their source is not analytical.

I don’t agree with the second and third options ever playing a role (biological variation and state of the art). My reasoning follows:

If a clinician orders an assay, the test must have some use for the clinician to decide on treatment. If this is not the case, the only reason a clinician would order such an assay is that he has to make a boat payment and needs the funds.

So, for example say the clinician will provide treatment A (often no treatment) if the result falls within X1-X2. If the result is greater than X2, then the clinician will provide treatment B. Of course this is oversimplified since other factors are involved besides the assay result. But if the assay is 10 times X2 but truth is between X1 and X2, then the clinician will make the wrong treatment decision based on laboratory error. I submit this model applies to all assays and that if one assembles clinician opinion, one can construct error specifications (see last sentence at bottom).

Other comments:

In the event that outcome studies do not exist, authors encourage double-blind randomized controlled trials. Get real people – these studies would never be approved! (e.g., feeding clinicians the wrong answer to see what happens).

The authors also suggest simulation studies which I have previously commented that their premier simulation study which was cited was flawed (Boyd Bruns glucose meter simulations).

The Milan 2014 conference rejected the use of clinician opinion to establish performance specifications. I don’t see how clinical chemists and pathologists trump clinicians.

Revisiting Bland Altman plots and a paranoia

February 13, 2017


Over 10 years ago I submitted a paper critiquing Bland Altman plots. Since the original publication of Bland Altman plots was the most cited paper ever in The Lancet, I submitted my paper with some temerity.

Briefly, the issue is this. When one is comparing two methods, Bland Altman suggest plotting the difference (Y-X) vs. the average of the two methods (Y+X)/2. Bland Altman also stated in a later paper (1) that even if the X method is a reference method (they use the term gold standard) one should still plot the difference against the average and not doing so is misguided and will lead to correlations. They attempted to prove this with formulas.

Not being so great in math, but doubting their premise, I did some simulations. The results are shown in the table below. Basically, this says that when you have two field methods you should plot the difference vs. (Y+X)/2 as Bland Altman suggest. But when you have field and a reference method, you should plot the difference vs. X. The values in the table are the correlation coefficients for Y-X vs. (Y-X)/2 and Y-X vs. X (after repeated simulations where Y is always a field method and X is either a field method or a reference method).


Case X=X X=(X+Y)/2
X=Reference method ~0 ~0.1
X=Field method ~-0.12 ~0


The paranoia

I submitted my paper as a technical brief to Clin Chem and included my simulation program as an appendix. After being told to recast the paper as a Letter, it was rejected. I submitted it to another journal (I think it was Clin Chem Lab Med) and it was also rejected. I then submitted my letter to Statistics in Medicine (2) where it was accepted.

Now in the lab medicine field, I am known by the other statisticians, and sometimes have published papers not to their liking. Regarding Statistics in Medicine, I am an unknown and lab medicine is a small part of Statistics in Medicine. So maybe, my paper was judged solely on merit or maybe I’m just paranoid.


  1. Bland JM, Altman DG. (1995) Comparing methods of measurement – why plotting difference against standard method is misleading. Lancet, 346, 1085-1087.
  1. Krouwer JS Why Bland-Altman plots should use X, not (Y+X)/2 when X is a reference method. Statistics in Medicine, 2008;27:778-780.

Help with sigma metric analysis

January 27, 2017


I’ve been interested in glucose meter specifications and evaluations. There are three glucose meter specifications sources:

FDA glucose meter guidance
ISO 15197:2013
glucose meter error grids

There are various ways to evaluate glucose meter performance. What I wished to look at was the combination of sigma metric analysis and the error grid. I found this article about the sigma metric analysis and glucose meters.

After looking at this, I understand how to construct these so-called method decision charts (MEDX). But here’s my problem. In these charts, the total allowable error TEa is a constant – this is not the case for TEa for error grids. The TEa changes with the glucose concentration. Moreover, it is not even the same at a specific glucose concentration because the “A” zone limits of an error grid (I’m using the Parkes error grid) are not symmetrical.

I have simulated data with a fixed bias and constant CV throughout the glucose meter range. But with a changing TEa, the estimated sigma also changes with glucose concentration.

So I’m not sure how to proceed.

The Diabetes Technology Society (DTS) surveillance protocol doesn’t seem right

January 16, 2017


The Diabetes Technology Society (DTS) has published a protocol that will allow a glucose meter to be tested to see if the meter meets the DTS seal of approval. This was instituted because for some FDA approved glucose meters, the performance of post release for sale meters from some companies did not meet ISO standards.

Before the DTS published their protocol, they published a new glucose meter error grid – the surveillance error grid.

But what I don’t understand is that the error grid is not part of the DTS acceptance criteria to gain the DTS seal of approval. (The error grid is plotted as supplemental material). Basically, to get DTS approval, one has to show that enough samples have differences from reference that fall within the ISO 15197:2013 standard. To be fair, the ISO standard and the “A” zone of the error grid have similar limits, but why not use the error grid, since the error grid was developed by clinicians whereas the ISO standard is weighted by industry members. And the error grid deals with results in higher zones.

Moreover, the DTS does not deal with outliers other than to categorize them – their presence does not disqualify a meter from getting DTS acceptance as long as the percentage of results within ISO limits is high enough.

So if a meter has a 1% rate of values that could kill a patient, it could still gain DTS seal of approval. This doesn’t seem right.


Book about noninvasive glucose meters

December 12, 2016


Noninvasive glucose meters are the Holy Grail in glucose testing. To be able to get a glucose value without a finger stick would be a tremendous benefit to the millions of people who have to test themselves several times each day.

So there have scores of scientists who have worked on the problem, backed by diagnostic companies since the profit potential is huge.

I remember while at Ciba Corning, attending a lecture on near infrared spectroscopy given by a professor whom I think we were supporting to try to come up with a noninvasive glucose meter.

On a website devoted to diabetes, I became aware of a book which chronicles the quest for a noninvasive glucose meter. It is recent (2015 publication date), free, and written by a former chief scientific officer and VP of LifeScan who has been involved in this search for years.

I found it fascinating.

Test error and healthcare costs

December 7, 2016


Conventional wisdom says that regulatory authorities approve assays that have the highest quality, meaning that the errors are small enough that no or little harm will arise because a clinician makes a wrong medical decision based on test error.

It is also true, although not talked about, that in most countries healthcare is rationed – the cost of treating everyone with every possible treatment is too high.

So here’s a hypothetical example using glucose meters.

First, we start out with the status quo for existing glucose meter quality and assume that on average, across all tests there will be some harm due to glucose meter error. The percentage of tests that harm people is unknown as is the range of harm but assume that these can be ascertained and do occur.

As for the hypothetical part…

There are 2 new glucose meters seeking approval

Meter A costs 100 times as much as current meters and is guaranteed to have zero error, as it is a breakthrough technology. Its use will reduce patient harm due to test error to zero.

Meter B costs 100 times less than current meters but isn’t quite as accurate or reliable. Patient harm will increase with the use of meter B.

If meter A is approved, because of healthcare rationing, costs will have to be transferred from other parts of healthcare to pay for meter A.

If meter B is approved, costs can be transferred from glucose meter testing to other parts of healthcare.

The point is not to try to answer whether meter A or meter B should be approved, but to illustrate that the cost issues associated with healthcare policy always exist but are rarely discussed.

Letter to be published

November 15, 2016


Recently, I alerted readers to the fact that the updated FDA POCT glucose meter standard no longer specifies 100% of the results.

So I submitted a letter to the editor to the Journal of Diabetes Science and Technology.

This letter has been accepted – It seemed to take a long time for the editors to decide about my letter. I can think of several possible reasons:

  1. I was just impatient – the time to reach a decision was average
  2. The editors were exceptionally busy due to their annual conference which just took place.
  3. By waiting until the conference, the editors could ask the FDA if they wanted to respond to my letter.

I’m hoping that #3 is the reason so I can understand why the FDA changed things.