Glucose meter recalls

November 30, 2013


Glucose meters are an example of unit use devices, meaning that when a sample is assayed, a new reagent strip (the unit) has to be used. Some years ago, unit use device manufacturers argued that QC is less important for their products because of among other reasons, a more rigorously controlled manufacturing process was used.

I have been doing some work with glucose meters and note that at least twice this year there have been recalls for reagents strips from two different manufacturers. Here are my reasons for why these recalls continue to happen.

  1. Vendors that supply raw materials have provided different lots from those used to design and evaluate the original reagent strip.
  2. Vendor processes have changed.
  3. The glucose meter manufacturer processes have changed
  4. The process used to release reagent strip lots is imperfect. It is not as rigorous as a full blown method comparison and the parameters measured may not reflect all aspects of performance.
    1. The process parameters limits may not be correct.
    2. Some key variables may not be measured.
    3. The sample size may not be adequate.
    4. And last but not least people make mistakes!!!

As someone who worked for manufacturers, the recall sequence was usually: our service department received complaints from customers, these complaints were verified in-house, and a recall was initiated.

Don’t forget the mountain plot

November 28, 2013


The mountain plot, created by my colleague Mike Lynch while we were at Ciba Corning is part of the CLSI standard EP21-A. It was used extensively while we were at Ciba Corning, but it has not been very popular as it is not often cited. On the other hand, the Bland-Altman plot is frequently cited.

But this is not a competition. Sometimes one plot is better, sometimes the other and often both should be shown. I was at a glucose meter conference this September in Washington DC where someone was presenting data for two glucose meters vs. reference using a Bland Altman plot. He should have been using a mountain plot. I don’t have his data, but this is an example of when the mountain plot is better than the Bland-Altman plot.


With the Bland-Altman plot, the pattern of the “bad” vs. “good” assay is harder to see than with the mountain plot. Moreover, as more data gets added, the Bland-Altman plot becomes a mess of dots, whereas the mountain plot remains sharp. If there were 3 or 4 glucose meters, the mountain plot would be even better.


To construct a mountain plot in a spreadsheet:

  1. Calculate the differences between the candidate and reference assay
  2. Sort the differences from low to high
  3. Rank the sorted differences
  4. Calculate the cumulative probability as rank / (number of observations + 1)
  5. Calculate the adjusted cumulative probability as: If the cumulative probability is greater than 0.5, use 1- cumulative probability.

Pay for Performance (P4P) and total error

November 9, 2013


P4P has been around for a number of years as a way to reward and punish physicians financially based on selected measures. P4P has been widely criticized, here for example and also in the NEJM.

I disagree with one P4P criticism that the work of a physician is so complex that judging performance is impossible – it would be easier to split the atom. The problem is rather that a subset of performance measures has been chosen and without the entire set of performance measures, the result is similar to my last post although in this case it is like having a dictionary with only the letter “C.”

An alternative would be to use a total error concept. That is, for any patient care episode, what errors have been made? Errors could not only include harm but could be financial as well. Thus, if a physician ordered unnecessary blood tests, financial waste has occurred. The value of total error is that there is no modeling – all errors would be captured by examining whether there is harm (or waste) in the care of a patient. Thus, there is no list of performance measures. And yes, there can be a physician error even when the patient presents with a complex set of symptoms. But the concept is impractical because one would need a panel of experts – see for example a NEJM case study – for each patient encounter.

Another approach would be to evaluate known cases of patient harm in a NTSB type of approach. Here the goal would be to understand causes to reduce error rates. In certain cases such as incompetence, physicians would be punished but in other cases, process improvements including better training might would be used.

A dictionary without the letter W, or the black hole of data

November 3, 2013


A recent review article on glucose meters deserves comment. Its title is: Assessing the Analytical Performance of Systems for Self-Monitoring of Blood Glucose: Concepts of Performance Evaluation and Definition of Metrological Key Terms.

In this article, the Westgard model for total error is used. There are the usual four pictures of data superimposed on bull’s-eyes with all combinations of high and low precision and trueness. Well, below is a picture of assay drift that never makes it into these discussions. The problem is that it doesn’t fit into one of the four combinations of high and low precision and trueness. The authors reference an article I wrote in which I critiqued the Westgard model and I suggested a different and more complete total error model. The authors of the current review say that factors such as drift and interferences “may go beyond the scope of this review.” But it makes no sense to leave out error sources and at the same time claim you have total error. It’s like publishing a dictionary of the English language but leaving out words that begin with the letter W.