HbA1c – use the right model, please

August 31, 2017

I had occasion to read a paper (CCLM paper) about HbA1c goals and evaluation results. This paper refers to an earlier paper (CC paper) which says that Sigma Metrics should be used for HbA1c.

So here are some problems with all of this.

The CC paper says that TAE (which they use) is derived from bias and imprecision. Now I have many blog entries as well as peer reviewed publications going back to 1991 saying that this approach is flawed. That the authors chose to ignore this prior work doesn’t mean the prior work doesn’t exist – it does – or that it is somehow not relevant – it is.

In the CC paper, controls were used to arrive at conclusions. But real data involves patient samples so the conclusions are not necessarily transferable. And in the CCLM paper, patient samples are used without any mention as to whether the CC paper conclusions still apply.

In the CCLM paper, precision studies, a method comparison, linearity, and interferences were carried out. This is hard to understand since the TAE model of (absolute) average bias + 2x imprecision does not account for either linearity or interference studies.

The linearity study says it followed CLSI EP6 but there are no results to show this (e.g., no reported higher order polynomial regressions). The graphs shown, do look linear.

But the interference studies are more troubling. From what I can make of it, the target values are given ± 10% bands and any candidate interfering substance whose data does not fall outside of these bands is said to not clinically interfere (e.g., the bias is less than absolute 10%). But that does not mean there is no bias! To see how silly this is, one could say if the average bias from regression was less than absolute 10%, it should be set to zero since there was no clinical interference.

The real problem is that the authors’ chosen TAE model cannot account for interferences – such biases are not in their model. But interference biases still contribute to TAE! And what do the reported values of six sigma mean? They are valid only for samples containing no interfering substances. That’s neither practical nor meaningful.

Now one could better model things by adding an interference term to TAE and simulating various patient populations as a function of interfering substances (including the occurrence of multiple interfering substances). But Sigma Metrics, to my knowledge cannot do this.

Another comment is that whereas HbA1c is not glucose, the subject matter is diabetes and in the glucose meter world, error grids are well known as a way to evaluate required clinical performance. But the term “error grid” does not appear in either paper.

Error grids account for the entire range of the assay. It seems that Sigma Metrics are chosen to apply at only one point in the assay.


Blog Review

May 26, 2017

I started this blog 13 years ago in March 2004 – the first two articles are about six sigma, here and here. The blog entry being posted now is my 344th blog entry.

Although the blog has an eclectic range of topics, one unifying theme for many entries is specifications, how to set them and how to evaluate them.

A few years ago, I was working on a hematology analyzer, which has a multitude of reported parameters. The company was evaluating parameters with the usual means of precision studies and accuracy using regression. I asked them:

  1. a) what are the limits that, when differences from reference are contained within these limits, will ensure that no wrong medical decisions would be made based on the reported result (resulting in patient harm) and
  2. b) what are the (wider) limits that, when differences from reference are contained within these limits, will ensure that no wrong medical decisions would be made based on the reported result (resulting in severe patient harm)

This was a way of asking for an error grid for each parameter. I believe, then and now, that constructing an error grid is the best way to set specifications for any assay.

As an example about the importance of specifications there was a case for which I was an expert witness whereby the lab had produced an incorrect result that led to patent harm. The lab’s defense was that they had followed all procedures. Thus, as long as they as followed procedures, they were not to blame. But procedures, which contain specifications, are not always adequate. As an example, remember the CMS program “equivalent quality control”?

Antwerp talk about total error

March 12, 2017

Looking at my blog stats, I see that a lot of people are reading the total analytical error vs. total error post. So, below are the slides from a talk that I gave at a conference in Antwerp in 2016 called The “total” in total error. The slides have been updated. Because it is a talk, the slides are not as effective as the talk.




Help with sigma metric analysis

January 27, 2017


I’ve been interested in glucose meter specifications and evaluations. There are three glucose meter specifications sources:

FDA glucose meter guidance
ISO 15197:2013
glucose meter error grids

There are various ways to evaluate glucose meter performance. What I wished to look at was the combination of sigma metric analysis and the error grid. I found this article about the sigma metric analysis and glucose meters.

After looking at this, I understand how to construct these so-called method decision charts (MEDX). But here’s my problem. In these charts, the total allowable error TEa is a constant – this is not the case for TEa for error grids. The TEa changes with the glucose concentration. Moreover, it is not even the same at a specific glucose concentration because the “A” zone limits of an error grid (I’m using the Parkes error grid) are not symmetrical.

I have simulated data with a fixed bias and constant CV throughout the glucose meter range. But with a changing TEa, the estimated sigma also changes with glucose concentration.

So I’m not sure how to proceed.

Letter to be published

November 15, 2016


Recently, I alerted readers to the fact that the updated FDA POCT glucose meter standard no longer specifies 100% of the results.

So I submitted a letter to the editor to the Journal of Diabetes Science and Technology.

This letter has been accepted – It seemed to take a long time for the editors to decide about my letter. I can think of several possible reasons:

  1. I was just impatient – the time to reach a decision was average
  2. The editors were exceptionally busy due to their annual conference which just took place.
  3. By waiting until the conference, the editors could ask the FDA if they wanted to respond to my letter.

I’m hoping that #3 is the reason so I can understand why the FDA changed things.

Glucose Error Grids – Well Known?

September 24, 2016


The picture shows a possible stranded sea creature at low tide, taken from 3,500 feet.

I was talking to a colleague about a project I’m working on and in order to explain, I asked him if he was familiar with glucose error grids. He said no, which surprised me. My colleague has been developing immunoassay reagents for a long time and while development and not evaluation is his specialty, as part of development, one must prove precision and accuracy.

I took this to mean that the concept of error grids is not that well known outside of diabetes. This is unfortunate, since error grids make more sense to me than total error, measurement uncertainty, or separate requirements for precision and accuracy.

MU vs TE vs EG

July 29, 2016


Picture is aerial view from a Cirrus of Foxwoods casino in CT

MU=measurement uncertainty TE=total error EG=error grid

Having looked at a blog entry by the Westgards, which is always interesting, here are my thoughts.

To recall, MU is a “bottoms-up” way to model error in a clinical chemistry assay (TE uses a “top down” model) and EG has no model at all.

MU is a bad idea for clinical chemistry – Here are the problems with MU:

  1. Unless things have changed, MU doesn’t allow for bias in it modeling process. If a bias is found, it must be eliminated. Yet in the real world, there are many uncorrected biases in assays (calibration bias, interferences).
  2. The modeling required by MU is not practical for a typical clinical chemistry lab. One can view the modeling as having two major components: the biological equations that govern the assay (e.g., Michaelis Menten kinetics) and the instrumentation (e.g., the properties of the syringe that picks up the sample). Whereas clinical chemists may know the biological equations, they won’t have access to the manufacturer’s instrumentation data.
  3. The math required to perform the analysis is extremely complicated.
  4. Some of the errors that occur cannot be modeled (e.g., user errors, manufacturing mistakes, software errors).
  5. The MU result is typically reported as the location of 95% of the results. But one needs to account for 100% of the results.
  6. So some people get the SD for a bunch of controls and call this MU – a joke.

TE has been much more useful than MU, but still has problems:

  1. The Westgard model for TE doesn’t account for some important errors, such as patient interferences.
  2. Other errors that occur (e.g., user errors, manufacturing mistakes, software errors) may be captured by TE but the potential for these errors are often excluded from experiments (e.g., users in these experiments are often more highly trained than typical users).
  3. Although both MU and TE rely on experimental data, TE relies solely on an experiment (method comparison or quality control). There are likely to be biases in the experiment which will cause TE to be underestimated. (See #2).
  4. The TE result is typically reported as the location of 95% of the results. But one needs to account for 100% of the results.
  5. TE is often overstated e.g., the sigma value is said to provide a specific (numeric) quality for patient results. But this is untrue since TE underestimates the true total error.
  6. TE fails to account for the importance of bias. That is, one can have results that are within TE goals but can still cause harm due to bias. Klee has shown this as well as me. For example, bias for a glucose meter can cause diabetic complications but still be within TE goals.

I favor error grids.


  1. Error grids still have the problem that they rely on experimental data and hence there may be bias in the studies.
  2. But 100% of the results are accounted for.
  3. There is the notion of increasing patient harm in EG. With either MU or TE, there is only the concept of harm vs no harm. This is not the real world. A glucose meter result of 95 mg/dL (truth=160 mg/dL) has much less harm than a glucose meter result of 350 mg/dl (truth=45 mg/dL).
  4. EG simply plots test vs. reference. There are no models (but there is no way to tell the origin of the error source).