Blog Review

May 26, 2017

I started this blog 13 years ago in March 2004 – the first two articles are about six sigma, here and here. The blog entry being posted now is my 344th blog entry.

Although the blog has an eclectic range of topics, one unifying theme for many entries is specifications, how to set them and how to evaluate them.

A few years ago, I was working on a hematology analyzer, which has a multitude of reported parameters. The company was evaluating parameters with the usual means of precision studies and accuracy using regression. I asked them:

  1. a) what are the limits that, when differences from reference are contained within these limits, will ensure that no wrong medical decisions would be made based on the reported result (resulting in patient harm) and
  2. b) what are the (wider) limits that, when differences from reference are contained within these limits, will ensure that no wrong medical decisions would be made based on the reported result (resulting in severe patient harm)

This was a way of asking for an error grid for each parameter. I believe, then and now, that constructing an error grid is the best way to set specifications for any assay.

As an example about the importance of specifications there was a case for which I was an expert witness whereby the lab had produced an incorrect result that led to patent harm. The lab’s defense was that they had followed all procedures. Thus, as long as they as followed procedures, they were not to blame. But procedures, which contain specifications, are not always adequate. As an example, remember the CMS program “equivalent quality control”?


Biases in clinical trials performed for regulatory approval

May 31, 2015


The title of this post has been accepted for publication in the journal: Accreditation and Quality Assurance. The article describes common biases and how they might be avoided.

Hemoglobin A1c quality targets

March 16, 2015


There is a new article in Clinical Chemistry about a complicated (to me) analysis of quality targets for A1c when it would seem that a simple error grid – prepared by surveying clinicians would fit the bill.

Thus, this paper has problems. They are:

  1. The total error model is limited to average bias and imprecision. Error from interferences, user error, or other sources is not included. It is unfortunate to call this “total” error, since there is nothing total about it.
  2. A pass fail system is mentioned, which is dichotomous and unlike an error grid which allows for varying degrees of error with respect to severity of harm to patients.
  3. A hierarchy of possible goals are mentioned. This comes from a 1999 conference. But there is really only one way to set patient goals (listed near the top of the 1999 conference): namely; a survey of clinician opinions.
  4. Discussed in the Clinical Chemistry paper is the use of biological variation based goals for quality targets. Someone needs to explain to me how this could ever be useful.
  5. The analysis is based on proficiency survey materials, which due to the absence of patient interferences (see #1) is a subset of total error.
  6. From I could tell from their NICE reference (#11) in the paper, the authors have inferred that total allowable error should be 0.46% but this did not come from surveying clinicians.
  7. I’m on-board with six sigma in its original use at Motorola. But I don’t see its usefulness in laboratory medicine compared to an error grid.

More glucose fiction

December 1, 2014


In the latest issue of Clinical Chemistry, there are two articles (1-2) about how much glucose meter error is ok and an editorial (3) which discusses these papers. Once again, my work on this topic has been ignored (4-12). Ok, to be fair not all of my articles are directly relevant but the gist of my articles and particularly reference #10 is that if you use the wrong model, the outcome of a simulation is not relevant to the real world.

How are the authors’ models wrong?

In paper #1, the authors’ state: “The measurement error was assumed to be uncorrelated and normally distributed with zero mean…”

In paper #2, the authors state:” We ignored other analytical errors (such as nonlinear bias and drift) and user errors in this model.”

In both papers, the objective is to state a maximum glucose error that will be medically ok. But since the modeling omits errors that occur in the real world, the results and conclusions are unwarranted.

Ok, here’s a thought people – instead of simulations based on the wrong model, why not construct simulations based on actual glucose evaluations. An example of such study is: Brazg RL, Klaff LJ, Parkin CG. Performance variability of seven commonly used self-monitoring of blood glucose systems: clinical considerations for patients and providers. J Diabetes Sci Technol. 2013;7:144-152. Given sufficient method comparison data, one could construct an empirical distribution of differences and randomly sample from it.

And finally, I’m sick of seeing the Box quote (reference 3): “Essentially, all models are wrong, but some are useful.” Give it a rest – it doesn’t apply here.


  1. Malgorzata E. Wilinska and Roman Hovorka Glucose Control in the Intensive Care Unit by Use of Continuous Glucose Monitoring: What Level of Measurement Error Is Acceptable? Clinical Chemistry 2014; v. 60, p.1500-1509.
  2. Tom Van Herpe, Bart De Moor, Greet Van den Berghe, and Dieter Mesotten Modeling of Effect of Glucose Sensor Errors on Insulin Dosage and Glucose Bolus Computed by LOGIC-Insulin Clinical Chemistry 2014; v. 60, p.1510-1518.
  3. James C. Boyd and David E. Bruns Performance Requirements for Glucose Assays in Intensive Care Units Clinical Chemistry 2014; v. 60, p.1463-1465
  4. Jan S. Krouwer: Wrong thinking about glucose standards. Clin Chem, 2010;56:874-875.
  5. Jan S. Krouwer and George S. Cembrowski A review of standards and statistics used to describe blood glucose monitor performance. Journal of Diabetes Science and Technology, 2010;4:75-83.
  6. Jan S. Krouwer: Analysis of the Performance of the OneTouch SelectSimple Blood Glucose Monitoring System: Why Ease of Use Studies Need to Be Part of Accuracy Studies. Journal of Diabetes Science and Technology, 2011;5:610-611.
  7. Jan S. Krouwer: Evaluation of the Analytical Performance of the Coulometry-Based Optium Omega Blood Glucose Meter: What Do Such Evaluations Show? Journal of Diabetes Science and Technology, 2011;5:618-620.
  8. Jan S. Krouwer: Why specifications for allowable glucose meter errors should include 100% of the data. Clinical Chemistry and Laboratory Medicine, 2013;51:1543-1544.
  9. Jan S. Krouwer: The new glucose standard, POCT12-A3 misses the mark. Journal of Diabetes Science and Technology, 2013;7:1400-1402.
  10. Jan S. Krouwer: The danger of using total error models to compare glucose meter performance. Journal of Diabetes Science and Technology, 2014;8:419-421.
  11. Jan S. Krouwer and George S. Cembrowski: Acute Versus Chronic Injury in Error Grids. Journal of Diabetes Science and Technology, 2014;8:1057.
  12. Jan S. Krouwer and George S. Cembrowski. The chronic injury glucose error grid. A tool to reduce diabetes complications. Journal of Diabetes Science and Technology, in press (available online)

QC (quality Control) is not quality

May 14, 2013


Based on recent events, I’m restating that for a clinical assay, good quality control results do not imply good quality. Of course, good quality control results is a good thing and poor quality control results means that there are problems, but here are some examples where good quality control results don’t mean good quality.

  1. QC samples do not inform about patient sample interferences, which can cause large errors and result in patient harm. Such events could occur with perfect QC results.
  2. QC informs about biases that persist across time. For example if QC is performed twice per day, a bad calibration (where calibration lasts for a month) will likely be detected. But short term biases will likely be missed.

So if anyone claims, you can select your lab’s quality by running QC according to some scheme, it’s simply not true.

The basis of a spec

May 16, 2012

I proposed and was the chairholder of EP27, the CLSI standard about error grids. A while ago during the document development committee discussions, I suggested that an error limit specification contain two items – the level of error (e.g., ± 10%) and the percentage of results that would meet the limit (e.g., 95%). A committee member strongly objected and said no – the spec should be the level or error only. So I said, would it be acceptable for the 10% spec, if 20% of results met the spec. He said, of course no. So I said, what about 60%? He said again no and commented – I see where you’re going. Yes, the percentage of results is used to determine acceptability but is not part of the spec. So I said, a spec is a set of criteria and an evaluation is conducted to determine if those criteria have been met, but this line of reasoning didn’t convince him. There might have been more to this story but then I was unexpectedly and rather unceremoniously thrown off the document development committee.

Patients – the missing voice

May 24, 2011

An important question, when answered about an assay – is the performance good enough, is usually answered by a standards group. An example is the ISO 15197 standard for glucose meters.

The usual input into standards groups are manufacturers, clinicians, regulators, and laboratorians. Within these groups, manufacturers tend to dominate. This was true for ISO 15197.

But one voice is often missing – that of patients. This is particularly important for glucose, where patients act as clinicians and laboratorians.

The FDA meeting last year did have a patient advocate and patients have commented on the FDA meeting, here and here.