The CLSI document EP19

February 6, 2015


I had occasion recently to see a final draft of CLSI EP19 – which is a framework for using CLSI evaluation documents. I may review this when it is officially released but here are three comments.

  1. There is a cause and effect diagram in EP19 listing assay attributes (precision, interferences, and so on) and the CLSI documents that are used to evaluate these attributes. I published (1) a diagram in 1992 (attributes only) and later adapted my diagram to include the associated CLSI documents and this diagram appeared in a 2002 publication (2). In 2005, I proposed to CLSI that the diagram appear in all CLSI evaluation standards – it is in EP10, EP18, and EP21 although it is not in more recent documents. Now I know that CLSI is a consensus organization whereby documents are a collaborative effort and my diagram has been modified further but there should be a citation to my prior work and there isn’t.
  2. In the clinical performance section, there is no mention of error grids (EP27). In fact, a search of EP19 shows that EP27 is never mentioned. This is most strange. After all, error grids are used to determine if an assay is good enough which is the whole point of an evaluation! Error grids are part of the FDA CLIA waiver recommended guideline and fundamental in glucose meter evaluations. I don’t understand how in years of document development of EP19, EP27 has received zero mention. I did check the list of CLSI publications on their website to make sure that EP27 is still for sale.
  3. There is mention that assay claims should clear – that’s it! no more details are given. Sadly, there was an entire document about uniformity of claims (EP11) that was killed by CLSI management after one manufacturer threatened to quit.


  1. Krouwer JS Estimating Total Analytical Error and Its Sources: Techniques to Improve Method Evaluation. 1192, Arch Pathol Lab Med., 116, 726-731.
  2. Krouwer JS Setting Performance Goals and Evaluating Total Analytical Error for Diagnostic Assays. Clin. Chem., 48: 919-927 (2002).

It’s up to the lab director – not really

February 4, 2015

post flight cirrusedp copy

I have previously commented that many CLSI evaluation standards at some point ask the question “is the assay performance good enough” and answer that question with “it’s up to the lab director.”

The problem is that lab directors are not clinicians and do not treat patients. Note that most lab directors are either PhD clinical chemists or pathologists and although pathologists are MDs, they are not clinicians because they do not treat patients.

Of course, lab directors do have a great deal of knowledge about assay performance but in my experience – especially in working on CLSI standards – lab directors tend to focus on analytical errors whereas only total error is of importance to clinicians and the source of errors that contribute to total error is a combination of analytical, pre- and post-analytical error.

So how should the “is it good enough” question be answered? An example appeared recently in the literature (1) where clinicians were surveyed as to what size glucose meter errors would start to cause problems for diabetics under several scenarios. The results provided limits for a glucose meter error grid. Note that there was no attempt to identify error limit sources – the limits simply reflect the observed error, regardless of its source.


  1. Klonoff DC, Lias C, Vigersky R, et al The surveillance error grid. J Diabetes Sci Technol. 2014;8:658-672.

Reply to Letter Published

January 30, 2015


[The photo is Martha’s Vineyard snow removal operations taken during a flight two days after two feet of snow fell in the Boston area.]

I had mentioned that a Letter to the editor to Clinical Chemistry had been accepted. It is now online with a reply to my Letter (subscription required).

I had previously mentioned in this blog how the editor of Clinical Chemistry is not fond of letters and replies so any thought of me replying to the reply would be a lost cause. Not that I would anyway. The authors who replied were kind in their comments and I have only one comment which I make at the end of this entry.

One cynical comment about these glucose meter models that relate precision and bias to total error is that you can make beautiful contour graphs because there are three variables. If you add interferences, no more simple contour graphs.

But what does it take to add interferences to the glucose meter (simulation) model. First one needs to list all candidate interfering substances and test them. Manufacturers have already done this but unfortunately, don’t try to use the information in the package insert. You can thank CLSI EP7 for this which allows a manufacturer to say compound XYZ does not interfere – if the manufacturer finds that the interference is less than 10% and the goal was 10%. So there could be a bunch of compounds that interfere but at levels less than 10%. This means that unless one can access the original manufacturing data, one would have to do over all of the interference studies. Then one needs the patient distribution of the concentration of each interfering substance. With this information one can randomly select a concentration of each interfering substance and apply the appropriate equation to generate a bias.

Thus, simulations, while still models and subject to the possibility of being incorrect, can require a significant amount of work.

My comment to the authors who replied to my Letter deals with their statement: “This is exactly the reason we advised in our work to adopt accuracy requirements more stringent than those resulting from simulations.” A similar statement was made by Boyd and Bruns back when I similarly critiqued their model. Now for sure, if the required bias is reduced and interferences are small, this will work because the total error will meet goals. The problem is, one has no knowledge of the bias contributed by interferences. And perhaps more importantly, this strategy will not work to prevent errors in the D zone of an error grid. I mention in my last post that with a bias of zero and a CV of 5%, one could get a D zone error if the observation is 80 standard deviations away. This will not happen anytime soon, but a gross interference is possible.

Published and Bad Model

January 2, 2015


I complained about two glucose modeling papers and an accompanying editorial in the December issue of Clinical Chemistry. My Letter to the editor about one on the papers has been accepted in Clinical Chemistry.

Although not in the Letter, here’s another example of why modeling glucose meter error using average bias plus multiples of the standard deviation (e.g., sampling from a Gaussian distribution) can be misleading. Say truth is 50 mg/dL, which is hypoglycemic and the meter reads 200mg/dL, which is hyperglycemic. This would be a serious error because the provider (or patient if self-monitoring) would administer insulin, when in fact sugar is needed.

But in terms of modeling, say the bias is zero and the glucose CV is 5%. This means the sd at 50 is 2.5 mg/dL. Now to get a value of 200 due to imprecision requires 80 standard deviations! Using a spreadsheet, I can’t get this probability – however, for 30 standard deviations, the probability has 200 zeros to the right of the decimal point followed by a one. In other words not going to happen.

But such errors do occur – albeit rarely – but much more frequently than an 80 standard deviation error.

A comment about terms used in EP5-A3 and bias

December 11, 2014



I have the new version of EP5-A3, which is CLSI’s document about precision. Having been kicked out of CLSI, I was loathe to buy it but if one is consulting in evaluating assays, it’s required.

As I read through the document, one note on terminology – this was in the A2 version as well – the use of the term “total precision” has been dropped and replaced with either “within laboratory precision” or “within device precision.”

All three terms have issues – the replacement does not solve these issues. The problem is that whichever term one is using does not account for all sources of error, which is implied in the terms. In an experiment such as EP5, the goal is to randomly sample sources of imprecision from the population of interest. Take reagents for example. The study may use one reagent or in many cases in industry – three or more reagents. But these reagents are not a random sample from the population of reagents – that’s of course impossible, because for a new assay, there are often only a few reagents that have been made and future reagents don’t exist. Are future reagents the same? That’s hard to say as raw materials change, vendor and manufacturing procedures change, QC procedures for approving lots change, personnel change, and so on.

The same could be said for the 20 days. Say the assay’s projected life is 10 years. One cannot randomly select 20 days from all future 20 day sequences in the 10 years – one is stuck with the 20 days that are current.

Formally, these are forms of bias and thus the EP5 protocol is biased. This is not some bad, deliberate bias – it is unavoidable bias, but bias nevertheless.

So in reality, the EP5 experiment is estimating precision based on the error sources that are allowed to be in the experiment. Whatever term is used: “total precision”, “within laboratory precision” or within device precision”, it is likely that precision has been underestimated.

More glucose fiction

December 1, 2014


In the latest issue of Clinical Chemistry, there are two articles (1-2) about how much glucose meter error is ok and an editorial (3) which discusses these papers. Once again, my work on this topic has been ignored (4-12). Ok, to be fair not all of my articles are directly relevant but the gist of my articles and particularly reference #10 is that if you use the wrong model, the outcome of a simulation is not relevant to the real world.

How are the authors’ models wrong?

In paper #1, the authors’ state: “The measurement error was assumed to be uncorrelated and normally distributed with zero mean…”

In paper #2, the authors state:” We ignored other analytical errors (such as nonlinear bias and drift) and user errors in this model.”

In both papers, the objective is to state a maximum glucose error that will be medically ok. But since the modeling omits errors that occur in the real world, the results and conclusions are unwarranted.

Ok, here’s a thought people – instead of simulations based on the wrong model, why not construct simulations based on actual glucose evaluations. An example of such study is: Brazg RL, Klaff LJ, Parkin CG. Performance variability of seven commonly used self-monitoring of blood glucose systems: clinical considerations for patients and providers. J Diabetes Sci Technol. 2013;7:144-152. Given sufficient method comparison data, one could construct an empirical distribution of differences and randomly sample from it.

And finally, I’m sick of seeing the Box quote (reference 3): “Essentially, all models are wrong, but some are useful.” Give it a rest – it doesn’t apply here.


  1. Malgorzata E. Wilinska and Roman Hovorka Glucose Control in the Intensive Care Unit by Use of Continuous Glucose Monitoring: What Level of Measurement Error Is Acceptable? Clinical Chemistry 2014; v. 60, p.1500-1509.
  2. Tom Van Herpe, Bart De Moor, Greet Van den Berghe, and Dieter Mesotten Modeling of Effect of Glucose Sensor Errors on Insulin Dosage and Glucose Bolus Computed by LOGIC-Insulin Clinical Chemistry 2014; v. 60, p.1510-1518.
  3. James C. Boyd and David E. Bruns Performance Requirements for Glucose Assays in Intensive Care Units Clinical Chemistry 2014; v. 60, p.1463-1465
  4. Jan S. Krouwer: Wrong thinking about glucose standards. Clin Chem, 2010;56:874-875.
  5. Jan S. Krouwer and George S. Cembrowski A review of standards and statistics used to describe blood glucose monitor performance. Journal of Diabetes Science and Technology, 2010;4:75-83.
  6. Jan S. Krouwer: Analysis of the Performance of the OneTouch SelectSimple Blood Glucose Monitoring System: Why Ease of Use Studies Need to Be Part of Accuracy Studies. Journal of Diabetes Science and Technology, 2011;5:610-611.
  7. Jan S. Krouwer: Evaluation of the Analytical Performance of the Coulometry-Based Optium Omega Blood Glucose Meter: What Do Such Evaluations Show? Journal of Diabetes Science and Technology, 2011;5:618-620.
  8. Jan S. Krouwer: Why specifications for allowable glucose meter errors should include 100% of the data. Clinical Chemistry and Laboratory Medicine, 2013;51:1543-1544.
  9. Jan S. Krouwer: The new glucose standard, POCT12-A3 misses the mark. Journal of Diabetes Science and Technology, 2013;7:1400-1402.
  10. Jan S. Krouwer: The danger of using total error models to compare glucose meter performance. Journal of Diabetes Science and Technology, 2014;8:419-421.
  11. Jan S. Krouwer and George S. Cembrowski: Acute Versus Chronic Injury in Error Grids. Journal of Diabetes Science and Technology, 2014;8:1057.
  12. Jan S. Krouwer and George S. Cembrowski. The chronic injury glucose error grid. A tool to reduce diabetes complications. Journal of Diabetes Science and Technology, in press (available online)

How the journal Clinical Chemistry has become elitist

November 21, 2014


At a recent AACC dinner meeting, I heard an interesting talk by Nader Rifai, the editor of Clinical Chemistry. About halfway through his talk, I remembered an event that took place a couple of years ago, so I asked him a question after his talk ended. My question and Nader’s responses went something like this:

Me: “A while ago, I read a commentary article that I didn’t agree with and submitted a Letter to the editor about it. The response from the journal was…”
Rifai: “It wouldn’t be reviewed because it wasn’t about an original article, right?”
Me: “Yes, that’s right, then I looked at a few issues and saw that the percentage of original articles is only about 50% of the journal. This means that one can’t comment about a large portion of the journal.”
Rifai: “Well, we were seeing Letters to the editor about other Letters to the editor and with commentary articles it is common that many people won’t have the same opinion as the author, so we don’t want to fill up the journal with such stuff.”

This is sort of what I remembered, not verbatim but that is the gist of it.

So basically, Rifai is putting Letters to the editor into a generic category similar to junk mail or the endless comments associated with Twitter or a blog and at the same time giving immunity to authors – other than those who write original articles – from any kind of comment.

But the problem is that commentary articles in Clinical Chemistry are about science and if the authors get the science wrong, it is a mistake to prevent people from pointing that out. That is unscientific / elitist. Perhaps contributing to this elitism was that Rifai mentioned that articles in Clinical Chemistry are of high quality due to the extensive review process. But this doesn’t guarantee correctness.

And Clinical Chemistry has changed its policy. I commented briefly on this topic before in this blog. My 2010 Letter to the editor about a “Question and Answer” type article was published. Moreover, I think my 2010 Letter had a role in shaping glucose meter standards but these days the Letter would not have been considered.

So now I have less interest in reading Clinical Chemistry.


Get every new post delivered to your Inbox.