Comparison of company vs. standards organization specifications

April 11, 2017

For almost all of my career, I’ve been working to determine performance specifications for assays, including the protocol and data analysis methods to see if performance has been met. This work has been performed mainly for companies but occasionally also for standards groups. There are some big differences.

Within a company, the specifications are very important:

If the product is released too soon, before the required performance has been met, the product may be recalled, patients may suffer harm, and overall the company may suffer financially.

If the product is released too late, the company will definitely suffer financially as “time to market” has been shown in financial models to be a key success factor in achieving profit goals.

Company specifications are built around two main factors – what performance is competitive and how can the company be sure that no patients will be harmed. In my experience this has simply led to two goals – 95% of the differences between the company assay and reference should be within limits which guarantee a competitive assay and no differences should be large enough to cause patient harm (a clinical standard).

Standards groups seem to have a different outlook. Without being overly cynical, the standards adopted are often to guarantee that no company’s assay will fail the specification. Thus, 95% of differences between the assay and reference should be within these limits. There is almost never a mention about larger errors which may cause patient harm.

Thus, it is somewhat ironic that company specifications are usually more difficult to achieve then specifications published by the standards organizations.

Antwerp talk about total error

March 12, 2017

Looking at my blog stats, I see that a lot of people are reading the total analytical error vs. total error post. So, below are the slides from a talk that I gave at a conference in Antwerp in 2016 called The “total” in total error. The slides have been updated. Because it is a talk, the slides are not as effective as the talk.




Letter to be published

November 15, 2016


Recently, I alerted readers to the fact that the updated FDA POCT glucose meter standard no longer specifies 100% of the results.

So I submitted a letter to the editor to the Journal of Diabetes Science and Technology.

This letter has been accepted – It seemed to take a long time for the editors to decide about my letter. I can think of several possible reasons:

  1. I was just impatient – the time to reach a decision was average
  2. The editors were exceptionally busy due to their annual conference which just took place.
  3. By waiting until the conference, the editors could ask the FDA if they wanted to respond to my letter.

I’m hoping that #3 is the reason so I can understand why the FDA changed things.

Comparison of Parkes glucose meter error grid with CLSI POCT 12-A3 Standard

November 8, 2016


The graph below shows the Parkes error grid in blue. Each zone in the Parkes error grid shows increasing patient harm with the innermost zone A having no harm. The zones (unlabeled) start with A (innermost) and go to D or E.

The red lines are the POCT 12-A3 standard. The innermost line should contain 95% of the results. Since no more than 2% can be outside of the outermost red lines, these outermost red lines should contain 98% of the data.


The red lines correspond roughly with the A zone of the Parkes error grid – the region of no patient harm.

Of course the problem is that in the CLSI guideline, 2% of the results are allowed to occur in the higher zones of the Parkes error grid corresponding to severe patent harm.

Westgards Detection and IQCP

November 1, 2016


I received an email recently that alerted me to three seminars from the 2016 AACC meeting that are online. One is by the Westgards, so I had a look. This is quite an interesting presentation and shows the breadth of the contributions that the Westgards have made to quality in laboratory medicine.

Yet, one thing caught my eye and so here are my comments. Thus, the Westgards complain that in risk management as espoused by CLSI EP23, detectability has been omitted.

What they mean is that for each failure event, EP23 wants one to estimate the severity and probability of occurrence of that failure event. The Westgards suggest that the detectability of the failure event needs to be assessed as well and state that this is how industry does it.

Well maybe some industries, but I worked in industry and our company did not use detectability (we used severity and probability of occurrence).

Now in the context of EP23, I agree with the Westgards use of detectability. The problem is that EP23 itself is a poor adaptation of risk management. I commented on this before but here it is again.

As an example of a failure mode of a process step, assume that the failure is sample hemolysis which occurs during the process step to turn a whole blood sample into serum. As you go across the rows in an EP23 style risk analysis, you might see that a mitigation for this failure mode is to visually check whether the sample has been hemolyzed and how effective this check is. In this case – for this row item – you could add detectability to severity and probability of occurrence.

Here are the problems with this approach, whether you have added detectability or not.

For most labs, this (example) is already established laboratory practice. That is, labs already check to see whether samples are hemolyzed. All that has been done is to document it. Not much in the way of formal risk analysis has been done although there will be some benefit to this documentation.

The problem is that the row is “collapsed.” It really has two additional process steps embedded in it. Here it is uncollapsed:

Process step – process whole blood into serum
Detection step – examine serum for the possibility of hemolysis
Recovery step – if the serum has been hemolyzed, request a new sample

One can see that it makes no sense to ask for the detectability of a detection step.

I note in passing that one of the most important detection process steps for any assay is running quality control.

Note that each of these steps above are process steps and each can fail. Whereas the severity will be the same for the failure for each of these steps, the probability of occurrence may differ. Because each step can fail, one needs to assess whether a mitigation step is required.

BTW, one should not discount failures in the recovery step. In the Challenger accident, engineers warned about the potential problem (detection) but delaying the launch failed (recovery). And of course, recovery steps are only performed if detection steps detect something.

Disclaimer – I may not have the latest version of EP23, but another problem in developing the potential failure modes (EP23 call these hazards) is that the process is not fully delineated – it is too high level. In a more traditional FMEA, the list of process steps is long and reflects what is actually done, not some high level description.

And each process step can fail in multiple ways. EP23 is a hazard based list. A process based list is better since one can ask how else each process step can fail. Although EP23 does some of this, it’s embedded within a row and makes things too complicated. Here’s an example of a few ways the above detection step of examining the sample for hemolysis can fail:

  1. Technician simply misses seeing a hemolyzed sample (non-cognitive error – we all make them)
  2. Technician misses seeing a sample due to inadequate training (cognitive error)
  3. Technician misses seeing a sample due to distraction (phone call, or talking to colleague).
  4. Technician ignores hemolyzed sample due to pressure from management to process samples.

On a separate note, how does IQCP help one modify QC frequency?

The problem with the FDA standard explained

October 25, 2016


The previous blog entry criticized the updated FDA POCT glucose meter performance standard, which now allows 2% of the results to be unspecified.

What follows is an explanation of why this is wrong. My logic applies to:

  1. Total error performance standards which state that 95% (or 99%) of results should be within stated limits
  2. Measurement uncertainty performance standards which state that 95% (or 99%) of results should be within stated limits
  3. The above FDA standard which states that 98% of results should be within stated limits

One argument that surfaces for allowing results to be unspecified is that one cannot prove that 100% of results are within limits. This is of course true. But here’s the problem of using that fact to allow unspecified results.

Using a glucose meter example, with truth = 30 mg/dL. Assume the glucose meter has a 5% CV and assume that the precision results are normally distributed. One can calculate the location of glucose meter errors using various SD multiples and also note their location in a Parkes error grid and the number of times 1 of these errors due to precision could occur.

Truth SD multiple Observed glucose Parkes grid Occurs 1 in
30 2 33 A zone 20
30 3 34.5 A zone 370
30 8 42 A zone 7E+14
30 22 63 C zone 1E+106


(To get an error in the E zone, an extremely dangerous result, would require 90 multiples of the standard deviation, and Excel refuses to tell me how rare this is). I think it’s clear that not specifying a portion of the results is not justified by worrying about precision and / or the normal distribution.

Now errors in higher zones of the Parkes error grid do occur including E zone errors and clearly this has nothing to do with precision. These errors have other causes by other sources such as interferences.

A better way to think of these errors are “attribute” errors – they either occur or don’t occur. For more on this, see: Krouwer JS. Recommendation to treat continuous variable errors like attribute errors. Clinical Chemistry and Laboratory Medicine 2006;44(7):797–798.

Note that one cannot prove that attribute errors won’t occur. But no one allows results to be unspecified the way clinical chemistry standards committees do. For example you don’t hear “we want 98% of surgeries to be performed on the correct organ on the correct patient.”

The updated FDA POCT glucose meter performance standard has a big problem

October 21, 2016


As readers may be aware, I have ranted against glucose meter standards for some time. Although the standards have many flaws, the most egregious one is the failure to specify 100% of the results. For POCT glucose meters, the CLSI standard C30-A2 (2003) adopted the ISO glucose meter standard 15197, which accounts for 95% of the results.

In 2013, CLSI updated its standard, now called POCT 12-A3 to include 98% of the results.

In 2014, FDA issued a draft POCT glucose meter guidance which covers 100% of the results.

But, now FDA has updated its POCT glucose meter guidance to cover only 98% of the results.

There’s no reason to allow 2% of the results to be unspecified – I don’t know why the FDA did this.