IQCP – It’s about the money

April 22, 2016


There is an article in CAP Today about IQCP. I was struck by a quote in the beginning of the article:

“I didn’t stop to calculate what it would cost to do liquid quality control on all the i-Stat cartridge types every eight hours because the number would have been through the roof”

Now I understand that cost is a real issue, but so is harm to patients.

The original idea of EQC (equivalent quality control) was to reduce the frequency of QC if you did an experiment that showed good QC for 10 days. This was of course without merit with the potential to cause patient harm.

The current notion of IQCP is to perform risk analysis and reduce the frequency of QC. This also makes no sense. Risk analysis should always be performed and so should QC, at a frequency which allows the repeat of questionable results such that patients will not be harmed.

More on Antwerp, EFLM and pre-analytical error

April 14, 2016


One of the talks in the Antwerp conference referred to an EFLM working group to come up with performance specifications for pre-analytical error. There is a talk on the EFLM website about this.

The recognition that pre-analytical error is a big part of ensuring the quality of laboratory tests is of course important; however, it’s hard to see how separate performance specifications for pre-analytical error (e.g., separate from analytical error) can be useful. [Actually, the presenter from the Antwerp conference agreed with my skepticism during a break.]

Pre-analytical error can be classified into three types

Type 1 – An error that is completely independent of the analytical process. Example: failure to wash the site that is sampled for a glucose meter test. If the site is contaminated with glucose, any glucose meter will report an elevated (and erroneous) result.

Type 2 – An error that is partly dependent on the analytical process. Example: a short sample for a glucose meter test that has an algorithm in the meter to detect short samples. If the algorithm is defective (an analytical error) and there is a short sample (a pre-analytical error), the glucose result may be erroneous.

Type 3 – A pre-analytical error that is indistinguishable from the analytical process. Example: air bubbles in a pO2 blood gas syringe. No matter who is performing the test, there is the possibility of having bubbles in the sample, a pre-analytical error which can cause an erroneous result.


One of the problems is that a typical evaluation will attempt to meet (analytical) performance specifications with type 1 and type 2 errors excluded from the evaluation. This is of course recognized by this EFLM group, hence their task. I note in passing that when type 3 errors occur, the performance evaluations include such pre-analytical errors even when trying not to (by excluding the possibility of type 1 and type 2 errors).

One fear is that the EFLM group will come up a bunch of separate performance specifications for pre-analytical error, independent of specifications for analytical error. I don’t see how this can work.

What would I do? I would use a reliability growth metric, which counts all errors (regardless of source) – see this paper.

Finally, where I wrote pre-analytical error, it should be pre- and post-analytical error.

Responding to Prof. Jim W.

April 13, 2016


I need to speak up due to a summary made by Jim Westgard regarding my talk in the Quality in the Spotlight Conference from Antwerp.

  1. Jim referred to my presentation where I said the “total” in total analytical error left out too many errors. Jim suggested I referred to pre-pre-analytical errors among others but I definitely stated that analytical errors are also left out of the Westgard total error model. I’m not sure what a pre- pre- analytical error is anyway. It is true that there are some rare errors that will be very difficult for a lab to detect, such as software errors or manufacturing mistakes.
  2. Jim suggested that total analytical error (e.g., the Westgard model) is broader than separate estimates of precision and bias. I don’t see how.
  3. He said that labs don’t want more complex equations / models. I’m sure this is true but what our company did was even simpler than the Westgard model – we simply looked at the difference from candidate minus the comparison method for all data. There were no models. The data were ranked to show the error limits achieved by 95% and 100% of the data. Not being constrained by models makes things simple.
  4. Jim said that ISO 15189 does not require uncertainty measurement that includes pre- and post- analytical error. That may be, but it doesn’t make it right.

The Lone Dissenter

April 12, 2016


The picture is a photo of Linda Thienpont receiving the Westgard quality award, presented by Jim Westgard. This was a highlight of the Antwerp meeting in which Linda’s contributions to laboratory medicine were recognized.



I was amused to see a photo on the Westgard blog about the Antwerp conference – Quality in the Spotlight. The photo is incidental to the blog content – it shows people holding up green cards with the exception of one person holding up a red card. It’s hard to see the person holding up the red card, but it’s me! So this was voting by the attendees to some questions asked by the convener – Henk Goldschmidt – at the end of the day’s session.

The question to which everyone agreed except me went something like – “should analytical variation always be less than biological variation”

So here’s my reason for dissenting.

The Ricos database for glucose, available on the Westgard website, lists the TAE for glucose at either 5.5% or 6.96%. Yet, the 2013 ISO 15197 performance standard for glucose meters is: TAE (95% of Results) ± 15 mg/dL below 100 mg/dL and ± 15% above 100 mg/dL. Hence, the answer to the question should analytical variation always be less than biological variation is no!

In my one man Milan response paper (subscription required) to the Milan conference, I had a section discussing the merits of biological variation vs. clinician opinion but dropped it in the final version. But this material was in the Antwerp conference – basically I said, I understand the rationale behind biological variation and it makes sense to me but I don’t see how biological variation can trump clinician opinion and glucose meters was the example I used.

I note in passing that Callum Fraser, the guru of biological variation, was in the audience – he presented earlier in the day a fabulous historical overview of biological variation. During his presentation I was nevertheless struck by some of the equations used for biological variation. For example, one of the equations was

CV < ½ CV within-subject biological variation

So why is it exactly 0.5? Why not 0.496 or 0.503? And how can it be 0.5 for all assays? Is there something about the 0.5 that is like pi?

A difference between the IFCC HbA1c goals and an HbA1c error grid – continued

April 5, 2016


Some more thoughts …

Anyone who’s ever looked at CAP summary statistics knows that CAP deletes outlier data as part of their process. One can view this in several ways…

From a statistics standpoint, it makes sense because the main parameter of interest is imprecision, which would be inflated by outlier data.

But the original goal of six sigma (which also requires a precise estimate of imprecision) was to be able to predict DEFECTS so why in the world would you delete the defects (outliers) that you wish to predict. From that standpoint, the analysis is biased.

Moreover, the outliers could in fact be real analytical problems although whatever their cause, they still are problems and because outliers are by definition large errors, these values could be associated with serious patient harm.

So this is another reason to favor error grids – which always include all data.

A difference between the IFCC HbA1c goals and an HbA1c error grid

April 4, 2016


Recently, an IFCC committee published recommended goals for hemoglobin A1c (HbA1c). Their recommended sigma metrics establish a pass-fail criterion and their discussion revolves around the risk of passing or failing the allowable total error using control samples.

An error grid has zones and zones outside of the “A” zone are associated with increasing harm to patients (e.g., likelihood of an incorrect medical decision made by a clinician based on test error). Typically, patient samples are used to populate an error grid and the likelihood of observing errors is more faithful to the real world.

Each type of goal has its advantages and limitations.

The IFCC pass-fail goal is useful for proficiency surveys to compare different assays, especially over large datasets. Thus, one can spot poorly performing assays. But the limitations are:

  • Because control samples are evaluated, certain errors such as patient interferences are not possible to be detected.
  • Even if patient samples were used, the Westgard model would not detect interferences or other sources of random bias.
  • The pass-fail criterion has its own limitations. An assay that just passes and one that just fails should have similar performance – yet one is acceptable and the other isn’t.

An error grid is more suitable to understanding how an assay will perform in a hospital laboratory, assuming that the error grid is populated using patient samples (requires a reference assay for comparison for each sample). The advantage is that more error sources are sampled and the harm associated with the results is shown (assumes that the zone limits are correct). The limitations are:

  • Populating an error grid is impractical across sites so conclusions are limited to the site that conducted the experiment.

So the IFCC goals provide a high level view of an HbA1c assay whereas the error grid provides the detailed view.