Mandel and Westgard

May 20, 2018

Readers may know that I been known to critique Westgard’s total error model.

But let’s step it back to 1964 with Mandel’s representation of total error (1), where:

Total Error (TE) = x-R = (x-mu) + (mu-R) with

x= the sample measurement
R=the reference value and
mu=the population mean of the sample

Thus, mu-R is the bias and x-mu the imprecision – the same as the Westgard model. There is an implicit assumption that the replicates of x which estimate mu are only affected by random error. For example, if the observations of the replicates contain drift, the Mandel model would be incorrect. For replicates sampled close in time, this is a reasonable assumption, although it is rarely if ever tested.

Interferences are not a problem because even if they exist, there is only one sample. Thus, interference bias is mixed in with any other biases in the sample.

Total error is often expressed for 95% of the results. I have argued that 5% of results are unspecified but if the assumption of random error is true for the repeated measurements, this is not a problem because these results come from a Normal distribution. Thus, the probability is extremely remote that high multiples of the standard deviation will occur.

But outliers are a problem. Typically for these studies, outliers (if found) are deleted because they will perturb the estimates – the problem is the outliers are usually not dealt with and now the 5% unspecified results becomes a problem.

If no outliers are observed, this is a good thing but here are some 95% confidence levels for the maximum outlier rate given the number of sample replicates indicated where 0 outliers have been found.

N                             Maximum outlier rate (95% CI)

10                           25.9%
100                         3.0%
1,000                      0.3%

So if one is measuring TE for a control or patient pool and keeping the time between replicates short, then the Westgard model estimate of total error is reasonable, although one still has to worry about outliers.

But when one applies the Westgard model to patient samples, it is no longer correct since each patient sample can have a different amount of interference bias. And while large interferences are rare, interferences can come in small amounts and affect every sample – inflating the total error. Moreover, other sources of bias can be expected with patient samples, such as user error in sample preparation. And with patient samples, outliers while still rare, can occur.

This raises the question as to the interpretation of results from a study that uses the Westgard model (such as a Six Sigma study). These studies typically use controls but the implication is that they inform about the quality of the assay – meaning of course for patient samples. This is a problem for the reasons stated above. So one can say that if an assay has a bad six sigma value, the assay has a problem, but if the assay has a good six sigma value, one cannot say the assay is without problems.

 

Reference

  1. Mandel J. The statistical analysis of experimental data Dover, NY 1964, p 105.

 

Advertisements

When large lab errors don’t cause bad patient outcomes

April 21, 2018

In the Milan conference, the preferred specification is the effect of assay error on patient outcomes. This seems reasonable enough but consider the following two cases.

Case 1, a glucose meter reads 350 mg/dL, truth is 50 mg/dL; the clinician administers insulin resulting in severe harm to the patient.

Case 2, a glucose meter reads 350 mg/dL, truth is 50 mg/dL; the clinician questions the result and repeats the test. The second test is 50 mg/dL; the clinician administers sugar resulting in no harm to the patient.

One must realize that lab tests by themselves cannot cause harm to patients; only clinicians can cause harm by making an incorrect medical decision based in part on a lab test. The lab test in cases 1 and 2 has the potential (a high potential) to result in patient harm. Case 2 could also be considered a near miss. From a performance vs. specification standpoint, both cases should be treated equally in spite of different patient outcomes.

Thus, the original Milan statement should really be the effect of assay error on potential patient outcomes.


New publication about interferences

April 20, 2018

My article “Interferences, a neglected error source for clinical assays” has been published. This article may be viewed using the following link https://rdcu.be/L6O2


Performance specifications, lawsuits, and irrelevant statistics

March 11, 2018

Readers of this blog know that I’m in favor of specifications that account for 100% of the results. The danger of specifications that are for 95% or 99% of the results is that errors can occur that cause serious patient harm for assays that meet specifications! Large and harmful errors are rare and certainly less than 1%. But hospitals might not want specifications that account for 100% of results (and remember that hospital clinical chemists populate standards committees). A potential reason is that if a large error occurs, the 95% or 99% specification can be an advantage for a hospital if there is a lawsuit.

I’m thinking of an example where I was an expert witness. Of course, I can’t go into the details but this was a case where there was a large error, the patient was harmed, and the hospital lab was clearly at fault. (In this case it was a user error). The hospital lab’s defense was that they followed all procedures and met all standards, e.g., sorry but stuff happens.

As for irrelevant statistics, I’ve heard two well-known people in the area of diabetes (Dr. David B Sachs and Dr. Andreas Pfützner) say in public meetings that one should not specify glucose meter performance for 100% of the results because one can never prove that the number of large errors is zero.

That one can never prove that the number of large errors is zero is true but this does not mean one should abandon a specification for 100% of the results.

Here, I’m reminded of blood gas. For blood gas, obtaining a result is critical. Hospital labs realize that blood gas instruments can break down and fail to produce a result. Since this is unacceptable, one can calculate the failure rate and reduce the risk of no result with redundancy (meaning using multiple instruments). No matter how many instruments are used, the possibility that all instruments will fail at the same time is not zero!

A final problem with not specifying 100% of the results is that it may cause labs to not put that much thought into procedures to minimize the risk of large errors.

And in industry (at least at Ciba-Corning) we always had specifications for 100% of the results, as did the original version of the CLSI total error document, EP21-A (this was dropped in the A2 version).


A flaw in almost all lab medicine evaluations

January 25, 2018

Anyone who has even briefly ventured into the realm of statistics has seen the standard setup. One states a hypothesis, plans a protocol, collects and analyzes data and finally concludes that the hypothesis is true or false.

Yet a typical lab medicine evaluation will state the importance of the assay, present data about precision, bias, and other parameters and then launch into a discussion.

What’s missing is the hypothesis, or in terms that we used in industry – the specifications. For example, assay A should have a CV of 5% or less in the range of XX to YY. After data analysis, the conclusion is that assay A met (or didn’t meet) the precision specification.

These specifications are rarely if ever present in evaluation publications. Try to find a specification the next time you read an evaluation paper. And without specifications, there are usually no meaningful conclusions.


A requirement for specifications – Duh

January 17, 2018

Doing my infrequent journal scan, I came across the following paper – “The use of error and uncertainty methods in the medical laboratory” available here. Ok, another sentence floored me (it’s in the abstract)… “Performance specifications for diagnostic tests should include the diagnostic uncertainty of the entire testing process.” It’s a little hard to understand what “diagnostic uncertainty” means. The sentence would be clearer if it read Performance specifications for diagnostic tests should include the entire testing process. But isn’t this obvious? Does this need to be stated as a principle in 2018?


Allowable limit for blood lead – why does it keep changing

August 19, 2017

A recent article suggests the CDC limit for blood lead may be lowered again. The logic for this is to base the limit on the 97.5th percentile of NHANES data, and to revisit the limit every 4 years. An article in Pediatrics has the details. Basically, the 97.5th percentile for blood lead has been decreasing – it was around 7 in 2000. And in the Pediatrics article it is stated that: “No safe blood lead concentration in children has been identified.” Nor has human physiology changed!

It’s hard to understand the logic behind the limit. If a child had a blood lead of 6 in 2011, the child was ok according to the CDC standard, but not ok in 2013. Similarly, a blood lead of 4 in 2016 was ok but not in 2017?

Here is a summary of lead standards in the USA through time.

1960s 60ug/dL
1978   30ug/dL
1985   25ug/dL
1991   10 ug/dL
2012    5 ug/dL
2017?   3.48 ug/dL