CLSI EP22 EP23 Review

August 4, 2008

EP22 was created as a means to use risk management to allow manufacturers to recommend the frequency of external quality control run by clinical laboratories. This was the so called option 4. Options 1-3 were part of the original CMS proposal to allow clinical laboratories to reduce the frequency of external quality control to once a month (provided certain conditions were met).

 

 

 

EP23 was the clinical laboratory follow on document to EP22.

Here’s my take on these two documents.

1.       Manufacturers won’t provide the information as suggested by EP22. (This information consists of experiments to demonstrate the efficacy of internal control measures). It would be a lot of work (e.g., cost) and there’s no regulatory requirement to do so. Moreover, if this information were provided, then it is labeling which would require FDA to review it. It is not clear that FDA has accepted this review task. 

Update on 8/4/08 - During a CLSI presentation at the AACC meeting in Washington, Alberto Gutierrez from the FDA gave a presentation. Afterwards, I asked him if FDA would review the material about internal control experiments that manufacturers might present as part of the package insert. He said that FDA would review this material - but from what was said it seemed that the review would be superficial and that only egregious problems would be flagged by the FDA.

2.       Clinical laboratory staff does not have the expertise to review this information, were it provided. This does not mean that clinical laboratory staff is incapable of reviewing it – they could acquire the expertise – it just seems unlikely.

 

3.       Should manufacturers provide this information and clinical laboratory staff review it, there would be no benefit with respect to improving QC. This is illustrated by an example in EP22 where the failure mode of “incorrect results due to low volume sample” is examined. After presenting the results of an experiment to show how an internal system control works, the user control measure is to “ensure that adequate volume of sample is presented to instrument.” But clinical laboratory staff would (or should) do this anyway. They don’t need EP22 and EP23 to know that one should follow the manufacturer’s instructions and to refrain from doing something stupid.

 

In clinical chemistry, risk management is “in.” But there are signs that its popularity is already starting to wane. This is unfortunate, as there is a great opportunity to use risk management tools to reduce both the risk and occurrence of laboratory errors. But one must focus not just on potential system errors, as EP22 and EP23 do, but on human errors as well.


Westgard Quality Control Workshop – Part 3

June 5, 2008

dohI just returned from the Westgard quality Control Workshop, where I was a speaker and have a few blogs worth of comments – this is the third.

EQC – Equivalent Quality Control

This is the CMS proposal (1) to allow clinical laboratories to reduce the frequency of quality control from twice per day to once a month given that 10 days of running QC shows no values that are out (and given some other conditions).

Let’s try to construct a hypothesis to base such a recommendation. For example:

given any possible error condition that could be detected by external quality control, internal quality control would detect the same error 100% of the time.

This is about the best I can think of, which would result in the recommendation:

Stop running external quality control.

What does running 10 days of external QC with no out of control results show? The answer is nothing. This is because one can assume that during these 10 days, there were either no errors or if there were errors, external QC was not able to detect them. (It is possible that internal QC detected errors during these 10 days). In fact, this experiment is guaranteed to be meaningless. To see this, one must realize that internal QC is always “on” and precedes external QC. So to see if external QC is redundant to internal QC for an error, would mean that internal QC would detect the error and either shut down the system or prevent the result – this being the external QC sample – from being reported. However, one can get different information by running external QC for a longer period because if internal QC misses an error but external QC detects the error, then one has proved that external QC is not redundant to internal QC. This was shown to me (2) as out of control results for a range of assays ranging from 1 to 10 per year, where these were real problems. Since controls are run twice per day, the number of affected patients samples is larger.

So a lab that reduces external QC to once a month is risking an even larger number of patient samples which is made worse since the clinician has probably acted on the erroneous results.

Rather than do the experiment suggested by CMS, a lab can simply examine its external QC records for a sufficient length of time.

References

1.       To review, see: See http://www.aacc.org/events/expert_access/2005/eqc/Pages/default.aspx

2.       Personal communication from Greg Miller of Virginia Commonwealth University


Westgard Quality Control Workshop – Part 1

June 5, 2008

 

measureI just returned from the Westgard quality Control Workshop, where I was a speaker and have a few blogs worth of comments – this is the first.

What’s Missing from Clinical Laboratory Inspections

At the Westgard Workshop, most of the participants were from clinical laboratories and I was impressed with how smart these people are. I also got a sense of a tremendous regulatory burden. From the CAP CD, I obtained at the Workshop:

      The mission statement of the CAP Laboratory Accreditation Program is:

“The CAP Laboratory Accreditation Program improves patient safety by advancing the quality of pathology and laboratory services through education and standard setting, and ensuring laboratories meet or exceed regulatory requirements.”

I have had mixed feelings about inspections that certify quality and have previously reported my experience with an industry quality program – ISO 9001 (1).

Here’s my assessment of clinical laboratory inspections to certify laboratories. It would seem that the premise of these inspections is to ensure that specific policies and procedures are in place and executed as proven largely by documentation, which guarantees high quality. So what’s missing? As far as I can tell – and it is with great difficulty to read through these materials – that there is no measurement of error rates. Without such measurements, quality is unknown.

Recommendation

The regulatory bodies would describe a list of errors and their associated severities. The severities would be given numerical values such as the VA hospital system which uses 1-4. Every clinical laboratory would record each error (failure mode) that occurs in their laboratory, its severity, and its frequency (default frequency is of course 1).  They would multiply frequency x severity for each unique error (failure mode), add this up and get a rate by dividing by the number of tests reported per year.

Failing to count errors would be a serious violation.

This would be the start of a new premise for the regulatory bodies. Measure quality – if it’s unacceptable, the clinical laboratory would suggest and implement process changes. It’s a simple closed loop process. With emphasis on measurement, reliance on documentation should decrease and inspections should be less burdensome.

closed loop

References

1.       Krouwer JS. ISO 9001 has had no effect on quality in the in-vitro medical diagnostics industry. Accred. Qual. Assur. 2004;9:39-43


Acceptable Risk – Easy to talk about, but no one knows what it means

May 4, 2008

risk

Standards about risk management always talk about “acceptable risk.” This is a qualitative term. Unfortunately, for much of healthcare there is no matching quantitative assessment or goal. Consider two examples.

Statement

Because

Precision is acceptable

CV is 8% and goal is 10%

Residual risk is acceptable

?

 

 

It is possible to estimate the probability of a severe adverse event and to have an associated goal for such a probability but no one in healthcare does this. So one will see things like, “with this mitigation we have reduced the risk of the adverse event to an acceptable level” but the reality is no one knows what this really means.


Six Sigma can be dangerous to your health

March 13, 2008

sigma

At a recent conference, there were several presentations about six sigma for clinical laboratory assays. To recall, sigma is calculated as Sigma = (TEa – bias)/CV where

TEa is the total allowable error
Bias is the inaccuracy of the measurement procedure
CV is the imprecision of the measurement procedure

The problem with six sigma is that’s it taken as a sole measure of quality – that is, if you have a high sigma value (greater than 6) then your assay is assured of high quality. The rest of this entry explains why this is wrong.

First, TEa (total allowable error) is often specially called out as medically acceptable limits. One need only read the ISO 15197 standard for glucose to see this connection. I have previously commented about this standard. The implied meaning of medically acceptable limits in shown in below.

figure 1

This is simply not the real world. Taguchi long ago specified a more realistic quadratic model of worth, which is shown below, superimposed on the original figure but in green.

figure 2

Thus points A and B are similar in bias and are similar in causing (or not causing) medically unacceptable results. It is also likely then that if point A is ok, then so is point B. It is only when one gets far away from these limits that one is almost certain to have results that can cause harm. This is shown below with point C.

figure 3

This can also be expressed as an error grid such as those for glucose. So the “sigma” calculations really only express the zone A region (grey) where 95% or more of the results should be. Zone B (white) can contain up to 5% of the results and zone C (dark grey) should contain no results. The error grid contains more information since each set of limits is different for each concentration. An error grid is shown below, taken from FDA guidance. In the guidance, WM is the test method and CM is the reference method. (In the document WM=waiver method and CM=comparative method).

figure 4

So the problem is that sigma only accounts for zone A, but patients are harmed by values in zone C!

Now one might argue that there is nevertheless a relationship between sigma and the three zones, meaning that high sigma values are unlikely to have values in zone C and low sigma values are likely to have such values. This is also not true. Here is why.

1.       Often incorrect models are used to asses total error – see here.

2.       In estimating bias and CV, outliers – the very values that cause harm - are often thrown out.

3.       All sigma calculations are based on the assumption that the data are normally distributed. Most data do not fulfill this criterion. This means that often there are more frequent values in the tails of the distribution (again, this is zone C) than expected by calculations based on the normal distribution

4.       And maybe the biggest reason of all, values can occur in zone C that have nothing to do with the analytical process. If there is a patient sample mix-up, this can occur and these values are excluded (when detected) from virtually all analytical evaluations.

Think of it this way. If a loved one suffered medical harm, due in part to an erroneous lab result, would it make you feel better to know that the assay had a high sigma value? And would you associate that assay with quality?

I will comment on how one can address these issues in a future entry.


At risk behavior

March 3, 2008

risk

I am involved in risk management standards for clinical laboratories, where the focus has been on understanding how manufacturer’s devices can fail and how a clinical laboratory can put in place control measures to prevent these failures from causing harm.

My concern with these standards is that there is not enough emphasis given to the clinical laboratories own sources of error – its people. Among problems related to human errors are cognitive errors, non cognitive errors, reckless behavior, and at risk behavior – the topic of this entry.

At risk behavior is behavior that increases risk where risk is not recognized, or is mistakenly believed to be justified. Anyone who manages people must have had the experience by hearing  (perhaps second hand) “I don’t think that’s necessary and I’m not going to do it.” And of course, parents are familiar with at risk behavior practiced by their children.

An example of healthcare at risk behavior is reusing syringes. This occurred recently at an endoscopy clinic in Nevada and has affected up to 40,000 people. In reading the patient empowerment blog, one learns about other cases of reused syringes. In a case in Long Island, the physician reused syringes only for the same patient, but the syringes were used with multi-dose vials and these vials were used across patients.

In the recent case of reducing central line infections, Dr. Peter Pronovost observed that of the steps associating with placing a central line, in a third of patients, doctors skipped at least one step. Whereas, some of this could be attributed to non cognitive errors (slips), it could also be associated with at risk behavior. The control measure that worked here, was a double check step, whereby another healthcare provider would check to make sure each step was followed.

Discovering at risk behavior may not be easy, hence it needs to be on one radar’s screen.


Software Verification and Validation

January 24, 2008

SW bug In spending two sessions with groups of people who verify and validate medical device software, I got the impression that most effort is spent on testing code (to the requirements that exist). In part, I based this assessment on the amount of questions (e.g., interest by the audience) when code testing was discussed vs. examining requirements. Yet, in reviewing recalls, and my experience in the IVD industry, I suspect that that most errors are caused by wrong requirements (see figure).

 

 coderequirements.jpg

 This makes me recall some definitions.

Bug – A coding error that prevents the software from meeting its stated requirement. A divide by zero error is a bug, but if the denominator can never be zero, this bug will never be a failure. Never be zero means the value can never be zero without a code logic statement such as If X <> 0, then … If the code logic statement were present, there would be no divide by zero bug.

Failure – Any deviation from customer expectations. This rather liberal statement is similar to the general definition of quality by ASQ. Each failure must be evaluated by the software / product development team to decide whether they agree and of course deviations have non software causes.

Example – A home glucose meter produces a value over 500 mg/dL. The meter displays ERR1. This is a requirements error. It is known the value is too high ( it could be 501 or 1,000). The meter should say something like HIGH.


FMEA vs. FRACAS

January 4, 2008

concept

I have previously compared FMEA and FRACAS, here. Another simple difference is:

(Successful) FMEA reduces risk.

(Successful) FRACAS reduces failure rates.

Now, one often hears about successful FMEAs. In my experience, these are not FMEAs, they are examples of FRACAS. An example is here. How can one tell that this is FRACAS and not FMEA. It’s simple - what is described is the reduction of a too high failure rate to a lower rate. With FMEA, the failure rate is zero – the event has not happened. What one does is to reduce the risk of this potential failure, from some amount to a lower amount. This is perhaps one of the reasons, one does not hear too much about FMEA successes. As I said before, to say that something that has never happened is now even less likely to happen (due to FMEA) just isn’t too exciting.

To reduce failure rates is a good thing and it is not a big deal to call this FMEA when it is FRACAS. However, it is simple to use the correct terms and if one doesn’t one might wind up neglecting to perform FMEA when it’s needed.


A Different Animal

January 1, 2008

different

I have spent my career in industry in R&D in a quality role. As I continue to interact with people that deal with quality in the in vitro diagnostics industry, I get the impression that most of these people are not from R&D but rather from regulatory affairs. What’s the difference? My perception is that regulatory affairs professionals focus more on compliance – I have focused on measuring things. Compliance is often assessed through audits with documentation a large part of audits. Measuring things forces activities to focus on improving the metric of interest. Documentation is of less importance.

What’s another difference? Whenever I write an article for publication on quality, it’s reviewed by regulatory affairs professionals. I can tell by the comments (e.g., they disagree with most of what I say). R&D people agree with me.


Frequency of QC in the clinical laboratory

December 9, 2007

Lab

Kent Dooley has written an interesting essay, which is here. One of the points he makes is that not all clinical laboratory errors result in patient harm because clinicians will not always act on the erroneous result. So if an assay result doesn’t agree with other clinical data, the clinician may suspect the result might be wrong and ask to have it repeated. Dooley suggests that the minimum QC frequency should follow the time course for the likelihood of a clinician requesting a repeat sample, so that upon repeat, if the result had been in error, the new result will be correct (because now QC has been run).

Now, I am unencumbered by the knowledge and experience of working in a lab but my view of things is somewhat different. It seems to me that there are several error/detection/recovery possibilities as shown in the figure below. (Note, better pictures are here).

Error Detection Recovery

The problem of waiting for a clinician (of for that matter a patient) to question a result, before running QC is that it doesn’t take advantage of the purpose of QC, which is shown below.

QC

That is, one runs the assay and at some time QC. If the QC is ok, then the results are released to the clinician. If not, one troubleshoots the assay including possibly rerunning patient samples. Using this scheme, QC frequency should not be determined by a retest time course but rather by the turn-around-time requirement for the assay.

Now if the clinician requests a the assay to be repeated, and QC had already been run, it is unlikely that running a second QC will detect anything. QC has limitations in its ability to detect error (see figure below). Random biases and random patient interferences will not be detected by QC.

QC properties

This figure came from previous considerations about equivalent QC, which are here, and here.

Besides suspecting assay error, many assay results are repeated because a condition is being monitored. Delta checks are a type of QC that is performed on these samples to determine whether the difference between results is expected. Exactly how the clinical laboratory could act on the knowledge that the clinician suspects that something is wrong with the assay result is a topic for clinical laboratorians to answer.