Whooping Cough and False Positives

January 22, 2007

There has been a recent incident that has set the quality folks abuzz. As reported in The New York Times, a hospital treated a number of its workers for whooping cough, due to a positive test for that condition. It was later determined that no one had whooping cough – all of the test results were false positives. In a standards committee, I cited this article as an example of why it is important to perform a FMEA (Failure Mode and Effects Analysis) as there has been some resistance that FMEAs are too complicated for hospital laboratories.

Westgard cited the Times article as a reason to stress the need for method validation skills. I agree with most of what he says although I suggest that in addition to performing a method validation, one must also consider pre- and post-analytical issues – a reason to perform FMEA.

However, I disagree with one of Westgard’s points:

“Finally, there are those damned statistics that get in the way of a practical understanding of experimental results. As evidence of this problem, Clinical Chemistry (the International Journal of Molecular Diagnostics and Laboratory Medicine) recommends that authors utilize the Bland-Altman approach (difference plot with t-test statistics) for analyzing method comparison data, in spite of the fact that regression techniques are usually much more informative, particularly in identifying proportional analytical errors that invalidate the error estimates from t-test analysis. Evidently, laboratory scientists are not sophisticated enough to understand and utilize regression analysis correctly. That again speaks to the inadequacy of our education and training programs and the lack of proper guidance in the validation of molecular tests, even by a leading international journal.”

This advice is incorrect. Total error is more informative than regression and a better first step in assessing assay performance. Proportional error does not invalidate the t-test. Among the methods for assessing total error are:

Technique Issues Origin* CLSI** Standard
Model:  combine systematic and random errors Can discard outliers, model can be wrong, only accounts for 95% of results, specs are for components Westgard, Peterson, others None, but components based on EP5 and EP9
Model: GUM Can discard outliers, model can be wrong, very complicated, only accounts for 95% of results ISO Clin. Chem. 51
Bland Altman Can discard outliers?, normal data assumption can mislead, only accounts for 95% of results Bland Altman EP21
Mountain Plot Need a lot of data Krouwer, Monti, Lynch EP21

*Or champions **Clinical and laboratory standards institute

If total error is unacceptable, further analysis may be warranted, such as regression.

The reverse Pareto – not a good thing

January 11, 2007

I have previously commented why in a ranking system, one should rank severity and probability and not include detection in ranking, e.g., use criticality = [severity] x [probability of occurrence], rather then RPN (Risk Priority Number) = [severity] x [probability of occurrence] x [likelihood of detection].

This essay describes the perils of trying to achieve for FMEA, a numerical reduction in criticality (or RPN) after instituting control measures (mitigations). Reduction in RPN is recommended by the IHI (Institute for Healthcare Improvement) website, http://www.ihi.org/ihi/workspace/tools/fmea/.

If one focuses on severity and probability, almost all dangerous events will (by definition) have the highest (worst) severity and the lowest (best) probability. As an example, for patient death caused by wrong site surgery, one would expect this event to be exceedingly rare (in the once in 5-30 year category) as opposed to more frequent. Less severe events could be expected to be more frequent. For example, a patient that waits for an appointment more than the prescribed time could be a frequent event.

Criticality – (severity times probability of occurrence) is a semi quantitative measure. The Veteran’s Administration HFMEA has rankings of 1-4 for both severity and probability. This results in criticality from 1 to 16 to cover all cases. As mentioned above, in a criticality grid, the cell that contains 16 is likely to be devoid of events.

Although there are a few exceptions, in most cases, control measures reduce probability of occurrence, not severity. This means that for the most dangerous events, one would institute a control measure on a criticality of 4 (severity = 4, probability = 1) and wind up with a criticality that is still 4 (severity = 4, probability = 1). On the other hand, it is possible to improve the numerical criticality of non severe events such as the excessive patient waiting times. For example one could improve criticality from 4 (severity = 1, probability = 4) to criticality of 1 (severity = 1, probability = 1).

It’s not hard to see where this is going. If one starts to add up all of the criticality numbers and seeks to have an improvement of criticality due to control measures, then the most likely way to do this is to focus on the least severe events! This is the reverse Pareto and not a good idea.

Pareto analysis is an important part of FMEA. One must focus on items at the top of the Pareto chart, and know that the criticality numbers are not likely to change for these items, in spite of the fact the risk has been reduced. If one had a true quantitative ranking of probability of occurrence, this would be a different story, but quantitative rankings are not in use in healthcare.

FMEA timing and new vs. previously implemented control measures

January 5, 2007
FMEA timing and new vs. previously implemented control measures – 1/2007

When is the best time to do a FMEA? To answer this question, consider the phases of product development:

Phase 1 – Researching new opportunities (funding, design starts) Phase 2 – Proving feasibility (breadboards) Phase 3 – Scheduled development (prototypes) Phase 4 – Validation (clinical trials) Phase 5 – Commercialization (launch)

The number of phases or their names differ in companies but conceptually, these phases describe the product development process (with a similar set of phases for processes).  For medical diagnostic manufacturers, FMEA is often conducted as part of a FDA submission and this can occur in Phases 2, 3, or 4, with Phase 3 being common.

However, the best time to start a FMEA is in Phase 1. The FMEA should be revisited in each phase. Here are some reasons for starting the FMEA this early.

Purpose of FMEA – The purpose of FMEA is to affect the design and the most practical time to do this is when competing designs are being considered, not after choices have been made, contracts signed, and prototypes being built and tested. Of course, there are other reasons for performing a FMEA. FDA requires risk assessment (hazard analysis) but a FMEA conducted late in the product development process is largely a documentation exercise – rarely does this FMEA affect the design.

Design Reviews and FMEA – Most projects have design reviews including in Phase 1 and one might think that these can be considered as FMEA activities. They can’t. In a typical design review, the design is presented and questions are allowed.  Whereas these questions could affect the design, they are less likely to do so than a FMEA, which is a challenge of the design in the form of a brainstorming session with much more time devoted to proposing potential errors, their causes, effects, and control measures (mitigations) that could affect the design.

Formal vs. Informal FMEA – If one reviews the documentation version of a FMEA, (e.g., a FMEA that has been carried out late in the product development cycle), it is clear that the design has been affected by considering error prevention and detection. A FMEA is a documentation exercise when the control measures have been previously chosen. Putting in control measures to prevent errors outside of FMEA can be considered an informal FMEA. The danger in this approach is that the control measures are often supplied by the designer. There is neither a team nor a challenge session.

Actually, it would be naïve to think that without a FMEA, all products will have a spate of serious flaws. Designers (or people who implement the design) who incorporate effective control measures as part of the design make the need for a FMEA less apparent. However, the team brainstorming design challenge session in a FMEA reduces the risk that the designer has missed something.

Moreover, when FMEA is informal, one might be adding control measures to a design that should not have been chosen in the first place.  A designer does not often question his or her own design! and the designer and the person who implements the design (the engineer) may be different people.

An Example – A small instrument is being developed. Rather than choosing an OEM keypad, a designer specifies a unique keypad which must be developed from scratch including all of the interfaces. The risks of such a design choice are much higher than selecting a keypad with known characteristics. The unique keypad increases risk of events that could lead to patient harm, worse reliability, and delay to the product launch. (It is up to the designer to supply the benefits off the unique keypad). The time to challenge this choice is before work on the unique keypad starts, not after prototypes have been built and are under test.

To make this example more challenging, assume that the designer is charismatic, cozy with upper management, and claims that the unique keypad and other similar design features will win design awards and dramatically increase sales. How does someone who is less charismatic and less well known by management successfully challenge the design? One needs to speak in the language of management – money. One should request a marketing study such as a conjoint analysis to prove the assumption of increased sales. (In a conjoint analysis, one asks customers to rank potential product features such as elegant vs. ordinary design, size of sample, cost, and so on). One should also demonstrate the downside risk to the design by showing the effect on profitability of a delay in the product launch (caused by the design choice). If one is using decision analysis financial models, one could show the decrease in expected net present value caused by the increased risk of a patient harm event and lower reliability (as well as the delay in launch).

A suggested new  FMEA item – distinguishing between new vs. previously implemented control measures

Consider another example:

Error – incorrect result reported Cause – clot in system Effect – potential wrong medical decision Control measure – clot detection system prevents result from being reported

This is a pretty standard (partial) entry in a FMEA table. However, it is most likely that the clot detection system has not been implemented as a part of the FMEA and may have been part of the original design requirements1. Hence this is a case of FMEA as documentation since any design for this product will have a clot detection system. Yet, regardless of when the FMEA is conducted, it is possible that some control measures have resulted from the FMEA exercise. If this is the case, one would like to know this and distinguish these control measures from previously implemented control measures and this distinction (which control measure is new vs. previously implemented) is missing in most FMEAs.

Whereas there is no guarantee that new control measures are required to lower risk, one can evaluate past projects by examining risk outcomes and the historical FMEAs – it should be possible to distinguish after the fact, between previously implemented and new control measures. This presents one with a one way test. If past risk outcomes have been poor and there have been no (or few) new control measures in FMEAs, then it would be prudent to measure new control measures in future FMEAs.

1It is possible to perform a FMEA on the design requirements, e.g., before any design work starts. However, it is also likely that clot detection is simply a feature required by marketing to be competitive.