Reading Quality Digest can be dangerous to your health

June 17, 2008

right tool for jobIn the June 2008 issue of Quality digest, there is an article by Jay Arthur entitled “Statistical Process Control for Healthcare” (1). After the usual boilerplate type of introduction, something caught my eye; namely, the so called good news that there is “inexpensive Excel based software to create control charts … .“ This made me go to the end of the article where sure enough the author just happens to sell such software. This may have been a good place for the author to introduce the term bias.

To understand a more serious problem with this article, consider a hospital process; namely analyzing blood glucose in a hospital laboratory. Because such a process has error, quality control samples are run. Say such a control has a target value of 100 mg/dL.  The values of the quality control samples are plotted by SPC software and rules are formulated. If the glucose control value is too high or too low, the process is said to be out of control and action is taken.

Now,  Mr. Arthur is trying to push SPC software not for a process but for errors in the process. For example, he uses the infection rate in a hospital. But the infection rate error is not a process that one wants to control – of course one does not want it to become worse - but its target is zero.

A more useful example than the hypothetical one provided by Mr. Arthur was published recently (2). Here, the authors were faced with an undesirable hospital infection error rate and set out to observe where errors occurred in the process of placing central lines. They then provided control measures and continued to track the error rate, which was reduced to zero. This is not SPC! It is much more like a FRACAS (Failure Reporting And Corrective Action System).

In another part of the article, Mr. Arthur suggests that “never events” can be tracked by SPC. Never events – a list of 28 such events have been put forth by the National Quality Forum – have as implied, targets of zero. Such an event is wrong site surgery. One should use something like FMEA (Failure Mode Effects Analysis) to reduce the risk of such events. It is silly to suggest SPC software for never events.

References

1.   See. http://www.qualitydigest.com/currentmag/articles/03_article.shtml

2.   An Intervention to Decrease Catheter-Related Bloodstream Infections in the ICU. Pronovost P, Needham D, Berenholtz S, Sinopoli D, Chu H, Cosgrove S, Sexton B, Hyzy R, Welsh R, Roth G, Bander J, Kepros J, Goeschel C N Engl J Med 355:2725, December 28, 2006


Westgard Quality Control Workshop – Part 1

June 5, 2008

 

measureI just returned from the Westgard quality Control Workshop, where I was a speaker and have a few blogs worth of comments – this is the first.

What’s Missing from Clinical Laboratory Inspections

At the Westgard Workshop, most of the participants were from clinical laboratories and I was impressed with how smart these people are. I also got a sense of a tremendous regulatory burden. From the CAP CD, I obtained at the Workshop:

      The mission statement of the CAP Laboratory Accreditation Program is:

“The CAP Laboratory Accreditation Program improves patient safety by advancing the quality of pathology and laboratory services through education and standard setting, and ensuring laboratories meet or exceed regulatory requirements.”

I have had mixed feelings about inspections that certify quality and have previously reported my experience with an industry quality program – ISO 9001 (1).

Here’s my assessment of clinical laboratory inspections to certify laboratories. It would seem that the premise of these inspections is to ensure that specific policies and procedures are in place and executed as proven largely by documentation, which guarantees high quality. So what’s missing? As far as I can tell – and it is with great difficulty to read through these materials – that there is no measurement of error rates. Without such measurements, quality is unknown.

Recommendation

The regulatory bodies would describe a list of errors and their associated severities. The severities would be given numerical values such as the VA hospital system which uses 1-4. Every clinical laboratory would record each error (failure mode) that occurs in their laboratory, its severity, and its frequency (default frequency is of course 1).  They would multiply frequency x severity for each unique error (failure mode), add this up and get a rate by dividing by the number of tests reported per year.

Failing to count errors would be a serious violation.

This would be the start of a new premise for the regulatory bodies. Measure quality – if it’s unacceptable, the clinical laboratory would suggest and implement process changes. It’s a simple closed loop process. With emphasis on measurement, reliance on documentation should decrease and inspections should be less burdensome.

closed loop

References

1.       Krouwer JS. ISO 9001 has had no effect on quality in the in-vitro medical diagnostics industry. Accred. Qual. Assur. 2004;9:39-43


Alternatives to Six Sigma

March 19, 2008

assay

This entry continues where the entry (Six Sigma can be dangerous to your health) left off. Given the problems with six sigma, what are some solutions to estimate the quality of an assay, using hCG as an example assay.

First, when total analytical error is calculated to estimate the values in zones A-C in an error grid, one should use conservative methods such as the empirical distributions suggested by the CLSI EP21A method, and where no data are deleted. Let’s say a clinical laboratory has done this evaluation with 40 patient samples for a new and reference method and found no results in zone C for an hCG assay. What can one conclude? Although there are 0% of the values in zone C, the 95% confidence interval extends to 7.2%. This means that for every million hCG results performed, up to 72,000 results could be in zone C. This is not very comforting and these types of evaluations don’t prove much, although one knows that the 7.2% rate is unlikely (because if this rate to occurred, it would be noticed).

FMEA is an approach that will provide an answer to the quality question but in its complete form, it requires considerable effort. To complete a FMEA analysis, one has to postulate all possible reasons why a result could fall into zone C. To get an idea of what is involved, take two possible failure modes, HAMA interference and a patient sample mix-up.

HAMA interference – To estimate the likelihood of a zone C result from HAMA interference, one needs to know the level of HAMA that will cause erroneous results in the assay and the probability of such levels in the population being sampled. Contacting the manufacturer might give one the level of HAMA to watch out for – I am not familiar with data about the distribution of HAMA in patient samples. Yet, one knows HAMA interference occurs (Clinical Chemistry. 2001;47:1332-1333).  

Patient sample mix-up – There are some data for patient sample mix-ups (Archives of Pathology and Laboratory Medicine: Vol. 130, No. 11, pp. 1662–1668). However, it seems that these cases are caught within the laboratory. One would need to determine how many cases actually are not caught within the laboratory. One could then model the likelihood of a zone C result by sampling from the empirical distribution of hCG results that are observed on the lab to see the likelihood of a mix-up causing a zone C result.

Because there are so many existing data in a clinical laboratory, one may also have the opportunity to perform FRACAS types of analyses. That is, in addition to modeling probabilities, once could use existing data to count actual failures.

One must then continue:

  • with each other possible failure mode, calculate the probability of zone C results
  • calculate the overall probability of zone C results (from all failure modes) and determine if that risk is acceptable
    • special software is typically used to perform these calculations
  • construct a Pareto table if the overall probability of zone C results is too high and
  • propose control measures to lower the overall risk to an acceptable level
    • the control measures must of course be affordable

At this point, one can get the idea that this level of effort is out of reach for clinical laboratories since the level of expertise and work need just to estimate the likelihood of a zone C result is huge. Even if a clinical laboratory could perform this task, it makes no sense to require every clinical laboratory to do so.

One possibility is to have a standards group tackle such a task., although this too has limitations as was shown for a (universal) control measure to prevent wrong site surgery.

Another possibility is to perhaps leverage resources beyond the clinical laboratory. For example, one could insist that before treatment for trophoblastic carcinoma, an hCG result should be confirmed either by performing a reference assay or perhaps by treating the sample and rerunning it. This requires an interaction between the clinical laboratory and clinicians.

So there are no easy answers to preventing severe, low frequency failures, (that cause patient harm) but as discussed before, coming up with a sigma estimate for an hCG assay, is also not the answer. Nor is doing nothing.


Should one focus on a failure in a procedure or the outcome of such a failure?

February 14, 2008

money

Withholding payment for adverse events is a financial incentive to promote patient safety. Whether this incentive makes financial sense is something I will comment on later or perhaps not at all. For now, my comments are about the policy as it recently appeared (1).

 

 

The authors suggest the following criteria to withhold payment.

·         Evidence demonstrates that the bulk of the adverse events in question can be prevented by widespread adoption of achievable practices.

·         The events can be measured accurately, in a way that is auditable.

·         The events resulted in clinically significant patient harm.

·         It is possible, through chart review, to differentiate the adverse events that began in the hospital from those that were “present on admission” (POA).

The problem is with the third bullet and can perhaps be illustrated by the following figure.

FMEA FRACAS

In this figure FMEA events are shown by the dashed line.  The red dashed line is before FMEA. The green dashed line shows that after a successful FMEA, risk of failures has been reduced. FRACAS events are shown by the solid lines. The green line shows a reduction in the failure rate after FRACAS.

Keep in mind, for the dashed lines (FMEA), no failures have occurred, while for the solid lines, failures have occurred.

Now the policy defines a failure as an adverse patient outcome. One can view outcomes as the end of  an event cascade as in the next figure.

error cascade

Assume that event C is an adverse patient outcome. According to the policy, payment is withheld only when event C is observed. In the first figure, the relevant concern area is shown by the ellipse as it is assumed that these are all high severity (severe patient harm) events.

This policy therefore excludes the following cases:

All FMEA events. That is, a procedure with a correctable high risk will be excluded from this policy because the event has not yet occurred. Considered the case of the Duke transplant error (2), before it happened. One can infer that this was a high risk procedure that would have benefited from a FMEA. In essence, this policy waits for disasters to happen.

All near miss events. Consider the case of the patient who had an MRI (3). Blood pressure monitor tubing had to be disconnected for the MRI. After the procedure, the tubing was incorrectly connected to an IV line. Before air was delivered from the automated blood pressure monitor, a family member noticed that things didn’t look right and contacted a nurse, who corrected the problem. Thus, there was no adverse event.

All defective procedures that don’t result in severe patient harm. Consider a healthcare worker who violates hospital policy (at risk behavior according to Marx (4)), which results in a patient fall. In this case, the fall results in a minor injury.  This is an important case because the policy fails to properly reflect risk management principles.

For a procedure that has a problem (e.g., a failed event), one has to classify the severity of the failed event and its probability (FMEA) or frequency of occurrence (FRACAS). The severity is classified not necessarily by the failed event but by the effect of the failed event. The effect is itself an event and can be a spectrum of severities. In the case of a patient fall, there is a distribution of harm associated with the fall event – some falls will result in severe harm, some will result in minor harm. Traditionally, in risk management, if severe harm is possible, then severity is associated with severe harm, even if the probability of severe harm is low. In this sense, severity is equated with potential outcome, regardless of whether that specific outcome has occurred.

One also has to classify the probability (FMEA) or frequency of occurrence of the event (FRACAS). Here, assuming FMEA, one could choose between the probability of the failed event or the probability of the effect of the event (the adverse outcome). It is recommended to use the probability of the failed event, not the probability of the effect of the event. This is because one usually has control over the failed event and does not have control over the effect of the event.

Example: If a clinical laboratory provides a clinician with an erroneous result and the effect of that could be patient harm, the event is classified as severe. The probability is the probability of erroneous result, not the probability of patient harm, because patient harm is outside of control of the clinical laboratory (the clinician might not act on the result, might suspect it is erroneous and request it to be repeated, and so on).

Summary

This policy will miss many quality issues and deviates from traditional risk management.

References

  1. Wachter RM ,Foster NE and Dudley RA Medicare’s Decision to Withhold Payment for Hospital Errors: The Devil Is in the Details The Joint Commission Journal on Quality and Patient Safety 2008;34: 116-123, see http://psnet.ahrq.gov/resource.aspx?resourceID=6760
  2. See http://www.cbsnews.com/stories/2003/03/16/60minutes/main544162.shtml
  3. See http://www.ismp.org/newsletters/acutecare/articles/20030612.asp
  4. Marx, D. Patient Safety and the “Just Culture”: A Primer for Health Care Executives http://www.mers-tm.net/support/Marx_Primer.pdf


FMEA vs. FRACAS

January 4, 2008

concept

I have previously compared FMEA and FRACAS, here. Another simple difference is:

(Successful) FMEA reduces risk.

(Successful) FRACAS reduces failure rates.

Now, one often hears about successful FMEAs. In my experience, these are not FMEAs, they are examples of FRACAS. An example is here. How can one tell that this is FRACAS and not FMEA. It’s simple - what is described is the reduction of a too high failure rate to a lower rate. With FMEA, the failure rate is zero – the event has not happened. What one does is to reduce the risk of this potential failure, from some amount to a lower amount. This is perhaps one of the reasons, one does not hear too much about FMEA successes. As I said before, to say that something that has never happened is now even less likely to happen (due to FMEA) just isn’t too exciting.

To reduce failure rates is a good thing and it is not a big deal to call this FMEA when it is FRACAS. However, it is simple to use the correct terms and if one doesn’t one might wind up neglecting to perform FMEA when it’s needed.


Central lines and FRACAS

December 7, 2007

surgery

One hears of FRACAS success stories (like the one below) and FMEA failure stories (like the wrong blood type organs transplanted at Duke). A reason one doesn’t hear of FMEA success stories is that to say that something that has never happened is now even less likely to happen (due to FMEA) just isn’t too exciting. FMEA success stories are often not cases of FMEA, they are FRACAS, since rate improvements are discussed. FRACAS failures – we tried something, it didn’t work – are not very interesting.

A recent article in The New Yorker (1) provides an example of a FRACAS success story.

In the article, there is no mention of FRACAS but many of the steps were followed. The issue was a too frequent infection rate in central lines. It is important that one can measure this rate. One knows how many central lines are used, infections manifest themselves and their cause can be determined by culturing the lines. Some undercounting is possible but the rate seems fairly reliable.

The man behind the work, Dr. Peter Pronovost, first observed events for a month within the context of the process of placing central lines (e.g., process mapping). Errors in the process steps were identified. Since these steps were simple, such as washing hands, one could partly view these errors as non cognitive errors. This suggests a control measure such as a double check to prevent such “slips”. Actually, besides slips, there may have been some at-risk behavior (2). This is behavior that increases risk where risk is not recognized, or is mistakenly believed to be justified. The main control measure used was a checklist, with the addition of having nurses double check to see that the checklist steps were properly done. Then the rate was measured again and found to be considerably lower. All of this was published (3).

It was mentioned that an alternative control measure had been tried; namely, using central lines coated with antimicrobials. This expensive control measure failed to provide a substantial reduction in infection rates. This illustrates that one must be open minded when selecting control measures. There is sometimes a bias towards fixing the “system” (e.g., such as with coated lines) rather than fixing a people issue (e.g., which often implies blame). Dr. Pronovost implemented some system control measures by getting the manufacturer of central lines to include drapes and chlorhexidine – items that should have been available at the bedside but often were not.

Another big part of this story is ongoing resistance towards implementing this control measure more widely, even after it has been shown to be effective and low cost. Any control measure can be viewed as a standard and standards are not very popular. People will argue “but our situation is different”, “ICUs are too complicated for standards”, and so on. Financial incentives (or disincentives) for standards (e.g., P4P) loom. Dr. Gawande goes on to say how complicated things are in an ICU, yet there is precisely where standards helped. A similar situation happened in anesthesiology in the late 70s and early 80s. (Here, critical incident analysis was used and is basically the same as FRACAS.) The error rate was too high, effective control measures were developed, and widespread implementation of the control measures took considerable effort. You can read about that story here.

References

1.       Gawande A. Annals of Medicine. The checklist. The New Yorker, Dec. 7th issue, 2007, see here (don’t know how long this link will work).

2.       Marx, D. Patient Safety and the “Just Culture”: A Primer for Health Care Executives http://www.mers-tm.net/support/Marx_Primer.pdf

3.       Pronovost P. et al. An Intervention to Decrease Catheter-Related Bloodstream Infections in the ICU. N Engl J Med 2006;355:2725-32.


ISO 14971 and Residual Risk

November 21, 2007

competition

The last entry was about FMEA goals, yet, the word “goal” isn’t in ISO 14971. Maybe “goal” suffered the same fate as the word “mitigation” – banned from ISO. There is an implied goal in ISO 14971 - the residual risk must be acceptable. To recall, residual risk is the risk that remains after control measures have been taken. Here’s where things get a little tricky.

In cases where the residual risk is unacceptable, one is supposed to perform a risk benefit analysis to determine if benefits of the medical procedure performed by the device outweigh any possible residual risk.

To frame this discussion, consider two types of residual risk:

 

 

1.       A residual risk from a known issue, such as an interference, where eliminating this risk is not “practical “

2.       The overall residual risk from unknown issues. A certain amount of effort is used to search for risks (e.g., through FMEA, FTA, and FRACAS). At some point, more effort is considered not practical. Note: One can look at FDA recalls to see that unknown risks are often found in released products and lead to recalls (1).

Use of the word practical in ISO 14971 implies that in some cases, risk reduction is too expensive. This is not meant to be pejorative since everyone has limited resources.

In most cases in the standard, the cost benefit analysis is positioned as an analysis of the medical device’s clinical benefit to the patient vs. its risk. But ISO 14971 does point out an additional frame for the discussion.

“Those involved in making risk/benefit judgments have a responsibility to understand and take into account the technical, clinical, regulatory, economic, sociological and political context of their risk management decisions.”

To understand the issue, consider Type 1 diabetes as an example with the medical procedure being use of a home glucose meter. Because of risks 1 and 2 above, the glucose meter will fail and provide an erroneous result, albeit rarely. This is the current status and it is clear the benefit of the home glucose meter outweighs the risk (e.g., ADA recommendations to test for glucose). Yet, if one conducts a thought experiment and starts raising the frequency of (all) home glucose meter failures, simple decision analysis (2) still warrants use of the device. That is, measuring glucose, even if it occasionally (e.g., more often than rarely) gives an erroneous result, is better (clinically) than not measuring it.

If a company is working on a home glucose meter which provided an erroneous result too often (e.g., compared to existing meters), they will keep developing the meter until its failure rate is competitive. That is, there is a hierarchy of requirements for release for sale and often the competitive requirements (features needed to sell the product – including quality) are more stringent than any medical need or regulatory requirement (3).

Would you pay 2.5 million dollars to go to Cleveland?

Richard Fogoros suggests that there is a limit that we can spend for healthcare (4). To make this point, he says that if a plane could be built that could be survivable for most crashes, most people would not pay for an astronomical ticket price.

So regulators could require lower failure rates (less risk), causing companies to invest more, which would result in higher healthcare prices, but this is not done because it is unaffordable, hence the level of risk allowed is usually driven by competition. This is risk management but it is not the clinical benefit risk analysis described in ISO 14971– it is financial risk management.

References

1.       See http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfRES/res.cfm

2.       Krouwer JS. Assay Development and Evaluation: A Manufacturer’s Perspective, AACC Press, Washington DC, 2002, Chapter 3.

3.       Krouwer JS. Assay Development and Evaluation: A Manufacturer’s Perspective, AACC Press, Washington DC, 2002, pp 38-39.

4.       Fogoros RN. Fixing American Healthcare. Publish or Perish Press, Pittsburgh, 2007.


FMEA goals in healthcare

November 17, 2007

goal

FMEA is now a common risk management tool used in healthcare. Here’s a quick test. If the words “minimal cut set” and “Petri net” don’t mean anything to you, then you probably don’t have a quantitative FMEA goal. The rest of this entry explains some things to know about goals.

A quantitative goal must also be measureable and realistic. For example, a goal for imprecision (reproducibility) for a clinical laboratory sodium assay, might be 4% CV. One can measure this goal using a variety of experiments including those defined by standards such as the CLSI standard EP5A2.

FMEA deals with risk. Some common pitfalls about risk goals are:

·         A goal that an event should never happen. For example, the NQF (National Quality Forum) implies such by talking about “never events.” Risk is probabilistic and can never be zero. It is possible that an estimated risk is so low that in lay terms, it may be said to never be possible to occur but this lay usage is different from a formal quantitative assessment.

·         Too many goals. The NQF has a list of 28 “never events.” Virtually all of these cause serious patient harm. A goal could be restated in terms of patient harm, as the combination of risk from any of the 28 events.

·         The institute of Healthcare Improvement (IHI) implies goals in terms of evaluating the RPN (risk priority number) before and after implementing control measures. Some problems here are:

o   One may improve this metric by reducing the risk of less severe events (without reducing risk of severe events)

o   A severe risk with the lowest (categorical) probability of occurrence may be ignored as a candidate for improvement, since its RPN won’t change, but there still may be a way to lower risk (and still have the same (categorical) probability of occurrence rank.

Quantitative FMEA goals are possible and are used in the nuclear power industry although fault trees are used instead of FMEAs. Quantitative fault trees are evaluated among other ways using “minimal cut sets” and “Petri nets.”

A reasonable non quantitative goal for FMEA is to learn more about potential failure modes. However, one should realize that it is difficult to assess how much is learned.

It is easy to have a quantitative FRACAS goal because it is easy to measure failure rates from observed failures, before and after implementing control measures.


Why FRACAS is important for medical device manufacturers

November 10, 2007

failure

I have commented before that FMEA (and FTA) are used to prevent potential errors and that FRACAS is used to prevent the recurrence of observed errors. FRACAS is easier than FMEA, FTA because for FRACAS:

·         no modeling is required with respect to enumerating the possible failure modes (errors) – one simply observes the errors

·         one can easily calculate a failure rate, which can also help  predict when a failure rate goal will be achieved

From a user’s perspective (e.g., medical device customer), it is of course more important to prevent errors than to prevent their recurrence (e.g., no melt down vs. preventing another melt down). However, if FRACAS is completed before release for sale, then the FRACAS activity of preventing the recurrence of observed errors is also preventing potential errors from the user’s perspective, because (again, from the user’s perspective) the clock is at zero – no errors have occurred yet because the system hasn’t been used. This is summarized in the following table.

Tool Before release for sale After release for sale
  Errors are: Control measures used to Effect of tool:
FMEA, FTA

enumerated

Prevent potential errors

Errors prevented

FRACAS

observed

Prevent recurrence of errors

Errors prevented

This does not mean that FMEA, FTA should be dropped. If a potential error has never been observed, one still must be sure that adequate control measures are in place.

So FRACAS is part of risk management in spite of the fact that it is not mentioned in ISO 14971.

Terms

FMEA – Failure mode Effects Analysis
FTA – Fault Tree Analysis
FRACAS – Failure Reporting And Corrective Action System
Failure Mode - Error


FDA Classes

October 28, 2007

class

Bob had a comment about my previous FRACAS post, which reminds me of something. In his comment, he refers to FDA device classes and says that Class II devices do not require as much rigor. FDA classes can cause some confusion because there are two types of classes - device classes and recall classes.

Devices classes are: class I, class II, or class III. It is class III that requires the most data and can “present a potential, unreasonable risk of illness or injury.”

Recall classes are also class I, class II, or class III. It is class I that is the most dangerous type of incident and can “predictably could cause serious health problems or death.”Can one get a class I recall for anything other than a class III device? I don’t know the answer to this question but to a company, it is somewhat besides the point. Recalls are expensive, regardless of what device class they belong to or what the FDA requires for data and are to be avoided (e.g., using tools such as FRACAS).