Building and quantifying fault trees – an example – 10/2005

October 13, 2005
Building and quantifying fault trees – an example – 10/2005

This example will show how a fault tree helps in completing a FMEA. The example will also demonstrate some quantification.

Introduction and starting point


hCG – human chorionic gonadotropin

HAMA – human anti mouse antibodies

FMEA is a “bottoms-up” approach and a fault tree is a “top-down” approach. Both approaches are useful. A difficulty with FMEA is that the entries form part of a table and unlike a fault tree, much of the structure within the table cannot be expressed. This example is an hCG blood test. The component that is being investigated is the reagent. The starting point for this section of the FMEA is:

Failure mode – outlier result Failure cause – HAMA interference in assay

The questions are:

What is the failure effect? What is the severity of the failure effect? What is the frequency of occurrence of the root cause?

This gives the following FMEA fragment:

Component Function Failure effect Failure Mode Failure


Severity Freq. RI
Reagent Measure hCG through immuno- chemical reaction ? outlier HAMA interference ? ? ?

A corresponding fault tree fragment is:

Outlier result

OR HAMA interference in assay

Assume that this part of the FMEA is concerned with potential harm to the patient. There could be other effects of outliers too, such as customer complaints. An outlier result by itself does not inform one about severity with respect to patient harm, because patients are not directly connected to the assay. To assess the importance of the outlier, one must know what happens with outlier results.

There are many possibilities. Consider one, where the hCG value is elevated (falsely) leading to a diagnosis and treatment of trophoblastic carcinoma, when the patient does not have this condition.

The fault tree now looks like this

Error – Patient harm

OR – Outlier result

OR HAMA interference in assay

In the VA scheme for severity (1), this would be severity 3 (severe injury, but not death).

What about frequency of occurrence of the root cause? In this case, “HAMA interference” is considered as a discrete event – the assay either interferes or doesn’t (due to its design and formulation) and this assay interferes, so in principle, the frequency of occurrence is always! But this implies that every hCG sample assayed results in an outlier and this is not the case. What’s missing is that for an outlier to occur, the patient sample must have human anti mouse antibodies in sufficient quantity. So now the fault tree looks like the one below. Note that there are two AND events and that the original root cause of HAMA interference in the assay has been changed from an OR to an AND gate. Both AND events have to occur for an outlier to result. Assume that 1% of patient samples have human anti mouse antibodies. This gives a frequency of occurrence for outliers of 1%.

Error – Patient harm

OR – Outlier result: freq. 1%

AND Human anti mouse antibodies in patient sample: freq. 1%

AND HAMA interference in assay: freq. 100%

However, in a real lab, results are reviewed before they are reported, so this step must be added. The result review can be considered as an error, detection, recovery scheme.

HAMA assay interference is a known problem with immunoassays and there are methods to detect it, which may be performed for certain assay results according to the lab’s rules. Assume that detection is successful 75% of the time (in detecting errors due to HAMA interference). Recovery means that the assay will be repeated to eliminate the interference and the new result reported to the clinician. Be aware that recovery is not always 100% effective – it can fail. Assume that in this case it is 99% effective. What is the outlier rate, given these assumptions? In this case,

  • assume there are 10,000 reported results per year
  • the 1% outlier rate gives 1,000 outliers
  • of the 1,000 outliers 26 (2.6%) incorrect results will be reported to clinicians and 74 (7.4%) of the outliers will be detected with recovery and no longer be an issue. (For this example, numbers have been rounded) This gives:

Error – Patient harm

OR – Outlier result reported: freq.  26 per year

OR – Result review fails: EDR Sequence*

OR – Outlier result

AND Human anti mouse antibodies in patient sample

AND HAMA interferences in assay

*EDR = error, detection, recovery

One still has to take into account two more things: 1) the outlier result must fall into a specific region of a (Parks type) error grid (2) to cause this level of patient harm and 2) the clinician must act on the incorrect result.

The error grid means that outliers (e.g., large errors) that don’t cross medical decision limits are not as dangerous as errors that do cross medical decision limits. In addition, the clinician has the opportunity to question the result and (for any reason) not act on it. If this happens, there may be no patient harm and in any case the outlier is not involved. Assume that these values are:

26 x (outlier percent in dangerous region) x (percent clinician acts on incorrect result) =

26 x (5%) x (50%) = 0.65

This gives as the final fault tree for this cause and effect:

Error – Patient harm (1) frequency of occurrence ~= slightly more than once in two years

AND – Clinician acts on incorrect result (2)

AND – Outlier falls into dangerous region of error grid (3)

AND – Outlier result reported (4)

OR – Result review fails: EDR Sequence*(5)

OR – Outlier result (6)

AND Human anti mouse antibodies in patient sample (7)

AND HAMA interferences in assay (8)

OR – Other causes

*EDR = error, detection, recovery

Note that there are other possible causes for the outlier to occur (the bottommost OR gate), which would raise the frequency of this type of patient harm, but these causes are distinct from HAMA interference. Also, the original question of the frequency of occurrence of the root cause is being addressed by the frequency of occurrence of the effect of the root cause.

Thus, the fault tree has helped to inform the FMEA. An outlier has many possible failure effects, the one studied here has a severity of 3 and causes serious harm to the patient and has a risk to occur of slightly more than once in two years which in the VA frequency scheme is the second highest frequency of occurrence. It’s hard to imagine this level of analysis with only a FMEA table.

Component Function Failure effect Failure Mode Failure


Severity Freq. RI
Reagent Measure hCG through immuno- chemical reaction Unneeded, harmful treatment outlier HAMA interference 3 3 9

Further discussion

This fault tree could still considered to be simplified and of course all of the numbers have been made up, but note that there have been 12 cases reported recently in which unnecessary treatment was carried out due to incorrect hCG results caused by HAMA interference (3).

A quantification of an entire fault tree (or a large subsection) requires algorithms which are available only in advanced (and expensive) fault tree software. This software is warranted in these cases, provided that one has good input data.

The fault tree helps to suggest risk mitigations. For this example, among the possible lab risk mitigations are:

  • One should of course try to select an assay which has been shown to have no HAMA interference, or if there is interference, only to a smaller subset of patients (e.g., with much higher levels of human anti mouse antibodies).
  • One could try to improve the detection success percentage. If this were 95%, for example, the rate of patient harm would be reduced to 0.15 events per year (once in 6.6 years).
  • Not mentioned in the fault tree is the interface between the lab and clinician, which also represents a lab risk mitigation opportunity. That is, clinicians focus is on patient care, and lab personnel focus is on lab assays. There would benefit by the lab being aware of clinician actions, given lab results so that a feedback loop could be added to the detection scheme.

A manufacturer’s risk mitigation would require an expanded fault tree, with causes listed for the HAMA interference. This also illustrates the concept of not enumerating causes when they are not relevant. That is, a lab may know possible reagent causes for HAMA interference, but if the lab must use a manufacturer’s assay without reagent modification, these causes are not relevant.

Finally, note that whereas a risk mitigation (or initial analysis) may result in a very tiny frequency of occurrence (e.g., once in 1,000 years) it still won’t be zero.

Building fault trees using a top down approach

This example was for illustration purposes. This is because the example involved building a fault tree from a FMEA, which while possible would not be how a fault tree is typically done. If one were normally building a fault tree, one would use a top down approach. The end result would be the same. Thus, the main error types are:

Lab error     OR Complaints     OR hazards         OR harm to patient         OR harm to operator     OR others

Expanding the harm to patient

Lab error     OR Complaints     OR hazards         OR harm to patient             OR outlier result             OR patient ID mix up

Expanding the outlier event, with help of a process flowchart

Lab error     OR Complaints     OR hazards         OR harm to patient             OR outlier result                 OR Interference                     OR HAMA interference                 OR Random noise

Continuing with this tree would give the same results as above. Note that the process flowchart does not help with all parts of the fault tree.


  1. The Basics of Healthcare Failure Mode and Effect Analysis, available at
  2. Parkes JL, Slatin SL, Pardo S, and Ginsberg BH. A new consensus error grid to evaluate the clinical significance of inaccuracies in the measurement of blood glucose. Diabetes Care 2000;23:1143-1148.
  3. Rotmensch S, Cole LA. False diagnosis and needless therapy of presumed malignant disease in women with false-positive human chorionic gonadotropin concentrations Lancet. 2000;355:712-5.