Error Reporting Systems for Clinical Laboratories

November 17, 2006

A study on determining how to reduce specimen identification errors in a clinical laboratory was recently published (1) and is well worth reading. Although I suggest some improvements, the work in reference 1 contains the essentials required for improving quality – a formal error reporting system, measuring error rates, implementing mitigations, and determining mitigation effectiveness by re-measuring error rates.


As occurred with anesthesiology in the 70s/80s, improving patient safety requires two steps:

  1. determining how to reduce errors
  2. putting in place a policy to apply what has been learned in #1

FMEA (Failure Mode Effects Analysis) is used to prevent potential errors and FRACAS* (Failure Review And Corrective Action System) is used to prevent the recurrence of observed errors. Since there are many observed errors in hospitals, FRACAS plays a key role. This essay focuses on error reporting systems, a key element of FRACAS, with discussion of reference 1. *FRACAS is used here but has many other names.

Description of error reporting systems

The major attributes of an error reporting system are:

manual or automated input – Manual input means that human observers are inputting errors that they observe. In an automated system, computer programs (often with hardware) input all data without the need for observers. A manual example is a person observing a patient specimen without a label and entering that error. An automated input is failed quality control (QC) whereby the failure is automatically transmitted to a database.

error classification – One must decide what is and is not an error, how to group similar errors, and what is the frequency and severity of each error.

paper or electronic – A complete paper system requires only pencil and paper. Electronic systems often involve, besides computer guided input, use of databases and perhaps analysis and reporting systems. Hybrid systems are also common where errors are recorded on paper and then transferred to electronic storage.

manual or automated analysis and reporting – Given a set of error events, analysis and reporting can either be built in (automated) or performed as needed (manual). Automated analysis and reporting implies agreed upon techniques whereas manual analysis and reporting can differ each time they are carried out. With manual analysis, there can still be some automated reporting (e.g., a list of errors is reported independently of whether analysis has been carried out).

Reference 1 used a manual input, electronic system, with manual analysis and (some) automated reporting. Classification lacked a severity ranking.

Before describing some of these attributes in more detail, consider an advanced electronic system with automated input, analysis and reporting:

An Airbus 340 with 4 GE engines is an hour into its 11 hour flight from Hong Kong to New Zealand. Inside one of the engines, small bits of insulating skin peel off and fly out the back. The breached surface lets in cold air in, which causes the temperature to drop. The pilots are unaware of this situation. Three hours of temperature data recorded by thermocouples within the engine compartment, are uploaded to a satellite which relays the information to a computer at a GE site near Cincinnati. This computer analyzes the temperature data, previous failure patterns, and the airplane’s maintenance records to correctly identify the problem as skin delamination in the engine’s thrust reverser. The airline is notified and when the plane lands, maintenance workers repair the problem without any delay to the schedule (2-3). Had the delamination been allowed to continue until it was noticed by visual inspection, the plane would have had to have been taken out of service for a lengthy repair.

These systems are used to some extent by diagnostic device companies and show what is possible.

More on error reporting systems

Commercial error reporting systems –These systems are usually 100% manual systems. This is because electronic input of data requires customized programming, e.g., dependent on factors unknown ahead of time to the vendor. Combined systems (e.g., manual + electronic) exist, often facilitated by having software developers on staff.

Classification of errors – Reference 1 describes 16,632 specimen identification errors out of 4.29 million specimens. To enable analysis, similar errors must be grouped together, which is the essence of classification. In reference 1, errors were grouped into 15 categories.

Classification requires more than deciding into which bucket to place an observed error. It also involves classifying the frequency and severity of the error. Frequency is often simple – in reference 1, each patient specimen is one occurrence. Severity is more complicated and usually is decided ahead of time and is associated with the error category (e.g., one of the 15 categories in reference 1). There did not appear to be a formal classification of severity in reference 1 – it was discussed informally – which means that the criticality of error events (severity x frequency) can’t be calculated. This means that resources devoted to solving problems may not be optimized.

Continuing with reference 1, consider the events “mislabeled specimen” and “requisition mismatch”. The authors suggested the mislabeled specimen is likely to be undercounted since it might only be detected by a clinician and likely to be the most severe type of error (informal severity classification). A requisition mismatch is likely to be detected by the laboratory. These two errors are examined by the following generic figure.

For either error, the top box Specimen labeling error has occurred. The differences are:

  • The requisition mismatch will most likely be detected by the laboratory. This means that the effect of this error (wrong result reported) won’t occur. Not shown in this figure is another effect that is likely; namely a delay in reporting results, which has its own severity.
  • The mislabeled specimen is likely to remain undetected in the figure – hence the error effect of sending the wrong result to a clinician will occur (whether or not it is observed). This event may be detected by a clinician at a later error – detection – recovery sequence of steps. However, one could envision that many mislabeled specimen errors will never be detected. This is because they are not inherently detectable by the laboratory, and if patient A’s sodium of 140.4 mmol/L is mixed up with patient B’s sodium of 140.8 mmol/L, the clinician will never detect this error. In addition, a mislabeled specimen that causes patient harm may also not be detected if it is not traceable to the laboratory.

A fault tree would be helpful in fully describing specimens errors. A top level error event would be patient harm. Some of the above discussion would be seen graphically in a fault tree. For example one cause of patient harm would be three AND (gate) events:

  1. undetected mislabeled specimen
  2. result very different from true result of patient (different as defined by a Parkes type glucose error grid)
  3. clinician uses result to make incorrect medical decision

Finally, the fact that the authors in reference 1 recognize that mislabeled specimens will likely be undercounted is a problem that requires attention, although I can’t think of any solutions.

Training of observers – Any manual reporting system requires adequate training for observers (e.g., the people that input errors). This is not as simple as it might appear. Errors include observing human errors, which can inhibit accurate reporting. The proper policies must be in place, such as those described by Marx (4).

It is often helpful to have periodic meetings where the most recent events are reviewed. The observers that input data make tentative classification decisions (often hurried). The purpose of these meetings is to resolve any misclassified data and in some cases create new classifications.

Analysis and reporting – There are a variety of possible analysis and reporting methods. Fundamental to any analysis are error rates, which are analyzed in reference 1 with respect to how mitigations affected rates. The data in reference 1 are also amenable to reliability growth methods (5), which requires goals and permit prediction of when they will be reached.

Acknowledgement Helpful comments were provided by Elizabeth A. Wagar, M.D. Laboratory Director, UCLA Clinical Laboratories


  1. Wagar EA, Tamashiro, L, Yasin B, , Hilborne L, and Bruckner, DA, Patient Safety in the Clinical Laboratory: A Longitudinal Analysis of Specimen Identification Errors. Arch Pathol Lab Med 2006;130:1662-1168
  2. Pool R. If it ain’t broke, fix it. Technology Review 2001;104:64-69.
  3. Assay Development and Evaluation: A Manufacturer’s Perspective. Jan S. Krouwer, AACC Press, Washington DC, 2002 pp 93-94 discusses some of the automated analysis methods.
  4. Marx D. Patient safety and the “just culture”: a primer for health care executives. Medical Event Reporting System for Transfusion Medicine 2001. Available at:
  5. Krouwer JS: Using a Learning Curve Approach to Reduce Laboratory Error, Accred. Qual. Assur., 7: 461-467 (2002).

Unit-use devices, POC, and Quality Control

November 12, 2006
Unit-use devices, POC, and Quality Control – 11/2006

Unit-use devices have existed for many years. It may be useful to consider two types of unit use devices – those that are used in the main clinical laboratory and Point of Care (POC) devices. Since POC devices are often operated outside of the clinical laboratory, one challenge is the difficulty for non clinical laboratory personnel to perform external quality control (QC). CMS proposed reducing the frequency of external QC (for any assay) – called equivalent QC – provided certain criteria were met (1).

There has been some confusion with respect to unit use and non unit use (called here continuous flow) devices. It is often suggested that external QC is of no value in unit use devices, because whatever the outcome of external QC with the unit use device, that specific device has been used up and the next specimen will see a new unit use device. There is in fact not that much difference between unit use and continuous flow devices. Consider external QC in four cases.

  1. Continuous flow device – Reagent lot is bad, external QC detects the failure in all samples.
  2. Unit use device – Reagent lot is bad. If  a clinical laboratory receives a shipment of unit use devices all from the same lot, which is likely, external QC will detect the failure in all samples.
  3. Continuous flow device – Random failures occur, external QC will not detect the failure in all samples.
  4. Unit use device – Random failures occur, external QC will not detect the failure in all samples.

There are some differences between what happens at the manufacturing plant vs. the clinical laboratory. That is, for either device type, the reagent is made and tested by the manufacturer. However, recalibration occurs at the clinical laboratory only for continuous flow devices, but one should not think that procedures performed at a manufacturing plant are immune to problems or that the only issues that occur are due to shipping and storage.

It may be helpful to understand quality tools as related to failures in the clinical laboratory (for all devices). Failures may be considered to be of three types:

  • reliability – an error occurs preventing the device from reporting a result. This may be a hardware error such as a failed power supply or the result of a detection algorithm, whereby software has detected that there is something wrong with the response signal, so the result is suppressed. Note that reliability errors can be considered to be either persistent (failed power supply) or non persistent (isolated response signal problem).
  • persistent performance errors – an error in a result that repeats across several samples. For example a calibration error is persistent because each sample will have a bias until the system is recalibrated.
  • non persistent performance (random) errors – an error in a result that occurs in an apparently random fashion. Examples include interference in a patient sample or a noisy signal (and the noisy signal escapes the detection algorithm).

These failures may also be classified as:

Failure Result is Potential patient harm
  Reported Not reported  
Reliability x Delay in obtaining results
Persistent performance error x Wrong results –> wrong medical decision
Non persistent performance error x Wrong results –> wrong medical decision

The following table shows the effectiveness of various quality tools to deal with failure types.

Failure Quality Tool
FMEA FRACAS External QC Internal QC
Reliability x x x
Persistent performance error x x x x
Non persistent performance error x x x

FMEA=Failure Mode Effects Analysis, FRACAS=Failure Review And Corrective Action System

One tool that has been omitted is attribute (also called acceptance) sampling. This technique can detect non persistent errors (both reliability or performance) but it is impractical for the clinical laboratory. This is because to guarantee with high confidence a high proportion of a lot of materials will not exhibit non persistent errors, usually requires very large samples sizes.

This can be shown using the hypergeometric distribution. However, the use of this distribution could be questioned since it involves knowledge of lot attributes that clinical laboratories are unlikely to have. The binomial distribution is a good approximation. For example, if one sampled 10 units and found 0 defectives, one could only guarantee with 95% confidence that no more than 25.9% of units are defective. To obtain better results, one has to sample many more units (see this post).

The table above shows the importance of internal QC, which unlike some recent suggestions is not new but has been in virtually all systems since assays were automated. However, internal QC methods are largely proprietary and thus details are generally not known to clinical laboratory users.

FMEA and FRACAS represent tools that the clinical laboratory can carry out and are effective for all errors.

A final table shows how each of the quality tools works.

Quality Tool Mitigation method
  prevention detection recovery
FMEA x x x
FRACAS x x x
External QC x x
Internal QC x x

The meaning of prevention, detection, and recovery is explained in reference 2.


  1. See
  2. Managing risk in hospitals using integrated Fault Trees / FMECAs. Jan S. Krouwer, AACC Press, Washington DC, 2004.