Equivalent QC

March 17, 2005

If you are unfamiliar with equivalent QC, see references 1-2. This essay explores some issues with equivalent QC.

 Go to the AACC expert online access presentation

What causes errors in assays and the role of QC

The figure above shows four generic types of error and how QC can prevent two of tem. The four error types are explained:

random patient interference – is an interfering substance or mix of substances that causes a bias in the (patient) result and is often different (e.g., apparently random) in each patient specimen. The incorrect results are repeatable on re-assay. For some or many patient specimens, there may be no observed interferences. QC does not detect this error.

random bias – is any short term bias such as a clog in an analyzer that lasts for a few samples and is not specific to a particular patient specimen. The incorrect results are often not repeatable on re-assay (because the bias has disappeared). QC probably won’t detect this error since the probability of the error occurring during a QC sample is low. Note: The “clog in analyzer” is a case of an error that may be detected by an internal monitoring system. In this example, the error has not been detected by the internal monitoring system.

long-term bias – is any bias such as most calibration error that lasts for at least a day and is thus detected by routine quality control. The “clog in an analyzer” failure could also last for more than a day. This definition is somewhat arbitrary, since some calibration error is short term (e.g., blood gas systems are calibrated more frequently than once a day).

imprecision – are all biases that are very short term (occur in less time than 1 assay result and are modeled as random error), plus longer term uncompensated biases (for example drift). Note that the imprecision as typically measured in clinical chemistry assays is apparent random error, which means it is the true random error plus uncompensated biases such as drift. QC can detect poor imprecision.

The effect various QC Schemes on detecting these errors.

Error Source QC Scheme
Increased Current (2 per day) Reduced
Random patient interference No effect No effect No effect
Short term bias Catches more errors Catches fewer errors Catches even fewer errors
Long term bias No effect No effect Catches fewer errors1
Imprecision No effect No effect No effect

1For example, if a system is calibrated weekly, and there is calibration error, running QC monthly will frequently miss this error

Internal monitoring systems

The rationale behind the reduction is QC frequency is the assertion that internal monitoring systems adequately detect and prevent incorrect results from being reported. Here are some problems with that assertion.

Calibration is hard to control though internal monitoring – It is unlikely that any internal monitoring system can detect all calibration problems. The whole basis behind calibration is to associate an assay’s response (signal) with a known concentration. This sets up a calibration equation. Then, with each unknown (patient sample) the response that is found is assigned a concentration according to that equation.

Although there can be limits set on the expected calibrator’s response as well as checks on the shape of the response, there is no real way to prevent other errors and this can lead to calibration bias, which can be detected by QC.

Internal monitoring systems are models and can be wrong – An internal monitoring system is the result of a model of how the system can fail. These models are often based on fault trees and FMECAs. Mitigations are applied to detect and prevent errors through hardware and software. But there is no guarantee that either the model is correct (e.g., that all possible failure modes are included) or that the mitigations applied are 100% effective. In fact, experience has shown that assay development usually starts with a relatively large number of errors. Mitigations are repeatedly applied until a decision is made to release the product. Mitigations also are applied after product release. Of course, errors which affect patient results are classified as the most severe and are given the most attention. The process of repeatedly applying fixes (formally known as reliability growth management) is the most efficient way of developing complex instrument systems and is used because the required knowledge to “design things right the first time” doesn’t exist.

Another view of QC vs. internal monitoring systems

There is another fundamental difference between QC and internal monitoring systems. As stated above, internal monitoring systems are based on a model whereas QC is largely observational. Observation means that assuming that one has reasonable quality control rules, one does not require any knowledge about how the system can fail, one must only run QC. Putting things another way, you can forecast the weather through models (and these can be quite sophisticated) or you can go outside.  Or in terms of the equivalent QC issue, one could suggest that one should have the best internal monitoring systems possible and run QC to detect anything that was missed.

The problem with the validation protocol

The suggested validation protocol is 2 QC samples per day for 30 days. One failed QC that does not repeat is allowed. One can show that this proves with 95% confidence that the proportion of all QC failures is no more than 7.7% (see reference 3). This is “equivalent” – in Six Sigma terms – to a 2.9 sigma process. This is actually the best case because one is not really interested in the QC samples but in the patient samples.

Cost must always be considered

All of the above does not deal with cost. If cost did not enter into the equation, one would increase QC frequency, not decrease it. However, cost is important. Running QC samples adds cost. The more lab tests cost, the fewer people will be able to be tested and this lack of information will increase morbidity and mortality. Yet, if QC frequency is reduced, this may lead to more errors and also increase morbidity and mortality (also see reference 4).

Conclusion

The proposal to reduce QC implies that QC is redundant to internal monitoring systems. I have suggested why this might not be the case. The cost benefit tradeoff of equivalent QC must be addressed with data, and this doesn’t mean asking each lab to answer this question.

References

  1. http://www.cms.hhs.gov/CLIA/downloads/6066bk.pdf
  2. http://www.westgard.com/cliafinalrule7.htm
  3. Hahn GJ and Meeker WQ. Statistical intervals. A guide for practitioners. Wiley: New York, 1991, p. 104
  4. Krouwer JS. Assay Development and Evaluation: A Manufacturer’s Perspective., AACC Press, Washington DC, 2002, p6.

Why “latent errors” is not a good term

March 17, 2005

One occasionally hears the term “latent errors” in articles about error reduction techniques (1). The purpose of this essay is to explain problems with this term and to suggest alternatives.

Latent implies hidden. Berwick uses the term “latent failures” and equates this with “the little things that go wrong all the time”. These are misleading concepts. Consider an example. In a recent presentation, Astion presents some examples of latent errors and their effects (2).

  • Computers: A lack of 1 instrument interface is responsible for many active data entry errors.
  • Staffing: 1 latent error regarding suboptimal staffing leads to multiple active errors by staff who are forced to multitask.
  • Policy and Procedure: A bad strategy for handling phone calls can lead to multiple errors

Before analyzing one of these examples, consider a fault tree model of lab error. This is a hierarchical (“top down”) chart of errors which has the following properties:

  • The severe errors are at the top
  • Errors are connected through parent – child connections
  • The parent errors are the “effects” of the child errors
  • The child errors are the causes of the parent errors
  • The tree uses “gates” which include:

o       or gate means any child event in that branch that occurs will cause the parent error

o       and gate means all child events in that branch must occur to cause the parent error

o       basic gate is the end (cause) of a tree branch

  • Each error is classified as to its:

o       severity

o       probability (likelihood of occurrence)

Considering the staffing error mentioned by Astion. One could postulate one of many branches of a fault tree to contain this error as:

Top – outlier (e.g., incorrect result) reported to clinician

AND – assay has interference to lipemia

AND – sample is lipemic

AND – (visual) detection for lipemic sample failed

OR – technician not available to perform test

OR – technician called away

BASIC – inadequate staffing

Translating the events of this tree into English (sort of), an outlier will be reported if the assay has an interference to lipemia, AND the sample is too lipemic AND the step for visually examining the sample has failed. The visual examination step failure can have several causes, one of which is the technician does not perform this step. This has several causes, one of which is the technician has been called away because the staffing is inadequate (e.g., there is a problem somewhere else that should be handled by staff but, inadequate staffing prevents this).

Severity

An outlier that is reported to a clinician is among the most severe errors that a lab can make. Every event in this branch of the tree has the same severity classification because the effect of any of these errors is the top level error. The importance (ranking) of these errors may be different because the probability of each of these errors may be different.

Because the severity is high, there is no reason for calling any of these events “the little things that go wrong.” They are all severe events. There is also no reason to call them latent (e.g., hidden) or to call the top level error “active” which implies that the lower level errors are not active. Any of these events that occurs are active. Whether these events can be detected depends on programs in place (3) to expose such errors.

To summarize what FMEA recommends:

  1. Flowchart the process
  2. Add process steps to a fault tree
  3. Add causes to each potential process step error
  4. Add FMEA information to each event
  5. Rank the errors
  6. Propose mitigations

Recommendation

Before deciding that one has thought of all possible causes for an error, one should consult a list of “QSEs (Quality System Essentials) (3). These are generic activities that apply to virtually all aspects of a service. These activities will typically not appear in flowcharts because they are so pervasive that they would make flowcharts too complicated.

For an additional critique of reference 1, see reference 4.

References

  1. Berwick, DM. Errors Today and Errors Tomorrow N Engl J Med 2003;348: 2570-2572
  2. Astion M. Developing a Patient Safety Culture in the Clinical Laboratory http://www.aacc.org/AACC/events/expert_access/2005/saftey/
  3. Application of a Quality System Model for Laboratory Services; Approved Guideline—Third Edition GP26-A3 NCCLS 2004 Wayne, PA
  4. Krouwer JS There is nothing wrong with the concept of a root cause. Int J Qual Health Care 2004;16:263

See also additional references at the end of the systems not people essay.