Bias in Laboratory Medicine Standards

October 22, 2010

Inspired by a blog that I read, here is my contribution. A clinical laboratory standards group, CLSI writes laboratory standards which have been adopted by the FDA and hence these standards could be thought of as quasi regulatory standards.

CLSI standards attempt to provide a balanced view between industry, government, and hospital laboratories. The balance is provided by having approximately equal representatives from each of these three groups on committees that write standards and on the board of directors. However, in the area that I have been involved – the committee that writes statistical (Evaluation Protocol) standards – many industry members are not statisticians, from R&D, or from manufacturing – they are from regulatory affairs. Unfortunately, these regulatory affairs members often take an obstructionist role when standards are perceived as providing information that is not to their company’s liking. This is bias number 1.

An example is the standard EP11, which is about uniformity of claims and was canceled by the CLSI board of directors even though it had been approved by its committee. EP11 would have provided a consistent way for manufacturers to state performance claims. The obstructionists said it was no longer needed and superseded by other documents (which was false). An example of a poor claim is in EP7, the standard about interferences, which states that if a substance causes less than a 10% bias, it can be claimed that it doesn’t interfere. This is of course an incorrect statement and suppresses information. EP11 would have changed this.

Bias number 2 is that there is an industry trade group – AdvaMed – which can mobilize the industry members to influence standards. There are no such groups for hospitals or government. AdvaMed did mobilize to influence the demise of EP11.

Bias number 3 has to do with CLSI membership fees. Manufacturers pay up to 70 times more in fees than a hospital laboratory. With this much money, an unhappy manufacturer can influence standards by threatening to drop its membership.

So the balanced way of producing standards is not so balanced and dominated by industry.

Rare errors need to be evaluated differently than frequent errors

October 15, 2010

A paper about error grids has been rejected twice and one of the stumbling blocks seems to be that risk management and a method comparison are proposed to be used to evaluate error grid performance.

To recall, an error grid (see CLSI EP27P) is a way to specify and evaluate assay performance. One plots the values for a candidate and comparison method on an XY plot where zones describe the amount of patient harm possible. In a simple error grid, the closest zone to the identity line demarcates no harm from minor harm. The outermost zone demarcates minor harm from major harm. Error grids are well known in glucose monitoring and little used elsewhere.

To evaluate the performance of an assay with respect to the innermost zone, one performs a method comparison and counts the percentage of points that fall within the zone. One can also calculate confidence limits. This result tells one (for good assays) that most assay results will result in no patient harm while a few assays results may or will cause minor patient harm.

But this method won’t work for the outermost zone. The reason is that the number of expected results that will cause serious patient harm is extremely low – often less than 1 in a million. A method comparison experiment almost always uses a low number of samples (around 100 and often less) and while it is expected that no results fall in the outermost zone, this just does not prove much. To demonstrate by means of a method comparison experiment that one is very confident that no results will fall in the outermost zone, one would have to run millions of samples and this is impractical.

If one simply stops at the 100 sample method comparison, then one has demonstrated performance about no harm and minor harm and nothing about serious harm. What can one do? Perform risk management on the assay (FMEA – Failure Mode Effects Analysis – and fault trees) to demonstrate that the most serious potential errors have mitigations in place to reduce risk to an acceptable level (as low as possible within financial constraints).

So why is risk management so difficult for clinical chemists? Perhaps because most current performance requirements specify limits for only 95% of the results. This is unfortunate since it is the remaining 5% of results which can cause serious harm. Error grids address this problem by specifying limits for 100% of the data.