Thoughts on the 510(k) process

February 25, 2010

FDA recently held an all day conference about the 510(k) process. It was also available to people who didn’t attend (like me) via the web and has also been archived. My reaction follows:

In-vitro diagnostic (IVD) devices (such as blood analyzers) are just one type of device and are quite different from most other devices since IVDs don’t interact directly with patients. I don’t know the breakdown between IVDs and other devices.

Post market surveillance was described as an issue that FDA would like to improve. I believe it is essential. For example, a typical assay may get cleared for 510(k) by running a few hundred specimens during evaluations. Yet, the number of patients assayed with that device is likely to be many millions. So only a small sample is being tested at the beginning of the device history. So here is what’s needed – I don’t know the best way of making it happen.

FDA needs to collect data on:

  1. A usage factor. For IVDs, perhaps the number of reported results for a particular assay by manufacturer and model.
  2. All events, whereby an event is either:
    1. Some error with the assay
    2. Some effect that is an error that is connected to the assay

FDA needs to provide guidance so that event reporting contains the severity and frequency of the event, with FDA guidelines as to how to classify events.

FDA then needs to analyze this data as to the error rate for:

  1. Assays that harm patients
  2. Assays that have other errors such as failure of the system without patient harm

FDA can then provide reports as to which assay and errors require improvement. This is how many IVD companies improve their products (with the FRACAS process).


EPCA-2 update number 4

February 20, 2010

Dr. Diamandis has published an opinion about EPCA2 at: http://www.clinchem.org/cgi/reprint/clinchem.2009.140061v1 Since a subscription is required, here is the gist of Dr. Diamandis’s contribution. First, he mentions that Dr. Getzenberg has published a new article about EPCA-2 (1) (subscription also required). In the Getzenberg article, Dr. Diamandis says that the claim has been dropped by Dr. Getzenberg that EPCA-2 distinguishes between organ confined and non organ confined prostate cancer. Otherwise, the Prostate article validates Getzenberg’s other previous claims about the EPCA-2 marker. The abstract is at: http://www3.interscience.wiley.com/journal/122373646/abstract?CRETRY=1&SRETRY=0

Dr. Diamandis’s Clinical Chemistry article says that according to ELISA assay principles and now demonstrated experimentally, the Getzenberg assay was incapable of measuring EPCA-2 at the levels claimed and thus, the results had could only be explained by some sort of bias.

So the story continues.

References

  1. Leman ES, Magheli A, Canon GW, Mangold L, Partin AW, Getzenberg RH. Analysis of a serum test for prostate cancer that detects a second epitope of EPCA-2. Prostate 2009;69:1188 –94.

Wrong thinking about evaluating assays

February 3, 2010

I published a Letter to the editor as well as an article about wrong thinking for (glucose) standards (subscription required for both). Here is a companion piece about evaluating assays.

Ideally, the goal in evaluating an assay is to determine the population of differences between the candidate assay and truth for the analyte over the life of the candidate assay. This is not attainable directly because it would mean to assay each patient sample with a definitive reference method. So one takes a small sample (say 100 patient samples) to estimate these differences. And one usually uses a comparative assay rather than a definitive reference assay.

So far there is nothing wrong with the above, but here’s where things go bad. In many cases, people run the evaluation experiment far from the way that the assay will be run routinely. Note that this is always unavoidable to a certain extent. For example, the results of an evaluation experiment are not sent to clinicians, because the assay is not in use. However, one can easily not perform the evaluation in ways that could match routine use. For example, a glucose meter that is designed to have nurses perform a fingerstick, might instead be run with venous samples, perhaps because the fingerstick procedure would cause more error and one wishes to observe just the “analytical” properties of the assay. But the experiment no longer answers the question set forth in the goal. This is because a potential source of error has been removed from the evaluation.

Another problem is how the results will be handled. I have argued that the only meaningful analysis is an error grid analysis, yet other analyses persist such as estimating total error by adding 2 times imprecision to average bias, or calculating six sigma metrics.

However, there is even a bigger issue. Say one runs 100 patient samples and it is estimated that the candidate assay will be used for one million patient samples. This experiment samples 0.01% of the population. The issue is how to interpret the results of this 100 sample experiment. If the results are bad, then one should question the acceptability of the assay. However, if the results are good, one cannot say much. Again, the experiment should be done and it is nice to know the results are good, but more is needed.

To understand what else is needed, consider elements that have either definitely, or probably not been tested in the 100 sample experiment, using glucose as an example:

  • Different interfering substances (some may have been present) including extremes of hematocrit
  • Different lots of reagents, age of reagents, storage of reagents
  • Different environmental conditions (temperature, humidity)
  • Different operators with representative skill levels
  • Evaluating the software
  • Determining the percentage of times a result is failed to be provided
  • And so on

There are two ways this information can be assessed. The first is by the manufacturer, by performing special studies such as factorial experiments, software evaluation, FMEA, FRACAS, and so on.

Since 85% of laboratory error is due to pre and post analytical error and not analytical error, one can’t underestimate the effect of laboratory procedures. The second way is for the clinical laboratory to perform their own FMEA and FRACAS to deal with conditions in their laboratory, since the manufacturer cannot anticipate all laboratory procedures.

To summarize:

  1. The 100 sample evaluation (often less samples) performed by the clinical laboratory is not much more than a cursory check to make sure nothing has gone wrong with the assay in the hands of the laboratory.
  2. The manufacturer performs most of the analytical validation of the assay and some (often simulated) user validation with the FDA evaluating the results.
  3. The laboratory performs FMEA and FRACAS in the context of their procedures.

CLSI documents that support this approach are EP27 (error grids) and EP18 (risk management).