The bad and the good in lab medicine

May 5, 2020

First, the bad. Well bad is maybe too strong a word.


If you have ever read an ISO standard, you will notice that something is missing. There is no list of authors or committee members. The people who write the standard should be listed!

ISO 9001, ISO 15189

In the 90s, companies would display banners stating there were ISO 9001 certified. From the ISO website “Using ISO 9001 helps ensure that customers get consistent, good-quality products and services, which in turn brings many business benefits.” But accreditation success is judged by the documentation that the organization has, to show that it is following the processes that it has developed. A company could have poor quality but if they can prove through documentation that they follow their processes, they will be accredited. The same applies to ISO 15189, the clinical laboratory version.

Krouwer JS ISO 9001 has had no effect on quality in the in-vitro medical diagnostics industry Accred. Qual. Assur., 9: 39-43 (2004)

ISO 15197

This standard describes the required accuracy for patients with diabetes who self-monitor their glucose with glucose meters. It came out in 2003 and was updated in 2013. The 2003 version allowed 5% of the results to have any difference from reference (hence 5% unspecified). The 2013 version reduced the unspecified amount to 1%. People with diabetes do a lot of testing. To have 5% unspecified meant that once a week you could get a result that could kill you from an ISO acceptable glucose meter 2003 version). For the 2013 version this was once a month. I was invited to attend an early meeting of the 15197 committee. It was not run by endocrinologists, but rather by regulatory affairs people from industry.

Krouwer JS Wrong thinking about glucose standards. Clin Chem, 2010;56:874-875.


I spent many years contributing to CLSI in the area of evaluations. This group is dominated by regulatory affairs people and it was always difficult to finish any evaluation standard. For example, when I became chairholder of the committee, I finally published a bunch of standards which had been sitting around for 14 years!

I thought Jim Westgard’s original idea about total error made a lot of sense, so I established and chaired a standard about total error (EP21). When it was time to revise the standard, it occurred to me that the original document was about “total analytical error.” I suggested that in the revision, we include pre- and post-analytical error. There was strong opposition to this and not just by the regulatory affairs people but also by hospital clinical chemists. After a while, the CLSI management threw me out of CLSI.

Clinical Chemistry (Journal)

There have been a lot of good things in the journal Clinical Chemistry but here is one that is not so good. The journal will not accept for review a Letter to the Editor except if the Letter is about an original article. That means that about half of the content in the journal (case studies, opinions, editorials, and so on) are off limits. I asked the editor of the journal during a local AACC meeting and his response went along the lines – we vet our articles very carefully and don’t wish to burden the journal with useless blather. My Letter to Clinical Chemistry published in 2010 about the ISO glucose standard would not be considered today.

The good

I like all sections of the AACC Artery. I look at it every day.

Flaws in the ISO 15197 standard (for glucose meters)

August 13, 2019

Having an occasion to read the ISO 15197 standard (for glucose meters) I notice the statements:

One of the reasons allowed to discard data is: “the blood-glucose monitoring system user recognizes that an error was made and documents the details”

This makes ISO a biased standard because in the real world there will be user error which generates outlier data.

And compounding things is this statement:

“Outlier data may not be eliminated from the data used in determining acceptable system accuracy, but may be excluded from the calculation of parametric statistics to avoid distorting estimates of central tendency and dispersion.”

The problem is outliers that are representative of what happens in the real world should not be thrown out to help statistics such as regression and precision from being distorted. Rather these statistics should not be used. An error grid is a perfectly adequate statistic to handle 100% of the data.

Stakeholders that participate in performance standards

May 11, 2019

Performance standards are used in several ways: to gain FDA approval, to make marketing claims, and to test assays after release for sale that are in routine use.

Using glucose meters as an example…

Endocrinologists, who care for people with diabetes, would be highly suited to writing standards. They are in a position to know the magnitude of error that will cause an incorrect treatment decision.

FDA would also be suited with statisticians, biochemists, and physicians.

Companies through their regulatory affairs people know their systems better than anyone, although one can argue that their main goal is to create a standard that is as least burdensome as possible.

So in the case of glucose meters, at least for the 2003 ISO 15197 standard, regulatory affairs people ran the show.

Just published

May 8, 2019

The article, “Getting More Information From Glucose Meter Evaluations” has just been published in the Journal of Diabetes Science and Technology.

Our article makes several points. In the ISO 15197 glucose meter standard (2013 edition), one is supposed to prepare a table showing the percentage of results in system accuracy within 5, 10, and 15 mg/dL. Our recommendation is to graph these results in a mountain plot – it is a  perfect example of when a mountain plot should be used.

Now I must confess that until we prepared this paper, I had not read ISO 15197 (2013). But based on some reviewer comments, it was clear that I had to bite the bullet, send money to ISO and get the standard. Reading it was an eye opener. The accuracy requirement is:

95% within ± 15 mg/dL (< 100 mg/dL) and within ± 15% (> 100 mg/dL) and
99% within the A and B zones of an error grid

I knew this. But what I didn’t know until I read the standard is user error from the intended population is excluded from this accuracy protocol. Moreover, even the healthcare professionals performing this study could exclude any result if they thought they made an error. I can imagine how this might work: That result can’t be right…

In any case, as previously mentioned in this blog, in the section when users are tested, the requirement for 99% of the results to be within the A and B zones of an error grid was dropped.

In the section where results may be excluded, failure to obtain a result is listed since if there’s no result, you can’t get a difference from reference. But there’s no requirement for the percentage of times a result can be obtained. This is ironic since section 5 is devoted to reliability. How can you have a section on reliability without a failure rate metric?

Summary of what’s wrong with the ISO 15197 2013 glucose meter standard

March 24, 2019

  1. Minimum system accuracy performance criteria (6.3.3) – I previously commented that the word “minimum” is silly. One either meets or does not meet the requirements. But the big problem is Notes 1 and 2 in this section that says that the test is not to be carried out by actual users. Thus, the protocol is biased by excluding user error. In the section where users are included, the acceptance criteria (8.2) drop the requirement for 99% of the results to be within the A and B zones of an error grid. The requirement for 95% of the results to be within ± 15 mg/dL below 100 and within ± 15% above 100 remain. Thus 5% of the results are unspecified, same as the 2003 version. This means that for people who test 3 times daily, they could have a dangerous error for their meter once a week in spite of their meter meeting the ISO 15197 standard.
  2. Safety and Reliability Testing (Sections 5) – A hallmark of reliability testing is the frequency of failures to obtain a result. There is nothing in this section (or elsewhere in the standard) to tally the frequency of failed results or specified limits for percent failures. This makes no sense for a standard about a POC test that is needed emergently. Failure to obtain a result is a frequent event in the FDA adverse event database for glucose meters.
  3. If you want to see who wrote the standard, you can’t. As with all ISO standards, there is no list of authors or members who served on the committee.

Minimum system accuracy performance criteria – part 2

February 13, 2019

I had occasion to read the ISO 15197:2013 standard about blood glucose meters Section 6.3.3 “minimum system accuracy performance criteria.

Note that this accuracy requirement is what is typically cited as the accuracy requirement for glucose meters.

But the two Notes in this section say that testing meters with actual users is tested elsewhere in the document (section 8). Thus, because of the protocol used, the system accuracy estimate does not account for all errors since user errors are excluded. Hence, the system accuracy requirement is not the total error of the meter but rather a subset of total error.

Moreover, in the user test section, the acceptance goals are different from the system accuracy section!

Ok, I get it. The authors of the standard want to separate two major error sources: error from the instrument and reagents (the system error) and errors caused by users.

But there is no attempt to reconcile the two estimates. And if one considers the user test as a total error test, which is reasonable (e.g., it includes system accuracy and user error), then the percentage of results that must meet goals is 95%. The 99% requirement went poof.


Minimum system accuracy performance criteria

February 13, 2019

I had occasion to read the ISO 15197:2013 standard about blood glucose meters and was struck by the words “minimum system accuracy performance criteria” (6.3.3).

This reminds me of the movie “Office Space”, where Jennifer Anniston, who plays a waitress, is being chastised for wearing just the minimum number of pieces of flair (buttons on her uniform). Sorry if you haven’t seen the movie.

Or when I participated in an earlier version of the CLSI method comparison standard EP9. The discussion at the time was to arrive at a minimum sample size. The A3 version says at least 40 samples should be run. I pointed out that 40 would become the default sample size.

Back to glucose meters. No one will report that they have met the minimum accuracy requirements. They will always report they have exceeded the accuracy requirements.