Who influences CMS and CDC?

March 23, 2019

A recent editorial disagrees with the proposed CLIA limits for HbA1c provided by CMS and CDC (The Need for Accuracy in Hemoglobin A1c Proficiency Testing: Why the Proposed CLIA Rule of 2019 Is a Step Backward) online in J Diabetes Science and Technology. The proposed CLIA limits are ± 10% – the NGSP limits are 5%, and the CAP limits 6%. Reading the Federal Register, I don’t understand the basis of the 10%.

This reminds me of another CMS decree in the early 2000s – Equivalent Quality Control. Under this program, a lab director could run quality control for 10 days as well as the automated internal quality checks and decide whether the two were equivalent. If the answer was yes, the frequency of quality control could be reduced to once a month. This made no sense!

New statistics will not help bad science

November 27, 2018

An article in Clinical Chemistry (1) refers to another article by Ioannidis (2) with a recommendation to change the tradition level of statistical significance for P values from 0.05 to 0.005.

The reasons presented for the proposed change make no sense. Here’s why

The first limitation is that P values are often misinterpreted …

If people misinterpret P values, then training needs to be improved, not changing P values!

The second limitation is that P values are overtrusted, when the P value can be highly influenced by factors such as sample size or selective reporting of data. 

Any introductory statistics textbook provides guidance on how to calculate the proper sample size for an experiment. Once again, this is a training issue. The second part of this reason is more insidious. If selective reporting of data occurs, the experiment is biased and no P value is valid!

The third limitation discussed by Ioannidis is that P values are often misused to draw conclusions about the research.

Another plea for training. And how will changing the level of statistical significance prevent wrong conclusions?

Actually, I prefer using confidence limits instead of P values but they provide no guarantees either. A famous example by Youden showed that for 15 estimates of the solar unit made from 1895 to 1961, each confidence interval did not overlap its predecessor.


  1. Hackenmueller, SA What’s the Value of the P Value? Clin Chem 2018;64:1675.
  2. Ioannidis JPA. The proposal to lower P value thresholds to .005. JAMA 2018;319:1429 –30.

Reviving an old accuracy hierarchy in clinical chemistry

September 3, 2018

Things that simplify are good and I recently had occasion to review one of these. It is an article by Tietz which is here. He describes a hierarchy of accuracy for clinical chemistry methods as follows:

Definitive method – methods that provide the highest accuracy such as isotope dilution mass spectroscopy

Reference method – documented methods not quite as accurate but doable for a wider variety of sites. Often these are manual methods using protein free filtrates

Field method – All of the commercial methods

Unfortunately, the ponderous and unhelpful metrology terminology now dominates and the clarity of Tietz has taken a backseat. For example, if one searches through VIM, the word definitive does not appear. But the word measurand is all over the place.


It’s hard to be a clinical chemist

March 25, 2018

What I mean by a clinical chemist is anyone associated with clinical chemistry which includes people who work in hospitals and anybody who works for a manufacturer.

A recent example is about blood lead, a product for which I consulted. As reported recently, the electrochemical method was at times giving the wrong answers. It was finally determined that a compound in the rubber stoppers of blood collection tubes was dissolving in blood and absorbing lead. Thus, nothing can be assumed – anything including the blood collections tubes can cause problems.

Commutability and déjà vu

March 18, 2018

Reading the series of articles and editorial in March 2018 Clinical Chemistry about commutability reminds me of my job that started almost 40 years ago at Technicon Instruments. My group, under the leadership of Dr. Stan Bauer, was responsible for putting the right values on calibrators for all of our assays. Back then, when customers complained that they weren’t getting the right result, the calibrator value was often blamed. I seem to recall that the customer even had the ability to choose a different value for the calibrator (we called the calibrator values “set points”).

In any case, what we did was as follows. We occupied space at the hospital of New York Medical College in nearby Valhalla (Technicon was in Tarrytown). We acquired patient samples that were no longer needed by the hospital and ran them both on our instruments and reference methods. Then, through data analysis, we assigned a calibrator value to the master lot of calibrator that would make the patient samples in the Technicon method equal what was obtained for the reference method. For some assays such as bilirubin if I remember correctly, the calibrator contained a dye and thus no analyte at all! Suffice it to say that whereas commutability of our calibrators didn’t exist, the patient samples nevertheless came out right (same as reference method).

It was this data analysis work that turned me into a statistician. I enjoyed the work and was finding out properties of our Technicon assays that the biostatisticians had missed and some of these properties were critical in calibrator value assignment.

On another note, I was at a small company a few years ago on a sales call. As I was describing my background including Technicon, I asked the small group – anyone hear of Technicon? No one raised their hand.

Articles accompanied by an editorial

March 16, 2018

Ever notice how in Clinical Chemistry (and other journals), an editorial accompanies an article (or series of articles) in the same issue. The editorial is saying – hey! listen up people, these articles are really important. And then the editorial goes on to explain what the article is about and why it’s important. It’s the book explaining the book.

Misuse of the term random error

January 31, 2018


In clinical chemistry, one often hears that there are two contributions to error – systematic error and random error. Random error is often estimated by taking the SD of a set of observations of the same sample. But does the SD estimate random error? And are repeatability and reproducibility forms of random error? (Recall that repeatability = within run imprecision and reproducibility = long term (or total) imprecision.

Example 1 – An assay with linear drift with 10 observations run one after the other.

The SD of these 10 observations = 1.89. But if one sets up a regression with Y=drift + error, the error term is 0.81. Hence, the real random error is much less than the estimated SD random error because the observations are contaminated with a bias (namely drift). So here is a case where repeatability doesn’t measure random error by taking the SD, one has to investigate further.


Example 2 – An assay with calibration (drift) bias using the same figure as above (Ok I used the same numbers but this doesn’t matter).

Assume that in the above figure, each N is the average of a month of observations, corresponding to a calibration. Each subsequent month has a new calibration.

Clearly, the same argument applies. There is now calibration bias which inflates the apparent imprecision so once again, the real random error is much less than what one measures by taking the SD.