CLSI Evaluation Protocol guidelines often contain statistical procedures and statistics is challenging for most people. One can think of CLSI documents as having three parts: the explanatory text, examples, and the appendices. The text is often lacking in spite of many revisions, simply because statistical explanations are hard to follow. The justifications of some of the statistics are in the appendices – even harder to follow.
This leaves the examples as an important part of these guidelines. If one understands the examples, then one can do the procedure, even if some of the text can’t be followed. Now this is a bit less important with the introduction of StatisPro software from CLSI, but some users might choose not to buy StatisPro and StatisPro doesn’t cover all guidelines.
Examples can be completely made up or they can have real data, which is much more useful. EP21 (total error) has two examples. One is real data (ldl cholesterol) and has a few outliers. During the comment period for EP21, several people wanted to change or delete the example because of the outliers, but outliers happen in the real world. The second example is made up because I wanted normally distributed data and although I worked for a manufacturer at the time, I couldn’t find an example of normally distributed data.
So the tradeoff is a made up example that neatly illustrates the statistical method with no brainer conclusions or a real example – warts and all – that also illustrates the statistical method but doesn’t look very appealing or leads to conclusions that require judgment.
This issue occurs in EP27 (error grids) but is much more intense. That is because error grids require judgment in their creation and this judgment can seem (and often is) arbitrary. But this is the real world. For example, with glucose, clinicians are still debating the location of the innermost zone of the error grid. The error grids in EP27 are real (blood lead, prothrombin time, and urine albumin). So now the comments complain that the error grids seem arbitrary and the commentators would rather have a made up, neat example that is an abstraction of an error grid, with clearly defined clinical consequence zones. But that is not the real world and won’t help anyone.