Protocols, Family Feud, and data analysis

If the audience in the TV game show Family Feud was comprised of scientists and was asked, “What is the most common question about protocols”, the number one answer would be “what should my sample size be?” This is usually what I am asked and my first response is, “what is your goal?” This is usually met with a blank stare because there are no goals so determining goals becomes the first task.

With goals and a proposed protocol, another key question is what is the best way to analyze the data. Often, this is asked because it is required to include a statistical analysis plan as part of the protocol document. The answer to this question is more difficult (in terms of protocol inclusion) because the real answer is the statistical methods to be used are whatever it takes to find out everything that’s going on in the data. This would require a book for inclusion in the protocol. For example, you perform a regression with data that look slightly non-linear so you try quadratic and cubic terms. There are also a few points that appear to be influential, so you repeat everything excluding all possible combinations of those points. And that’s just the start.

What goes in the report depends on where it’s going. If the report is internal to the company, then all of the warts in the data are discussed and possibly some speculation as to causes. It’s up to the company to decide if these “warts” are important (see last post). If the report is going to an agency, then it’s what is required by the agency and nothing more and certainly no speculation.

Once, at Ciba Corning, a customer informed us that one of the blood gas analytes was being influenced by another blood gas analyte. And sure enough, looking at previous evaluation data showed the effect to be present. To me, this was a disaster. Our group had missed pointing out something that was going on in the data. Much earlier, when my job was to run a reference lab for Technicon where we assigned values to master lots of calibrators, I was pointing out things missed by the statstical group.


