RCTs are experiments whereby a similar set of patients are *randomly* assigned one of two treatments. Often, one treatment is a new experimental treatment and the other is a placebo. Success of the new treatment is a statistically different effect between the two treatments with respect to the outcome measure. An outcome measure is defined for a study (for example, 5 year survival rate). With historical data, patients who have been treated in two or more ways are analyzed as to the outcome measure. Since the patients have not been randomly chosen as to the treatment group, it is attempted to find groups of patients who are similar.

RCTs are thought to be the gold standard in assessing treatments. RCTs are also advocated to compare *existing treatments* in addition to establishing efficacy of a new treatment. This contribution is limited to assessing RCTs vs. historical data to compare existing treatments.

To make this discussion less abstract, an example is used; namely treatments for prostate cancer. One would like to compare these existing treatments for success and side effects outcome measures. A typical outcome would be the 5 year percentage of patients free from biochemical evidence of disease (as defined by PSA measurements). Side effects include incontinence, impotence, and others.

**Round 1** – Treatment selection – In a RCT, typically two or perhaps three treatments would be selected. Many treatment categories can be further subdivided. For example, prostatectomy can be divided into the open procedure (which can be subdivided into where the incision is made) and a laparoscopic procedure, which may or may not be robotic. If one studies only one type of prostatectomy, then the results don’t apply to the other types. If one wishes to include all types, then the size of the trial becomes too large.

One must also consider patient eligibility. Typically, besides excluding patients for a number of reasons, patients are grouped into low, medium, and high risk categories. This again causes a strain on sample sizes.

Consider historical data. There are over 200,000 men diagnosed with prostate cancer each year. If one goes back ten years, this includes 2 million men. Provided the data for these 2 million men is accessible (or a subset of 2 million that is still large), there will sufficient sample size to compare different treatments, *including* treatment subcategories and patient risk groups.

**Round 2** – Randomization – Patients in a RCT are already a biased set. Consider an eligibility requirement. If the trial were to compare radical prostatectomy (RP) to external beam radiation therapy (EBRT) then each patient would have to be eligible for either treatment. But say someone who needed therapy had a previous stroke and was therefore not a candidate for surgery. This person would be excluded from the study and thus the set of patients in the study is a *medical* subset of the general population of patients who need treatment. This subset has bias because one needs healthier patients for the RCT because healthier patients are needed for surgery. There is another way the set of patients is a subset; some patients after being explained about the two treatments, may opt out of the study because they prefer one treatment. Thus, the set of patients in the study, is indifferent to which treatment they will get and indifference vs. preference may be important.

Historical data is more relevant to the real world. Some patients will have researched different treatments and selected the one they preferred, others will have accepted whatever their physician advised. The same biases as the RCT will be in historical data. That is, the stroke patient won’t get RP. But the effect of these biases can be explored through data analysis.

**Round 3** – Time – A RCT will by definition not provide results for 5 years (the outcome measure is the 5 year rate of no biochemical evidence of disease).

If treatments have existed for a number of years, historical data will provide results as soon as the data analysis is complete.

**Round 4** – Cost – A RCT is expensive. Analysis of historical data isn’t.

**Round 5** – Reliability – Reliability can be measured by sample size and bias. Assume for a moment the bias is the same. The following gives an idea of reliability.

RCT – assume 1,000 patients are treated either of two ways and also according to low, medium and high risk. This means one treatment has 500 patients and for each risk group 167 per treatment (numbers of course would not be exactly equal for each case).

Historical data – assume 2 million patients over ten years with half of the data usable. This leaves 100,000 per year. Following the same logic for the RCT gives 16,700 patients per risk stratification per treatment per year. One can use 5 years of data which gives 83,500.

What is the confidence interval (95%) for an 80% success rate for a treatment from either the RCT or historical data?

Trial Type |
Total N |
Success N |
Pct |
Low CI |
High CI |

RCT |
167 |
134 |
80.24% |
73.38% |
85.99% |

Historical data |
83,500 |
66,800 |
80% |
79.73% |
80.27% |

**Round 6 **– Exploring the data – Clearly, with a huge number, one can explore the data in ways not possible with a RCT. For example, for RP, one can examine the outcome measure and each side effect as a function of the hospital type where the surgery was performed, the number of surgeries performed by the surgeon, the time of day the surgery was performed, and so on. This can be done for all types of surgery.

**The winner** – In this example, historical data is the winner provided the data exists. If it doesn’t, it’s worth spending money to make the data available rather than spending money for a RCT.

**Added 1/4/09** – All of the above is for two treatments that are thought to be equivalent. Consider the case that treatment A is generally considered to be superior to treatment B based on historical or anecdotal data, but without studies from a randomized clinical trial. Proponents of treatment B might insist that “the jury is still out” with respect to treatment superiority and insist that treatment superiority can only be answered by a randomized clinical trial, but it is unlikely that enough patients could be found that are indifferent to which treatment they receive so the trial will never take place.