In contrast to the belief that randomised controlled trials are more reliable estimators of how much a treatment works, both reports found that observational studies did not overestimate the size of the treatment effect compared with their randomised counterparts.
BMJ. 2001 April 14; 322(7291): 879–880.
Copyright © 2001, BMJ
Any casualties in the clash of randomised and observational evidence?
No—recent comparisons have studied selected questions, but we do need more data
John P A Ioannidis, associate professor and chairman
Anna-Bettina Haidich, research fellow
Clinical Trials and Evidence-Based Medicine Unit, Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina 45110, Greece
Joseph Lau, professor
Division of Clinical Care Research, Department of Medicine, New England Medical Center, Tufts University School of Medicine, Boston, MA 02111, USA
Randomised controlled trials and observational studies are often seen as mutually exclusive, if not opposing, methods of clinical research. Two recent reports, however, identified clinical questions (19 in one report,1 five in the other2) where both randomised trials and observational methods had been used to evaluate the same question, and performed a head to head comparison of them. In contrast to the belief that randomised controlled trials are more reliable estimators of how much a treatment works, both reports found that observational studies did not overestimate the size of the treatment effect compared with their randomised counterparts. The authors say that the merits of well designed observational studies may need to be re-evaluated: case-control and cohort studies may need to assume more respect in assessing medical therapies and largescale observational databases should be better exploited.1,2 The first claim flies in the face of half a century of thinking, so are these authors right?
The combined results from the two reports indeed show a striking concordance between the estimates obtained with the two research designs. A correlation analysis we performed on their combined databases found that the correlation coefficient between the odds ratio of randomised trials and the odds ratio of observational designs is 0.84 (P<0.001). This represents excellent concordance (figure). In fact, it is better than that observed when the results of small randomised trials and their meta-analyses were compared with the results of large randomised trials.3 To complicate matters, the concordance has been worse when the results of specific large randomised trials on the same topic were compared among themselves.3 Concato et al further observe that, for the five clinical questions they evaluated, observational studies for each question had very similar odds ratios between themselves,2 whereas the results of the randomised trials were often very heterogeneous. Popular wisdom has it that a “gold standard” method should give more or less the same results when repeated several times, while a poor method would suffer from lots of variability. So should observational studies be the gold standard instead of randomised trials?
Such a thought would be anathema to most clinical trialists.4 A closer inspection of the data suggests several caveats. Firstly, in six of 25 comparisons the 95% confidence intervals of the summary effect from observational studies does not include the summary point estimate of the randomised trials. Moreover, in three cases the pooled point estimates are in the opposite direction (one suggests harm, the other benefit); in two more cases one pooled odds ratio estimate is exactly 1.00, and the other documents benefit. So, perhaps concordance is not all that perfect, depending on how one looks at it.
Secondly, variability may be a blessing and not a nuisance. Variable results in randomised trials suggest that these trials have indeed managed to study diverse patient populations and treatment circumstances where the efficacy of a treatment may differ.5 Observational studies may tend to amalgamate large populations and reach average population-wide effects where there is less variability but where it is also more difficult to discern which patients are likely to benefit from an intervention.