Two important articles have recently appeared on the subject of what we might properly describe as junk science. At the heart of the issue lies epidemiology which, whilst it has its uses, finds false positives more often than not. The problem is two-fold: there are too many epidemiologists chasing too few real associations and it is too easy to use statistics to 'prove' whatever you (or your funders) want to prove.
I would argue that the rot set it with the passive smoking studies (particularly after 1990). Certainly, the heart attack miracles represented the moment when epidemiology jumped the shark, but the corruption and folly was evident before that and has infected countless areas of research since. Epidemiology has been hopelessly debased and it is dragging the reputation of real science down with it.
Passive smoking is just one small part of this, but it is an important case study because it demonstrated that junk science would be tolerated if it was in the name of a 'good cause'. But by accepting nonsignificant statistical associations of 1.10-1.30 (ie. a 10-30% increase in risk), it opened a Pandora's box which could not easily be closed.
In Velvet Glove, Iron Fist, I quote John Brignell, whose book The Epidemiologists I warmly recommend. What he says about the EPA's 1992 secondhand smoke report has come to pass:
There is no doubt about it - every study of passive smoking, if evaluated on the basis of statistical probity, shows that it is harmless; but probity had been jettisoned. It was a deeply symbolic and decisive moment in time. Once you could get the world to accept a relative risk of 1.19 at a significance level of 10%, you could prove that anything caused anything.The scientific era that had started with Bacon four centuries earlier had come to an end and the world was ready to return to the rule of mumbo-jumbo.
The problem is exacerbated by the bastardisation of the peer-review process. All too often, peer-review is a rubber stamp from like-minded people who sometimes seem not to have even read the studies they are approving. How else can we explain basic mathematical errors appearing in studies by Konrad Jamrozik and Stanton Glantz?
Dr Richard Smith, the former editor of the British Medical Journal, discussed the peer-review process on his BMJ blog last week in the wake of more junk science being exposed and came up with a radical solution.
Prepublication peer review is faith based not evidence based, and Sudlow’s story shows how it failed badly at Science. Her anecdote joins a mountain of evidence of the failures of peer review: it is slow, expensive, largely a lottery, poor at detecting errors and fraud, anti-innovatory, biased, and prone to abuse.
So rather than bolster traditional peer review at “top journals,” we should abandon prepublication review and paying excessive attention to “top journals.” Instead, let people publish and let the world decide. This is ultimately what happens anyway in that what is published is digested with some of it absorbed into “what we know” and much of it never being cited and simply disappearing.
My answer to this objection is that this happens now. Much of what is published in journals is scientifically poor—as the Science article shows. Then, many studies are presented at scientific meetings without peer review, and scientists and their employers are increasingly likely to report their results through the mass media.
Smith's solution is controversial. I tend to prefer James Le Fanu's idea of closing down every department of epidemiology. Making a training course on epidemiology and statistics compulsory for journalists who report on science wouldn't be a bad idea either.
Smith's idea of scrapping peer-review might back-fire. It could lead to more false positives being reported but, as he says, how much worse can things really get? It would at least stop defenders of junk science hiding under the halo of the peer-review. Science would have to stand on its merits rather than relying on the appeal to authority.
As I have argued before, the 95% confidence interval beloved of epidemiologists is nothing of the sort. It depends on all things being equal which they never are when human beings are involved.
He's talking about meta-analysis, in which a bunch of shoddy, nonsignificant statistical associations are combined to manufacture a single significant finding. That is what the EPA in the secondhand smoke report that Brignell mentions above.
I am pleasantly surprised to hear that Ioannidis's study is "highly cited". It is, in my opinion, one of the most important articles ever written about epidemiology. I quoted it at length in the collection of extended footnotes I recently published as an addendum to Velvet Glove, Iron Fist. If you only read one article about the science of statistics, read that.
As Ioannidis writes:
That's your EPA and SCOTH meta-analyses right there (1.19 and 1.24 respectively for passive smoking/lung cancer).
Ioannidis has much to say about the effect of bias. He makes the obvious, but rarely spoken, point that bias does not need to be financial, but can just as readily be ideological.
It is surely difficult to argue that anti-smoking campaigners turned epidemiologists do not have an inherent bias, quite apart from the financial rewards associated with coming up with the 'right' result. Indeed, all the factors Ioannidis identifies as being likely to lead to false positives apply to secondhand smoke studies.
And, remember, even with all these biases, most passive smoking studies have not found a statistically significant association with lung cancer.
The other important article appeared in Science News, discussing why most epidemiological findings are false.
It’s science’s dirtiest secret: The “scientific method” of testing hypotheses by statistical analysis stands on a flimsy foundation. Statistical tests are supposed to guide scientists in judging whether an experimental result reflects some real effect or is merely a random fluke, but the standard methods mix mutually inconsistent philosophies and offer no meaningful basis for making such decisions. Even when performed correctly, statistical tests are widely misunderstood and frequently misinterpreted. As a result, countless conclusions in the scientific literature are erroneous, and tests of medical dangers or treatments are often contradictory and confusing.
Replicating a result helps establish its validity more securely, but the common tactic of combining numerous studies into one analysis, while sound in principle, is seldom conducted properly in practice.
He's talking about meta-analysis, in which a bunch of shoddy, nonsignificant statistical associations are combined to manufacture a single significant finding. That is what the EPA in the secondhand smoke report that Brignell mentions above.
“There is increasing concern,” declared epidemiologist John Ioannidis in a highly cited 2005 paper in PLoS Medicine, “that in modern research, false findings may be the majority or even the vast majority of published research claims.”
As Ioannidis writes:
A meta-analytic finding from inconclusive studies where pooling is used to “correct” the low power of single studies, is probably false if R ≤ 1:3.
That's your EPA and SCOTH meta-analyses right there (1.19 and 1.24 respectively for passive smoking/lung cancer).
Ioannidis has much to say about the effect of bias. He makes the obvious, but rarely spoken, point that bias does not need to be financial, but can just as readily be ideological.
Prejudice may not necessarily have financial roots. Scientists in a given field may be prejudiced purely because of their belief in a scientific theory or commitment to their own findings.
It is surely difficult to argue that anti-smoking campaigners turned epidemiologists do not have an inherent bias, quite apart from the financial rewards associated with coming up with the 'right' result. Indeed, all the factors Ioannidis identifies as being likely to lead to false positives apply to secondhand smoke studies.
Corollary 1: The smaller the studies conducted in a scientific field, the less likely the research findings are to be true.
Corollary 2: The smaller the effect sizes in a scientific field, the less likely the research findings are to be true.
Corollary 3: The greater the number and the lesser the selection of tested relationships in a scientific field, the less likely the research findings are to be true.
Corollary 4: The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true.
Corollary 5: The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true.
Corollary 6: The hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true.
And, remember, even with all these biases, most passive smoking studies have not found a statistically significant association with lung cancer.
Much of what S. Stanley Young says in this podcast for American Scientist also applies, although most of his examples are related to diet—another area where junk statistics are endemic. He concludes that 90% of epidemiological findings are false. If he's right it would turn the 95% confidence interval on its head. I don't think he's far off.
“The problem is exacerbated by the bastardisation of the peer-review process.”
ReplyDeleteRead this article (the problem and the solution):
http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2642/2287
"All too often, peer-review is a rubber stamp from like-minded people who sometimes seem not to have even read the studies they are approving."
ReplyDeleteMaybe in the area you write about. However, ditching it as a process would result in more delightful incidents like this.
It's a flawed process, like most complex human systems, but it's still the least worst one out there.