How to Tell When a Scientific Study Is Total B.S.
Truth be told, most of the medical studies you read about are either not as astounding as they’re cracked up to be, or downright worthless. The more amazing the breakthrough or novel the insight into the workings of the human mind or body, the more skeptical I am about the report that trumpets it. Very few will withstand much scrutiny or the test of time.
Several years ago, Dr. John Ioannidis, a professor at Stanford University School of Medicine (then at Tufts), wrote an article for the online journal PLoS Medicine titled “Why Most Published Research Findings Are False.”
“There is increasing concern that in modern research, false findings may be the majority or even the vast majority of published research claims,” he wrote, and went on to describe in meticulous mathematical detail exactly why most of the studies that see the light of day are deeply flawed. If dense biostatistical analysis is your bag, it’s well worth reading. (Even if you skip the math, merely scanning his list of corollaries is helpful.)
Dr. Ioannidis’ concerns extend to the entire body of scientific research, not merely those studies that end up gaining broader media attention. But for those that do, how can a reader unfamiliar with such notions as positive predictive value or logistical regression determine if what he or she is reading is really worth the attention it’s receiving? If most of what makes it even into the peer-reviewed journals is flawed, what about the ones that make it into the news?
“Extraordinary claims require extraordinary evidence,” Dr. Clay Jones, a Boston-area pediatrician and contributor to the flagship blog of the Society for Science-Based Medicine, told The Daily Beast. “Essentially, as a skeptic I proportion my acceptance to the evidence, and I have a higher bar for acceptance depending on the claim. Unfortunately, the lay public often doesn’t have the fund of knowledge or awareness of the history of bogus claims to know where to set their own bar for acceptance.”
One of the easiest ways to determine if a study is even worth discussing is to see where it was actually published, if at all. While the peer-review process isn’t perfect, with Dr. Ioannidis’ qualms pertaining even to the majority of those that make it through, it’s the bare minimum for taking research seriously. Having the study’s methods evaluated by a group of experts in the field is the lowest bar that any decent article should pass.
For those that do, considering the impact ranking of the journal that published it is an additional step recommended by Dr. Jones. An article appearing in a prestigious publication like The New England Journal of Medicine, which other researchers often cite in their own studies, is more likely to be worth reading than one in the Proceedings of the Society of Left-handed Rheumatologists. This isn’t a perfect method of evaluating studies, since a journal like Pediatrics has a more modest impact due to its limited scope, but is well respected within the specialty I share with Dr. Jones. And even The Lancet and Nature publish their share of shaky studies. But it’s decent frame of reference.
“Countless studies are touted on websites promoting a seemingly limitless variety of products, services, alternative healing modalities, etc. that are revealed after a little digging to have never been published in a peer-reviewed journal, even a crappy one,” continued Dr. Jones. “They may be in-house studies performed by a company to show that their product has some benefit. They may have been presented at a conference but never accepted for publication. They could in fact be completely made up.”
It was for reasons along those lines that I had such trouble accepting the findings of a survey sponsored by the MAC AIDS Fund, which reported that roughly a third of American teenagers could not identify HIV as a sexually-transmitted disease. (My concerns about the study do not dim my admiration for the fund itself or the work that it does.) A marketing firm, not a peer-reviewed journal, published the survey. Since an ambiguously-worded survey could easily account for such a startling finding, I was unable to take it seriously when I wasn’t given more information about the study’s methods. In fact, what I found interesting enough to write about were the results the survey actually downplayed.
This highlights the problem with much of the research out there, which is bias. Dr. Ioannidis defines bias as “the combination of various design, data, analysis, and presentation factors that tend to produce research findings when they should not be produced.” Put simply, it is anything from how a study is put together to how it is reported that makes a finding seem real when it’s not. It can be very tricky to spot.
Sometimes, however, it’s obvious. It makes sense, for example, for an AIDS charity to highlight findings that show how information about HIV is lacking, which is then helpfully passed on in the media without question. Any study that is sponsored by a group that would have an interest in the results turning out the way they did should be viewed with suspicion.
The bigger the claim made in the headline, the bigger the red flag it raises for me. Studies that confirm a large body of well-established findings are more likely to be sound, but they also make for lousier news. A wonderful new treatment or exciting breakthrough diagnostic test is much more exciting. Take, for example, the possibility of an objective test to diagnose attention-deficit hyperactivity disorder (ADHD), which is currently diagnosed based on observations by parents and teachers, and thus susceptible to subjectivity. A “foolproof” way to make the diagnosis would be news indeed! Those were the claims of an author of a study passed along to me recently, who says that measuring a patient’s involuntary eye movements offers just that kind of reliability.
“With other tests, you can slip up, make ‘mistakes’—intentionally or not. But our test cannot be fooled,” said Dr. Moshe Fried of Tel Aviv University. “Eye movements tracked in this test are involuntary, so they constitute a sound physiological marker of ADHD.”
However, as soon as I saw that the study included only 22 adults in each of the two study groups, I knew it wasn’t nearly powerful enough to lift a claim that big. While the sample size may be large enough to detect a statistically significant difference between the groups, it’s far too small to firmly establish the test’s usefulness as a diagnostic tool. It’s an interesting early finding, and it appears the researchers plan to do further study on a larger group of subjects, but we’re a long way from foolproof.
“If a study looking at the effect of a new treatment is small, uncontrolled, unblinded and unreplicated, don’t rush into acceptance,” cautioned Dr. Jones. Unfortunately, it’s often just these studies that make for the splashiest headlines.
It’s hard enough for medical providers to tease through the data and determine how much credence to give any given finding, even with access to the text of the study itself. Unfortunately, most media reports contain none of that information for lay readers to consider, which leaves them relying on rules of thumb. But rules of thumb are better than nothing.
If a result seems crazy, it probably is. Studies that build on the established body of evidence are more likely to be true than ones that appear to overturn it. If it seems that the study’s authors or sponsors might benefit from the result they got, it should be viewed with intense scrutiny. Big claims from small studies are almost certainly false. And ones that never got published in the first place aren’t worth your time.