The worlds of academic and commercial research are being riven at the moment with concerns and accusation about how poor much of the research and conclusions that have been published are. This particular problem is not specifically about market research, it covers health research, machine learning, bio-chemistry, neuroscience, and much more. The problem relates to the way that tests are being created and interpreted. One of the key people highlighting the concerns about this problem is John Ioannidis from Stanford University and his work has been reported both in academic and popular forums (for example The Economist). The quote “most published research findings are probably false.” comes from Ioannidis.
Here are some of the quotes and worries floating about at the moment:
- America’s National Institutes of Health (NIH) – researchers would find it hard to reproduce at least three-quarters of all published biomedical findings
- Sandy Pentland, a computer scientist at the Massachusetts Institute of Technology – three-quarters of published scientific papers in the field of machine learning are bunk because of this “overfitting”
- John Bohannon, a biologist at Harvard, submitted an error stewm paper on a cancer drug derived from lichen to 350 journals (as an experiment), 157 accepted it for publication
Key problems that Ioannidis has highlighted, and which relate to market research are:
1. Studies that show an unhelpful result are often not published, partly because they are seen as uninteresting. For example, if 100 teams look to see if they can find a way of improving a process and all test the same idea, we’d expect 5 of them to have results that are significant at the 95%, just by chance. The 95 tests that did not show significant results are not interesting, so they are less likely to be published. The 5 ‘significant’ results are likely to be published, and the researchers on that team are likely to be convinced that the results are valid and meaningful. However, these 5 results would not have been significant if all 100 had been considered together. This problem has been widely associated with problems in replicating results.
2. Another version of the multiple tests problem is when researchers gather a large amount of data then trawl it for differences. With a large enough data set (e.g. Big Data), you will always find things that look like patterns. Tests can only be run if the hypotheses are created BEFORE looking at the results.
3. Ioannidis has highlighted that researchers often base their study design on implicit knowledge, without necessarily intending to, and often without documenting it. This implicit process can push the results in one direction or another. For example, a researcher looking to show two methods produced the same results might be thinking about questions that are more likely to produce the same answers. Asking people to say if they are male or female is likely to produce the same result, across a wide range of question types and contexts. By contrast, questions about products that participants are less attached to, in the context of a 10 point-scale emotional associations are likely to be more variable, and therefore less likely to be consistent across different treatments.
4. Tests have a property called their statistical power, which in general terms is the ability of the test to avoid Type II errors (false negatives). The tests in use in neuroscience, biology, and market research typically have a much lower statistical power than the optimum. This led John Ioannidis in 2005 to assert that “most published research findings are probably false”.
What should market researchers make of these tests and their limitations? Test data is a basic component of evidence for market research. Researchers should seek to add any new evidence they can acquire to that which they already know, and where necessary do their own checking. In general, researcher should seek to find theoretical reasons for the phenomena they observe in testing – rather than relying on solely on test data.
However, let’s stop saying tests “prove” something works, and let’s stop quoting academic research as if it were “truth”. Things are more or less likely to be true, in market research and indeed most of science, there are few things that are definitely true.
The ‘science’ underpinning behavioural economics, neuroscience, and Big Data (to name just three) should be taken as work in progress, not ‘fact’.
Is Ioannidis Right?
If we are in the business of doubting academic research, then it behoves us to doubt the academic telling us to be more skeptical. There are people who are challenging the claims. For example this article from January 2013 claims that the real figure for bad biomedical research is ‘just’ 14%, rather than three-quarters.