Do statistical reporting standards affect what is published?

Publication bias can also explain why false-positives are so common in peer-reviewed journals. There is a widespread perception among journal editors and reviewers, as well as authors, that only results with at least 95% significance – or p-values less than 0.05 – are worth publishing. In 2008, political scientist Alan Gerber and political economist Neil Malhotra reviewed the reported observations of significance just above and below 95% in articles published in two leading Political Science Journals – American Political Science Review (APSR) and American Journal of Political Science (AJPS).

As you watch this video and learn about their findings, ask yourself: what does such a bias mean for studies that produce statistically non-significant results, but may nonetheless ask important questions and use rigorous methods?

This article assesses two prestigious journals for publication bias caused by a “reliance on the 0.05 significance level.” Authors Alan Gerber and Neil Malhotra define publication bias as “the outcome that occurs when, for whatever reason, publication practices lead to bias in the published parameter estimates.”

The authors list four ways in which bias can occur:

  1. Editors and reviewers may prefer significant results and reject methodologically sound articles that do not achieve statistical significance thresholds.
  2. Scholars may only submit studies with statistically significant results to journals and place the rest in “file drawers.”
  3. Investigators may adjust sample sizes after observing that results narrowly fail tests of significance.
  4. Researchers may engage in data mining to find model specifications and sub-samples that achieve significance thresholds. Or they may continuously collect data until statistical significance surpasses the 0.05 threshold. This, along with the third item, refers to a practice known as “p-hacking.” We’ll get more into this in the next activity.

To detect publication bias, Gerber and Malhotra conducted a “caliper test,” in two leading political science journals, looking at the number of published results for critical values just above and below the cut-off. Because sampling distributions should reflect continuous probability distributions, the values just above and just below an arbitrary cut-off should be the same.

Their results showed a dramatic spike in published results when critical values were just above the threshold. They concluded that many of the findings in these journals could be false due to bias.

Gerber and Malhotra discuss the consequences of publication bias:

“First, publication bias may result in a significant understatement of the chance of a Type I error, which lends false confidence and may misdirect subsequent research. Second, anticipation of journal practices may distort how studies are conducted, encouraging data mining, specification searches, and post hoc sample size adjustments. Third, and perhaps most important, holding work to the arbitrary standard of p < 0.05 may discourage the pursuit and publication of work that is well designed and on important topics but unlikely to produce precisely measured estimates”.

There is value in well-designed, robust, innovative studies, even if the power of the study is weak. Gerber and Malhotra propose that, along with greater attention to research design, study registries should be implemented to reduce publication bias. We’ll learn more about these registries next week.

You can read the full article here.


Gerber, Alan, and Neil Malhotra. 2008. “Do Statistical Reporting Standards Affect What Is Published? Publication Bias in Two Leading Political Science Journals.” Quarterly Journal of Political Science 3 (3): 313–26. doi:10.1561/100.00008024.