BITSS is currently holding its first summer institute in transparency practices for empirical research. The meeting is taking place in Berkeley, CA with 30+ graduate students and junior faculty in the attendance.
Ted Miguel (Economics, UC Berkeley), one of the founding members of BITSS, started with an overview of conceptual issues in current research practices. Across the social sciences, academic incentives reward striking, newsworthy, and statistically significant results at the expense of scientific integrity. This creates several issues, including publication bias and an incomplete body of evidence. Fortunately, new norms and practices are emerging, driven mostly by bottom up efforts among social science researchers.
Scott Desposato (Political Science, UCSD) followed with a fascinating talk on the issue of ethics in field experiments. “Social scientists are venturing overseas to conduct an increasing number of experiments, which are increasingly larger in scope.” Yet many of these experiments are illegal under local legislation, involved unconsenting subjects, and generate risks for bystanders. “One day, a researcher is going to push too hard and things will get out of hand. This is likely to create a backlash, which would limit our future ability to access a lot of important information […] You can’t outsource ethical judgment to the IRB – you need to think carefully about what you are doing and what the consequences will be.” Ignoring those issues has potentially serious consequences to subjects, enumerators, investigators, and entire scientific disciplines.
Leif Nelson (Psychology/Behavioral Science, UC Berkeley) presented on his seminal work on p-hacking, or the common practice of manipulating and reporting data to reach statistical significance (i.e. p-value < 0.05). This generates too many false-positives in the literature – a potential catastrophe for scientific inference. P-hacking is solved through transparent reporting. “Instead of only reporting the good stuff, people should report all the stuff” (i.e. the whole set of hypothesis-testing strategies used to generate findings). Nelson introduced the p-curve, a tool to detect whether p-hacking was so intense that it eliminated evidential value (if any). P-curve is particularly useful for “concerning” values where p-values are under, but very close from a certain level of statistical significance. Researchers, reviewers, and readers alike can use the p-curve at www.p-curve.com. Leif and his collaborators Uri Simonsohn and Joe Simmons are blogging at datacolada.org.
Drawing on some of his own research projects, Kevin Esterling (Political Science, UC Riverside) closed the first day with a checklist of reporting standards for randomized controlled trials. Esterling went over a list of to-ask questions about study hypotheses, subjects, allocation method, context, treatment, and results to be included in research papers. Unlike in biomedicine, there are no community-wide accepted standards for the reporting of social science RCTs.
A group of 32 graduate students and junior faculty are attending the workshop. The participants, who were selected through a competitive application process, represent no less than 13 academic institutions in the US, 6 abroad (Brazil, Rwanda, Kenya, Italy, the UK, and the Netherlands), as well as 4 research non-profits.