By BITSS Program Manager Kelsey Mulcahy
You’ve probably noticed the growing interest in research transparency and reproducibility issues and training – conversations with your colleagues, increasing numbers of high-profile panels – and of course, a number of BITSS workshops this Spring from UC Merced, California to Cuernavaca, Mexico to Delhi, India. We just finished our two-day workshop in New Delhi on March 16-17 to i) improve understanding of the importance of and challenges for research transparency, and ii) demonstrate new tools and techniques for reproducible workflows – and what an exciting one it was!
Led by Garret Christensen (BITSS Project Scientist) and Julia Clark (UCSD/PDEL), the workshop included 25 participants from 6 institutions in Chennai and New Delhi, including Athena Infonomics, Center for Disease Dynamics, Economics & Policy (CDDEP), IDinsight, International Initiative for Impact Evaluation (3ie), Indian School of Business (ISB), and the Abdul Latif Jameel Poverty Action Lab (J-PAL SA). Most participants were Research Associates and Research Managers, allowing us to focus the material on more technical issues and practices.
After motivating the workshop with an overview of research transparency on Day 1, participants delved into reproducible coding, data de-identification, replication, and received intros to registrations and pre-analysis plans with the OSF and AEA registry. Day 2 featured more work with OSF, version control with Git + Github, and Dynamic Documents in Stata (MarkDoc) and R (R Markdown). All workshop materials are available on Github here.
As with any of our workshops, the learning went both ways! Over the course of the workshop, participants raised a number of topics and questions that went beyond the scope of this short workshop, but are no less important and warrant further discussion:
Pre-registration and Pre-Analysis Plans (PAPs)
Do pre-registrations and PAPs limit a researcher’s ability to publish new and interesting results that may be of interest to policy-makers or other researchers?
What is “required” for a PAP to actually be useful? What are the criteria? Are there templates? Does this differ by discipline?
How can we more effectively document interventions and methods to assist in replications/reproducibility? Without complete and thorough documentation of both the intervention and how it was implemented, anyone doing a full replication—meaning starting from the beginning and collecting new data—cannot expect to end up with the same results as the original study.
How do we incentivize people to share data? What are the social barriers to doing so? Participants reflected that academics and researchers in India (although this is not something that is limited to India) do not like to share code and data, which can make it very difficult to build off of studies that have already been done, let alone do replications.
These are not simple questions with easy answers, but they are helpful in motivating us as we head into our Research Transparency and Reproducibility Training (RT2) event (previously called the Summer Institute) in Berkeley this June.
As always, we look forward to continuing the discussion. Feel free to comment below!