It Depends…(and not on the weather)

A read out on key questions from Day 2 of our Research Transparency and Reproducibility Workshop (RT2) in Berkeley, California.

Well, Day 1 of RT2 was foggy – but Day 2 brought the Northern California rain. If you participated in Day 2, or visited our OSF page to follow along on your own, hopefully you weren’t looking for ALL THE ANSWERS and ways to clear the fog. While we talk about best practices and tools available for making research more transparent and reproducible, we certainly don’t have all the solutions!

Take pre-registration and pre-analysis plans. Yes, these are important tools for starting to tackle publication bias and p-hacking. But as BITSS Faculty Director Ted Miguel reflected in his presentation, there is a lot of wiggle room for when, what, and where to publish a pre-analysis plan (PAP) and pre-registration.

The when question revolves around balancing both the need to define your research analysis clearly and get feedback to help improve the design, with the potential for getting “scooped”. We discussed the pros and cons of making your PAP public or keeping it private for a period of time (OSF allows you to timestamp your PAP and embargo for up to 4 years, for example) and came away with – it depends.

The what question is also complicated – defining your hypothesis early and following your PAP can help reduce p-hacking and searching for “flashy” results – but on the other side, many question if this ties a researcher’s hands too much by diminishing the flexibility for exploratory analysis (see Ben Olken’s discussion here). Back in 2012, David McKenzie was writing about the high-level items that belong in a PAP, and while there are some guidelines for what to include – the framework of the OSF and AEA registries and guidelines like PRISMA are useful references – actual requirements of what belong in a PAP aren’t necessarily defined. Beyond the basic fields required by registries during the pre-registration process, researchers can also leverage the functionality of platforms like OSF to timestamp supplemental documents like literature reviews, detailed protocols, surveys, and more. To go even more in depth, you could consider Standard Operating Procedures in case your PAP doesn’t cover all methodological contingencies.

The where question is certainly a big one. The social sciences are fragmented, and while there are a few registries – OSF, AEA RCT Registry, EGAP,, etc. – there really isn’t one go-to registry for social science researchers that works for all types of research. While there isn’t one coherent registry to point all researchers to, there may be a more common registry if your research fits a particular method – like RCTs in development economics that can be registered in the AEA Registry.

But, we certainly discussed a lot of reasons why we train on developing and using PAPs and pre-registration. One benefit of PAPs relates to creating a firewall to protect the integrity of the research question from decisions made ex post, i.e. you don’t want your research questions or analysis to be shaped by political shifts, for example. As Ted said yesterday, social science is so diverse that it would be tough to use a one-size-fits-all approach to pre-registration and PAPs. Our goal is to have a conversation about this… not to be prescriptive.

Another thorny issue is data publication. Research transparency hinges on making data and code public, but what if that data includes direct and indirect identifiers that may compromise the privacy of your research subjects? As researchers, we need to consider our commitments to protect the human subjects involved in our research. And as discussed in Danae Roumis’ presentation on Privacy and Transparency, we need to balance the benefits of making our data public with protecting those human subjects and the promises we make for confidentiality.

This balancing act means there is no absolute way of publishing data. It takes careful review of each study and understanding the sample, risks for re-identification through indirect and direct identifiers, and potential results of re-identification depending on the cultural context. These issues should not be considered at the end of the research project, but rather throughout the research life cycle – much like many other practices to make research transparent. As we delve deeper into the emerging methods portion of RT2 on Day 3, with a focus on innovative tools and solutions, we keep in mind that our research is shaped by the choices that we make throughout the process, and there is no perfect answer for the questions the research community is posing.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.