Reflections on the 9th BITSS Annual Meeting: Toward a More Open and Inclusive Scientific Ecosystem

By Katie Hoeberling and Aleksandar Bogdanoski

BITSS Faculty Director Ted Miguel gives the introductory presentation at the 9th BITSS Annual Meeting.

We look forward each year to the BITSS Annual Meeting as a space for discussing new meta-research and open science initiatives, as well as for reuniting with old colleagues and welcoming new ones, normally with handshakes and hugs. This year has been anything but normal, though. And while many of the old suspects joined us to continue deliberating issues we typically discuss, the program and our conversations reflected a world shaken up by a global pandemic, social unrest, and widespread questioning of what “normal” means within the scientific community and beyond.

While the policies and social processes that have brought about COVID-19 and widening awareness of structural racism are by no means new, there is a new sense of urgency to address the roots of these issues, as well as an understanding that these phenomena are deeply connected. Our presentations and discussions revealed how scientists are rethinking the ways science is and should be done, partly in response to these. 

In addition to improving the credibility of social science, speakers highlighted how transparency has facilitated access, collaboration, and co-creation. Affording access to rich datasets, for example, like those produced by the Landsat program and the US Census Bureau, helps diversify the community of scientists who use the data, as well as the research topics of the publications associated with the data. Relatedly, distributed data collection or analysis through meta-analyses with individual participant data, multi-lab replications, forecast collection, or crowdsourcing reproductions can grow the pool of people involved in research and help us understand biases and credibility. Transparently reporting findings and underlying uncertainties can make research more reusable, especially for evidence synthesis. And perhaps most presciently, open science policy has encouraged and enabled broad access to data that can be used to understand the spread of the novel coronavirus. 

Efforts like these are critical to democratizing science, but they can only get us so far. We heard from several initiatives that are working to widely share educational and pedagogical materials, as well as open source infrastructure. These resources can help meet the growing demand for open science training and ensure a diverse group of researchers can participate in a rapidly evolving scientific landscape. But as our second keynote panelists reminded us, open science is limited in its ability to promote equity and inclusion if it ignores the processes that create and perpetuate systemic exclusion outside of academia.

We’re still processing many of the conversations we had at our first virtual Annual Meeting. In the spirit of transparency, we’ve provided a “TV Guide” below with descriptions of each presentation, plus links to video recordings, slides, and other presentation materials. We welcome you to watch, read, and reflect with us on our collective understanding of normalcy. What can we learn from this last year and what should we carry forward? As always, thank you for joining us from wherever you are. We hope to see you in person at next year’s Annual Meeting!

Day 1

We kicked off with an introduction by BITSS Faculty Director Ted Miguel, who reflected on several BITSS projects that provide infrastructural solutions to transparency-related problems, such as the recently launched Social Science Prediction Platform and the soon-to-launch Social Science Reproduction Platform. He also discussed how BITSS  training initiatives, such as the first-ever virtual Research Transparency and Reproducibility Training (RT2) and Catalyst program, have adapted to virtual settings. Visit our website to find materials from our first virtual RT2, and to read about our new Catalyst training grants. (View: Slides, Video)

Video of Keynote Panel: Challenges and Opportunities of Transparency in COVID-19 Research. Panelists: Carrie D. Wolinetz (National Institutes of Health, Office of Science Policy), Samir Bhatt (Imperial College London), and Joakim Weill (UC Davis). Moderated by Maya Petersen (UC Berkeley).

Keynote Panel: Challenges and Opportunities of Transparency in COVID-19 Research

Moderated by Maya Petersen (UC Berkeley), Carrie D. Wolinetz (NIH), Samir Bhatt (ICL), and Joakim Weill (UC Davis) reflected on how open data has fostered discovery and facilitated collaboration in the context of COVID-19 research. Fortunately, actors in both the public and private sectors have recognized the urgency created by the pandemic and worked to increase access to data and code. This access has also spurred interdisciplinary collaboration and increased scientific output, which in turn has emphasized the value of peer review and domain expertise. To maintain and build on this progress, the panelists underlined the need to invest in infrastructure that incentivizes and rewards transparency practices such as data sharing. They also pointed out that greater data transparency has sharpened the tensions between openness and the need to protect the privacy of study participants—an issue that is yet to be resolved. 

“To maintain and build on this progress, the panelists underlined the need to invest in infrastructure that incentivizes and rewards transparency practices such as data sharing”.

Breakout Room A: Evidence Synthesis

  • The use of behavioral-science informed interventions to promote latrine use in rural India: a synthesis of findings –Charlotte Lane (3ie) (View: Video, Slides, Preprint)

Charlotte discussed how the International Initiative for Impact Evaluation (3ie), a producer and synthesizer of development impact research, is building on their pre-registration and pre-analysis plan policy to promote the adoption of practices that facilitate rigorous meta-analysis. By requiring researchers to use a common set of data collection standards and indicators, they’re able to conduct individual participant data (IPD) meta-analysis across their projects. She also discussed the challenges involved, and how their protocols are changing as they learn more (for example, using common sampling approaches and asking for more detailed readme files or codebooks).

  • Analyzing data of a multi-lab replication project with individual participant data meta-analysis: a tutorial Robbie Van Aert (Tilburg University) (View: Video, Slides)

Robbie presented a different approach to leveraging multi-lab replications for IPD meta-analysis. In addition to better powering the meta-analysis, multi-lab replications also allow researchers to handle data more consistently across studies, standardize protocols for removing outliers, avoid transforming effect sizes, and use participant-level moderators. Robbie demonstrates, however, that using a one-step IPD meta-analysis process that fits a single multilevel model to the IPD (rather than computing effect sizes for each replication and then meta-analyzing them), allows for valid conclusions to be drawn at the participant level, in addition to the lab level.

  • Why do large-scale replications and meta-analyses diverge? A case study of infant directed speech Molly Lewis (Carnegie Mellon University) (View: Video, Slides)

Meta-analysis and multi-lab replications have their merits and disadvantages: meta-analyses require fewer resources but are seldom pre-registered, while multi-lab replications require more time and resources but are typically pre-registered. Molly Lewis compared the results of a meta-analysis and a multi-lab replication testing babies’ exhibited preferences for infant-directed speech and adult-directed speech, finding meaningful differences. Sensitivity analysis showed that publication bias cannot fully account for these differences, and further multi-lab replications are planned.

Breakout Room B: Transparent Reporting

  • Selective hypothesis reporting in the field of psychology – Olmo van den Akker (Tilburg University) (View: Video, Slides)

Olmo van den Akker presented joint work with colleagues from the Meta-Research Center at Tilburg University that found selective hypothesis reporting to be still prevalent in psychology despite the uptake of preregistration. Comparing 48 pairs of pre-registrations and papers, they found that researchers omitted about half of the pre-registered hypotheses from subsequent papers, often changed the status of primary and secondary hypotheses, and sometimes even their direction. They recommended that researchers should be more specific when formulating hypotheses, both in pre-registrations and papers.

From Edlin and Love (unpublished): Magnitude and Precision measures in headline results in economics, political science, sociology, and medicine.
  • Magnitude and Precision vs. Sign and Significance in the Social Sciences – Aaron Edlin (UC Berkeley) (View: Video)

Focusing on three leading journals in economics, political science, sociology, Aaron Edlin and Michael Love examined the prevalence of numerical measures of precision and magnitude in “headline” results, which are commonly featured in paper abstracts. They found that measures of magnitude were not included in 64% ±3% of empirical economics papers and 92% ±1% of empirical political science or sociology papers. Researchers rarely reported measures of precision (0.1%±0.1%).

  • Open Policy Analysis of Deworming Interventions – Fernando Hoces de la Guardia (BITSS) (View: Video, Slides)

Fernando Hoces de la Guardia presented a new open policy analysis (OPA) that combines several prominent cost-benefit studies of deworming interventions. This OPA makes fully accessible the assumptions, analytic decisions, and data used in the analysis through an interactive app, an open policy report, and a GitHub repository. Using the OPA, analysts, researchers, and policymakers can better estimate the expected benefits of deworming across different policy settings.


Video of Panel: Forecasting Social Science Results. Panelists: Arun Advani (University of Warwick), Eva Vivalt (University of Toronto), and Nick Otis (UC Berkeley). Moderated by Stefano DellaVigna (UC Berkeley).

Panel: Forecasting Social Science Results (View: Slides)

Moderated by Stefano DellaVigna, this panel discussed the recently launched Social Science Prediction Platform (SSPP), which enables the systematic crowdsourcing of forecasts of research results. Nicholas Otis first discussed how collecting forecasts can improve research designs, mitigate publication bias, and help us study hindsight bias. He also shared usage stats from the platform—2,275 users contributing 8,225 predictions on 13 projects, largely in economics—and helpful resources for those interested in using it. Eva Vivalt and Arun Advani then discussed how they used the SSPP for specific projects. Eva presented a project with Aidin Coville in which forecasters predicted relatively accurately how much other researchers, policymakers, and development practitioners trust different kinds of evidence. Her team’s main challenge, she said, was explaining the study clearly and concisely enough to elicit informed predictions. Arun and three colleagues gathered predictions of how often economists study race, finding economists predicted accurately that they publish on race less often than political scientists and sociologists but overestimated the share of race-related research in economics journals. In general, both found the SSPP easy to use and advocated for its use in a wide variety of projects.

Lightning Talks (View: Video)

  • Andrew Little presented forthcoming work with Thomas Pepinsky that will examine how researchers’ directional motives about conclusions reached by a study may influence their perception of its credibility.
  • Maya Mathur and Tyler VanderWeele found that the validity of a vast majority of large meta-analyses (with 40+ studies) in psychology and medicine was not impacted by the publication bias found in the individual meta-analyzed studies.
  • Francesca Parente and Chad Hazlett advocated for a sensitivity-based approach in observational studies, which quantifies the amount of confounding necessary to change the conclusion.

Video of Keynote Panel: Open Science for a more Democratic and Inclusive Scholarship. Panelists: Juan Pablo Alperin (ScholComm Lab, University of British Columbia), Leslie Chan (University of Toronto), and Antoinette Foster (PREreview). Moderated by: Ted Miguel (UC Berkeley, BITSS).

Day 2

Keynote Panel: Open Science for a more Democratic and Inclusive Scholarship

Beginning with the premise that Mertonian and democratic norms are fundamental to a just and well-functioning scientific ecosystem, this panel sought to explore how departing from these can disproportionately harm marginalized groups. The panelists represented a variety of initiatives advancing openness and inclusion through open educational tools and alternative metrics for career advancement (Juan Pablo Alperin, Scholarly Communications Lab), open peer review and mentorship (Antoinette Foster, PREreview), and knowledge equity and open access (Leslie Chan, Knowledge Equity Lab). While the conversation explored how open science tools and practices can advance equity and inclusion, the speakers helped us take a step back to discuss how centering standards of whiteness and Western perspectives creates structural barriers that stretch from long before students enter the academic pipeline and perpetuate inequality throughout it. They left us with several important questions:

Who makes the rules, and whom do they impact? Where do norms and senses of legitimacy come from, and whom do they serve? How do norms and policies manifest in our personal and interpersonal lives, as well as in our institutions and structures?

While we don’t have all the answers, we’re excited to infuse these questions into everything we do, and hope that others in the research transparency movement do the same.

Breakout Room A: Democratizing Science

  • Improving Data Access Democratizes and Diversifies Science – Esther Shears (UC Berkeley) (View: Video, Slides)

Esther used descriptive statistics and a Difference in Differences approach to show how the lifting of restrictions and reduction in the costs of accessing Landsat data led to increases in the publications of research articles using the data, and especially large increases in highly cited publications. Use by researchers in South America, Africa, and Asia increased more dramatically than in the US and Europe where researchers generally had better access to data and financial resources. Moreover, newer publications were written by scientists earlier in their careers and focused on new topics and study locations.

  • Bringing Reproducible Practices to Methods Training: Introducing the Framework for Open and Reproducible Research Training (FORRT) – Flávio Azevedo (Friedrich Schiller University) and Sam Parsons (University of Oxford) (View: Video, Slides)

Sam and Flávio presented the Framework for Open and Reproducible Research Training (FORRT), a collaborative, crowdsourced, grassroots project focused on reforming science education and fostering social justice in academia. FORRT provides an e-learning platform, curates and shares curricular materials, pedagogical approaches, and research summaries to help educators, researchers, and students teach, practice, and understand open and reusable science. Visit forrt.org to learn more about these resources, as well as open office hours and an upcoming remote mentorship program.

  • How Do Data Shape Science? Evidence from U.S. Census’ Federal Research Data Center Program – Matteo Tranchero (UC Berkeley) (View: Video, Slides)

Administrative data generated in the United States, such as those generated by the Census Bureau and Internal Revenue Service, can only be accessed in situ at centers usually managed by the agencies collecting the data and university consortia. The opening of such centers can be challenging as they often rely on the efforts of a few key leaders, and require a competitively awarded grant from the National Science Foundation. Matteo finds that after new centers open, use of these data increases, as does the number of articles using the data published in top journals. Future research may focus on the scope and novelty of research using these data.

Breakout Room B: Publication Bias

  • From One Dataset to 198 Different Conclusions: Lifting the Hood on the Social Research Machinery – Nate Breznau (University of Bremen) (View: Video, Slides)

Nate Breznau, Eike Mark Rinke, and Alexander Wuttke presented the results of a crowdsourced replication project which studied inter-researcher reliability in replications. Although the 73 participating research teams focused on a single hypothesis and used the same dataset, their conclusions varied widely, as they arrived at 88 different conclusions and 1,261 models. Because the organizers limited the influence of common threats to research validity, such as questionable research practices and perverse incentives, they concluded that the high inter-researcher variability is probably due to idiosyncratic variation that is inherent to data analysis.

  • Is Peer Review Biased Toward Statistical Significance? – Abel Brodeur (University of Ottawa) (View: Video, Slides)

Using data from 400 manuscripts submitted to the Journal of Human Resources during 2013-2018, Abel Brodeur, Scott Carrell, David Figlio, and Lester Lusher tried to identify sources of bias at various stages in the peer-review process. They found that test statistics were abnormally concentrated at the 10% and 5% significance thresholds at the submission stage, suggesting that publication bias cannot be strictly explained by bias in the peer-review process. Journal editors were then able to “sniff out” and desk-reject papers with marginally significant results; however, peer reviewers were swayed significantly by statistical thresholds.

  • The Influence of Hidden Researcher Decisions in Applied Microeconomics – Nick Huntington-Klein (Seattle University) (View: Video, Slides, Paper)

Using a many-analysts approach with seven replicators, Nick Huntington-Klein and colleagues examined the influence of “hidden” researcher decisions (e.g., decisions in data cleaning, software commands, sample construction) on the replicability of results. They found large differences in data cleaning and analysis decisions across researchers, which led to a standard deviation of estimates across replications 3-4 times the typical reported standard error.


Video of Panel: Accelerating Computational Reproducibility in Economics. Panelists: Fernando Hoces de la Guardia (BITSS) and Lars Vilhuber (Cornell University). Moderated by Aleksandar Bogdanoski (BITSS).

Panel: Accelerating Computational Reproducibility in Economics (View: Video)

Fernando Hoces de la Guardia (BITSS) and Lars Vilhuber, the American Economic Association Data Editor, discussed efforts to improve the computational reproducibility of published work in economics and other social sciences. Fernando presented two complementary teaching resources developed by BITSS, including the Guide for Accelerating Computational Reproducibility (ACRe Guide) and the Social Science Reproduction Platform (SSRP). Instructors can use the ACRe Guide and the SSRP to lead students through reproducing published papers and assessing and improving their computational reproducibility. Reflecting on his experience verifying the computational reproducibility of prospective papers in AEA journals, Lars spoke about the need for economics researchers to facilitate reproductions of their work using reproducibility tools (e.g., GitHub, Docker) and recognized data repositories. He also shared a template README file for social science replication packages developed by several social science journals’ data editors.

Lightning Talks (View: Video)


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.