Resource Library

The BITSS Resource Library contains resources for learning, teaching, and practicing research transparency and reproducibility, including curricula, slide decks, books, guidelines, templates, software, and other tools. All resources are categorized by i) topic, ii) type, and iii) discipline. Filter results by applying criteria along these parameters or use the search bar to find what you’re looking for.

Know of a great resource that we haven’t included or have questions about the existing resources? Email us!

Disseminate

Design

Collect & Analyze Data

39 Results

↗

PGRP Onboarding Materials for Collaborative Reproducible Workflows Data Management+EconomicsInterdisciplinaryPolitical ScienceReproducibilityVersion Control

Catalyst Thomas Brailey developed a set of training materials to help transition J-PAL’s Payments and Governance Research Program (PGRP) towards a version-controlled research pipeline by onboarding all research team members to GitHub, GitHub desktop, and R. These teaching materials can be applied to onboard other research/lab teams across a variety of contexts in social science research.

↗

Coding style guides for collaborators (in R, Stata, and Python) Dynamic Documents and Coding Practices+InterdisciplinaryReproducibility

Developed by Sean Higgins (Northwestern University) and colleagues, this guide provides instructions for using R on research projects. Its purpose is to use with collaborators and research assistants to make code consistent, easier to read, transparent, and reproducible. See also the Python Guide and Stata Guide.

↗

TIER Protocol 4.0 Data Management+InterdisciplinaryReproducibility

The TIER Protocol specifies the contents and organization of reproduction documentation for a project involving computations with statistical data.

↗

Lab Manual for Jade Benjamin-Chung’s Lab Data Management+InterdisciplinaryPublic HealthReproducibility

This is a lab manual for students and staff working with Jade Benjamin-Chung at Stanford University. Its goal is to support collaborative, transparent, and reproducible workflows and it contains guidance on tools and good practices in communications, coding, version control, and data sharing, among others. It also features an internal replication process that increases reproducibility by identifying and resolving errors prior to publication.

↗

Reproducible Data Science with Python Data Visualization+InterdisciplinaryReproducibilityStatistics and Data ScienceVersion Control

Written by Valentin Danchev, “Reproducible Data Science with Python” is a textbook that uses real-world social data sets related to the COVID-19 pandemic to provide an accessible introduction to open, reproducible, and ethical data analysis using hands-on Python coding, modern open-source computational tools, and data science techniques. Topics include open reproducible research workflows, data wrangling, exploratory data analysis, data visualization, pattern discovery (e.g., clustering), prediction & machine learning, causal inference, and network analysis.

↗

Framework for Open and Reproducible Research Training (FORRT) Data Management+Dynamic Documents and Coding PracticesInterdisciplinaryPre-Analysis PlansStatistical LiteracyTransparent Reporting

FORRT is a pedagogical infrastructure designed to recognize and support the teaching and mentoring of open and reproducible science tenets in tandem with prototypical subject matters in higher education. FORRT also advocates for the opening of teaching and mentoring materials as a means to facilitate access, discovery, and learning to those who otherwise would be educationally disenfranchised.

↗

Videos: Research Transparency and Reproducibility Training (RT2) – Washington, D.C. Data Management+InterdisciplinaryMeta-AnalysesPower analysisPre-Analysis PlansPreprintsRegistriesReplicationsStatistical LiteracyTransparent ReportingVersion Control

BITSS hosted a Research Transparency and Reproducibility Training (RT2) in Washington DC, September 11-13, 2019. This was the eighth training event of this kind organized by BITSS since 2014.

RT2 provides participants with an overview of tools and best practices for transparent and reproducible social science research. Click here to videos of presentations given during the training. Find slide decks and other useful materials on this OSF project page (https://osf.io/3mxrw/).

↗

Nextjournal Dynamic Documents and Coding Practices+Version Control

Nextjournal is a container tool with features like polyglot notebooks, automatic versioning and real-time collaboration.

↗

Transparent and Open Social Science Research (FR) Dynamic Documents and Coding Practices+

Demand is growing for evidence-based policy making, but there is also growing recognition in the social science community that limited transparency and openness in research have contributed to widespread problems. With this course created and administered by BITSS, you can explore the causes of limited transparency in social science research, as well as tools to make your own work more open and reproducible.

↗

Software Carpentry Data Management+Dynamic Documents and Coding PracticesEngineering and Computer ScienceInterdisciplinaryStatistics and Data ScienceVersion Control

Software Carpentry offers online tutorials for data analysis including Version Control with Git, Using Databases and SQL, Programming with Python, Programming with R and Programming with MATLAB.

↗

ResonsibleData.io Data Management+Dynamic Documents and Coding PracticesInterdisciplinaryStatistics and Data Science

Using data for social change work offers many opportunities, but it brings challenges, too. The RD community develops practical ways to deal with the unintended consequences of using data in social change work, establishes best practices, and shares approaches between leading thinkers and doers from different sectors. We discuss thorny topics in-person, facilitate online group discussions on the RD mailing list, and share resources on this site.

↗

Conda Data Visualization+InterdisciplinaryStatistics and Data Science

Conda is an open source package management system and environment management system that runs on Windows, macOS and Linux. Conda installs, runs and updates packages and their dependencies and is operable in multiple languages, including Python, R, Ruby, Lua, Scala, Java, JavaScript, C/ C++, FORTRAN.

↗

PhD Course Materials: Transparent, Open, and Reproducible Policy Research Data Management+Dynamic Documents and Coding PracticesHealth SciencesInterdisciplinaryMeta-AnalysesOpen PublishingPre-Analysis PlansPreprintsPublic PolicyRegistriesReplicationsStatistical LiteracyTransparent ReportingVersion Control

BITSS Catalyst Sean Grant developed and delivered a PhD course on Transparent, Open, and Reproducible Policy Research at the Pardee RAND Graduate School in Policy Analysis. Find all course materials at the project’s OSF page.

↗

Transparency Training Module for Undergraduate Experimental Economics Dynamic Documents and Coding Practices+Meta-AnalysesPre-Analysis PlansReplicationsStatistical Literacy

These materials were used in the final weeks of an undergraduate course experimental economics at Wesleyan University taught by Professor Jeffrey Naecker.

These materials were developed as part of a BITSS Catalyst Training Project “Incorporating Reproducibility and Transparency in an Undergraduate Economics Course” led by Catalyst Jeffrey Naecker.

↗

Course Syllabi for Open and Reproducible Methods Anthropology, Archaeology, and Ethnography+Data RepositoriesData VisualizationDynamic Documents and Coding PracticesEconomics and FinanceEngineering and Computer ScienceHealth SciencesHumanitiesInterdisciplinaryLife SciencesLinguisticsMeta-AnalysesOpen PublishingOther Social SciencesPolitical SciencePower analysisPre-Analysis PlansPsychologyPublic PolicyRegistriesReplicationsSociologyStatistical LiteracyStatistics and Data ScienceTransparent ReportingVersion Control

A collection of course syllabi from any discipline featuring content to examine or improve open and reproducible research practices. Housed on the OSF.

↗

rOpenSci Packages Data Management+Dynamic Documents and Coding PracticesInterdisciplinaryMeta-AnalysesPower analysisReplicationsStatistics and Data ScienceVersion Control

These packages are carefully vetted, staff- and community-contributed R software tools that lower barriers to working with scientific data sources and data that support research applications on the web.

↗

Improving the Credibility of Social Science Research: A Practical Guide for Researchers Data Management+Economics and FinanceInterdisciplinaryPolitical SciencePre-Analysis PlansPsychologyPublic PolicyRegistriesReplicationsSociology

Created by the Policy Design and Evaluation Lab (PDEL) at UCSD, this teaching module was developed to demonstrate the credibility crisis in the social sciences caused by a variety of incentives and practices at both the disciplinary and individual levels, and provide practical steps for researchers to improve the credibility of their work throughout the lifecycle of a project. It is intended for use in graduate-level social science methodology courses—including those in political science, economics, sociology, and psychology—at UCSD and beyond.

These materials were developed as part of a BITSS Catalyst Training Project “Creating Pedagogical Materials to Enhance Research Transparency at UCSD” led by Catalysts Scott Desposato and Craig McIntosh along with Julia Clark, PhD candidate at UCSD.

↗

Accountable Replications Policy “Pottery Barn” Dynamic Documents and Coding Practices+Open PublishingPsychologyReplications

The Accountable Replication Policy commits the Psychology and Cognitive Neuroscience section of Royal Society Open Science to publishing replications of studies previously published within the journal. Authors can either submit a replication study that is already completed or a proposal to replicate a previous study. To ensure that the review process is unbiased by the results, submissions will be reviewed with existing results initially redacted (where applicable), or in the case of study proposals, before the results exist. Submissions that report close, clear and valid replications of the original methodology will be offered in principle acceptance, which virtually guarantees publication of the replication regardless of the study outcome.

↗

Improving Your Statistical Inference Dynamic Documents and Coding Practices+Power analysisPsychologyStatistical Literacy

This course aims to help you to draw better statistical inferences from empirical research. Students discuss how to correctly interpret p-values, effect sizes, confidence intervals, Bayes Factors, and likelihood ratios, and how these statistics answer different questions you might be interested in. Then, they learn how to design experiments where the false positive rate is controlled, and how to decide upon the sample size for a study, for example in order to achieve high statistical power. Subsequently, students learn how to interpret evidence in the scientific literature given widespread publication bias, for example by learning about p-curve analysis. Finally, the course discusses how to do philosophy of science, theory construction, and cumulative science, including how to perform replication studies, why and how to pre-register an experiment, and how to share results following Open Science principles.

↗

Nicebread Data Management+Data VisualizationDynamic Documents and Coding PracticesInterdisciplinaryMeta-AnalysesOpen PublishingPower analysisPre-Analysis PlansPreprintsPsychologyRegistriesReplicationsTransparent ReportingVersion Control

Dr. Felix Schönbrodt’s blog promoting research transparency and open science.

↗

Jupyter Notebooks Data Visualization+InterdisciplinaryReplicationsStatistics and Data ScienceVersion Control

The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more.

↗

Docker Data Visualization+InterdisciplinaryReplicationsVersion Control

Docker is the world’s leading software container platform. Developers use Docker to eliminate “works on my machine” problems when collaborating on code with co-workers. Operators use Docker to run and manage apps side-by-side in isolated containers to get better compute density. Enterprises use Docker to build agile software delivery pipelines to ship new features faster, more securely and with confidence for both Linux and Windows Server apps.

↗

DeclareDesign Dynamic Documents and Coding Practices+InterdisciplinaryPolitical SciencePower analysisPre-Analysis PlansStatistics and Data Science

DeclareDesign is statistical software to aid researchers in characterizing and diagnosing research designs — including experiments, quasi-experiments, and observational studies. DeclareDesign consists of a core package, as well as three companion packages that stand on their own but can also be used to complement the core package: randomizr: Easy-to-use tools for common forms of random assignment and sampling; fabricatr: Tools for fabricating data to enable frontloading analysis decisions in social science research; estimatr: Fast estimators for social science research.

↗

The New Statistics (+OSF Learning Page) Data Management+Dynamic Documents and Coding PracticesInterdisciplinaryMeta-AnalysesOpen PublishingPower analysisPre-Analysis PlansPsychologyReplicationsStatistical LiteracyStatistics and Data ScienceTransparent ReportingVersion Control

This OSF project helps organize resources for teaching the “New Statistics” — an approach that emphasizes asking quantitative questions, focusing on effect sizes, using confidence intervals to express uncertainty about effect sizes, using modern data visualizations, seeking replication, and using meta-analysis as a matter of course.

↗

Databrary Data Management+Data VisualizationDynamic Documents and Coding PracticesPsychology

Databrary is a video data library for developmental science. Anyone collecting shareable research data will be able to store and organize their data within Databrary after completing the registration process.

↗

rpsychologist Data Management+Dynamic Documents and Coding PracticesInterdisciplinaryOpen PublishingPsychology

Kristoffer Magnusson’s blog about R, Statistics, Psychology, Open Science, and Data Visualization.

↗

JASP Dynamic Documents and Coding Practices+Meta-AnalysesStatistical LiteracyStatistics and Data ScienceVersion Control

JASP is a cross-platform software program with a state-of-the-art graphical user interface. The JASP interface allows you to conduct statistical analyses in seconds, and without having to learn programming or risking a programming mistake. JASP is statistically inclusive as it offers both frequentist and Bayesian analysis methods. Open source and free of charge.

↗

p-curve Dynamic Documents and Coding Practices+Power analysisStatistics and Data Science

P-curve is a tool for determining if reported effects in literature are true or if they merely reflect selective reporting. P-curve is the distribution of statistically significant p-values for a set of studies (ps < .05). Because only true effects are expected to generate right-skewed p-curves – containing more low (.01s) than high (.04s) significant p-values – only right-skewed p-curves are diagnostic of evidential value. By telling us whether we can rule out selective reporting as the sole explanation for a set of findings, p-curve offers a solution to the age-old inferential problems caused by file-drawers of failed studies and analyses.

↗

Transparent and Open Social Science Research Dynamic Documents and Coding Practices+Meta-AnalysesPre-Analysis PlansRegistriesReplicationsStatistical LiteracyTransparent Reporting

Demand is growing for evidence-based policymaking, but there is also growing recognition in the social science community that limited transparency and openness in research have contributed to widespread problems. With this course created by BITSS, you can explore the causes of limited transparency in social science research, as well as tools to make your own work more open and reproducible.

You can access the course videos for self-paced learning on the BITSS YouTube channel here, (also available with subtitles in French here). You can also enroll for free during curated course runs on the FutureLearn platform.

↗

Manual of Best Practices Dynamic Documents and Coding Practices+Pre-Analysis PlansTransparent Reporting

Manual of Best Practices, written by Garret Christensen (BITSS), is a working guide to the latest best practices for transparent quantitative social science research. The manual is also available, and occasionally updated on GitHub. For suggestions or feedback, contact garret@berkeley.edu.

↗

Open Science Training Initiative Data Management+InterdisciplinaryVersion Control

Open Science Training Initiative (OSTI), provides a series of lectures in open science, data management, licensing and reproducibility, for use with graduate students and postdoctoral researchers. The lectures can be used individually as one-off information lectures in aspects of open science, or can be integrated into existing course curriculum. Content, slides and advice sheets for the lectures and other training materials are being gradually released on the GitHub repository as the official release versions become available.

↗

Swirl Data Visualization+Interdisciplinary

Swirl is a software package for the R programming language that turns the R console into an interactive learning environment. Users receive immediate feedback as they are guided through self-paced lessons in data science and R programming.

↗

Data Science Certificate Data Visualization+Engineering and Computer ScienceInterdisciplinaryStatistical LiteracyStatistics and Data Science

Data Science Certificate offered on Coursera, is set of nine classes that cover the concepts and tools needed to analyze data starting with asking the right kinds of questions to making inferences and publishing results.

↗

Reproducible Research Data Management+InterdisciplinaryStatistical LiteracyStatistics and Data Science

Reproducible Research taught by Roger D. Peng, Jeff Leek, and Brian Caffoof of Johns Hopkins University is a course on Coursera that teaches methods to organize data analysis so that it is reproducible and accessible to others. In this course students will learn to write a document using R markdown, integrate live R code into a literate statistical program and compile R markdown documents using knitr and related tools.

↗

OpenIntro Statistics Data Management+Dynamic Documents and Coding PracticesInterdisciplinaryStatistical LiteracyStatistics and Data Science

OpenIntro Statistics is a free comprehensive 400 page online textbook and suite of educational material on statistics and data analysis.

↗

Implementing Reproducible Research Dynamic Documents and Coding Practices+Statistics and Data ScienceTransparent ReportingVersion Control

Implementing Reproducible Research by Victoria Stodden, Friedrich Leisch, and Roger D. Peng covers many of the elements necessary for conducting and distributing reproducible research. The book focuses on the tools, practices, and dissemination platforms for ensuring reproducibility in computational science.

↗

The Workflow of Data Analysis Using Stata Data Management+InterdisciplinaryStatistical LiteracyStatistics and Data Science

Stata by J. Scott Long, explains how to manage aspects of data analysis including cleaning data; creating, renaming, and verifying variables; performing and presenting statistical analyses and producing replicable results.

↗

Political Science Replication Dynamic Documents and Coding Practices+Replications

Political Science Replication is a blog about reproducibility, replication, pre-registration, research transparency and open peer review.

↗

RStudio Dynamic Documents and Coding PracticesInterdisciplinary

RStudio is a popular and free user interface for R. R Markdown offers an easy way to implement dynamic documents, which are reproducible scripts that contain data, analysis, and nicely formatted outputs all in one file. For Stata users, dynamic documents can be created with Markdoc.