NRIN Collection of Resources on Research Integrity EducationInterdisciplinarySoftware
Code Ocean (in beta) Software
Code Ocean is a cloud-based computational reproducibility platform that provides researchers and developers an easy way to share, discover and run code published in academic journals and conferences. Upload code and data in 10 programming languages and link working code in a computational environment with the associated article for free. Code Ocean assigns a Digital Object Identifier (DOI) to the algorithm, providing correct attribution and a connection to the published research.
Metametrik is a prototype of a platform for storing and search of econometric results, a project lead by the Open Economics Group of the Open Knowledge Foundation. This prototype is an example of a platform where regression results are stored through input in a spreadsheet by an informed researcher, who enters the results on the level of a single regression. The platform then enables search with the option of several facets, including dependent variable, independent variable, model, controls, journal, year, authors, JEL codes and key words.
statcheck Wep App InterdisciplinarySoftware
statcheck is a program that checks for errors in statistical reporting in APA-formatted documents. It was originally written in the R programming language. statcheck/web is a web-based implementation of statcheck. Using statcheck/web, you can check any PDF for statistical errors without installing the R programming language on your computer.
TextThresher is a mass collaboration software allowing researchers to direct hundreds of volunteers – working through the internet – to label tens of thousands of text documents according to all the concepts vital to researchers’ theories and questions. With TextThresher, projects that would have required a decade of effort, and the close training of wave after wave of research assistants, can be completed in about a year and a half online. The project will likely begin beta-testing the software in late 2017, with plans to release it to the general public in early 2018.
Jupyter Notebooks Software
The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more.
Docker is the world’s leading software container platform. Developers use Docker to eliminate “works on my machine” problems when collaborating on code with co-workers. Operators use Docker to run and manage apps side-by-side in isolated containers to get better compute density. Enterprises use Docker to build agile software delivery pipelines to ship new features faster, more securely and with confidence for both Linux and Windows Server apps.
DeclareDesign is statistical software to aid researchers in characterizing and diagnosing research designs — including experiments, quasi-experiments, and observational studies. DeclareDesign consists of a core package, as well as three companion packages that stand on their own but can also be used to complement the core package: randomizr: Easy-to-use tools for common forms of random assignment and sampling; fabricatr: Tools for fabricating data to enable frontloading analysis decisions in social science research; estimatr: Fast estimators for social science research.
The New Statistics (+OSF Learning Page) EducationMeta-AnalysisPsychologyReplicationsSoftware
This OSF project helps organize resources for teaching the “New Statistics”–an approach that emphasizes asking quantitative questions, focusing on effect sizes, using confidence intervals to express uncertainty about effect sizes, using modern data visualizations, seeking replication, and using meta-analysis as a matter of course (Cumming, 2011).
JASP is a cross-platform software program with a state-of-the-art graphical user interface. The JASP interface allows you to conduct statistical analyses in seconds, and without having to learn programming or risking a programming mistake. JASP is statistically inclusive as it offers both frequentist and Bayesian analysis methods. Open source and free of charge.
The p-uniform package provides meta-analysis methods that correct for publication bias. Three methods are currently included in the package. The p-uniform method can be used for estimating effect size, testing the null hypothesis of no effect, and testing for publication bias. The second method in the package is the hybrid method. The hybrid method is a meta-analysis method for combining an original study and replication and while taking into account statistical significance of the original study. The p-uniform and hybrid method are based on the statistical theory that the distribution of p-values is uniform conditional on the population effect size. The third method in the package is the Snapshot Bayesian Hybrid Meta-Analysis Method. This method computes posterior probabilities for four true effect sizes (no, small, medium, and large) based on an original study and replication while taking into account publication bias in the original study. The method can also be used for computing the required sample size of the replication akin to power analysis in null hypothesis significance testing.
P-curve is a tool for determining if reported effects in literature are true or if they merely reflect selective reporting. P-curve is the distribution of statistically significant p-values for a set of studies (ps < .05). Because only true effects are expected to generate right-skewed p-curves – containing more low (.01s) than high (.04s) significant p-values – only right-skewed p-curves are diagnostic of evidential value. By telling us whether we can rule out selective reporting as the sole explanation for a set of findings, p-curve offers a solution to the age-old inferential problems caused by file-drawers of failed studies and analyses.
The Distributed Meta-Analysis System is an online tool to help scientists analyze, explore, combine, and communicate results from existing empirical studies. It’s primary purpose it to support meta-analyses, by providing a database for empirically estimated models and methods to integrate their results. The current version supports a range of tools that are useful for analyzing empirical climate impact results, but it’s creators intend to expand its applicability to other fields including social sciences, medicine, ecology, and geophysics.
MetaLab is a research tool for aggregating across studies in the language acquisition literature. Currently, MetaLab contains 887 effect sizes across meta-analyses in 13 domains of language acquisition, based on data from 252 papers collecting 11363 subjects. These studies can be used to obtain better estimates of effect sizes across different domains, methods, and ages. Using our power calculator, researchers can use these estimates to plan appropriate sample sizes for prospective studies. More generally, MetaLab can be used as a theoretical tool for exploring patterns in development across language acquisition domains.
statcheck is an R package that checks for errors in statistical reporting in APA-formatted documents. It can help estimate the prevalence of reporting errors and is a tool to check your own work before submitting. The package can be used to automatically extract statistics from articles and recompute p values. It is also available as a wep app.
This package performs power calculations for randomized experiments that use panel data. Unlike the existing programs “sampsi” and “power”, this package accommodates arbitrary serial correlation. The program “pc_simulate” performs simulation-based power calculations using a pre-existing dataset (stored in memory), and accommodates cross-sectional, multi-wave panel, difference-in-differences, and ANCOVA designs. The program “pc_dd_analytic” performs analytical power calculations for a difference-in-differences experimental design, applying the formula derived in Burlig, Preonas, and Woerman (2017) that is robust to serial correlation. Users may either input parameters to characterize the assumed variance-covariance structure of the outcome variable, or allow the subprogram “pc_dd_covar” to estimate the variance-covariance structure from pre-existing data.
OSF Data RepositoryRegistrySoftwareVersion Control
Open Science Framework (OSF) is part version control system, part data repository, part collaboration software that allows researchers to move study materials to the cloud, share and find materials, detail individual contributions, make research design more visible, and register materials to certify research design was not modified to alter outcomes. To increase workflow flexibility OSF offers a system where researchers can register a description of their study and its goals. The OSF emphasizes versatility with a very wide range of tools and features including add-ons from other related sites such as Dataverse and Github. Uploaded materials can also be archived and receive a Digital Object Identifier (DOI) or Archival Resource Key (ARK).
Scan.R searches all Stata (.dta), SAS (.sas7bdat), and comma-separated values (.csv) files found in the specified directory for variables that may contain personally identifiable information (PII) using strings that commonly appear as part of variable names or labels that contain PII. (Note: Scan.R does not search labels in .csv files.) Results are displayed to the screen and saved to a comma-separated values file in the current working directory containing the variables and data flagged as potential PII.
Git SoftwareVersion Control
Git is a free and widely-used version control system. It allows researchers to preserve, track, and revert to different versions of their project files in what are called Git Repositories. Software Carpentry offers useful tutorials for version control with Git. Github is a well-designed and popular host for Git repositories, and also offers a graphical application for managing repositories. It is used for sharing project files and collaborating. Github Guides are excellent tutorials for learning how to use Github.
Zotero is the only research tool that automatically senses content in your web browser, allowing you to add it to your personal library with a single click. Whether you’re searching for a preprint on arXiv.org, a journal article from JSTOR, a news story from the New York Times, or a book from your university library catalog, Zotero has you covered with support for thousands of sites.