Garret Christensen–BITSS Project Scientist
BITSS participated in a pair of conferences/workshops recently that we should probably tell you about. First, BITSS was part of a research transparency conference in Washington DC put together by the Laura and John Arnold Foundation. Many of the presentations from the conference can be found here. The idea was to bring together academics, researchers on federal contracts, and federal government research sponsors and policy makers. Just a few things that were new to me or which stuck out were:
- BITSS partner Innovations for Poverty Action is now offering a code check for its affiliates. That is, PIs working on IPA projects can get an RA to run your code and see if it actually reproduces the numbers and tables you’re putting in your paper. E-mail Stephanie Wykstra at email@example.com. (This is in addition to IPA’s data repository, and their data publication guidelines.) This isn’t open to anyone, but it’s a step in the right direction; it would be great to see journals or some other organization adopting this on a larger scale.
- Brett Hemenway of the University of Pennsylvania spoke on secure multiparty computation (MPC). Basically, this is using public, proven, sometimes open-source encryption algorithms to share the results of analysis without sharing the underlying data. So I could have some data that I want to merge with sensitive or restricted-access data, and I could merge and run the regression, and only see the results, without being able to see the sensitive personal identifying information.
- George Alter of ICPSR gave a nice presentation on the ways to protect confidential data, and he discussed a simple catalog of the options. Safe data, safe places, safe people, and safe outputs. That is, modifying the data to reduce the identification risk, physical isolation or secure technologies, data use agreements and training, and review of results before release to researchers.
I also gave a presentation at UC Riverside’s Graduate Quantitative Methods Program on software tools for reproducibility. Specifically, version control with GitHub, reproducible workflow with R Markdown and Knitr, data sharing with Harvard’s Dataverse, and linking all these parts together plus pre-registering your analysis with the Open Science Framework. The video may be posted under the Resources tab on their page in the future, but you can find my slides here.