Open Source Software for Reproducible Social Science

Garret Christensen –BITSS Project Scientist


 

BITSS offers grad student workshops in reproducible research, where we give a hands-on introduction to software that can help make your work more reproducible. A lot of the software is listed on the Software section of our Resources page, but I wanted to create a quick narrative summary of software that people can use to make their research more reproducible.

1: Dynamic Documents

You can write your code and your paper in one place. This means you won’t screw anything up copying and pasting, and you’ll never have to wonder what code it was that produced which figure, and where on earth you saved it, or whether the paper has the updated version.

In R, this can be done with R Markdown, which is built into R Studio. So download and install R and R Studio. When you open a new R Markdown file in R Studio, it starts with a really simple example, or you can learn more here.

In Stata this can be done with the user-written command MarkDoc. So you’ll have to pay an arm and a leg for Stata, and then run:

ssc install markdoc

ssc install weaver

ssc install statax

The package may have been updated recently, so you might want to run “adoupdate” if you installed it a while ago. The syntax is explained in the built-in help file. For MarkDoc to work you also need to install Pandoc, a pretty cool Swiss-army knife that converts almost any markup file to almost any other, as well as wkhtmltopdf. If you install as above, these may be installed automatically, but you may have to click on a link that will show up inside Stata.

2: Version Control

The date-and-initial version of keeping track of changes to your files doesn’t really cut it when you’re doing something complicated or you’ve got a lot of co-authors. If you really want your work to be reproducible, use real version control. It’s got a learning curve even for xkcd-type people, but it’s worth it. (Read Gentzkow and Shapiro chapter 3 on why.) Software Carpentry and GitHub have great tutorials.

To get started, download the GitHub Desktop GUI app. If you are comfortable using the command line, I also recommend Windows users install Git Bash.Note that this is only available for Windows and Mac users. Linux users can use the command line or pick one of the other GUIs listed here.

Next, create an account with GitHub.com. Github is a popular online storage for your repositories (folders/projects) that are version-controlled with Git.

3: LaTeX

Word is nice and easy for writing short papers, but when you start writing longer papers, or you want to include any equations or formatting it quickly becomes cumbersome. LaTeX is better for reproducibility since when you include your figures, you just refer to files, so there’s no question of whether you remembered to update or not. LaTeX (download here) is also used by R Markdown when you make pdf’s, so you have to at least have it installed in the background. This is a huge file, and you have to install the full version, so don’t leave this until the last minute.

4: The Open Science Framework (OSF)

There’s no installation for this, so far it’s just a web application. The OSF allows you to store your research files and link together all your research across several platforms, such as Dropbox, Harvard’s Dataverse, and GitHub. It version controls any files you upload, and you can register a project to create a frozen time-stamped version with a persistent URL, so by writing a pre-analysis plan you could prove to the world that your significant results aren’t just a successful fishing expedition. Sign up for a free account here.

5: A good text editor

You need a good way to edit plain text. On a mac, you can set TextEdit so plain text, not rich text (rtf) is the default format. On Windows, you can use Notepad. I’d suggest something a little more powerful, like Atom, or Notepad++, or Sublime Text. These have syntax highlighting, and add-on packages that can render markdown, and things like that.

So that’s it–hopefully not too difficult. Download a handful pieces of software and you’re on your way. If you’re at a BITSS workshop, we’ll be happy to help, but please try and do as much installation as you can before the workshop starts.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.