Garret Christensen–BITSS Project Scientist
CEGA faculty director Ted Miguel was quoted in a Wall Street Journal blog post by Anna Louie Sussman today:
“At this point, everybody doing with [sic] work with data and economics has an expectation that their data is very likely to get posted online, that someone is going to scrutinize it, that someone is going to try to replicate it. Knowing that’s the case is going to make people document their data better and be much more careful,”
The post discusses the recent Federal Reserve Board working paper released by Andrew C. Chang and Phillip Li that attempted to replicate 67 macroeconomics papers. The authors concluded that they could successfully replicate the results from only about 50% of the papers.
The working paper may not be perfect, but it does end with what I think are a few good suggestions:
- Mandatory data and code files should be a condition of publication.
An entry in the journal’s data and code archive should indicate whether a paper withoutreplication files in the journal’s archive is exempt from the journal’s replication policy.
Readme files should indicate the operating system-software version combination usedin the analysis.
- Readme files should contain an expected model estimation time.
Code that relies on random number generators should set seeds and specify the randomnumber generator.
Readme files should clearly delineate which files should be executed in what order toproduce desired results.
- Authors should provide raw data in addition to transformed series.
- Programs that replicate estimation results should carry out the estimation.