Pre-results review reaches the (economic) lab: Experimental Economics follows the Journal of Development Economics in piloting pre-results review

In its April 2019 issue, the journal Experimental Economics issued a Call for Submissions for a virtual Symposium of 5-7 papers to be published under “pre-results review”. BITSS Senior Program Associate Aleksandar Bogdanoski talked to Irenaeus Wolff of University of Konstanz, who along with Urs Fischbacher is a guest editor for the Symposium.

1. Why do you think pre-results review can contribute to increased rigor also in lab research in economics?

Some would argue that pre-results review has a limited value in lab experiments, because it’s much easier to replicate such work. Field experiments, on the other hand, are difficult, or often even impossible to replicate, and researchers usually measure hundreds of outcomes, which leaves a lot of room for maneuvering. However, publication bias is an issue in both, field and lab experimental research.

2. Is your main concern publication bias?

Yes. Under the current system, researchers don’t have the incentive to write up papers for lab experiments with null results, and we mostly hear about positive findings — a substantial fraction of which are false positives. What is important to note here is that a false positive usually is not the researcher’s fault: the researcher doesn’t know that he’s the 25th person to test the relationship between A and B, and whether the 24 earlier tests were pre-registered or not. In my view, these mechanisms clearly show that we need pre-results review also for lab studies: how the publication process works is by no means unique to field research. As another indication, just look at all the experimental psychology journals that are focused on lab research and that have started to introduce pre-results review!

In the end, research should be evaluated based on the importance of the question and the appropriateness of the methods. If the important question is subsequently answered by “A leads to B”, then that’s fine. If, on the other hand, the question instead is answered by “A does not lead to B”, that’s just as fine.

“…[R]esearch should be evaluated based on the importance of the question and the appropriateness of the methods”.

3. When did you suggest to introduce pre-results review at Experimental Economics?

In the field of experimental economics, the idea of pre-results review has been on the table at least since 2007, when Martin Dufwenberg proposed a very similar procedure at the Economic Science Association (ESA) World Meeting in Rome. Nonetheless, for some reason we were not convincing enough when we sent in our proposal to implement pre-results-review at Experimental Economics the first time in 2012, and when we suggested to make this the “unique selling point” for the to-be-launched Journal of the Economic Science Association (JESA) a year later.

However, things changed as more people in experimental economics and the research community at large, including members of editorial boards, grew more concerned with the replication crisis and publication bias. Also, it’s certainly helpful that more and more journals across the sciences have adopted the format, including many journals in psychology, and JDE in economics.

4. Was preparing the Symposium a lot of work?

The preparation for the launch was relatively easy thanks to the wealth of materials prepared by people like Chris Chambers at the Center for Open Science and by BITSS as part of the pilot with the Journal of Development Economics. The first thing we did was to contact BITSS and the JDE editors and ask for permission to use a substantial part of their materials, which they happily granted. That was extremely helpful. Then, we worked out our ideal scenario of how the pilot should be run and discussed it with other people from both within our research group and outside. And then, of course, we discussed our ideal scenario with the Experimental Economics editors as well.

5. Did you have a hard time convincing the Editorial Board to adopt pre-results review?

There was some discussion, but most of that probably took place among the board members. And in the end, there essentially were only two things that we changed based on the editors’ suggestions. First, we suggested a double-blind review at the first submission stage that we were asked to cancel to keep in line with the general Exp. Econ. policy. The other concern of the board was the possibility for authors to submit their paper to a top-five journal after the data are in and come back to Exp. Econ. after a rejection. Authors can of course retract their paper from Exp. Econ. to hand it into another top journal, be it before or after an in-principle acceptance—but the way back is closed: once you retract the paper, the in-principle acceptance is forfeited. Actually, when we proposed the pre-results review track to the editors of Exp. Econ., one of them suggested that we should follow a similar procedure as the JDE in that respect to increase the attractiveness of the option. The first elaborated proposal envisaged that the Editorial Board would publish a citable editorial and that authors could try a top journal first. They would only have to cite that editorial, so that Exp. Econ. would have at least some benefit from the review process. However, that idea was rejected by the Editorial Board because of two reasons. First, they did not want to waste journal space for an editorial that wouldn’t be cited by more than one or two papers after all, and perhaps more importantly, they did not want to serve as a quality-insurance agency for top journals.

“(…) [W]e hope that pre-results review becomes a standard for experimental research. This does not mean that traditional submissions shouldn’t be possible, even though in an ideal world they probably would be “reserved” for exploratory research”.

6. What do you hope that pre-results review will achieve in your discipline?

First, we hope to improve the validity of research. If there is some idea that many people think is true even though it is not, then it is tremendously important that null-results get out. Otherwise, researcher after researcher is going to waste money on trying to document the nonexistent effect—and eventually, there will be the researcher who succeeds. Then, the “news” gets out, and we have some new bogus “knowledge” that will take a lot of resources to be corrected.

Second, think of the case that some researcher asks an important question that opens the door to a completely new research area. However, the researcher does not find the perfect design immediately to answer the question. Then, even an inconclusive result could help others to design better experiments earlier on, so publishing the results could increase the speed with which important ideas spread. Closely related to this, null results can help refine our thinking about true relationships: if we see that in everyday life, A seems to lead to B but we cannot find it in the lab, we simply might have overlooked that A leads to B only under specific circumstances. So, seeing null results could be an important step in figuring out the true relationship between A and B.

Looking at the broader picture, we hope that pre-results review becomes a standard for experimental research. This does not mean that traditional submissions shouldn’t be possible, even though in an ideal world they probably would be “reserved” for exploratory research.

7. What do you see as the main advantages of pre-results review?

I think they are pretty clear, and a lot has been said about them. Two things might be worth mentioning still. First, in contrast to what intuition might tell us, the required effort for the publication of a paper actually may decrease: the analyses contained in the published version would have been necessary also under the traditional regime. On the other hand, it is no longer necessary to rewrite full papers time and again to make them fit an ever-changing “story”. So, less effort will be spent on re-writing the “story”, and more on the planning stage. Doesn’t that sound like an efficiency enhancement?

The second point concerns pre-registration. Many people seem to think that pre-registration will go a long way to solve the replication crisis. I don’t think so, and not just because people do not follow the registered protocol closely. To me, it seems pretty obvious that pre-registration does little to address the publication bias, in particular because it does not increase the incentives to complete an article after a null-result—or accept it, as an editor. In that sense, I am wary that the fast spread of pre-registration might in the end block the more important step towards widespread use of results-blind review, because it might make people believe we have done enough.  Another approach that people have been suggesting is to just tell editors and referees that a null result is not a reason for rejection. In fact, this policy has been in place at a number of journals for lab experiments, and my feeling is that it is only very selected null results that get written up. And I think that’s no wonder: given humans’ hindsight-bias and our desire for novelty, it often is difficult to appreciate the importance particularly of a null-result when it lands on our tables as referees or editors.

8. Do you foresee any challenges, and what are common concerns that you’ve heard?

A point that is being raised time and again (which in my view is not really an issue) is that pre-results review does not allow for exploratory analysis after surprise results. This is, of course, simply wrong: the only thing we require is that exploratory analysis is marked as such. A much more pressing issue is the fear that a lot of “laboratory mishaps” get published because researchers weren’t as careful in running the experiment given that the study had been pre-accepted anyway. I think the converse will be true: because the research design and implementation details become more important in deciding about acceptance or rejection, people will pay more attention to the details. And so, many problematic choices will be addressed up front that might have gone unnoticed otherwise. Also, even after pre-acceptance, people will have to take responsibility for the implementation: reviewers will check whether everything was implemented and analyzed the way that was promised, and researchers still have to care about their reputation, too.

9. A common line of criticism against pre-results review is that it may negatively affect the impact factor of the journal because null results may be less “interesting”. What is your response to such criticism?

The big question is: what is an uninteresting result? If the question is interesting and important and the methods are perfectly appropriate to study that question—and all these aspects are taken care of by pre-results review—how can it be uninteresting to learn what comes out? In their contribution to a recent special section of The American Statistician on what to do with, or instead of using p-values—which should be a must-read for every researcher —Amrhein, Trafimow, and Greenland call on journal editors to take pride in their journals’ exhaustive methods sections rather than in the surprising results. And I’d like to add: editors should be proud of the reliability of the findings that they publish, not of how “attractive” they are. After all, what good is having the journal filled with surprising results if we know quite a few of them won’t replicate, but not which ones? Wouldn’t it be better to at least know that a paper was not the third attempt of the fifth author to show that A causes B, where the first four authors didn’t publish the paper because they did not find a relationship? We have to internalize finally that null results are not uninteresting, but important information.

We know that pre-results review can solve a number of issues, but it is not meant to solve every research-related issue we face. For example, it is not meant to solve the problem that some researchers might engage in outright fraud. Although, as people like Martin Dufwenberg and Peter Martinsson have pointed out in a paper that will soon be published, it actually will decrease the incentives for engaging in fraud. But for us, this is a side issue. The main point is to address the publication bias that arises as a result of the well-intended actions of many faithful researchers.

“We have to internalize finally that null results are not uninteresting, but important information”.

10. In the end, what do you hope to achieve by the time the special issue is out?

We hope that there will be a substantial number of high-quality submissions, and that the quality of the papers that end up in the Symposium is convincing. Then, we might be able to convince the editorial board—and potentially the boards of other similar journals—that there is enough demand for such a submission option, and that this does not mean there would be a drop in paper quality. As I’ve heard from other people in charge of a pre-results review process, there is another positive side effect, namely that the average referee report is more constructive than is currently the case. And in our view, that would be an achievement of its own!

Irenaeus Wolff is a Postdoc at the Department of Economics at Konstanz University and the Thurgau Institute of Economics (TWI). His research includes topics in experimental and behavioral economics, models of bounded rationality, behavioral public choice, evolution of institutions and cooperation, and evolutionary biology.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.