Guest post by Arnaud Vaganay (Meta-Lab)
This post is the second of two dedicated to the reproducible interpretation of empirical results in the social sciences.
In my previous post on the interpretation of study results, I contrasted the notions of:
- Analytic reproducibility, which is concerned with the reproducibility of empirical analyses (typically reported in the findings/results section of manuscripts); and
- Hermeneutic reproducibility, which is concerned with the reproducibility of interpretations (typically reported in the discussion section).
I also justified the importance of hermeneutic reproducibility but did not address the all-important question of how to achieve hermeneutic reproducibility. This is the aim of this post.
I first assume that you:
- Acknowledge the replicative component of your study (not just its innovative component); and
- Have identified all the studies that addressed your research question before you, including in other disciplines. This can be done by using a recent systematic review or by conducting a reproducible literature review (since then renamed cumulative literature review to emphasise its reusability). Unstructured literature reviews should be avoided because they (i) do not guarantee that the most relevant studies were reviewed; and (ii) do not compare studies in all important respects.
A reproducible discussion includes two main steps. The first step is the systematic comparison of your results with results from previous studies (as mentioned above). Inasmuch as possible, results should be compared head-to-head using both:
- Unstandardized values: by comparing the direction and statistical significance of your results with the same quantities in previous studies;
- Standardized values: by comparing the magnitude/size of your effect with the magnitude/size of effects in previous studies. Ideally, an additional test should assess whether the difference between these effects is statically significant.
If your results cannot be directly compared (for example because your study analysed the data in a novel way), you should clearly mention it and invite further replications. As previously mentioned, it is through replication that the credibility of a theory can be ascertained.
The second step consists in correctly interpreting findings. If your results are in line with previous results, the effect is robust and the theory is corroborated (assuming no p-hacking of course). If the results are significantly different, the plausibility of the following scenarios should be discussed:
- Your study differs significantly in terms of analysis: for example, it could be that your study is the far or very far replication of an existing study (or corpus of studies, see p.9 of the linked document). As already explained, the closer the replication, the higher the expectation to find a similar result;
- Your study differs significantly in terms of intervention/independent variable: if your study evaluates a policy intervention, I would recommend that you compare these interventions using the TIDieR checklist;
- Your study differs significantly in terms of sample;
- Your study differs significantly in terms of social, cultural or institutional context.
- These hypotheses should be tested in subsequent studies;
- These two steps should be pre-registered and any change to the original protocol flagged and justified.
Readers should bear in mind that this methodology is most effective in highly structured literatures (i.e. in studies with a clear replicative component). However, its guiding principles will hopefully be useful to all. As always, any feedback is much appreciated: firstname.lastname@example.org