“It’s not only peer-reviewed, it’s reproducible!”

Peer review is one of the oldest and most respected instruments of quality control in science and research. Peer review means that a paper is evaluated by a number of experts on the topic of the article (the peers). The criteria may vary, but most of the time they include methodological and technical soundness, scientific relevance, and presentation.

“Peer-reviewed” is a widely accepted sign of quality of a scientific paper. Peer review has its problems, but you won’t find many researchers that favour a non peer-reviewed paper over a peer-reviewed one. As a result, if you want your paper to be scientifically acknowledged, you most likely have to submit it to a peer-reviewed journal.

Even though it will take more time and effort to get it published than in a non peer-reviewed publication outlet.

Peer review helps to weed out bad science and pseudo-science, but it also has serious limitations. One of these limitations is that the primary data and other supplementary material such as documentation source code are usually not available. The results of the paper are thus not reproducible. When I review such a paper, I usually have to trust the authors on a number of issues: that they have described the process of achieving the results as accurate as possible, that they have not left out any crucial pre-processing steps and so on. When I suspect a certain bias in a survey for example, I can only note that in the review, but I cannot test for that bias in the data myself. When the results of an experiment seem to be too good to be true, I cannot inspect the data pre-processing to see if the authors left out any important steps.

As a result, later efforts in reproducing research results can lead to devastating outcomes. Wang et al. (2010) for example found that they could not reproduce almost all of the literature on a certain topic in computer science.

“Reproducible”: a new quality criterion

Needless to say this is not a very desirable state. Therefore, I argue that we should start promoting a new quality criterion: “reproducible”. Reproducible means that the results achieved in the paper can be reproduced by anyone because all of the necessary supplementary resources have been openly provided along with the paper.

It is easy to see why a peer-reviewed and reproducible paper is of higher quality than just a peer-reviewed one. You do not have to take the researchers’ word of how they calculated their results – you can reconstruct them yourself. As a welcome side-effect, this would make more datasets and source code openly available. Thus, we could start building on each others’ work and aggregate data from different sources to gain new insights.

In my opinion, reproducible papers could be published alongside non-reproducible papers, just like peer-reviewed articles are usually published alongside editorials, letters, and other non peer-reviewed content. I would think, however, that over time, reproducible would become the overall quality standard of choice – just like peer-reviewed is the preferred standard right now. To help this process, journals and conferences could designate a certain share of their space to reproducible papers. I would imagine that they would not have to do that for too long though. Researchers will aim for a higher quality standard, even if it takes more time and effort.

I do not claim that reproducibility solves all of the problems that we see in science and research right now. For example, it will still be possible to manipulate the data to a certain degree. I do, however, believe that reproducibility as an additional quality criterion would be an important step for open and reproducible science and research.

So that you can say to your colleague one day: “Let’s go with the method described in this paper. It’s not only peer-reviewed, it’s reproducible!”

24 responses to “It’s not only peer-reviewed, it’s reproducible!”

  1. I replied to this post via several tweets https://twitter.com/PeterKraker/statuses/391216531780038656

    @PeterKraker asked me to add the comments here, which I’m more than happy to do

    · I like the sentiment of “Reproducible”. However, the advances of most physical science research cannot be added to a paper.

    · Frequently, it is not the data analysis, but rather the act of data collection itself that is the advancement.

    · That can’t be conveyed in a PDF, it can only be reproduced in a lab, and the equipment can be expensive (e.g. TEM).

    • Scott, thanks for your comment. I agree, reproducibility can mean different things depending on the discipline and the kind of scientific innovation. In the case of data collection in the physical sciences, I could imagine that a link to the experimental setup in a remote laboratory would do the trick. What do you think?

  2. I’m really surprised to see reproducibility described as a “new” quality criterion for science. It’s long been recognised (if not practised) as a defining, characteristic of science.

    • Robert, by describing it as a new criterion, I was referring to the apparent discrepancy between theory and practice. In my perception, reproducibility has not been considered by journals and conferences so far, even though it is one of the defining characteristics of science. But I wouldn’t mind talking about the “renaissance” of reproducible as a quality criterion either.

  3. I’d like to draw attention to the keynote given in Berlin last July by professor Carole Goble (U of Manchester) on the topic of reproducibility, the slides of which can be found on Slideshare: http://www.slideshare.net/carolegoble/ismb2013-keynotecleangoble

    • Thanks Jan for pointing out this very comprehensive presentation. C. Goble looks at the matter of reproducibility from many different points of view, and does so in a very entertaining way. Above all, the presentation does a very good job of explaining why reproducibility is simply NOT a matter of a clear description of sample and method in the paper alone.

      I’d also like to point out Victoria Stodden’s outstanding work on reproducibility in the computational sciences: http://www.stanford.edu/~vcs/talks/UMN-Oct102013-STODDEN.pdf

  4. It appears that you are talking about reproducibility of analysis rather than the reproducibility of an experiment.

    If an honest researcher gives you data and tells you how the analysis was performed then how could you not be able to reproduce the analysis (up to machine precision of your computer) ?

    • That depends on what you mean by “tells you how the analysis was performed”. I’d like to cite a paper that I wrote together with D. Leony, W. Reinhardt and G. Beham (you can find it here): “Knorr-Cetina (1981) already showed that papers do not contain all the methodological information needed to reproduce a certain research result. An elimination process is taking place during the production of the paper in which information is decontextualized and typified. Scientific work is usually done in a different way than it is reported on later. This is also backed by Latour (1979) who found that science is not a structured process but rather an array of incoherent observations, which need to be ordered subsequently. Furthermore, there are certain procedural remarks that are too detailed to be included in a publication. The way that these procedural remarks look like differs greatly depending on the method used. Currently, these procedural details are mostly exchanged through personal communication or joint work.”

  5. Of course I am in broad agreement with your point, but reproducibility can be tricky.

    My favorite illustration comes from my chemistry post doc experience. One of my associates stumbled across a novel synthetic method, by the classic means of sorting through the products of a reaction that failed to go as anticipated. What an exciting time! The whole lab pulled together as we churned out example after example of the new reaction class.

    Yet, in the course of nailing down some details for the big publication my colleague tried and failed to replicate the original experiment. To this day, no one has ever managed to make the reaction go again on the original target molecule – yet the original vial of product still exists in a freezer.

    So where was the reproducibility? Not in the literal replication of the original synthesis, but in the supporting evidence of extending and generalising the original work.

  6. This post reminds me very much of Bruce Charlton’s, “Peer usage versus peer review.” http://www.bmj.com/content/335/7617/451

  7. I like these ideas. How would you embed reproducibility in the peer review process to be able to label a study as ‘reproduced & peer-reviewed’? Could journals team with statistical training centers and integrate students (guided by a lecturer) into this? Another blog recommends that groups of scholars meet on google hangout or skype to reproduce papers together (http://ivory.idyll.org/blog/a-conversation-on-reproducibility.html). I’d like to discuss some concrete ideas either here or on my own blog (http://politicalsciencereplication.wordpress.com) if anyone is interested in taking this further. Best, Nicole

    • Nicole, thanks for your comment. I like your suggestion because it decouples the reproducibility effort from the peer review process. This would take away some burden from the reviewers, and the results from the reproducibility effort could potentially inform their decision.

      The idea posted by C. Titus Brown might work very well in the computational sciences, as reviewers should usually have all of the necessary equipment to reproduce a result.

  8. I totally agree that a “reproducible paper” meets a higher standard. The question of who is ultimately responsible for this kind of review is one I tried to unpack in a blog post here: http://isps.yale.edu/news/blog/2013/07/the-role-of-data-repositories-in-reproducible-research

    • Limor, thanks for pointing out your comprehensive review. It was interesting to learn that the data curation process in your repository is driven by replication requirements. I am especially impressed that you check whether the submitted materials can actually be used to reproduce the results from the papers.

  9. I also like the idea of promoting “reproducibility” as new important criterion for dissemination quality. I’d also underline the importance for this kind of quality metric with regard to information visualization. F.i. Plaisant, Fekete et al. (2008) promote development of benchmarks to facilitate the comparison of certain visualization techniques. North (2006) and also others already try to introduce and describe methods to produce comparable visualization evaluation results. I also underlined the importance of repositories for comparison in one of my latest publications. Thank you for introducing this term as a quality metric and reminding me of its’ importance!

  10. @Peter: First and foremost thank You for linking to the open access version of the paper! 😉 To your question: Science mapping is surely another interesting example usage and I like the idea of visually approaching the question of mapping science! By doing this visually, we can reduce complexity with this approach. Currently many research is published on the topic of time visualization and this could be a benefit for the science mapping research too. I also recently came across certain research questions to model evolution. One of the latest case studies I participated in was dealing with the question how (biological) model visualizations evolve over time. This is also a really exciting part for new project ideas! 😉

  11. Nice article! As a reproducibility advocate, one of the criticisms I’ve often run into from people is that making sure things are reproducible takes time away from “getting real work done”. I hope these people realize how critical reproducibility truly is — without it, what is science? We throw away so much information about how our data analysis was performed it’s sickening!

    • Thanks, Matt! I agree, investing time into reproducibility will pay off in the long run. In the short term, however, it may seem that you are losing time. That’s why I think that if reproducibility was seen as a quality standard, it would be easier to get people to commit to it.

Leave a reply

Your email address will not be published. Required fields are marked *