Automatizing experiment reproducibility using semantic models and container virtualization

Tracking #: 2128-3341

This paper is currently under review
Carlos Buil Aranda
Idafen Santana1
Maximiliano Osorio

Responsible editor: 
Guest Editors Semantic E-Science 2018

Submission type: 
Full Paper
Experimental reproducibility is a major cornerstone of the Scientific Method, allowing to run an experiment to verify its validity and advance science by building on top of previous results introducing changes to it. In order to achieve this goal, in the context of current in-silico experiments, it is mandatory to address the conservation of the underlying infrastructure (i.e., computational resources and software components) in which the experiment is executed. This represents a major challenge, since the execution of the same experiment on different execution environments may lead to significant result differences, assuming the scientist manages to actually run that experiment. In this work, we propose a method that extends existing semantic models and systems to automatically describe the execution environment of scientific workflows. Our approach allows to identify issues between different execution environments, easing experimental reproducibility. We also propose the use of container virtualization to allow the distribution and dissemination of experiments. We have evaluated our approach using three different workflow management systems for a total of five different experiments, showcasing the feasibility of our approach to both reproduce the experiments as well as to identify potential execution issues.
Full PDF Version: 
Under Review