Automatizing experiment reproducibility using semantic models and container virtualization

Tracking #: 2264-3477

This paper is currently under review
Carlos Buil Aranda
Maximiliano Osorio
Idafen Santana

Responsible editor: 
Guest Editors Semantic E-Science 2018

Submission type: 
Full Paper
Experimental reproducibility is a major cornerstone of the Scientific Method, allowing to run an experiment to verify its validity and advance science by building on top of previous results introducing changes to it. In order to achieve this goal, in the context of current in-silico experiments, it is mandatory to address the conservation of the underlying infrastructure (i.e., computational resources and software components) in which the experiment is executed. This represents a major challenge, since the execution of the same experiment on different execution environments may lead to significant result differences, assuming the scientist manages to actually run that experiment. In this work, we propose a method that extends existing semantic models and systems to automatically describe the execution environment of scientific workflows. Our approach allows to identify issues between different execution environments, easing experimental reproducibility. We have evaluated our approach using three different workflow management systems for a total of five different experiments, running on a container virtualization system (i.e. Docker). That showcases the feasibility of our approach to both reproduce the experiments as well as to identify potential execution issues.
Full PDF Version: 
Under Review