The concept of reproducibility is essential for building confidence in science. Open access to the software and data used to produce a scientific result is one of the necessary, but not sufficient, conditions for ensuring the reproducibility of that result.

The reusability of data and codes is a major issue for reproducibility.

There are several key concepts (see K. Hinsen, ReScience, 2017) :

  • Rerunnable
    • Can you re-run your program?
    • One day, one week, one month, one year (just kidding) apart?
  • Repeatable
    • Can you re-run your program and get the same results?
    • Did you save everything, including the random seed?
  • Reproducible
    • Can someone re-run your program and get the same results?
    • Did you save the software stack?
  • Replicable
    • Can someone reimplement your model and get the same results?
    • Did you describe everything?
  • Reusable
    • Can someone reuse your program using different data?
    • Is your software data-dependent?

To meet these needs for reproducibility, releasing the code under an open licence is an essential first step in ensuring its accessibility. But this is far from sufficient, especially when you consider that the code is running in a complex environment whose dependencies are not really under control.

There are tools available to help control and preserve the environment in which the results were obtained: these include package managers such as Nix or Guix.

The use of notebooks enables editorial text, computer code and the results of this code to be integrated into the same document, and are therefore also interesting tools for reproducibility, particularly by documenting the use of the software (encouraging replicability).

Package management systems

Reproducibility of software environments — i.e. the ability to deploy the exact software package used for scientific research — is one of the most challenging aspects. Having the source code alone does not guarantee that calculations can be reproduced.

The usual package management systems (apt, rpm, etc.) have limitations that make them of little use in the context we are interested in: they require system administration rights, and they only allow you to deploy one software environment at a time.

Guix

Guix and Nix are tools that do not have these constraints and allow you to create completely reproducible software environments.

These two systems are available on the GRICAD computing infrastructure.
Documentation is available here.

Guix Coffees” are organised on a regular basis.