Speaker
Description
Providing dependencies for computational workflows in shared environments, like HPC clusters, requires consideration from users to define the software stack for a workflow to execute. The Environment Modules system is the tool of choice on BinAC for this purpose. In this work, we present a use case which requires an application stack not available via Environment Modules on BinAC. We propose the usage of a containerized software stack for this particular problem using the Singularity and Docker container platforms. We present a solution for the reproducible provisioning of identical software stacks across HPC and non-HPC environments. The approach uses a Docker image as the basis for a Singularity container. This allows to define arbitrary software stacks and to execute workflows across different environments, from local workstations to HPC clusters, providing identical versions of software and hence leading to the same computational output.
Abstract (optional)
Providing runtime dependencies for computational workflows in shared environments, like HPC clusters, requires appropriate management effort from users and administrators.
Users of a cluster define the software stack required for a workflow to execute successfully, while administrators maintain the mechanisms to offer libraries and applications in different versions and combinations for the users to have maximum flexibility.
The Environment Modules system is the tool of choice on bwForCluster BinAC for this purpose.
In this work, we present a use case of a workflow which requires an application stack not available via Environment Modules on BinAC.
We propose the usage of a containerized, user-defined software stack for this particular problem using the Singularity and Docker container platforms.
Additionally, we present a solution for the reproducible provisioning of identical software stacks across HPC and non-HPC environments.
The approach uses a Docker image as the basis for a Singularity container.
This allows users to define arbitrary software constellations and gives them the ability to execute their workflows across different environments, from local workstations to HPC clusters, constantly having the identical versions of software and libraries and hence leading to the same computational output.