Run containers on HPC with Shifter (and Singularity)

Overview

Teaching: 15 min
Exercises: 5 min
Questions
Objectives
  • Learn how to manage and run containers on a HPC cluster with Shifter

Why not Docker on HPC?

There are a few issues preventing Docker from being used as a container engine on HPC systems:

Fortunately, a number of alternatives are available to run containers at HPC facilities, including:

At the moment, Pawsey is using CSCS Shifter on its HPC systems, and therefore this will be the tool of choice in this tutorial.

NCI (the National Computational Infrastructure in Canberra) is using Singularity on its HPC systems, for which examples will be provided as well.

How to login on Pawsey HPC systems?

Pawsey currently has two sytems, Magnus and Zeus. We’re using Zeus for this tutorial. You can login using the ssh command and your Pawsey access credentials (they will be provided for live workshops):

$ ssh <your-pawsey-account-name>@zeus.pawsey.org.au

After this, you’ll be prompted to enter your Pawsey account password.

Pulling and managing images with Shifter

To use Shifter on Pawsey HPC systems, we need first to load the corresponding module:

$ module load shifter

In principle, the command to pull container images is very similar to Docker, shifter pull.
However, to avoid disk quota issues on Pawsey HPC systems, the following syntax is recommended, that makes use of the sg linux command, for instance:

$ sg $PAWSEY_PROJECT -c 'shifter pull ubuntu'
# image     : index.docker.io/library/ubuntu/latest
# cacheDir  : /group/shifterrepos/mdelapierre/.shifter/cache
# tmpDir    : /tmp
# imageDir  : /group/shifterrepos/mdelapierre/.shifter/images
> save image layers ...
> pulling        : sha256:f85999a86bef2603a9e9a4fa488a7c1f82e471cbb76c3b5068e54e1a9320964a
> pulling        : sha256:da1315cffa03c17988ae5c66f56d5f50517652a622afc1611a8bdd6c00b1fde3
[..]
> extracting     : /group/shifterrepos/mdelapierre/.shifter/cache/sha256:f85999a86bef2603a9e9a4fa488a7c1f82e471cbb76c3b5068e54e1a9320964a.tar
> make squashfs ...
> create metadata ...
# created: /group/shifterrepos/mdelapierre/.shifter/images/index.docker.io/library/ubuntu/latest.squashfs
# created: /group/shifterrepos/mdelapierre/.shifter/images/index.docker.io/library/ubuntu/latest.meta
$ sg $PAWSEY_PROJECT -c 'shifter pull centos'
$ sg $PAWSEY_PROJECT -c 'shifter pull busybox'

Similar again to Docker, we can list locally pulled images with shifter images:

$ shifter images
library/centos                   latest                       ea4b646d9000   2018-11-27T07:05:23   69.62MB      index.docker.io
library/busybox                  latest                       7dc9d60af829   2018-12-19T22:31:48   704.00KB     index.docker.io
library/ubuntu                   latest                       d71fc6939e16   2018-12-19T22:30:41   29.94MB      index.docker.io

and remove undesired images with shifter rmi:

$ shifter rmi busybox
removed image index.docker.io/library/busybox/latest

Running images with Shifter

Let us change directory to our group directory with:

$ cd $MYGROUP

and then run ls using the Ubuntu image we just pulled, via shifter run:

$ shifter run ubuntu ls

The output will display the content of the current host directory!

A few differences in behaviour can be noticed compared to Docker, such that using Shifter typically requires to specify less options and flags:

As additional examples, you might want to run:

$ shifter run ubuntu ls /
$ shifter run ubuntu whoami

Note how no flag is required to run a container interactively. To launch an interactive shell from within the container, just run it without any commands, for instance:

$ shifter run ubuntu
mdelapierre@zeus-1:/group/pawsey0001/mdelapierre$ exit   # or hit CTRL-D

Shifter has support to run containers exploiting MPI parallelism and GPU acceleration (the latter only through CSCS Shifter).

Using Shifter with a job scheduler

Shifter is compatible with SLURM, the job scheduler installed on Pawsey HPC systems. In particular, SLURM job executor srun is compatible with shifter run, and the two syntaxes can be combined together.

As an example, the following script uses a Ubuntu container to output the machine hostname:

#!/bin/bash -l
  
#SBATCH --account=<your-pawsey-project>
#SBATCH --partition=workq
#SBATCH --ntasks=1
#SBATCH --time=00:05:00
#SBATCH --export=NONE
#SBATCH --job-name=blast

module load shifter

srun --export=all shifter run ubuntu hostname

Now use your favourite text editor to copy paste the script above in a file called hostname.sh somewhere under $MYSCRATCH or $MYGROUP (remember to specify your Pawsey Project ID in the script!),

and then submit this script using SLURM. If you are running this during a live workshop, use the flag --reservation <your-pawsey-reservation> to use the compute nodes that have been reserved for the event:

$ sbatch --reservation <your-pawsey-reservation> hostname.sh

Can Shifter build container images?

Shifter does not allow to build container images. The best way to create an image to be pulled and run through it is to use Docker on a distinct machine (see previous episode).

Run a Python app in a container on HPC

First, pull the container continuumio/miniconda3:4.5.12.

Then, with your favourite text editor create a file called app.py with the following content:

import sys

def print_sums(data):
    with open("row_sums",'w') as output:
        for line in data:
            row = 0
            for word in line.strip().split():
                row += int(word)
            output.write(str(row)+"\n")
            print("Sum of the row is ",row)

if len(sys.argv) > 1 and sys.argv[1] != "-":
    with open(sys.argv[1], 'r') as infile:
        print_sums(infile)
else:
    print_sums(sys.stdin)

and an input file input containing:

1 2 3
4 5 6
7 8 9

The app reads rows containing integers and outputs their sums line by line. Input can be given through file or via standard input. The output is produced both in formatted form through standard output and in raw form written to a file named row_sums.

Now, run python app.py using the the container image you have just pulled. For instance, give the input filename as an argument to the app.

Finally, re-run it by means of a SLURM script called python_slurm.sh.

Solution

Pull the container image:

$ sg $PAWSEY_PROJECT -c 'shifter pull continuumio/miniconda3:4.5.12'

Run the app:

$ shifter run continuumio/miniconda3:4.5.12 python app.py input

SLURM script for scheduler submission, python_slurm.sh (insert Pawsey Project ID!):

#!/bin/bash -l

#SBATCH --account=<your-pawsey-project>
#SBATCH --partition=workq
#SBATCH --ntasks=1
#SBATCH --time=00:05:00
#SBATCH --export=NONE
#SBATCH --job-name=python

module load shifter

srun --export=all shifter run continuumio/miniconda3:4.5.12 python app.py input

SLURM submission:

$ sbatch --reservation <your-pawsey-reservation> python_slurm.sh

HPC containers with Singularity at NCI

Raijin, the NCI HPC system, uses Singularity as the container engine, instead of Shifter. However, much of the user-facing interface is similar, or even the same. The biggest difference is that on Raijin you cannot pull and use your own images; instead, you’ll need to contact the NCI Help Desk at help@nci.org.au and ask for your image to be added to the library.

This allows the NCI staff to inspect and sanitise the containers before use. For example, to ensure that they allow the use of system MPI libraries, or at least contain a compatible version. The images must be able to be built using a build script, rather than being distributed as just an opaque filesystem image. Mount points for the systems Lustre filesystems will also be included on build so that all of your user data is available at the same location as in the native image (e.g. /home, /short, and /g/data).

First of all, let us load the Singularity module:

module load singularity

You can run an interactive shell inside the container using the singularity shell command. Here, we are using a CentOS image:

$ singularity shell /apps/singularity/images/centos7/centos7-latest.simg 
Singularity: Invoking an interactive shell within container...
 
Singularity centos7-2018051701.simg:~> cat /etc/centos-release
CentOS Linux release 7.5.1804 (Core) 
Singularity centos7-2018051701.simg:~> 
Singularity centos7-2018051701.simg:~> whoami
bjm900
Singularity centos7-2018051701.simg:~> exit

You can run a specific command within the container using the singularity exec command:

$ singularity exec /apps/singularity/images/centos7/centos7-latest.simg cat /etc/centos-release
CentOS Linux release 7.5.1804 (Core) 

Note that the CentOS version in the container image is 7.5, whereas Raijin’s native OS is (currently) CentOS 6.10:

$ cat /etc/centos-release
CentOS release 6.10 (Final)

On Raijin, Singularity is also integrated with the PBS batch scheduling system. This allows you to specify the image to use via the directive #PBS -l image in your job script (or similarly on the qsub command line):

#!/bin/bash
#PBS -P <your-nci-project>
#PBS -q normal
#PBS -l ncpus=1
#PBS -l walltime=00:05:00
#PBS -l image=centos7

module load singularity

cat /etc/centos-release

Of course, you can also just use singularity exec within your job script as in the example above:

#!/bin/bash
#PBS -P <your-nci-project>
#PBS -q normal
#PBS -l ncpus=1
#PBS -l walltime=00:05:00

module load singularity

singularity exec /apps/singularity/images/centos7/centos7-latest.simg cat /etc/centos-release

Key Points

  • Shifter has a quite simple syntax that allows to pull, manage and run containers on HPC systems

  • shifter pull and shifter run are the key commands