Deploy containers on a supercomputer
Overview
Teaching: 15 min
Exercises: 20 minQuestions
How can I execute commands in a container with Singularity or Shifter?
How are variables and directories shared between host and container?
Objectives
Download and run containers on a supercomputer
Manage sharing of variables and directories with the host
Run a real-world bioinformatics application in a container
Download and run containers
Before starting, let us cd into the exercises
subdirectory of the tutorial repository directory (see also Setup page):
cd ~/isc-tutorial/exercises
Singularity
Download an old Ubuntu image using:
singularity pull docker://ubuntu:18.04
INFO: Converting OCI blobs to SIF format
INFO: Starting build...
Getting image source signatures
Copying blob 01bf7da0a88c done
Copying blob f3b4a5f15c7a done
Copying blob 57ffbe87baa1 done
Copying config 4e0d916041 done
Writing manifest to image destination
Storing signatures
2021/05/19 08:32:37 info unpack layer: sha256:01bf7da0a88c9e37ae418d17c0aeed0621524848d80ccb9e38c67e7ab8e11928
2021/05/19 08:32:38 info unpack layer: sha256:f3b4a5f15c7a0722b4f22e61b5387317eaf2602c27ffb2bceac9a25f19fbd156
2021/05/19 08:32:38 info unpack layer: sha256:57ffbe87baa135002dddb7a7460082c5d6a352186e1be9464c5f31db81378824
INFO: Creating SIF file...
INFO: Build complete: ubuntu_18.04.sif
Note how you need to prepend docker://
to the image name, to tell Singularity you’re downloading an image in Docker format (the default would be to download a SIF image).
The image has been converted into singularity format and the file has been saved in your current directory:
ls
ubuntu_18.04.sif
Now let’s execute some Linux commands from within the container, whoami
and cat /etc/os-release
:
singularity exec ubuntu_18.04.sif whoami
tutorial
singularity exec ubuntu_18.04.sif cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
Note that, with Singularity, the user in the container is the same as in the host machine. Also note that the operating system information comes from the container and not from the host.
Singularity has a dedicated syntax to open an interactive shell prompt in a container:
singularity shell ubuntu_18.04.sif
Singularity>
Exit the shell by typing exit
or hitting Ctrl-D
.
Finally, note that you can request Singularity to execute a container straight away, if the image is not available locally it will be pulled first, and stored in the Singularity cache:
singularity exec docker://ubuntu:16.04 cat /etc/os-release
INFO: Converting OCI blobs to SIF format
INFO: Starting build...
Getting image source signatures
Copying blob 92473f7ef455 done
Copying blob fb52bde70123 done
Copying blob 64788f86be3f done
Copying blob 33f6d5f2e001 done
Copying config f911e561e9 done
Writing manifest to image destination
Storing signatures
[..]
INFO: Creating SIF file...
NAME="Ubuntu"
VERSION="16.04.7 LTS (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04.7 LTS"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
VERSION_CODENAME=xenial
UBUNTU_CODENAME=xenial
By default, the cache is stored in ~/.singularity
; this location can be customised using the environment variable SINGULARITY_CACHEDIR
.
A subcommand, singularity cache
, can be used to manage the cache.
Shifter
Let’s download the same Ubuntu image as above, using shifterimg
:
shifterimg pull ubuntu:18.04
2020-11-02T22:38:46 Pulling Image: docker:ubuntu:18.04, status: READY
Locally stored images are managed by Shifter itself:
shifterimg images
mycluster docker READY 0e855866b8 2020-11-02T22:38:46 ubuntu:18.04
What’s the container user with Shifter? Let’s use both id -u
and whoami
:
shifter --image=ubuntu:18.04 whoami
tutorial
shifter --image=ubuntu:18.04 id -u
1001
Again, these are propagated from the host.
You can try more Linux commands:
shifter --image=ubuntu:18.04 cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic
NOTE: If you need to open an interactive shell in the container with Shifter, just execute bash
in the container.
Share host environment variables
With Singularity and Shifter, host environment variables are shared with the container by default:
export HELLO="world"
You can access that variable inside the container with both Singularity:
singularity exec ubuntu_18.04.sif bash -c 'echo $HELLO'
world
and Shifter:
shifter --image=ubuntu:18.04 bash -c 'echo $HELLO'
world
Note the use of bash -c
and single quotes to force the evaluation of the environment variable inside the container and not on the host commnad line.
There are some additional user options worth discussing, for further tuning of the shell environment. In some cases, e.g. when using Python containers, you might need to isolate the container shell from the host and not mess up Python environment variable definitions that already exist inside the container. To this end, use -e
with Singularity:
singularity exec -e ubuntu_18.04.sif bash -c 'echo $HELLO'
and -E
with Shifter:
shifter -E --image=ubuntu:18.04 bash -c 'echo $HELLO'
If you need to pass variables to the container in this situation, you can use a dedicated syntax. With Singularity, variables named with the prefix SINGULARITYENV_
are shared with the container:
export SINGULARITYENV_BYE="moon"
singularity exec -e ubuntu_18.04.sif bash -c 'echo $BYE'
moon
or, from version 3.6.x
on, variables can be defined in the command line:
singularity exec -e --env BYE="moon" ubuntu_18.04.sif bash -c 'echo $BYE'
moon
And with Shifter:
shifter -E --env BYE="moon" --image=ubuntu:18.04 bash -c 'echo $BYE'
moon
What about Podman?
By default and similar to Docker, Podman isolates host and container shell environments:
export HELLO="world"
podman run ubuntu:18.04 bash -c 'echo $HELLO'
You can pass specific variables to the container by using the flag --env
(or -e
) :
podman run --env HELLO ubuntu:18.04 bash -c 'echo $HELLO'
world
Or even redefine variables, with the same flag:
podman run --env BYE=moon ubuntu:18.04 bash -c 'echo $BYE'
moon
Use host directories
Regarding the default working directory, the two container engines have different behaviours. Let’s see this with an example, the container image marcodelapierre/ubuntu_workdir:18.04
, which has with a custom WORKDIR
in the Dockerfile:
FROM ubuntu:18.04
WORKDIR "/workdir"
How does Singularity behave?
singularity exec docker://marcodelapierre/ubuntu_workdir:18.04 pwd
/home/tutorial/isc-tutorial/exercises
Singularity always uses the host current working directory.
Now, how about Shifter?
shifterimg pull marcodelapierre/ubuntu_workdir:18.04
shifter --image=marcodelapierre/ubuntu_workdir:18.04 pwd
/home/tutorial/isc-tutorial/exercises
Shifter uses the host current working directory, too.
With both Singularity and Shifter, the container internal filesystem is read-only, so if you want to write output files you must do it in a bind-mounted host directory.
Typically, HPC administrators will configure the container engine for you, so that host filesystems containing data and software are mounted by default.
In the unlikely scenario where you need to bind-mount additional paths, Singularity offers handy methods for users. For instance:
singularity exec ubuntu_18.04.sif ls /data2
ls: cannot access /data2: No such file or directory
singularity exec -B /data2 ubuntu_18.04.sif ls /data2
file2
or using the SINGULARITY_BINDPATH
(or SINGULARITY_BIND
) variables:
export SINGULARITY_BINDPATH="/data2"
singularity exec ubuntu_18.04.sif ls /data2
file2
What about Podman?
podman run marcodelapierre/ubuntu_workdir:18.04 pwd
/workdir
podman run ubuntu:18.04 pwd
/
Similar to Docker, Podman follows WORKDIR
as defined in the Dockerfile; if undefined it defaults to the root dir /
.
Also remember that, similar to Docker, Podman does not mount any host directories (although the system administrators may enable default mounting of relevant directories):
podman run -w $(pwd) marcodelapierre/ubuntu_workdir:18.04 pwd
Error: workdir "/home/tutorial/isc-tutorial/exercises" does not exist on container d42af9a0aaed56b840b5a6be3aa0e2ba19114ead456036573491e54816e185ab
So, if you want to run the container in the host current directory, you need to use -w
to specify the work directory, and -v
to mount the desired host directory:
podman run -v $(pwd):$(pwd) -w $(pwd) marcodelapierre/ubuntu_workdir:18.04 pwd
/home/tutorial/isc-tutorial/exercises
Do It Yourself: BLAST example
Now you’re going to run a BLAST (Basic Local Alignment Search Tool) example with a container from BioContainers.
BLAST is a tool bioinformaticians use to compare a sample genetic sequence to a database of known sequences; it’s one of the most widely used bioinformatics packages.
This example is adapted from the BioContainers documentation.
For this exercise, use Singularity.
Try and achieve what you’re asked to do, use the solution only if you’re lost.
Before you start, change directory to blast
:
cd ~/isc-tutorial/exercises/blast
Pull the image
First, download the following container image:
quay.io/biocontainers/blast:2.9.0--pl526h3066fca_4
Solution
singularity pull docker://quay.io/biocontainers/blast:2.9.0--pl526h3066fca_4
Run a test command
Now, run a simple command using that image, for instance
blastp -help
, to verify that it actually works.Solution
$ singularity exec blast_2.9.0--pl526h3066fca_4.sif blastp -help
USAGE blastp [-h] [-help] [-import_search_strategy filename] [..] -use_sw_tback Compute locally optimal Smith-Waterman alignments?
Now, the exercise directory contains a human prion FASTA sequence, P04156.fasta
and a gzipped reference database to blast against, zebrafish.1.protein.faa.gz
.
First, uncompress the database (you can use host commands for this):
gunzip zebrafish.1.protein.faa.gz
Run the analysis
We need to perform two tasks:
- prepare the zebrafish database with
makeblastdb
:makeblastdb -in zebrafish.1.protein.faa -dbtype prot
- run the alignment with
blastp
:blastp -query P04156.fasta -db zebrafish.1.protein.faa -out results.txt
Start from these commands and adapt them so you can run them from within a container.
Solution
singularity exec blast_2.9.0--pl526h3066fca_4.sif makeblastdb -in zebrafish.1.protein.faa -dbtype prot
Building a new DB, current time: 11/16/2019 19:14:43 New DB name: /home/ubuntu/singularity-containers/demos/blast_db/zebrafish.1.protein.faa New DB title: zebrafish.1.protein.faa Sequence type: Protein Keep Linkouts: T Keep MBits: T Maximum file size: 1000000000B Adding sequences from FASTA; added 52951 sequences in 1.34541 seconds.
singularity exec blast_2.9.0--pl526h3066fca_4.sif blastp -query P04156.fasta -db zebrafish.1.protein.faa -out results.txt
The final results are stored in results.txt
:
less results.txt
Score E
Sequences producing significant alignments: (Bits) Value
XP_017207509.1 protein piccolo isoform X2 [Danio rerio] 43.9 2e-04
XP_017207511.1 mucin-16 isoform X4 [Danio rerio] 43.9 2e-04
XP_021323434.1 protein piccolo isoform X5 [Danio rerio] 43.5 3e-04
XP_017207510.1 protein piccolo isoform X3 [Danio rerio] 43.5 3e-04
XP_021323433.1 protein piccolo isoform X1 [Danio rerio] 43.5 3e-04
XP_009291733.1 protein piccolo isoform X1 [Danio rerio] 43.5 3e-04
NP_001268391.1 chromodomain-helicase-DNA-binding protein 2 [Dan... 35.8 0.072
[..]
When you’re done, quit the view by hitting the q
button.
Key Points
Download a container image with
singularity pull
orshifterimg pull
Execute commands in a container with
singularity exec
orshifter
By default Singularity and Shifter pass all host variables to the container
By default Singularity uses the host current directory as the container working directory, whereas Shifter gives precedence to
WORKDIR
from the DockerfileDefine container specific shell variables with Singularity by prefixing them with
SINGULARITYENV_
Mount additional host directories with Singularity with the flag
-B
, or the variablesSINGULARITY_BINDPATH
orSINGULARITY_BIND