Build your own container image with Docker

Overview

Teaching: 20 min
Exercises: 10 min
Questions
Objectives
  • Learn what is a Dockerfile and its basic syntax

  • Learn how to build a container and push it to a web registry

What is a Dockerfile?

A Dockerfile is a recipe to build a container image with Docker. It is basically a collection of the standard shell commands you would use to build your software through prompt; in addition, it contains Docker-specific instructions that handle the build process. We will see some examples below.

Let’s write a Dockerfile

We will build a very simple container image: a Ubuntu box featuring tools for building software and a text editor. Its Dockerfile will contain most of the basic Docker instructions that can also be used to build more complicated images.

First let us create a directory where we’ll store the Dockerfile. This directory will be the so called Docker build context. Docker will include files in this directory in the build process and in the final image. As a by-product, this will make the build process longer and the image larger, so that we want to include only those strictly required for the build, even none if possible.

$ mkdir build_dockerfile
$ cd build_dockerfile

Now use your favourite text editor to create a file named Dockerfile and edit it. Here is its contents:

FROM ubuntu:18.04
  
MAINTAINER Your Name <youremail@yourdomain>

RUN apt-get update && \
    apt-get install -y \
        autoconf \
        automake \
        g++ \
        gcc \
        gfortran \
        make \
        nano \
    && apt-get clean all \
    && rm -rf /var/lib/apt/lists/*

VOLUME ["/data"]
WORKDIR /data

CMD ["/bin/bash"]

Building the image

Once the Dockerfile is ready, let us build the image with docker build:

$ docker build -t build:2019May29 .
Sending build context to Docker daemon  2.048kB
Step 1/6 : FROM ubuntu:18.04
 ---> d131e0fa2585
[..]
Step 6/6 : CMD ["/bin/bash"]
 ---> Running in fb003b87b020
Removing intermediate container fb003b87b020
 ---> 8986ee76d9a9
Successfully built 8986ee76d9a9
Successfully tagged build:2019May29

In the command above, . is the location of the build context (i.e. the directory for the Dockerfile).
The -t flag is used to specify the image name (compulsory) and tag (optional).

Any lowercase alphanumeric string can be used as image name; here we’ve used build. The image tag (following the colon) can be optionally used to maintain a set of different image versions on Docker Hub, and is a key feature in enabling reproducibility of your computations through containers; here we’ve used 2019May29.

Adding the prefix <Your Docker Hub account>/ to the image name is also optional and allows to push the built image to your Docker Hub registry (see below).

The complete format for the image name looks like: <Your Docker Hub account ^>/<Image name>:<Image tag ^>. ^These are optional.

Layers in a container image

Note how the RUN instruction above is used to execute a sequence of commands to:

We have concatenated all these commands in one using the && linux operator, and then the \ symbol to break them into multiple lines for readability.

We could have used one RUN instruction per command, so why concatenating instead?

Well, each RUN creates a distinct layer in the final image, increasing its size. It is a good practice to use as few layers, and thus RUN instructions, as possible, to keep the image size smaller.

More Dockerfile instructions

Several other instructions are available, that we haven’t covered in this introduction. You can find more information on them at Dockerfile reference. Just to mention a few possibilities:

Pushing the image to Docker Hub

If you have a (free) Docker Hub account you must first login to Docker.

$ docker login

You are now ready to push your newly created image to the Docker Hub web registry.

First, let us create a second tag for the image, that includes your Docker Account. To this end we’ll use docker tag:

$ docker tag build:2019May29 <your-dockerhub-account>/build:2019May29

Now we can push the image:

$ docker push <your-dockerhub-account>/build:2019May29
The push refers to repository [docker.io/marcodelapierre/build]
cab15c00fd34: Pushed 
cf5522ba3624: Pushed 
[..]
2019May29: digest: sha256:bcb0e09927291c7a36a37ee686aa870939ab6c2cee2ef06ae4e742dba4bb1dd4 size: 1569

Congratulations! Your image is now publicly available for anyone to pull.

Base images for Python

continuumio/miniconda2 and continuumio/miniconda3 are Docker images provided by the maintainers of the Anaconda project. They ship with Python 2 and 3, respectively, as well as pip and conda to install and manage packages. At the time of writing, the most recent version is 4.5.12, which is based on Python 2.7.15 and 3.7.1, respectively.

Among other use cases, these base images can be very useful for maintaining Python containers, as well as bioinformatics containers based on the Bioconda project.

If you need interactive Jupyter Notebooks, Jupyter Docker Stacks are a series of dedicated container images. Among others, there is the base SciPy image jupyter/scipy-notebook, the data science image jupyter/datascience-notebook, and the machine learning image jupyter/tensorflow-notebook.

Base images for R

The Rocker Project maintains a number of good R base images. Of particular relevance is rocker/tidyverse, which embeds the basic R distribution, an RStudio web-server installation and the tydiverse collection of packages for data science, that are also quite popular across the bioinformatics community of Bioconductor. At the time of writing, the most recent version is 3.5.3.

Other more basic images are rocker/r-ver (R only) and rocker/rstudio (R + RStudio).

Build your own Scientific Python container

Using continuumio/miniconda3:4.5.12 as base image, create an image called mypython, which includes the Python packages numpy, scipy and pandas. Hint: you can use pip install or conda install -y in the Dockerfile to this end.

If you have a Docker Hub account, for the image name use the format <Your Docker Hub account>/<Image name>:<Version tag>. Then, push the image to the web registry.

Solution

Dockerfile:

FROM continuumio/miniconda3:4.5.12

MAINTAINER Marco De La Pierre <marco.delapierre@pawsey.org.au>

RUN pip install numpy scipy pandas

VOLUME ["/data"]
WORKDIR /data

CMD ["/bin/bash"]

Build the image:

a) Plain (no Docker Hub account):

$ docker build -t mypython:2019Apr23 .

b) With a Docker Hub account:

$ docker build -t <your-dockerhub-account>/mypython:2019Apr23 .

Push the image (optional):

$ docker push <your-dockerhub-account>/mypython:2019Apr23

Best practices

  • for stand-alone packages, it is suggested to use the policy of one container per package
  • for Python or R pipelines, it may be handier to use the policy of a single container for the entire pipeline
  • Best practices for writing Dockerfiles are found in the Docker website

Key Points

  • A Dockerfile is a recipe that uses specific instructions to direct the image building process

  • docker build is used to build images

  • docker push is used to push images to a web registry