Introduction to containers

Overview

Teaching: 10 min
Exercises: 0 min
Questions
Objectives
  • Define the term “container”

  • Discuss when you would benefit from using containers in your workflow

Containers vs Virtual Machines

A container is an entity providing an isolated software environment (or filesystem) for an application and its dependencies.

If you have already used a Virtual Machine, or VM, you’re actually already familiar with some of the concepts of a container.

Containers vs. VMs

The key difference here is that VMs virtualise hardware while containers virtualise operating systems. There are other differences (and benefits), in particular containers are:

Containers and your workflow

There are a number of reasons for using containers in your daily work:

A few examples of how containers are being used at Pawsey include:

Here’s an overview of what a typical workflow looks like:

Container Workflow

Terminology

An image is a file (or set of files) that contains the application and all its dependencies, libraries, run-time systems, etc. required to run. You can copy images around, upload them, download them etc.

A container is an instantiation of an image. That is, it’s a process in execution that got spawned out of an image. You can run multiple containers from the same image, much like you might run the same application with different options or arguments.

In abstract, an image corresponds to a file, a container corresponds to a process.

A registry is a server application where images are stored and can be accessed by users. It can be public (e.g. Docker Hub) or private.

To build an image we need a recipe. A recipe file is called a Definition File, or def file, in the Singularity jargon and a Dockerfile in the Docker world.

Container engines

A number of tools are available to create, deploy and run containerised applications. Some of these will be covered throughout this tutorial:

Key Points

  • Containers enable you to package up an application and its dependencies.

  • By using containers, you can better enforce reproducibility, portability and share-ability of your computational workflows.