#DockerDays Day 4- Data Persistence

So far we have been mostly interested in running the container, without thinking too much about data. We have also briefly described how the containers add a read-write layer on top of the image’s read-only layers while running a container. But what does that actually mean? What happens to the read-write layer when the container is not running? Is the data persisted and if not, how does one persist data within the container? Can we share the same data with another container?

As you can see, there are already way too many questions regarding the persistence of data. In this part of the tutorial, we will look into data persistence.

Agenda

  • Docker and Data Persistence
  • Different ways to persist data
    • Docker Volume
      • Create Docker volume
      • List volumes
      • Inspect volume
      • Remove volume
      • Prune volume
    • Bind Mount

Docker and Data Persistence

If you recall, containers are instances of application stack defined in a docker image. The image themselves are made of multiple read-only layers that are built over a file system known as Union File System. Union File System allows files and directories from different file systems to be virtually overlaid such that it appears to be a single file system. In docker, multiple read-only layers are made to appear as if they are a single-layered file system, with each top layer superseding the ones below it.

When a container runs, a new, read-write layer is created on the top of layers in the image. Any change made during the execution of the container, create a copy of the file in the writable layer. This implies a couple of things

  • When a container is removed, any changes made to the writable layer are lost.
  • Since each container creates its own writable layer, it cannot be shared with another container. This obstructs sharing of data between containers.

If you want to persist data or share data between containers, you need to use the host’s file system. Docker allows two ways to store data in host machine – Bind Mounts and Docker Volume. Additionally, Docker allows containers to store files in the host’s in-memory – tmpfs mount, which doesn’t exactly support persistence.

Docker Volume

Docker volume is the commonly preferred storage/persistence mechanism when working with Docker. Volumes are directories within the host File System, which are managed by Docker. When you create a volume using the Docker Commands, it is placed under a special directory within the Host. When the volume is mounted, this directory is mounted into the container. The key fact to remember here is that this directory is managed by Docker and is isolated from the host’s core functionality.

A volume can be shared between multiple containers as it is now independent of the container. This also means that it would remain intact even when the container is removed ensuring the persistence of your data.

Create Docker Volume

Docker volumes can be created using the docker volume create command.

$ docker volume create volumeName

If you do not specify the volumeName, docker would create a unique/random name for itself. For example,

$ docker volume create
2f477c8c6484e8ac726700b03c27ee5336c81b28b1643a8514cec90e7b3ca92e  //volume name created by docker

List Volumes

As with containers and images, you can use the ls subcommand to list all the volumes known to docker.

$ docker volume ls

If you have too many volumes and would like to filter the result, the --filter helps you to do so.

$ docker volume ls --filter dangling=true
$ docker volume ls --filter name=my

The dangling filter matches on all volumes not referenced by any containers. The --filter flag can also be applied to filter the volumes based on labelsname or drivers.

Inspect Volume

You can inspect a volume using the docker volume inspect command to get more information about the volume.

$ docker volume inspect myVolume
[
    {
        "CreatedAt": "2022-02-25T00:55:05Z",
        "Driver": "local",
        "Labels": {
            "isDb": "yes"
        },
        "Mountpoint": "/var/lib/docker/volumes/myVolume2/_data",
        "Name": "myVolume",
        "Options": {},
        "Scope": "local"
    }
]

Remove Volume

A docker volume can be removed using the docker volume rm command.

$ docker volume rm volumeName

You could use the -f or --force flag to force removal of volumes.

Prune Volume

If you want to remove all the unused volumes in the host machine, you can use the docker volume prune command.

$ docker volume prune

Unused volumes are the ones that are not referenced by any containers. The –filter flag, which was introduced along with the docker volume ls the command can be used along with the docker volume prune to filter the unused volumes that need to be removed.

Mounting Docker Volume

Let us now mount a docker volume. Let us first create a docker volume.

$ docker volume create mysql.volume.demo

Now let us run our mySql container with our newly created volume mounted.

$ docker container run -d --name mysql.container.demo -e MYSQL_ALLOW_EMPTY_PASSWORD=True -v mysql.volume.demo:/var/lib/mysql mysql

That’s all we need to do mount our volume and make our container support persistence. Do not that we can also use --mount instead -v.

Bind Mount

In some ways, bind mount may sound similar to docker volume. Similar to Docker Volume, the Bind Mounts too mount a directory from the host file system to the container. The difference, however, lies in a couple of factors.

The bind mount volume dictates the host machine has the specified directory structure (it will create it if it doesn’t exist). The specified directory is referenced by the absolute path. In other words, the bind mounts provide us to specify the exact mountpoint. This path doesn’t need to be specific to Docker and could contain sensitive information required by the host machine. This effectively means the sensitive files in a host file system can be changed by the docker. This leads to potential changes to host machine files which might have larger repercussions.

In contrast, the docker volume exists within the Docker’s own storage directories and is specific to the docker. These are managed directly by docker. Docker recommends named volumes as to the recommended persistance strategy

$ docker container run --name nginx.container.demo -d  -p 80:80 -v ${pwd}:/usr/share/nginx/html nginx


As you can notice, we have mounted the bind mount by specifying a specific location in the host file system.

Summary

In this part of the tutorial, we addressed persistance in docker containers. We understood how there are mainly two different ways by which we could support persistance in docker. We also learned that the docker volume is the most common way to support persistance when compared to bind mount.

We will continue our exploration of docker in this series. Until the next part, happy coding.

Advertisement

One thought on “#DockerDays Day 4- Data Persistence

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s