Like any new technologies, container technology such as Docker may appear magical to new comers like myself. So let's grasp a few concepts first. Containers are known as virtual environments (VE). To understand what's VE, perhaps we should think of virtual machine (VM) that came before it. And before VM, it was Unix - a time sharing system. Plentiful information on the notions of Unix and VM is available online.
VM virtualises hardware that allows multiple guest operating systems and their applications to run on the same host. VM makes it possible to share the host hardware among many users that they are not limited to applications packaged for the host OS. Each VM runs its own OS. Yes, fat but was a resource saver over one physical machine for running each OS.
VE virtualises operating environment in which one or more applications run on the same host with the same "core operating system". For Linux, the core OS means a common and shared kernel. An application runs inside an isolated and sandboxed environment. Except the shared kernel, all user space libraries could be tailored for each VE or container. For example, one container could run Alpine Linux. Another container could run Arch Linux. Here by Linux, we mean the user space packages that include libraries and applications. It means anything but the kernel from the distributions. What's so nice about VE is that applications inside each container see a distinct and complete operating environment that appears like running its own OS and being the sole owner of the host hardware. Each container could even have its own process id 1.
It sounds magical but container technologies are the results and natural evolution of earlier work done in Linux kernel, namely cgroups and namespaces as well as early effort in user-space such as Linux Container (LXC) that attempted to bring uses of such features to the mass. So sooner or later something like Docker was deemed to happen. In fact, Docker before v1.10 was implemented on top of LXC. From physical machines to VM and then to VE, each generation is a paradigm shift that changes how we deploy applications, and how we work as developers.
Now let's grasp a few key concepts in Docker. Docker creates and manages images. Docker runs an image by instantiating it into one or more containers. An image consists of multiple layers - each layer incrementally builds on top of the immediate layer underneath. A container sees the final and merged result of all the layers of an image. A layer is a collection of added, deleted or modified files with respect to the layer below.
Docker builds an image from a Dockerfile that specifies its base image and modifies on top of it. A Dockerfile defines how an image is built. An image is the result of software compilation, package installation, config editing and/or any actions taken to provision an operating environment - think of installing OS, libraries and applications on a bare-metal to fully operational.
After a Docker image is built, you use Docker to run the image by creating a container - think of it as instantiation or an instance of the image. Containers can be deleted when you no longer need it. Images can be deleted when no containers are instantiated from them. If a base image is needed by other images built on top of it, this base image cannot be be deleted until its dependants are deleted.
A Docker repository is different versions or builds of the same image. An image in this sense means the Dockerfile. A Docker registry is a collection of Docker repositories. Registries could be private or public. Docker Hub is an example of a public registry.
Enough abstract talk. Let's get some hands-on. The following is a demonstration of essential Docker operations. I use Linux as the host for both docker daemon (aka Docker Engine as per Docker Inc's terminology) and docker client, the command line tool. Docker v18.09.4 on Arch Linux to be exact.
I found the easiest way to start on Docker is to try images already built by other people. So let's try Alpine Linux from Docker Hub. To get the image, we do a pull request:
$ docker pull alpine Using default tag: latest latest: Pulling from library/alpine Digest: sha256:28ef97b8686a0b5399129e9b763d5b7e5ff03576aa5580d6f4182a49c5fe1913 Status: Downloaded newer image for alpine:latest
To check the image (aka repository) we just pulled from Docker Hub (aka a public registry):
$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE alpine latest 5cb3aa00f899 4 weeks ago 5.53MB
Image ID is an UUID of this image known as "alpine" by name. Image ID is a truncated sha256 hash of the image's Dockerfile.
Notice that this Alpine image is only 5.53MB. From my long introduction above, we should conclude that the image doesn't and shouldn't include the kernel. So the size is essentially all user-space tools and libraries. Debian and Ubuntu base images are over hundred megabytes. Because of its tiny footprint, Alpine is a very popular base image for containerising applications.
To run the image for the first time. The following command will create a container and run it in one go (there are Docker commands to separate into two steps but let's not get into it here). The command line logins you and provides a shell prompt.
$ docker run -it --name alpine_linux 5cb3aa00f899 /bin/sh / # ps PID USER TIME COMMAND 1 root 0:00 /bin/sh 6 root 0:00 ps
We then run 'ps' inside the container and only see two processes - the shell we started and the 'ps' itself. Notice that '/bin/sh' has PID 1.
From the perspective of the host, a container is merely a collection of processes under containerd-shim which is started by containerd and that was started by dockerd. Together the three processes are the essence of a Docker Engine.
Notice that '/bin/sh' has PID 16414 on the host. It's the only process we run in this container. So to log out the shell without killing the container, remember to use Ctrl-P Ctrl-Q.
To check running containers. Including option '-a' will show stopped containers as well.
$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 72e7496aff60 5cb3aa00f899 "/bin/sh" 8 minutes ago Up 8 minutes alpine_linux
So our example container is still running.
To run a command in a running container. For example, to get inside and take another peek, we could run the shell again:
$ docker exec -i -t 72e7496aff60 /bin/sh / # ps PID USER TIME COMMAND 1 root 0:00 /bin/sh 15 root 0:00 /bin/sh 20 root 0:00 ps / # exit $
The second '/bin/sh' has PID 15 inside the container. On the host, it has PID 16646 however. To quit, we could use 'exit' just like quitting any shells.
To stop a running container:
$ docker stop 72e7496aff60 $ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 72e7496aff60 5cb3aa00f899 "/bin/sh" 36 minutes ago Exited (137) 7 seconds ago alpine_linux
To delete a container:
$ docker rm 72e7496aff60 72e7496aff60 $ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
To export the contents of a container. This creates a gzipped tarball of all files of the container.
$ docker export 72e7496aff60 > 72e7496aff60.tgz $ ls -l total 5664 -rw-r--r-- 1 root root 5793792 Apr 8 03:09 72e7496aff60.tgz
docker image rm
To remove an image:
$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE alpine latest 5cb3aa00f899 4 weeks ago 5.53MB $ docker image rm 5cb3aa00f899 Error response from daemon: conflict: unable to delete 5cb3aa00f899 (cannot be forced) - image is being used by running container a455762f0f24 $ docker stop a455762f0f24 $ docker image rm 5cb3aa00f899 Error response from daemon: conflict: unable to delete 5cb3aa00f899 (cannot be forced) - image is being used by running container a455762f0f24 $ docker image rm 5cb3aa00f899 Error response from daemon: conflict: unable to delete 5cb3aa00f899 (must be forced) - image is being used by stopped container a455762f0f24 $ docker rm a455762f0f24 a455762f0f24 $ docker image rm 5cb3aa00f899 Untagged: alpine:latest Untagged: [email protected]:644fcb1a676b5165371437feaa922943aaf7afcfa8bfee4472f6860aad1ef2a0
As illustrated above, an image can only be deleted if no containers and/or other images depend on it.
To show logging from the given container. The following example comes from a container with id 5b91ce running apache:
$ docker logs 5b91ce [Wed Apr 10 04:01:59.406657 2019] [mpm_prefork:notice] [pid 1] AH00163: Apache/2.4.25 (Debian) PHP/7.2.17 configured -- resuming normal operations [Wed Apr 10 04:01:59.407060 2019] [core:notice] [pid 1] AH00094: Command line: 'apache2 -D FOREGROUND' 192.168.1.100 - - [10/Apr/2019:04:02:02 +0000] "GET / HTTP/1.1" 302 261 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36" 192.168.1.100 - - [10/Apr/2019:04:02:02 +0000] "GET /install/index.php HTTP/1.1" 200 14119 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36"
To copy files between the hosts and containers:
### copy a file from container (with id 5b91ce) $ docker cp 5b91ce:/etc/resolv.conf . ### do some editing and then copy back to the container $ docker cp resolv.conf 5b91ce:/etc/
Host storage for Docker images and containers
Docker stores data under '/var/lib/docker' on the host. Images and containers are saved under 'overlay2' sub-directory. overlay2 represents the storage driver used on the host. My Docker package from Arch Linux uses overlay2. Other major distributions should be the same. The contents of '/var/lib/docker/overlay2' after we pulled the Alpine image from Docker Hub:
$ du -ks /var/lib/docker/overlay2/* 5964 /var/lib/docker/overlay2/a0975c9c2852a21e06f37aad0dfed2196351f59afb5b366836b729381781f52a 16 /var/lib/docker/overlay2/l
5964KB for 'a0975c9c2852a...' roughly matches the reported 5.53MB image size of Alpine Linux. Drilling into the 'diff' directory inside, you'll see all the added, modified and deleted files with respect to the layer below. But the Alpine image has no parent image, the 'diff' directory sums up all the files created.
Run the Alpine image once, the contents of 'overlay2' look like:
$ du -ks /var/lib/docker/overlay2/* 5980 /var/lib/docker/overlay2/85d1cba7ae0074e9ace95f7aa6c318faa96a95fc05cdbe524ffa102ed1f2bd57 40 /var/lib/docker/overlay2/85d1cba7ae0074e9ace95f7aa6c318faa96a95fc05cdbe524ffa102ed1f2bd57-init 5964 /var/lib/docker/overlay2/a0975c9c2852a21e06f37aad0dfed2196351f59afb5b366836b729381781f52a 16 /var/lib/docker/overlay2/l
Two new directories are created for the one container instantiated from the Alpine image. Looking inside '85d1cba7ae...', we can see a sub-directory named 'merged' where the bulk of its size resides. In fact, 'merged' is a union of all files of all layers. Its size does not reflect physical disk usage, far less in fact. We can tell from 'df' output on the host. Before pulling the Alpine image:
$ df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda2 50826732 6275392 41939784 14% /
After pulling the Alpine image:
$ df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda2 50826732 6281400 41933776 14% /
That's roughly 5.7MB disk space consumed. 'df' output after creating a container from the Alpine image:
$ df Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda2 50826732 6281532 41933644 14% / overlay 50826732 6281532 41933644 14% /var/lib/docker/overlay2/85d1cba7ae0074e9ace95f7aa6c318faa96a95fc05cdbe524ffa102ed1f2bd57/merged
Only additional 132KB disk usage on the host. Additionally, we can see 'merged' is an overlayFS mount point!
Linux Kernel for Docker
To share one issue I ran into on my custom kernel. As previously mentioned, Docker requires kernel features such as namespaces and cgroups. Kernels shipped by major distributions should be generic enough and capable of running Docker Engine. If you like me compile custom kernel, many kernel features may be turned off to strip down resources. Docker supports a few modes of networking among containers. That is where I got burned by the default bridge mode:
docker: Error response from daemon: failed to create endpoint pixelserv-tls on network dnet: failed to add the host (vethd0ec620) <=> sandbox (vethdb5c1d5) pair interfaces: operation not supported.
I had strong feeling the error is related to some of the network features being turned off in my custom kernel. For a couple of days, I just did not know what feature was missing. Sifting through Docker's source codes on GitHub yields nothing about the error message. Apparently that's my fault of not knowing where to look for. Eventually I figured my kernel was missing the driver for "virtual ethernet pair" (CONFIG_VETH). While there, you also want to enable macvlan driver (CONFIG_MACVLAN). It's also one of the networking modes supported by Docker.
Docker Desktop for Mac
Docker Desktop for Mac is an all-in-one package for Mac users from Docker Inc. To understand what it's, we should briefly review how Docker was run on Mac before it. Docker is tightly coupled with Linux for running containers. All Docker deployments will need a Linux host that runs a compatible kernel. Mac is based on FreeBSD kernel which is completely different from Linux. Hence, Docker needs a Linux VM where its Docker Engine and containers could run.
In the beginning Docker used VirtualBox to start a Linux VM (known as boot2docker) in the background. Docker client, the command line tool together with a GUI to configure and start the VM runs natively on Mac. VirtualBox networking was reportedly not that reliable for Docker usage (that I personally cannot confirm nor deny). Then Docker Inc decided to ditch VirtualBox and came up with its own VM solution.
This new VM solution is HyperKit, a hypervisor implemented on top of Apple's Hypervisor.Framework, a thin native macOS layer available since MacOS Yosemite that exposes VT-x capabilities of CPUs and saves effort of hypervisor vendors dwelling down with their own kernel extensions to achieve the same. Docker Desktop for Mac uses HyperKit and runs a Linux VM of a Debian image. The VM process is known as com.docker.hyperkit in Activity Monitor on macOS.
Docker Inc promoted the new way being superior to VirtualBox. Users however found that the HyperKit VM allocates upfront the maximum assigned RAM to the VM, instead of growing on demand like VirtualBox. Many users thought this is memory leakage and some claimed their memory usage kept growing to a point where their Mac's were slowed down to a crawl. Plenty of good discussion in this thread. Interestingly the memory phenomenon was observed back since 2016 when Docker Desktop for Mac was first released. Docker Inc went silent about the issue. If I've to make a guess, I would think that it's a limitation of Apple's Hypervisor.Framework. So any change to this issue (if ever happen) has to wait for Apple perhaps.
When I was looking at this memory issue, I came across some unrelated posts by VirtualBox that said they had years of experience and optimisation in their product. There is truth in it. I would stick to VirtualBox for experimenting Docker on macOS. For completeness, there is also Docker Desktop for Windows that uses Microsoft's Hyper-V to run the Linux VM on Windows for hosting Docker Engine and its containers.
Conclusions and Next Steps
This sums up my recent revisit and a deeper dive into Docker since the very first and short encounter back in 2017. I would be deploying Docker more often. Seems certainly like a mature time to start doing so. The next steps will be creating the first Dockerfiles and building your own images, having multiple containers talking to each other and exploring benefits of this new paradigm on your own applications. Until my next post, here are more resources that I found interesting: one article on not to build an excessively fat Docker image, details on images, ids and layers from under the hood, and more inner workings of Docker.