Diving Into Containers




When I say containers, you may think oh I know this one, Docker! That is where I was starting from as well. In reality, Docker is the Kleenex of container technology, just the name brand that everyone knows. I wanted to understand more than just how to use Docker and understand container technology and the difference between different offerings at a more fundamental level. I set out with a few main questions in mind: What are the differences between Docker and Kubernetes? What about Docker and other container solutions like Podman? I know containers can be created with things called a “DockerFile” is there another way? How does one create a base container OS? As I learned I asked and answered many other questions, let’s get into it!

Overview

Containers are like a Virtual Machine (a topic for another time perhaps), Virtual Machines run a second operating system on a computer where a container is a bundle of an application(s) and the necessary libraries and binaries for that application to run.

“Containers are a technology that helps to orchestrate application development and deployment by sharing the same operating system kernel and isolate the application from the rest of the system providing portability.” [1] In a simpler explanation, container technology allows a developer to bundle an application as well as any dependencies in a way that it can be deployed on any server and experience the same operating environment. This Reddit thread also has some great explanations, including one comparing containers to lunch boxes. [19]

Containers can be used for a multitude of use cases, they are commonly found in Continuous Integration / Continuous Development (CI/CD) environments, Home Labs, and Edge Computing environments. In using a container there is minimal configuration needed since the majority of the configuration is built into the container. Once the container is built it can be moved to different servers and have the same functionality, or even have multiple instances of the same container running to perform the same job faster.

Understanding the Basics

Using a Container

You can use a container without needing to build it yourself, in fact that is probably the most common use case. To use a container a “container engine” needs to be installed, this is most commonly where Docker comes in. Docker is a container tool that allows developers to build, package, and run containers. Developers who build containers will publish their container image to a Container Image Repository. The container image is the bundle of code and software that is needed to execute whatever it is that container was built to do, the Container Image Repository can be thought of as a Library. There are public image repositories, like Docker Hub, and private image repositories. To use a container image, you first need to pull, or “checkout”, the container image from the repository. Once the image is on the computer you can run the container. With docker this is done using the pull and run commands. Once a container is running you can interact with it. I have a container running on my raspberry pi for HomeAssistant, it runs and required minimal configuration to get set up. I can now interact with the container to control my smart lights.

Benefits to Using A Container

Before containers, applications needed to be downloaded, installed, and configured. This process needed to happen each time and place the application needed to be run. With advancements in virtualization technology containers became possible. As I mentioned earlier, a container bundles everything an application needs, including its runtime environment, into a package that can be distributed through image repositories. This makes it easier to move the application as well as deploy multiple instances of the same application, say on multiple servers. These benefits have more obvious business use cases, you can run multiple containers on the same server, and they have their own resource management. These same benefits also provide advantages in personal use as well. The HomeAssistant Container I mentioned earlier allows me to run the Home Assistant application on my Raspberry Pi 2, the Home Assistant application isn’t officially supported on that device. Since the container packages everything the application needs though I can run the container and still use the application. This also made using the application as simple as pulling the container image and running it. I didn’t need to download code, or executable and make sure they were in the right place or compiled correctly. Granted for most non-server environments containers are a little overkill. One exception to the “a container can be bundled and used everywhere”, is containers with a Windows Base Layer, these can only run on Windows. I’ll get into both what a Base OS is and why you can’t run a windows base layer on a Linux operating system in a little bit.

How to Build A Container

For most, building a container with Docker is sufficient, and primarily what I will cover in this section. I will begin to explore alternatives later on but as I discovered in my research that may be enough for it's own article! In the “Using a Container” section we briefly touched on a container engine this same container engine will be needed to fully build your container. The University of Wisconsin-Madison has a great overview of building a container [12] that covers most of what I will touch on here. If you want a step by step guide of getting started with creating a container, follow that article!

Container Images almost always start by using a pre-existing image as the starting point or “base layer”. While I won't get fully into the container structure it will be helpful to know at this point that containers are made up of layers, you can add new layers on top of existing containers to create a new container with different functionality. The “base layer” if the start of your container on top of this layer application files can be added as well as other configurations that may be needed for the application to work correctly. In most cases this “base layer” is some form of minimal Linux operating system (OS).

Containers build using docker use a file called a “Dockerfile” to outline how to construct the container. The start of the Dockerfile will instruct what “base layer” to use through the use of the `FROM` command. This command is followed by a repository, image, and tag that represents the base layer you wish to use. After the from line files can be copied into the container with the `COPY` command, and commands to install software or perform other actions can be done with the `RUN` command.

Docker is the most common container solution, however it is not the only one, others such as Podman, Colima, or many others do exist. These solutions can read Dockerfiles in order to create the container images, they also have their own native image files that could be written as well. For the majority of users the differences between these solutions is not important, dealing mostly in how the solutions manages knowing what containers to run and configuring network access.

After creating your Dockerfile the container image needs to be built. To do this with docker you can simply run the docker build command from the same directory as the Docker file. This will generate the image which can then be started with the same docker run command discussed in the “Using a Container” section.

What is Kubernetes?

One of the main questions I wanted to answer as I learned more about containers was the difference between Docker and Kubernetes, luckily for me the answer was easy, better understanding the answer will probably have to be a separate topic! Docker is a tool for building and running containers. Kubernetes is a tool for container orchestration.

Many tools now exist for container orchestration, like Docker, Kubernetes seems to be the “Kleenex” of them all. Container orchestration is the tools that manage the provisioning of infrastructure and deployment and scaling of a container solution. [14] In a case where I have a container to manage incoming web requests, I would want to be able to scale the number of containers I have in response to the number of incoming web requests, this is what container orchestration helps with. Beyond that will have to wait for it's own Dincher's Digest!

Deep Dive

From here on things may get more technical as I try to understand the underlying technology!

Building a Container From Nothing

Maybe you noticed I said “Container Images almost always start with a base image”? Well, I was curious how those base images were created! There are a few ways to build an image without a “base image”. This first is a little more straight forward and depending on who you ask, may not count. Docker has a concept of a “scratch” image. This “scratch” image is just an empty files system [8] that allows you to customize the container as you desire. “Scratch” is not an image, just a keyword letting Docker know to use the lowest file system layer as the starting point. You can build upon a scratch layer in the same way as a normal Dockerfile with a few notable exceptions, you won't have an OS so any attempts to use normal commands won't work! You can still use the docker `COPY` command to copy an application binary and the Docker `CMD` command to start that application, anything that application needs will need to be copied into the docker file with it though!

So why build from Scratch? Pretty much every OS in existence has security vulnerabilities of some level, by building from scratch you can eliminate any vulnerability not directly related to your application. Another use case is for edge computing with size limitations. An application build from Scratch will have a much smaller image size than the same image built from a normal base layer.

For the purists out there starting from “scratch” may not count as starting from nothing, this still leverages docker to build the image. For all of you there is the Open Container Initiative [2]. This initiative outlines the standards for container images, runtimes, and distribution. This image standard [9] outlines how to create the file system, image manifest, and much more, While this may be a fun read, this is way more detail than I care for at present! If you want to go somewhere between Scratch and nothing then you can take a look at builah [15]. Buildah is a tool to facilitate writing your own containers that comply with OCI standards, but gives you a little help, this tool only runs on Linux at the moment.

Docker vs. Podman

As previously stated, for most users the difference between different containers is not important, so what is the difference? The primary and original difference (as far as I could tell) is in how the two tools manage running containers. Docker creates a system daemon, dockerd, to watch for events to spin up an manage containers, where Podman does not. This difference allows Podman to run without root user permissions while Docker originally required root user, and still defaults to requiring root permissions. This means that a root user inside a docker container has root user on the underlying OS. As a result Podman is typically seen as a more secure alternative, however it does not always have the same capability that Docker provides. [16]

What Makes Containers Work

Docker, and container solutions like Docker, is build on top of the Linux kernel containment capability. This capability, known more as LXC, provides the features needed to isolate processes using features like AppArmor, SELinux profiles, kernel name spaces, and more [17]. These features have become a part of the Linux kernel overtime to help close security gaps. AppArmor is a Linux application security system that is used to supplement Unix Discretionary Access Control (DAC) and provide Mandatory Access control (MAC) [18]. AppArmour was added to the Linux kernel to provide more robust security features but is not always enabled by default. You can build native Linux containers with LXC. LXC packages can be installed on most Linux distributions with apt-get. Docker make use of the same technology that LXC and originally FreeBSD Jails do(did). As a result Docker primarily only works on Linux, it even creates a lightweight Linux virtual machine in order to run on Macs (OS X). The Code Mentor article [20] explains the concepts of how docker works much better than I could.

Windows Containers

While the original containers are Linux specific, Windows Operating System has similar features to achieve the same results. Windows has it's own program, runhcs, which is a fork of the Linux runc (what enables Docker and containerization to work on Linux) [21]. This application allows the Windows OS to run both Windows and Linux containers. To use Containers on windows certain virtualization features need to be enabled, even if you are using Docker for Windows. When docker installed on Windows it sets up a Linux virtual machine in order to run Linux containers.

So while both Linux and Windows have containerization features neither Linux or windows containers ever run on a different OS. Even when you run a Linux container on Windows, it is actually running in a Linux Virtual machine. It is not possible to run windows container on Linux because this ability to create a Windows virtual machine does not exist you can not run a container with a Windows Base Layer on a Linux operating system. Once minor exception to this is Image Repositories. Both Linux and windows images can be hosted in the same Image repository since the repository does not care about the contents of the different image layers. If you try to pull a Windows image with a Linux daemon though you will get an error because it does care about the contents.


References

Below are all articles and sites I either directly referenced or found useful in my pursuit of understanding containers!

  1. https://www.redhat.com/en/topics/containers

  2. https://opencontainers.org/

  3. https://www.aquasec.com/cloud-native-academy/container-security/image-repository/

  4. https://www.freecodecamp.org/news/kubernetes-vs-docker-whats-the-difference-explained-with-examples/

  5. https://medium.com/javarevisited/difference-between-docker-kubernetes-and-podman-8b03a4cf03bc

  6. https://www.techtarget.com/whatis/definition/daemon

  7. https://www.linode.com/docs/guides/podman-vs-docker/

  8. https://www.howtogeek.com/devops/how-to-create-your-own-docker-base-images-from-scratch/

  9. https://github.com/opencontainers/image-spec/blob/main/spec.md

  10. https://betterstack.com/community/guides/scaling-docker/podman-vs-docker/

  11. https://blog.iron.io/how-to-create-a-docker-container/

  12. https://chtc.cs.wisc.edu/uw-research-computing/docker-build

  13. https://devopscube.com/podman-tutorial-beginners/

  14. https://cloud.google.com/discover/what-is-container-orchestration#section-1

  15. https://buildah.io/

  16. https://news.ycombinator.com/item?id=38981844

  17. https://linuxcontainers.org/lxc/introduction/

  18. https://apparmor.net/

  19. https://www.reddit.com/r/docker/comments/keq9el/please_someone_explain_docker_to_me_like_i_am_an/?rdt=33578

  20. https://www.codementor.io/blog/docker-technology-5x1kilcbow

  21. https://www.techtarget.com/searchwindowsserver/definition/Microsoft-Windows-Containers

  22. https://www.rancher.cn/the-similarities-and-differences-between-windows-and-linux-containers

  23. https://stackoverflow.com/questions/42158596/can-windows-containers-be-hosted-on-linux

  24. https://forums.docker.com/t/private-repository-linux-and-windows-images/23092/3