Harbor Image Registry in PKS and Kubernetes

Overview

Just as virtual machines can be created from disk images plus metadata, so containers are created from container images. When you deploy a Deployment with Kubernetes, you include a Pod Template that, among other things, specifies an image to use. If your Nodes are already configured with a registry, this can be just the name and tag; otherwise, you could use a full URL to a public registry. Let’s look at each component of this process one by one.

Container Images

Kubernetes orchestrates running containers, but has nothing to do with building the images they are based on. Typically, you will build your images with Docker, or some build tool that does it for you. You create the Docker image and push it into a container registry first, then you can refer to it and deploy it to Kubernetes.

To build a Docker image, you start with an empty folder that will become the context for the image. You will at least need a Dockerfile, which is literally a file named “Dockerfile” with no file extension. In it, you describe the steps required to build the image you desire. You can place other files and folders in that initial folder, and then include instructions in the Dockerfile to copy some of those to the image. Here’s a simple example of a Dockerfile (from the Docker docs):

FROM ubuntu:15.04

COPY . /app

RUN make /app

CMD python /app/app.py

The first line in this Dockerfile specifies the base image to start with, in this case, Ubuntu 15.04. The second line instructs Docker to add the app folder to the image. Docker looks for this folder in the context, which is right next to the Dockerfile. Next, Linux is instructed to run the make command on that folder, and finally, the application is launched by instructing Linux to have Python run the file app.py.

To build this folder into an image, you could simply navigate to the folder with the Docker file, and assuming you had Docker installed, run:

docker build

This tells Docker to build an image with the context being “.” (the folder you are currently in). Alternatively, you could have been in any folder and just specified the full path, or even specified a tarball to use as the context. In any case, Docker will create the image and store it locally. Next, you would push it to a registry, which we will get to in the next section.

One thing to note about Docker images is that they are highly layered; in fact, they use a Union File System and each line in the Dockerfile mentioned above becomes a separate layer. All of these layers are immutable once the container is running, but an additional read/write layer is added on top. That read/write layer on top is one of the main differences between a Docker container image and a running Docker container.

Images are also highly layered in the sense that you rarely create a base Docker image; instead, you almost always start with the FROM instruction and specify a base image to start with. For example, you might base your web server on the official NGINX image, and if you chose the Alpine version, then the image you are using as your base (NGINX) used Alpine as its base image. Similarly, you can base new images off your own previously created images.

Once your image is created, whether local or in a registry, you could use Docker to run it with a simple CLI command:

docker run [IMAGE_NAME]

But if you want to deploy that image to Pods in Kubernetes, you’ll need to push it to a container registry.

Container Registries

Container registries are similar in convention to Git repos. You use the “docker push” command to push an updated local copy of an image up to a remote registry, and conversely, you use “docker pull” to get an updated copy of the image from the remote registry to your local machine. You even can run “docker commit” on a running container to build a new local image based on the changes made to the container while it was running.

Docker images are registered with a name and a tag. That tag is often a version, like myapp:1.0 and later myapp:2.0, but can also represent other types of variants, like which type of Linux the image was based on. For example, as we mentioned above, you can get NGINX build on Alpine Linux by referencing the image “nginx:alpine” or “nginx:1.15-alpine” for that NGINX built on that specific version of Alpine.

To configure Docker to run with a specific image registry, you just login to it with the “docker login” command. That is really the only sense in which you login with Docker; in fact, you can’t register for an account at docker.com, rather you create an account at hub.docker.com, their official container registry.

Like Github, Docker Hub is a very popular place for public code, and can also be used for private code. There are a variety of other options on the public internet; for example, Google Container Registry. Alternatively, you can host a registry locally, either with default Docker tools, or a feature rich enterprise solution like Harbor.

Harbor and PKS

Harbor is open source, enterprise-grade, and PKS is built to use it by default. PKS and Harbor can both be run on-prem, on company-owned hardware, which can be helpful in some industries for compliance and security concerns. It also helps with security with features like container signing and vulnerability scanning.

Just like signed certs back HTTPS and allow us to confidently shop online knowing the website is from who it says it is, signed containers ensure the image you are deploying is from where you think it is. Imagine how bad it could be if you were running Kubernetes in a production environment and ran a malicious image instead of the one you were expecting to run.

Harbor can automatically scan for vulnerabilities as your images are pushed into it. This is great for microservice architecture, because you may end up with many small teams each managing versions and dependencies separately. That can make it tough to keep on top of all the possibility vulnerabilities that could be introduced to your environment. Harbor can keep images with known vulnerabilities from being allowed to run in your Kubernetes clusters.

Conclusion

In summary, Kubernetes orchestrates the running of containers. Containers are run from images, and images are stored in registries. Before deploying workloads with Kubernetes, you must build your images with Docker, and push them to a container registry. That registry could be DockerHub, a home-grown registry, a cloud provider’s registry, or an open-source registry like Harbor.

PKS and Harbor both fit VMware’s vision of offering the convenience of the cloud for your on-prem data center, supported by enterprise-grade features, all of this to enable large companies who prefer their own data centers to benefit from Kubernetes and the container revolution. Harbor rounds out the other VMware solutions like NSX for networking and vSAN for storage that make up the full PKS package.