Kubernetes YAML Tutorial
An interesting question to address first, is why not a GUI? Why not push buttons and turn knobs to dial in the K8s config you desire? In general, it takes longer, is less standard, and can't easily be tracked in version control. As DevOps has brought a developer's mindset to IT, GUIs have receded into the darkness, and text-based config has come into the light.
These config files are especially popular with declarative systems that function off "desired state," of which Kubernetes is one. You describe the desired state of your cluster in one or more config files, K8s parses that and loads your desired state into a database, then brings the cluster into compliance with it, and keeps it there using health checks. Many other systems, such as Terraform, work on this same principle, while other systems like Chef and Ansible are procedural, in which you define exactly what they should do step by step.
So that answers why we are configuring our IT with a config file, but that doesn't answer why we are using YAML. JSON is probably the most commonly known format, but JSON is meant for machines to read and write, though it is reasonable for humans, too. Conversely, YAML is designed specifically for humans to read and write, but is also reasonably easy for computers to parse. A major limitation of JSON is that it does not allow comments, where YAML and other formats do. Many have pointed out that TOML has some distinct advantages, but YAML is the most widely known human-centric config format, so that's K8s works on.
Technically speaking, JSON is a proper subset of YAML, which means that YAML can parse all JSON, but JSON cannot parse all YAML. That means that you could actually use JSON in your K8s YAML config files, but in general it's recommended that you don't do that, since YAML is more user-friendly. On the other hand, if for some reason you were generating your config files programmatically, instead of writing them by hand, then JSON might be appropriate.
There are two primary data structures inside YAML, maps and lists. Maps are key value pairs, just like a property in JSON. Here's a simple example:
Or here's a more relevant example - at the beginning of a K8s config file, you would specify the version like this:
By contrast, a list is like an array - it is a set of values. Here's an example:
Notice that items with a preceding dash are considered to be elements of a list.
You can have maps with values that are lists:
And lists of maps:
- Superman: Clark Kent
- Batman: Bruce Wayne
- Wolverine: James Howlett
As you start to describe configurations, you'll need to use complex nesting of both combined (from the official K8s example, guestbook-all-in-one ):
- name: slave
- name: GET_HOSTS_FROM
# If your cluster config does not include a dns service, then to
# instead access an environment variable to find the master
# service's host, comment out the 'value: dns' line above, and
# uncomment the line below:
# value: env
- containerPort: 6379
There is a lot of additional nuance to YAML, and a bunch of little rules and options. You might have noticed in previous examples that comments begin with a #, but you might not have noticed that it's essential to have white space (a single space will do) after the #. Hopefully, you have enough of an understand to start reading and writing K8s config files.
YAML in Kubernetes
When writing a config YAML for Kubernetes, the first thing you do is specify the K8s API version you are writing for. Generally, this should just be v1 at the time of this writing (late 2018). You also need to specify the type of config this is for - most often you will say Service - though other options include ReplicaSet and Deployment. Here's an example of how a common K8s config YAML would start:
Notice these are both simple maps. We have not yet needed to use a list. Next, we would specify metadata, which does not use a list, but does use nested maps:
Notice indentation is used to denote hierarchy, similar to a language like Python which requires such spacing, and most programming languages at least use it conventionally. It might seem that the key value pairs inside of labels are a list, but technically they are not, they are instead a single object with multiple properties. If you were to use an online YAML to JSON converter, you'd notice that's exactly what labels converts into in JSON.
In the next section, we'd get to the spec which includes ports, and ports is a list:
- port: 6379
This list happens to only have one element in it, but it is nonetheless a list, and could take multiple elements like this:
- port: 6379
- port: 22
If you converted that to JSON, you'd see that port(s) is a key whose value is an array of objects, and each of those objects has two properties: port and target port.
One last important detail of YAML is the separator -, which lets you split a single file into multiple documents. That way you can have many K8s services described in a single YAML file, rather than having to create a literal separate file for each service (which you could do, and you could process them all at once by using the K8s cli (kubectl) and pointing at an entire folder). However, it's often cleaner to just have all the services in a single file and use kubectl to point to that one file.
Armed with this knowledge, you should be ready to read through the official Kubernetes sample file, guestbook-all-in-one.yaml. Spend some time looking through it and try to understand its structure and what it's communicating to Kubernetes. Hopefully, you'll start to feel that YAML config files are not a burdensome difficulty, but rather a clean and simple way to work. Moving from GUIs to text is a critical part of the transformation from traditional IT to DevOps.