"System software that manages computer hardware, software resources, and provides common services for computer programs."
That's the quote that you'll find if you search Wikipedia for "Operating System".
Originating as Google's Borg, Kubernetes was created to perform this resource management on a grand scale. Translating to "Helmsman" or "Pilot" in greek, it was constructed to efficiently command the resources of many machines at once.
Through many iterations, it achieved its' current form - a multi-machine operating system complete with an init system, package manager, and selection of network managers.
The Init System
Quintessential to the role of the OS is the init system, or process manager. As a Linux SysAdmin, you've almost certainly worked with one - most likely,
Interactions with Kubernetes are made through calls to its REST API. This API is hosted on the control plane node(s) of your cluster, which command the worker nodes your applications run on. But don't worry - you don't have to write the API calls yourself.
The default client for this API is
kubectl, the Kubernetes CLI. It parses YAML files called manifests, which are comparable to the service files in
/etc/systemd/system. These manifests specify which binaries to run, and how to run them.
The basic syntax for deploying resources from a manifest is this:
To delete these resources, the syntax is similar:
kubectl delete -f <filename.yaml>
Under the hood,
kubectl converts them to JSON, then blasts them into the API server. It authenticates itself with a password, token, or TLS certificate. You can read more about that here.
When the API server ingests these calls, the data within them gets stored in
etcd, a consistent & highly-availible, key-value database.
kubectl is not the only Kubernetes API client. There are others like Rancher, which provide a GUI for commanding your cluster.
The package manager
You're probably familiar with
apt. Especially since this article was written in 2021, following the CentOS scandal! 🙄
The Kubernetes package manager is called
helm. It's an add-on to
kubectl that provides templating and a more
Let's look at the Nextcloud package as an example ⬇️
First, you've got to add a repository, called a
chart. Unlike with apt, there is no "default" Helm repo; you have to add them manually.
Now you can install the application with a command like this:
# ⬇️ Install the application under the name Nextcloud helm install --name nextcloud \ # ⬇️ Set config variables for the app's installation --set \ nextcloud.username=admin,nextcloud.password=password,mariadb.rootUser.password=secretpassword \ # ⬇️ Install the `nextcloud` application (#2) from the `nextcloud` repo (#1). nextcloud/nextcloud
The network managers
Kubernetes has differrent kinds of network managers playing at differrent levels of the stack, for differrent use cases. It has "system" network managers like Flannel, Calico, and many others. These play on layer 3/4 and handle IP address assignment & DNS.*
There are also "application layer" network managers, called
IngressController. These you can think of like reverse proxies. Many are built to handle things like TLS termination for you. The main differrence between these and "traditional" reverse proxies is that they're built to detect services using the Kubernetes APIs. You authorize them to access segregated components of the cluster using Kubernetes' RBAC features. It's good ole' user & group, but rebuilt on another level.
I'd love to provide examples on the usage of these network managers. Unfortunately, providing any of value to you would be too complicated for the scope for this article. But you can read more about Kubernetes networking here.
*Yes: You need to address objects by DNS in the cluster. Sorry, I know, but there's no way around it, because you can't count on a static IP for anything in an OS made of ephemeral resources. Services, made up of many differrent containers, get created and destroyed all the time, as do the worker nodes they run on.
⚡️ Lightening Round ⚡️ (FAQ++)
I am still really confused!!
Kubernetes is quite possibly the most confusing, convolluted, poorly product-managed operating system ever created for commercial use. So...that's normal.
Here's what I reccomend: When you come across a component you don't understand - Google its purpose. If that doesn't answer your question, recursively Google all its subcomponents.
Is Kubernetes actually an operating system?
Ok, fine, no it isn't...not yet anyway.* Right now, most implemetations are actually just userland software which run on top of a regular Linux distro like Ubuntu or CentOS.
I'm explaining it to you as an operating system because it performs the OS-like functionality you expect as a system administrator. And you're probably running it in the cloud - where you don't provision or maintain the nodes manually - it really will work like an OS from your perspective.
*There are distros like Rancher's k3OS, which seek to turn Kubernetes into a full-blown Linux distro capable of natively commodifying hardware into cluster compute capacity. I believe this is the future of Kubernetes.
What is the package format for Kubernetes?
The exclusive package format for Kubernetes is the OCI-compliant container
image. So, Docker images, basically. You don't need to remember that. Just remember that any images you build with Docker will run on Kubernetes.
Physically, an image is a zip of a container's filesystem + some metadata. When the image gets "pulled" (downloaded), it gets saved to a temp location and extracted before being run.
Are containers virtual machines?
No. While running, a container is just a process constrained by a Linux kernel feature called a
cgroup (Control Group). The container engine controls which data containers can access, and lies to them about the context in which they're running. So it can tell your Wordpress container that
~/projectname/html is actually
/var/www/html. It's a
chroot environment, if you're familiar.
Importantly: Containers do not have their own kernel. The "multi-tenency" features of Kubernetes are considered "soft" multi-tenancy, and there is a whole genere of security vulnerabilities called container escapes.
Should I deploy multiple applications per cluster?
It depends. If by "applications" you mean deploying a highly-availible Postgres cluster to power your highly-availible Django application? Yes, absolutely! This is exactly what Kubernetes is for.
If, however, you're a law firm wondering if you should run the Nextcloud server with all your client data on the same cluster as your website? Nah. Don't do that.
Remember: Kubernetes is just an operating system. A "cluster" maps roughly to a server. Think of what you'd be comfortable installing on a single server, and work from there.
How is Kubernetes different from Docker?
Docker is a set of Linux packages for conventional distros, built on the
containerd engine. It contains image building tools, and a friendly CLI for interacting directly with the containers you create. Kubernetes also uses
containerd under the hood - but it doesn't come with the rest of the Docker tools, because it doesn't need them.
Docker used to have its' own Kubernetes competitor called Swarm, but, unfortunately, this product was a commercial failure and was discontinued. So, at the moment? Docker is a developer tool, while Kubernetes is an operating system.