You may have noticed that every company and their dog is now embracing Kubernetes. Amazon Web Services (AWS), IBM/Red Hat, Microsoft, VMware, the list goes on and on. Even Docker, which has its own container orchestration program, Docker Swarm, now supports Kubernetes. So, what is it exactly and why is everyone going all in on it?
Once upon a time, and it wasn't so long ago, we still ran server programs on server hardware. Then along came virtual machines (VMs), and we could run multiple operating systems and applications on a single platform. This empowered companies to run ten times or more server instances on a single server. This, in turn, enabled us to run on-demand clouds on top of these VMs, which both saved us money and gave us great flexibility.
But what if you could run even more server programs on a single server if you left behind the bulk of the VM's operating system? That would give us even more cost savings and flexibility. That's exactly what containers deliver.
You see VM hypervisors, such as Hyper-V, KVM and Xen, work by emulating virtual hardware. That makes them "fat" in systems requirements terms. Containers, however, use shared operating systems. This means they are much more efficient than hypervisors.
Instead of virtualizing hardware, containers rest on top of a single Linux instance. This means you can run your applications in a small, neat capsule. In practical terms, you can run four-to-ten times the number of server application instances as you can with a VM on the same hardware.
Containers also lend themselves to Continuous Integration/Continuous Deployment (CI/CD). This a devops methodology designed to encourage developers to integrate their code into a shared repository early and often, and then to deploy the code quickly and efficiently.
Last, containers enable developers to easily pack, ship and run any application as a lightweight, portable, self-sufficient package, which can run virtually anywhere.
There's only one little -- well huge actually -- problem. How do you manage all those containers? With containers a single server might have dozens of workloads starting, working and ending. That's where Kubernetes comes in.
Kubernetes' history
Kubernetes's origin began in Google as Borg. It was a large-scale internal cluster management system for Google-sized job management. Not long after that, in 2014, Google released the first version of Kubernetes. Originally, it was named Seven of Nine, keeping with Borg's Star Trek theme, but that was quickly dropped. This open-source container orchestration program could deploy containers into a fleet of machines, provides health management and replication capabilities, and makes it easy for containers to connect to one another and other programs.
Thus, Kubernetes started with two major advantages. Thanks to its Borg ancestry, it had already been battle tested by Google, the world's biggest container user. And, by making it open source, Kubernetes was freed of the burden of being a Google-specific program. This was underlined when in August 2018, Google turned over the Kubernetes project’s cloud resources to its new home, the independent Cloud Native Computing Platform (CNCF).
Since the CNCF took over Kubernetes, it's gone from being a project dominated by contributions from Google and Red Hat to one with thousands of contributors. Along the way, it pretty much eliminated all of its competition. There are still rivals out there, but even by 2017, according to the developer research company Red Monk, over 50 percent of Fortune 100 companies were using Kubernetes as their container orchestration platform.
What Kubernetes does
According to Brian Grant, a Google principal engineer and Kubernetes lead architect -- and he should know -- "Kubernetes is a portable, extensible open-source platform for managing containerized workloads and services, that facilitates both declarative configuration and automation." So what does that mean?
Kubernetes runs on top of Linux and works with pods, groups of one or more containers, deployed to a single node, a physical server or VM. Your commands, usually sent via kubectl, Kubernetes' command line configuration tool, defines the metadata and specifications for your job. These descriptions consist of declarative statements written in JavaScript Object Notation (JSON) or YAML. These describe, via the Kubernetes API, which applications you want to run, what container images are needed for them and what network and storage resources they'll need to create your cluster’s desired state.
The Kubernetes master takes your commands, works out how best to run them with the resources available, and relays your marching orders to the pods via the Pod Lifecycle Event Generator (PLEG). You don't need to worry about the exact details. Kubernetes works out which nodes are best suited for the task. Behind the scenes, Kubernetes allocates resources and assigns the pods needed to do the job. Thus, Kubernetes automates setting up, monitoring, and managing containers.
Specifically, Kubernetes enables you to accomplish the following:
- Control and automate application deployments and updates. With it, you describe the desired state for your deployed containers. Kubernetes then changes the actual state of the containerized applications to the desired state at a controlled rate. For instance, you can create new containers, remove existing containers, or update a container's software content.
- Automate container configuration. You provide Kubernetes with a cluster of nodes it can use to run containerized tasks. You then tell Kubernetes how much CPU and memory (RAM) each container needs. Kubernetes then automatically fits containers onto your nodes to make the best use of available resources. In short, it scales containerized applications and their resources on the fly.
- Storage orchestration. Kubernetes enables you to automatically mount your choice of storage systems including local storages, public cloud providers, or storage-area network.
- Orchestrate containers across nodes. Containers are exposed by using their Domain Name Server (DNS) id or IP addresses. If container traffic is high, Kubernetes can also load balance and distribute the network traffic to maintain stability.
- Self-healing. Kubernetes restarts containers that fail, replaces containers, kills containers that don’t respond to your user-defined health check, and doesn’t advertise them to clients until they are ready.
- Security management. Kubernetes lets you store and manage authentication data, such as passwords, OAuth tokens, and SSH keys.
Put all this together and you get three major business advantages.
First there's stability, which comes with Kubernetes managing your containers. If a VM, server or even a cluster goes down, Kubernetes will automatically spin up more containers to carry the load.
Related: Kubernetes – the platform for running containers - is getting more enterprisey
Then, there's continuity. Kubernetes enables you to patch your applications or change them out entirely without affecting your operations. The new containers are bought online while the old ones expire and the services they deliver keep on running. IT stability used to be all about server uptime. Now, with containers and Kubernetes, it's all about service uptime.
A related plus is resilience. Kubernetes automatically maintains active containers. These are called replica sets. When a pod fails taking down all its containerized applications with it, the replica set has the needed containers already up and running ready to take over the load.
In short, a Kubernetes-based IT structure keeps running with up-to-date software when previous methods would either fail or require significant downtime. You can't beat this with a stick.
What Kubernetes isn't
Sounds great doesn't it? Kubernetes can't do everything though.
For example, Kubernetes is not a platform as a service (PaaS) system. Kubernetes provides the building blocks for PaaS-like services, but, unlike say Cloud Foundry, it doesn't provide the programming tools needed to build cloud-native applications. And, while you can certainly build and use CI/CD systems on top of Kubernetes, it's not inherently a CI/CD. Also, as a way to manage containers, it doesn't come with application-level services, such as middleware or databases.
Related: Kubernetes is the starting point, not a destination
It's because of this there are so many Kubernetes distributions. These include, in a far from comprehensive list, Amazon Elastic Container Service for Kubernetes (EKS), Azure Kubernetes Service (AKS), Google Kubernetes Engine (GKE), IBM Cloud Kubernetes Service, Red Hat's OpenShift, Pivotal Container Service (PKS) and VMware Kubernetes Engine. Besides providing the tools you need to deploy Kubernetes on a particular cloud, they each have their own special sauce of additional functionality.
Why Kubernetes is your future
As we continue to move our applications from servers and virtual machines to containers, Kubernetes is inevitable. There's simply no practical way for a system administrator to manage hundreds or thousands of ephemeral containers even with such DevOps tools as Ansible, Puppet, or Salt. It needs a dedicated tool and that tool is Kubernetes.
There have been, and still are, other container orchestration tools. For some specialized uses, they might be better for your company. Generally speaking, though, Kubernetes will be the default container management choice for most businesses.
Because it can run across multiple platforms, even if they're run by rival cloud businesses, Kubernetes is becoming a popular choice for hybrid clouds. It's not easy. Not yet. But many Kubernetes distributors are working as hard and as fast as they can to deliver easy-to-deploy Kubernetes-based hybrid clouds. These will enable you to do such things as running a hybrid cloud, which keeps your back-end data on your private cloud while using a public cloud for your front-end interface.
Put it all together, and if you're not running Kubernetes yet, you soon will be. Just as Linux fundamentally changed the server operating system space, and VMs led to the rise of the cloud, Kubernetes will lead us to container-based, distributed computing.