Following our first blog post of the Getting Started with Kubermatic Kubernetes Platform webinar series, this second part continues on the path of showing you how to use KKP to automate your Kubernetes operations at scale.
Specifically, we will show you how Kubermatic KubeOne fits into the KKP architecture. Kubermatic KubeOne is an open source, provider-neutral and feature-complete tool to deploy and manage the initial Kubernetes cluster needed to install KKP.
It installs and provisions Kubernetes and upgrades, repairs and un-provisions the cluster.
We walk you through these steps in our webinar about deploying a Kubernetes cluster using KubeOne.
Get to Know KubeOne
Kubermatic Kubernetes Platform is an operator that needs to run on Kubernetes; KubeOne comes in to help manage clusters like Kubernetes help manage workloads.
KubeOne supports:
- Various node OS’es
- All upstream-supported Kubernetes versions
- The full Kubernetes cluster lifecycle (provision, upgrade, repair, unprovision clusters)
- Any provider and infrastructure - including on-prem and bare metal
- Integrating with Terraform
- Accessing non-publicly exposed instead over bastion host/SSH jump host
- Proxy environments
KubeOne automates control plane and worker provisioning, is compatible with kubeadm by using it and its declarative cluster declaration brings reproducibility.
Creating Clusters on AWS
Step 1: Create instance and infrastructure to be used by Kubernetes
- KubeOne comes with example Terraform scripts that can be used to get started
Step 2: Build KubeOne configuration manifest
- This defines:
- Which Kubernetes version will be installed
- What machines will be used
- How the cluster will be provisioned
- What features will be enabled
- The easiest way to get a manifest is to use the kubeone config print command
The KubeOne configuration manifest looks like:
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
versions:
kubernetes: '1.18.6'
cloudProvider:
aws: {}
Step 3: Run kubeone install
command
Upgrading KubeOne Clusters
Scope of the Upgrade Process
KubeOne takes care of:
- Upgrading kubeadm and kubelet binaries
- Running kubeadm upgrade on all control plane nodes
- Upgrading components and addons deploy by KubeOne
- Optionally upgrading all MachineDeployments objects to the desired Kubernetes version.
Upgrades are done in-place; KubeOne connects to nodes over SSH and runs commands needed to upgrade the node.
Worker nodes managed by Kubermatic machine-controller are upgraded using the rolling-upgrade strategy, meaning the old nodes are replaced with the new ones. KubeOne Static Workers are upgraded in-place, similar to the control plane nodes.
Prerequisites
KubeOne does a set of ‘preflight checks’ to make sure the prerequisites are satisfied, including:
- The cluster has been provisioned; Docker, Kubelet and Kubeadm are installed
- Information about nodes from the Kubernetes API matches what we have in the KubeOne configuration (and Terraform state file)
- All nodes are healthy
- The Kubernetes version skew policy is satisfied
Once the upgrade process starts for a node, KubeOne applies the kubeone.io/upgrade-in-progress label on the Node object. The label is used as a lock mechanism, so if an upgrade fails or it’s already in progress, you can’t start it again.
We recommend that you backup your cluster before running the upgrade process, which you can do using the Backups Addons.
Before running an upgrade, always ensure that your KubeOne version supports upgrading to the desired Kubernetes version. You can find more information about supported Kubernetes versions in the Compatibility document. You can check what KubeOne version you’re running using the kubeone version command.
Upgrading the Cluster
You need to update the KubeOne configuration manifest to use the desired Kubernetes version by changing the versions.Kubernetes field. As per Kubernetes Skew Policy, it’s only possible to upgrade to the next minor release, or to any patch release as long as the minor version is the same or the next one.
After modifying the configuration manifest, you can use the apply command to run an upgrade. The kubeone.yaml file is the configuration manifest and the tf.json file is the Terraform state file (which can be omitted if the Terraform Integration is not used). The –upgrade-machine-deployments flag ensures that worker nodes will be upgraded as well.
kubeone apply --manifest kubeone.yaml -t tf.json --upgrade-machine-deployments
The apply command:
- Analyzes the given instances
- Verifies that there is Kubernetes running on those instances
- Runs the preflight checks
- Offers you the chance to upgrade the cluster if needed
You’ll be asked to confirm your intention to upgrade the cluster by typing yes.
INFO[13:59:27 CEST] Determine hostname…
INFO[13:59:31 CEST] Determine operating system…
INFO[13:59:32 CEST] Running host probes…
INFO[13:59:33 CEST] Electing cluster leader…
INFO[13:59:33 CEST] Elected leader "ip-172-31-220-51.eu-west-3.compute.internal"…
INFO[13:59:36 CEST] Building Kubernetes clientset…
INFO[13:59:36 CEST] Running cluster probes…
The following actions will be taken:
Run with --verbose flag for more information.
~ upgrade control plane node "ip-172-31-220-51.eu-west-3.compute.internal" (172.31.220.51): 1.18.5 -> 1.18.6
~ upgrade control plane node "ip-172-31-221-177.eu-west-3.compute.internal" (172.31.221.177): 1.18.5 -> 1.18.6
~ upgrade control plane node "ip-172-31-222-48.eu-west-3.compute.internal" (172.31.222.48): 1.18.5 -> 1.18.6
~ ensure nodelocaldns
~ ensure CNI
~ ensure credential
~ ensure machine-controller
~ upgrade MachineDeployments
Do you want to proceed (yes/no):
After confirming your intention to upgrade the cluster, the process will start. It usually takes 5-10 minutes for a cluster to be upgraded. At the end, you should see output such as the following one:
INFO[13:59:55 CEST] Determine hostname…
INFO[13:59:55 CEST] Determine operating system…
INFO[13:59:55 CEST] Generating kubeadm config file…
INFO[13:59:56 CEST] Uploading config files… node=172.31.222.48
INFO[13:59:56 CEST] Uploading config files… node=172.31.220.51
INFO[13:59:56 CEST] Uploading config files… node=172.31.221.177
INFO[13:59:57 CEST] Building Kubernetes clientset…
INFO[13:59:58 CEST] Running preflight checks…
INFO[13:59:58 CEST] Verifying that Docker, Kubelet and Kubeadm are installed…
INFO[13:59:58 CEST] Verifying that nodes in the cluster match nodes defined in the manifest…
INFO[13:59:58 CEST] Verifying that all nodes in the cluster are ready…
INFO[13:59:58 CEST] Verifying that there is no upgrade in the progress…
INFO[13:59:58 CEST] Verifying is it possible to upgrade to the desired version…
INFO[13:59:58 CEST] Labeling leader control plane… node=172.31.220.51
INFO[13:59:58 CEST] Draining leader control plane… node=172.31.220.51
INFO[14:00:07 CEST] Upgrading kubeadm binary on the leader control plane… node=172.31.220.51
INFO[14:00:21 CEST] Running 'kubeadm upgrade' on leader control plane node… node=172.31.220.51
INFO[14:00:44 CEST] Upgrading kubernetes system binaries on the leader control plane… node=172.31.220.51
INFO[14:00:59 CEST] Uncordoning leader control plane… node=172.31.220.51
INFO[14:01:00 CEST] Waiting 30s to ensure all components are up… node=172.31.220.51
INFO[14:01:30 CEST] Unlabeling leader control plane… node=172.31.220.51
INFO[14:01:30 CEST] Labeling follower control plane… node=172.31.221.177
INFO[14:01:30 CEST] Draining follower control plane… node=172.31.221.177
INFO[14:01:30 CEST] Upgrading Kubernetes binaries on follower control plane… node=172.31.221.177
INFO[14:01:44 CEST] Running 'kubeadm upgrade' on the follower control plane node… node=172.31.221.177
INFO[14:01:55 CEST] Upgrading kubernetes system binaries on the follower control plane… node=172.31.221.177
INFO[14:02:14 CEST] Uncordoning follower control plane… node=172.31.221.177
INFO[14:02:14 CEST] Waiting 30s to ensure all components are up… node=172.31.221.177
INFO[14:02:44 CEST] Unlabeling follower control plane… node=172.31.221.177
INFO[14:02:44 CEST] Labeling follower control plane… node=172.31.222.48
INFO[14:02:44 CEST] Draining follower control plane… node=172.31.222.48
INFO[14:02:53 CEST] Upgrading Kubernetes binaries on follower control plane… node=172.31.222.48
INFO[14:03:10 CEST] Running 'kubeadm upgrade' on the follower control plane node… node=172.31.222.48
INFO[14:03:24 CEST] Upgrading kubernetes system binaries on the follower control plane… node=172.31.222.48
INFO[14:03:48 CEST] Uncordoning follower control plane… node=172.31.222.48
INFO[14:03:48 CEST] Waiting 30s to ensure all components are up… node=172.31.222.48
INFO[14:04:18 CEST] Unlabeling follower control plane… node=172.31.222.48
INFO[14:04:18 CEST] Downloading PKI…
INFO[14:04:19 CEST] Downloading PKI files… node=172.31.220.51
INFO[14:04:20 CEST] Creating local backup… node=172.31.220.51
INFO[14:04:20 CEST] Ensure node local DNS cache…
INFO[14:04:21 CEST] Activating additional features…
INFO[14:04:22 CEST] Applying canal CNI plugin…
INFO[14:04:34 CEST] Creating credentials secret…
INFO[14:04:34 CEST] Installing machine-controller…
INFO[14:04:37 CEST] Installing machine-controller webhooks…
INFO[14:04:37 CEST] Waiting for machine-controller to come up…
INFO[14:05:03 CEST] Upgrade MachineDeployments…
If the upgrade process fails, it’s recommended to continue manually and resolve errors. In that case, the kubeone.io/upgrade-in-progress label will prevent you from running KubeOne again, but you can remove it by using kubectl, such as: kubectl label node <node-name> kubeone.io/upgrade-in-progress-.
Changing Cluster Properties Using KubeOne Upgrade
If you want to change some of the cluster properties (e.g. enable a new feature), you can use the upgrade command to reconcile the changes.
Modify your manifest to include the desired changes, but don’t change the Kubernetes version (unless you want to upgrade the cluster), and then run the upgrade command with the –force flag:
kubeone upgrade --manifest kubeone.yaml -t tf.json --force
Alternatively, the kubeone apply command can also be used:
kubeone apply --manifest kubeone.yaml -t tf.json --force-upgrade
The –force flag instructs KubeOne to ignore the preflight errors, including the error saying that you’re trying to upgrade to the already running version.
At the upgrade time, KubeOne ensures that the actual cluster configuration matches the expected configuration, and therefore the upgrade command can be used to modify cluster properties.
Introduction to Machine-Controller and Cluster API
Machine controller
Kubermatic machine-controller is an open-source Cluster API implementation that takes care of:
- Creating and managing instances for worker nodes
- Joining worker nodes a cluster
- Reconciling worker nodes and ensuring they are healthy
Kubermatic machine-controller allows you to define all worker nodes as Kubernetes objects and, more precisely, as MachineDeployments. MachineDeployments work similar to core Deployments.
You provide the information needed to create instances, while machine-controller creates underlying MachineSet and Machine objects and, based on that, (cloud) provider instances. The (cloud) provider instances are then provisioned and joined to the cluster automatically by machine-controller.
machine-controller watches all MachineDeployment, MachineSet, and Machine objects all the time. If any change happens, it ensures that the actual state matches the desired state.
As all worker nodes are defined as Kubernetes objects, you can manage them using kubectl or by interacting with the Kubernetes API directly. This is a powerful mechanism because you can create new worker nodes, delete existing ones, or scale them up and down, using a single kubectl command.
Kubermatic machine-controller works only with natively-supported providers. If your provider is natively-supported, we highly recommend using machine-controller. Otherwise, you can use KubeOne Static Workers.
Cluster API
Cluster API is a Kubernetes sub-project focused on providing declarative APIs and tooling to simplify provisioning, upgrading, and operating multiple Kubernetes clusters.
We use Cluster API for managing worker nodes, while control plane nodes are managed as described in the Cluster Provisioning and Management section.
The Cluster API controller (e.g. Kubermatic machine-controller) is responsible for acting on Cluster API objects — Machines, MachineSets, and MachineDeployments. The controller takes care of reconciling the desired state and ensuring that the requested machines exist and are part of the cluster.
You can learn more about the Cluster API by checking out the Cluster API repository and the Cluster API documentation website.
Managing Worker Nodes
KubeOne Static Workers are worker nodes provisioned by KubeOne using kubeadm. Similar to the control plane nodes, it’s expected that the user will create and maintain instances for static worker nodes.
This is useful in cases where the infrastructure provider is not natively-supported. In this case, KubeOne will use the static worker nodes provided in the KubeOne Configuration Manifest.
Static Workers Nodes are defined similarly to the control plane hosts, but they have their own API field called staticWorkers:
# The list of nodes can be overwritten by providing Terraform output.
# You are strongly encouraged to provide an odd number of nodes and
# have at least three of them.
# Remember to only specify your *master* nodes.
controlPlane:
hosts:
- publicAddress: '1.2.3.4'
privateAddress: '172.18.0.1'
bastion: '4.3.2.1'
bastionPort: 22 # can be left out if using the default (22)
bastionUser: 'root' # can be left out if using the default ('root')
sshPort: 22 # can be left out if using the default (22)
sshUsername: ubuntu
# You usually want to configure either a private key OR an
# agent socket, but never both. The socket value can be
# prefixed with "env:" to refer to an environment variable.
sshPrivateKeyFile: '/home/me/.ssh/id_rsa'
sshAgentSocket: 'env:SSH_AUTH_SOCK'
# Taints is used to apply taints to the node.
# If not provided defaults to TaintEffectNoSchedule, with key
# node-role.kubernetes.io/master for control plane nodes.
# Explicitly empty (i.e. taints: {}) means no taints will be applied.
taints:
- key: "node-role.kubernetes.io/master"
effect: "NoSchedule"
# A list of static workers, not managed by MachineController.
# The list of nodes can be overwritten by providing Terraform output.
staticWorkers:
hosts:
- publicAddress: '1.2.3.5'
privateAddress: '172.18.0.2'
bastion: '4.3.2.1'
bastionPort: 22 # can be left out if using the default (22)
bastionUser: 'root' # can be left out if using the default ('root')
sshPort: 22 # can be left out if using the default (22)
sshUsername: ubuntu
# You usually want to configure either a private key OR an
# agent socket, but never both. The socket value can be
# prefixed with "env:" to refer to an environment variable.
sshPrivateKeyFile: '/home/me/.ssh/id_rsa'
sshAgentSocket: 'env:SSH_AUTH_SOCK'
# Taints is used to apply taints to the node.
# Explicitly empty (i.e. taints: {}) means no taints will be applied.
# taints:
# - key: ""
# effect: ""
Next Up
If you have any questions or comments, please get in touch with our team.
Our next installment in this series will continue in guiding you through getting started with Kubermatic Kubernetes Platform.
Where to Learn More
- Visit our product page
- Watch the demo: Automate Your Clusters Across Multi-Cloud with KKP
- Find KKP on Github