With Kubernetes becoming an ubiquitous platform for running software at scale, an obvious but sometimes overlooked topic is security. Over the past years, several guidelines to secure Kubernetes clusters have been released. Among them is the CIS Benchmark for Kubernetes, which is being published and is receiving updates since 2017.
Kubermatic is committed to deliver a modern and secure Kubernetes platform. To meet our commitment, we introduced several changes and new features with our open source Kubermatic Kubernetes Platform (KKP) 2.19 that will help platform operators score better results on the CIS Benchmark and subsequently provide a standardized security level to their developers and end users.The CIS Benchmark recommendations can also be automatically validated with various tools. However, those tools make certain assumptions around cluster architecture. So, KKP’s control plane design would mark tests that should be considered “passed” as “failed”. For example, some recommendations in chapter 1 recommend certain file permissions on manifests like kube-apiserver; but KKP does not write those to disk at all, so that recommendation does not apply to KKP.
We broadly categorized recommendations made by the benchmark into three categories:
- Recommendations affecting running workloads on user clusters.
- Optional features that cluster operators can enable and configure to improve their compliance.
- Changes to how KKP deploys and runs a Kubernetes cluster.
For the first category in specific, we highly recommend downloading the CIS Benchmark and checking out chapter 5, “Policies”. Several recommendations are “soft” and ask cluster operators to “minimize” the usage of security-relevant settings, not eliminate them completely, as that might not be possible.
KKP components running on user clusters have been adjusted to follow those recommendations where possible, but we recommend users to review the recommendations carefully and implement them in a way that works for their organization. As an example, KKP offers integration with Open Policy Agent (OPA) to implement such policies.
New Features in KKP 2.19
Iterating on our existing feature set, our 2.19 release adds additional functionality to enable cluster operators to adhere to the CIS Benchmark and other security guidelines in accordance with their requirements more closely. Let’s jump right in with some of the highlights from this release.
Audit Logging Presets
While KKP already supported audit logging, the new audit logging preset feature allows cluster operators to select a pre-defined audit policy maintained by KKP developers. The selection will be accessible when enabling audit logging from the cluster creation wizard. A full description of the feature is available in the KKP documentation, but this feature is built for cluster operators that want to use a solid baseline for Kubernetes API audit logs, placing them in compliance with the CIS benchmark (both “minimal” and “recommended” presets cover CIS benchmark recommendation 3.2.2). It is still possible to define a custom audit policy for those with more specific requirements.
EventRateLimit Admission Plugin
KKP 2.19 also saw the addition of support for the admission plugin EventRateLimit. The CIS benchmark recommends to enable this plugin (item 1.2.9). This plugin will allow cluster operators to limit the amount of Kubernetes Events generated from a specific namespace, thus preventing short-term “flooding” of the Kubernetes API server if a component (like a third-party operator) generates too many errors.
Because the plugin is still considered alpha by Kubernetes and requires some considerations around which limits are deemed acceptable (since rate-limited events might be dropped), we chose not to enable it by default. It can however be selected from the dropdown list of admission plugins during the cluster creation wizard. The Kubermatic UI will then present additional elements to configure sensible limits for your cluster. It is also mentioned in our documentation.
Enhanced Out-of-the-Box Security
The great thing about some security improvements made in KKP 2.19 is that they do not require any user interaction apart from running the KKP upgrade! With 2.19, KKP delivers even better security built into the platform. A couple of examples for that would be:
- Explicitly setting the allowed TLS ciphers for communication between Kubernetes components. While Go already provides a limited set of ciphers, the CIS Benchmark recommends an even smaller subset to optimize transport security. KKP follows those recommendations now.
- Enabling the NodeRestriction admission plugin by default. This plugin limits the ability to modify resources at the Kubernetes API with credentials used by Kubernetes nodes. That way, the ability for an attacker to use a compromised node’s permissions is greatly reduced.
- When using etcd-launcher for the user cluster etcd rings, peer connections between the etcd instances are automatically upgraded to TLS for enhanced security. This live upgrade requires the enhanced functionality that etcd-launcher offers, so without that feature enabled, the upgrade will not happen.
- Disabling the profiling endpoint on all Kubernetes components. While this debugging endpoint was not accessible in KKP’s control plane architecture anyway, completely turning it off will prevent any attempts to abuse debugging capabilities.
And this list isn’t even exhaustive! In addition to that, we deliver support for recent Kubernetes patch releases, which include a significant amount of CVE fixes. See the full KKP 2.19.0 changelog if you want to know more about that.
How to Increase Security in a Kubernetes Platform
As we have mentioned earlier in this post, there are automation tools available that help with benchmarking your Kubernetes security. Unfortunately, their false positives (and false negatives) made it necessary to verify everything manually to be really sure we delivered the security improvements we wanted to deliver. For us, that meant carefully dissecting the requirements and understanding their impact and applicability to KKP. As pointed out earlier, some of the recommendations simply did not apply to KKP.
Our process to incorporate recommendations started with one of those tools called kube-bench. After running it, we went over every single recommendation, reviewed the results and how the tool got to that result. Of course, we focused on the failures, but we also had to make sure that succeeding tests were correct.
Failing tests were, after comparing the actual result to what we would expect it to be, categorized into three categories:
- False positives, so recommendations that KKP actually fulfilled but the automation didn’t recognize correctly.
- Not applicable recommendations (because of platform architecture; take the file permission recommendation for control planes as an example. KKP does not write those resources to disk at all, so no file permissions can be set) or recommendations to enable deprecated Kubernetes features (these will be removed in the foreseeable future).
- Recommendations that we wanted to implement for KKP 2.19 and beyond.
After identifying what recommendations we could implement, we started applying those changes all over the KKP code base. Some changes affected the control plane components spawned in Seed clusters (kube-apiserver, kube-controller-manager, etc), some of them in the way Kubernetes nodes are deployed and configured, some required the addition of new features by amending configuration options to our APIs and user interface.
You can see, it was a lot of tiny steps until those recommendations ended up in KKP 2.19. Thankfully, the actual rollout of those security improvements scales very well, powered by the high automation level of KKP. No need to manually connect to control plane nodes, adjust a flag for kube-apiserver, restart it and verify the flag was applied - KKP does it all for you when you upgrade to 2.19!
Outlook
While we have significantly increased our coverage of the CIS Benchmark recommendations, we are not done yet and always strive to improve our security stance further. We expect to continue delivering “out of the box” security enhancements that might “fly under the radar”, but we hope to support features like encryption-at-rest for data stored in etcd in the future as well. This feature will allow cluster operators to define encryption keys or use a KMS service and rotate keys when they are scheduled for rotation by security policies or have been compromised.
In addition, we hope to cover additional security recommendations coming out of the Kubernetes ecosystem in the future as well.