Kubevious Guard - Kubernetes Validation Enforcer. Where are we headed, and how does it affect the project?

As the name suggests, Kubevious Guard is about safety, assurance, and peace of mind. The idea behind Guard is to validate changes before they have a chance to enter Kubernetes clusters, cause application downtimes or violate cloud-native best practices. In this article, we’ll do a deep dive into Kubevious Guard, what problem it solves, why and how we are solving that problem, and how it would affect the Kubevious project overall.

Kubernetes has gone mainstream and is the platform of choice for many of the new developments. Yet, there is a strong Kubernetes talent shortage. As if Kubernetes was not complex enough, now everyone is creating their own CRDs for API gateways, networking, app deployment, cluster management, public cloud management, etc., bringing the complexity to another level. Yes, that’s the beauty of the Kubernetes ecosystem, but unfortunately, it does not make it easier to operate applications in production safely.

Kubernetes is a distributed system, and so is the configuration used to configure, deploy and run applications on top of Kubernetes. A combination of name references, labels, selectors, bindings, numbers, annotations, and other loosely coupled conventions are used to tell Kubernetes and underlying ecosystem products the specifics of the application deployment. Such declarative approach and abstractions allow engineers to build reusable configurations and sometimes self-healing applications - at the cost of operational complexity. For example, in a recent discovery, nearly half a million Kubernetes servers were left open to the Internet as a result of misconfiguration (full story). There are many other examples where just a simple typo brought down an entire application globally, forcing SREs to deal with consequences for hours (full story). Every new K8s distro which introduces its own way of app configuration or Kubernetes ecosystem project, which enhances the platform’s capabilities, aggravates this problem even further, increasing variability and introducing more chances for misconfiguration.

Every once in a while there are attempts to create a layer on top of complex infrastructure platforms such as Kubernetes and public clouds, with a promise to simplify the use and creation of the most optimal, error-free, and secure configuration under the hood. While those intentions are genuinely good, they come with several issues. It is hard to introduce them into existing stacks. Also, those layers create their own opinionated ways of configurations and they expose the subset of all capabilities of the underlying platform. The latter is usually known when one is already locked into the layer and they are deploying their 100th microservice. As a matter of fact, I had my own prior attempt at creating such a layer on top of K8s+AWS.

Are we in the cloud-native quicksand, and anything we do makes the problem worse? Should we go back to monoliths and VMs? Yes, pretty much. Sorry for disappointing you. Haha, I know, you’re way too smart to take that bait.

With Kubevious we took a different approach. Kubevious started solving the problem of safety and complexity using an application-centric viewer and an introspection tool. It helps Kubernetes SREs to clearly understand what is going on in their applications and clusters. The way how loosely coupled K8s manifests were correlated and grouped under Application nodes looked very attractive to users of vast expertise. The best part was it required no changes to application deployments, labels, annotations, or other settings.

With time, it grew into a sophisticated rule engine for detecting configuration and state anomalies, best practices violations, and errors (typos, misconfigurations, conflicts, inconsistencies). Kubevious was equipped with built-in rules to detect such issues with K8s native manifests, for example, detecting missing Services referred by Ingresses or invalid Pod labels used in Services. It also allows the extension of rules using JavaScript-like scripts that have access to the application graph to detect additional issues and industry or organization bad practices. The key difference was the ability to not just lint manifests one-by-one but validate entire applications as a whole package - considering the current runtime state of the cluster.

We learned from the feedback that the viewer that displays issues is just not enough. People don’t want to see the errors and violations in the UI. They just want those breaking changes not to enter the clusters in the first place. Pretty fair ask. That’s why we created Kubevious Guard!

Kubevious Guard is a CLI extension that validates the entire change package against built-in and custom validation rules present in Kubevious. The change package can be just a YAML file with multiple manifests, the output of helm, helmfile, kustomize, or any other manifest generator tool. CLI then ships this package to Guard service (which runs in the cluster) to check if that change package introduces new errors and violations to the cluster or it resolves them. The entire change gets rejected if it raises new issues. Such issues can be as simple as the usage of “latest” image tags or a cross-manifest misconfiguration like missing ConfigMap mounted as a volume.

Kubevious Guard Intro

The first thing that comes to your mind is Validating Admission Webhook, right? It turns out that it is very hard to run cross-manifest validations using the webhook. Maybe impossible. The execution order would matter a lot. If the new Deployment webhook is triggered before the mounted Secret, it would then cause a false positive violation. Maybe we would need to execute only single manifest validations in the webhook and leave cross-manifest validations to the Guard and continuous Kubevious validator. If you have an opinion on this, please let us know on Slack or raise a feature request on GitHub.

With Kubevous Guard, we are creating an assurance enforcer for Kubernetes and the ecosystem. At the same time, the continuous Kubevious validation functionality acts as a final audit for issues that happen to slip through the Guard or admission controller.

I want to mention one more recent development that indirectly improves the Guard validation capabilities. We want to validate manifests beyond native K8s resources. The first development in that direction is the full integration of Traefik Proxy, which validates the correctness of IngressRoutes, TraefikServices, Middlewares, and TLSOptions. Traefik Proxy would bring down the entire route in the case of a type in Middleware. Guard would detect and prevent such breaking changes from getting applied. If there are more integrations like that, please let us know.

It only takes piping the change package into a bash script to integrate Kubevious Guard with CI/CD pipeline:

# Validate Changes with Guard
$ cat changes.yaml | sh <(curl -sfL https://run.kubevious.io/validate.sh)

# Apply Changes
$ kubectl apply -f changes.yaml

Are we going to stop further development on the Kubevious viewer and focus purely on developing new validation rules? Absolutely not! Kubevious UI is not only used to troubleshoot and get an instant clarity of the application structure but is also used to compose new validation rules. The rules engine follows the same correlated navigation structure as in the UI. That makes writing custom rules so much faster and easier. Any new integration of the K8s ecosystem project has an immediate appearance in Kubevious UI, continuous validator, and the Guard!

Follow instructions on GitHub to try out Kubevious Guard on your clusters.

#kubevious #guard #open-source #launch #kubernetes #k8s #traefik #best-practices

Back to list