How MayaData Solved Persistent Storage Problems in Kubernetes with Kubera
Harshit Mehndiratta
Harshit Mehndiratta
September 21, 2020
/
8 minutes read

How MayaData Solved Persistent Storage Problems in Kubernetes with Kubera

Over the years, Kubernetes has increasingly been adopted as a data layer platform, which has made the Container Attached Storage(CAS) approach more prevalent.

CAS solutions such as OpenEBS, have emerged which provide Stateful applications easy access to Dynamic Local Persistent Volumes PV’s or Replicated PVs. Using the Container Attached Storage approach, users have also reported lower costs, easier management, and more control for their teams.

OpenEBS, both as a storage layer and as a means to dynamically provision and manage local disk, is an excellent CAS solution for Kubernetes. Founded by MayaData as an open-source storage project, OpenEBS is widely accepted by the Cloud Native Computing Foundation (CNCF) and many other users, including Bloomberg, Comcast, Arista, Orange, Intuit, and others.

Having built OpenEBS, the leading open source storage solution, MayaData has turned its attention to the operations of Kubernetes as a data plane by announcing Kubera. Kubera, as a SaaS, simplifies the processes of Kubernetes as a data plane that allows easy use of OpenEBS for Kubernetes workloads.

Kubera

Kubera is the best way to implement OpenEBS on Kubernetes. It wraps around the Enterprise Edition of OpenEBS and includes 24 / 7 available support for OpenEBS. Everyone from home users to teams building CI/CD or data pipelines can benefit from using Kubera to operate their Kubernetes based data layer.

Kubera reduces operation costs and increases data resilience by simplifying workflows such as upgrading workloads and upgrading OpenEBS itself.

Kubera provides cluster logging, alerting, and reporting, which saves time and money while improving operations. Kubera can also help you see and manage underlying environments to ensure that deployed resources across availability zones and their dynamic provisioning work as per requirements.

Kubera’s open-source license helps meet users’ needs that do not want to upgrade their data layer as frequently. MayaData provides support for users via live chat that connects users to the large MayaData team 24x7, 365 days a year.

Kubera supports simple backup of any workload, regardless of its support for OpenEBS or not. That is especially handy for StatefulSet deployments where you might want the high performance plus the resilience and replication of the node and the cluster data.

MayaData Kubera is also the first Kubernetes native storage solution to support MayaStor, which provides NVMe to deliver per workload storage services. While MayaStor is not as evolved as OpenEBS, MayaData has still added initial support of MayaStor within Kubera.

Per Workload Backup

Data protection requirements on Kubernetes for stateful workloads are fundamentally different from prior approaches to backups, disaster recovery, and cloud migration. Kubera from MayaData addresses these data resilience issues by enabling per workload backup policies.

These backup policies are managed on the user’s Kubera account. They include S3 compliant backup target configuration such as AWS, GCP, Cloudian HyperStore and retention count, S3 target specific region, and time interval information.

Once these policies are up, they run on the defined schedule. Many typical workloads are preconfigured into Kubera with these policies - so when you ask Kubera to log, visualize, backup, and report on a set of containers. Kubera will recognize and apply a relevant set of defaults.

Kubera can also detect a growing number of stateful workloads and provide backup-specific features for that application. That can help teams that use Kubera to have a more comprehensive view of backups and analytics by the application.

Automation and Flexibility

OpenEBS is fairly well known for various storage engines such as cStor, Jiva, or LocalPV for different application workloads based on a configuration policy.

cStor is a copy-on-write storage engine, meaning that snapshots take the least amount of space. Jiva is useful for its simplicity, low requirements, and node-accessible host path. Whereas LocalPV is a preferred choice. If you need high speed, dynamic provisioning, or storage that’s not in the data path, dynamic LocalPV is the most commonly used storage engine for OpenEBS and Kubera.

There are often workloads that require multi-level resiliency at various levels, such as host, availability zone, or data center level, which requires lots of implementation. cStor can be used used in these situations to pool underlying cloud volumes for expansion capabilities, create multiple replicas to distribute across availability zones, and back up data to any S3 object storage.

For all the above data resiliency levels, Kubera can automate the process of backup and data recovery.

Kubera also supports self resilient workloads such as Cassandra, Kafka, etc. In a case where you do not need to deploy a multi resiliency model with cStor, you may choose to optimize the higher performance offered by using a flavor of OpenEBS LocalPV while still getting the third level of resilience at the cluster or data center level.

The approach deployed by Kubera, in this case, is to back up one or more nodes of the self resilient workloads, in addition, to protect against the data center and cluster failure. Since OpenEBS cStor is not used in the data path, Kubera takes less efficient backups using the Restic backup manager to provide a third level of resilience.

Highly Fault Tolerant

Building fault-tolerant systems for cloud-native architectures is one of the most counter-intuitive aspects of cloud-native architectures to improve resilience. Each component in the underlying infrastructure needs to be assumed to fail at some point in time.

OpenEBS is amongst well-known fault-tolerant storage systems when it comes to cloud-native design. Every OpenEBS component is a microservice, and these components are loosely coupled to easily separate workloads and underlying infrastructure.

OpenEBS and Kubera, unlike other traditional storage systems, do not store all their metadata in a special-purpose database, which must be kept resilient at all times to keep etcd alive in a Kubernetes environment.

In Kubera, all the metadata, which contains information about disks, is stored via custom resource definitions in etcd itself. Secondly, the volume metadata is stored within the storage volume itself. That means that when a storage volume is attached, it can “find itself” without needing to restore itself without querying a remote metadata database.

Lastly, OpenEBS partitions the problem of performance, resilience, and management of the entire storage management problem on a per workload basis. In case of the failure where OpenEBS itself fails, it only fails for a particular workload.

Migration and Recovery

In Kubera enabled CAS environment, you can have granular control of your workload storage resources. Workloads can easily be cloned if you want to develop against a read-only copy of production data. Suppose the team wants that their database workloads have Cross AZ deployments.

They can easily do it using Kubera, which automatically creates an instance of the primary database and replicates that synchronously to a standby instance in a different availability zone.

Teams can also deploy a common pattern using Kubera for production database workloads. Such patterns can serve as default templates for new workloads created by the team or the entire organization.

Per workload backup feature of Kubera here also comes in handy when you want to restore or migrate workloads to a different cloud platform. Other cloud platforms can have different storage solutions implemented, which can lock you to a particular vendor. With workload-based backup, you can steadily shift your applications, packages, and configuration files to the new cloud platform without any issues.

Fully Open Source

Kubera is 100 percent open-source, and you can get started for free with its open source license. As your usage increases, you’ll graduate into a scale of operations where you can get a license on a per worker node basis. If a worker node is used by OpenEBS to store data or manage stored data, then that is a licensed node, and you only pay for this licensed node.

Kubera track usage and notifies you if your usage has exceeded the provisioned amount. You will receive an invoice and a 30 day fix period in case if users have spanned a huge number of stateful workloads towards the end of a quarter,

Kubera also provides SLAs, proactive consulting, and more if you require an enterprise support level. The consulting team will also run a version of your environment at MayaData so that all releases are tried and tested.

Flexible Pricing

Kubera pricing starts with a free tier. For individual and team licenses, it’s $79 per month. That includes all of Kubera’s functionality, including 24x7 support for OpenEBS as well. But as you move to per cluster or per node licenses, the price becomes $2,500 per storage node per year.

Kubera users can also commit to a different pricing model by committing upfront for an annual license. The yearly license prepayment discount is 20% per year, and in case a payment is delayed, they only charge a fix per month amount beyond the first 30 days.

Large-scale users can also use site licenses, which can provide unlimited access to a certain number of users in a facility or on a site. The use of prepaid site license is best suited for users at a scale of 4,000 or 5,000 nodes with OpenEBS implemented on them.

Final Words

Kubera provides a better route to use Kubernetes as a data layer. It provides access to many data agility features that other solutions cannot.

Sometimes Kubera is confused with VMware’s vSAN, but they are both different in data management capabilities. vSAN runs on vSphere to provide storage benefits to workloads that are running on vSphere. In contrast, Kubera runs on OpenEBS to provide data management benefits to underlying clusters running only on Kubernetes.

Kubera is built with a different philosophy in mind compared with vSAN. vSAN, as an enterprise-class storage virtualization software, is built to provide overall management of storage in a single platform. In contrast, Kubera centers around the utilization of Kubernetes as a data layer rather than overall management.

message