Kubernetes storage class

March 14, 2024

min read

Understanding the nature of storage in Kubernetes is crucial to effective operations. At its heart, Kubernetes is about running containerized applications, and while containers are ephemeral, the data often isn't. This is where Kubernetes storage classes come into play.

A Kubernetes storage class acts as a blueprint defining storage types, provisioning, and behavior. They enable automated provisioning that helps Kubernetes administrators scale their deployments and reduce tedious manual work.

The benefits of Kubernetes storage classes are magnified in multi-tenant environments. When multiple tenants have their own storage requirements, storage classes can ensure each tenant gets the right storage and mitigate the risk of conflicts.

This article will delve into the intricacies of Kubernetes storage classes, exploring their dynamic nature, their integration with multi-tenant virtual clusters, and how they can be optimized for many real-world scenarios. By the end, you'll comprehensively understand storage classes and why they're indispensable to the Kubernetes ecosystem.

Summary of key Kubernetes storage class concepts

The table below summarizes the Kubernetes storage class concepts this article will explore in detail.

Concept	Description
Kubernetes storage classes	A mechanism for defining different types of storage offered by a Kubernetes cluster.
Dynamic volume provisioning	A mechanism that allows storage volumes to be created on-demand without a manual call to a cloud or storage provider.
Kubernetes storage class best practices	Recommended strategies and methods for optimal storage class utilization, including retention policies, mount options, labeling, version control, and monitoring.
Securing Kubernetes storage classes	Best practices for access modes, RBAC, namespace isolation, and more.
Real-world Kubernetes storage class use cases	Uses of Kubernetes storage classes in scenarios like CI/CD, logging, and database management.
Troubleshooting Kubernetes storage class issues	Exploring possible solutions for issues like PV-PVC binding, failed provisioning, and capacity issues.

What is a Kubernetes storage class?

Before we jump into the details, let’s take a look at Kubernetes storage class basics. Storage classes act as a blueprint for creating storage. They define the types of storage available, how they're provisioned, and how they should behave. Instead of manually provisioning storage whenever needed, storage classes allow for automated provisioning based on the specified criteria.

Why are Kubernetes storage classes important?

Before diving into storage classes, consider the traditional way of managing storage. Without a defined storage class, the onus would fall on the system administrators whenever an application requires storage. They would have to provision the storage manually, ensure it meets the required specifications, and then bind it to the application in need. This model is cumbersome, error-prone, and certainly not scalable.

The multi-tenancy advantage

The real benefits of storage classes shine in multi-tenancy environments. When multiple users or teams share the same Kubernetes cluster, each with its own requirements and nuances, storage classes ensure each tenant gets the right kind of storage without any overlap or conflict. This is especially valuable when considering setting up a virtual cluster-based developer platform within the primary Kubernetes cluster, enhancing resource allocation, security, and management in a multi-tenant setup.

Illustrated in the image below, you can see how each team can have its own storage class (SC-A1, SC-A2 etc.) and how each storage class is isolated from the other teams.

Storage class in a multi-tenant Kubernetes environment — Storage class (SC) in a multi-tenant Kubernetes environment.

Dynamic volumes provisioning

Kubernetes stands out from traditional infrastructure solutions due to its inherently dynamic nature. While containers are transient and can be spun up or down based on demand, the data associated with some containers often requires a more persistent solution. That's where Dynamic Volume Provisioning (DVP) comes into the picture.

There are two primary ways Persistent Volumes (PVs) can be provisioned in Kubernetes:

Static Provisioning: Administrators manually allocate storage ahead of time in traditional storage provisioning. This practice often results in over-provisioning or under-provisioning of resources. This method lacks flexibility, is often wasteful, and is not scalable.

A logical overview of static volume provisioning.

Dynamic Volume Provisioning (DVP): on the other hand, allows storage volumes to be created on the fly based on the application's demands. When a user creates a Persistent Volume Claim (PVC) and no existing PV matches their claim, the Kubernetes system can automatically create a PV for them, aligning with a predefined storage class. This ensures optimal resource utilization and provides scalability to meet the demands of growing applications.

Dynamic volume provisioning using storage classes.

The following section will explore how dynamic provisioning is crucial in more complex environments, particularly in multi-tenant developer platforms, and examine how dynamic provisioning integrates with ephemeral storage, virtual clusters, and host cluster synchronization, providing a robust solution for such platforms.

Dynamic Provisioning for a Multi-tenant Developer Platform

In a multi-tenant Kubernetes environment, different teams or users may have virtual clusters within a multi-tenant Kubernetes environment. In such a layered environment, efficiently managing storage becomes paramount. The dynamic provisioning shines here as

It allows individual virtual clusters to claim storage based on their immediate requirements.
Offers strict data isolation, ensuring that one virtual cluster cannot access another's data unless explicitly permitted.

This feature is particularly crucial in developer platforms, where the nature of work often fluctuates between long-term projects and short-lived, experimental tasks. For a deeper dive into multi-tenancy, refer to our Kubernetes multi-tenancy article.

Ephemeral Storage and Ephemeral Environments

The ephemerality of specific tasks requires a storage solution that can keep pace. Ephemeral storage addresses this demand, providing a temporary solution that aligns with transient environments. For example, in tasks such as testing, staging, or one-off experiments, data persistence beyond the task's lifecycle is unnecessary. These volumes can be rapidly provisioned and discarded once their purpose is served, aligning perfectly with the dynamic nature of developer workflows.

Using dynamic provisioning can help automatically cater to the immediate storage needs of ephemeral environments. It ensures that resources are optimally utilized and promptly freed post-usage, preventing wastage and ensuring agility.

Synchronization: Bridging Virtual and Host Clusters

Ensuring data consistency is critical in a multi-tenant Kubernetes environment. Virtual clusters operate within a host cluster, and while each virtual cluster might have its storage needs, the actual provisioned storage is within the host cluster. There arises a need to synchronize data changes between the two, especially when data in a virtual cluster’s volume needs to be replicated, backed up, or synchronized with a volume in the host cluster or another virtual cluster. How do we ensure seamless integration?

Dynamic provisioning facilitates this by automatically catering to the storage requirements of virtual clusters, ensuring both the source (in the virtual cluster) and target volumes (in the host cluster) have the same or compatible storage classes. This eliminates the need for manual intervention and ensures that the provisioning parameters and configuration for both volumes are consistent, making data synchronization, replication, or backup processes smoother.

Features such as dynamic provisioning, coupled with ephemeral storage and robust synchronization mechanisms, work together to support a complex, multi-layered architecture of the multi-tenant developer platform.

A step-by-step guide to defining storage classes

Storage classes in Kubernetes provides a way for administrators to describe their “classes” of storage. Different classes might map to quality-of-service levels, backup policies, or other administrative policies.

To define a StorageClass resource, specify a provisioner, parameters, and a reclaim policy. Here's an example that shows you how to define a storage class:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: standard
  labels:
    environment: development
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2
  zone: us-west-01
reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: Immediate
mountOptions:
  - noexec

In this example, any PVC that doesn't specify a volume type will get a volume of type ‘gp2’ provisioned in us-west-01 if they are in a cluster running on AWS and use this storage class “standard.”

There are key components of Kubernetes storage classes as used in the above example:

Provisioner: This field is required to determine which volume plugin is used to provision PVs.
Parameters: This section is provisioner-specific, meaning each provisioner may require, allow, or ignore certain key-value pairs. You can set various parameters to fine-tune the type of storage, its performance, and other attributes such as type, zone, iopsPerGB, encryption mode, etc.
ReclaimPolicy: This field can have two values: Retain or Delete. When set to Retain, the PV remains even after deleting the PVC. For Delete, the PV is deleted automatically. If not specified, the default behavior is Delete.
AllowVolumeExpansion: When set to true, this field allows the volume to be resized. This parameter can only be used to grow a Volume, not to shrink it.
VolumeBindingMode: Determines when a Persistent Volume (PV) will be bound to a Persistent Volume Claim (PVC). The available modes are:
- Immediate: The default mode. The PV binding happens immediately.
- WaitForFirstConsumer: Delays the binding and provisioning of a PV until a pod using the PVC is created. This mode ensures that the PV chosen is accessible by the node on which the Pod will be scheduled. It's beneficial for multi-zone clusters to avoid un-schedulable Pods due to zone mismatch between the PV and the Pod.
mountOptions: Mount options are specific to the type of file system being used. Always refer to the provider documentation to check all the mount options they support.

After defining a StorageClass, you must create it within the Kubernetes cluster. Here's how:

> kubectl apply -f my-storageclass.yaml

Where “my-storageclass.yaml” contains your storage class definition.

Once applied, ensure that the storage class was created successfully:

>> kubectl get storageclass

You should see the newly defined storage class in the list:

NAME       PROVISIONER            RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
standard   kubernetes.io/aws-ebs  Retain          Immediate              true                   1m

Congratulations! You have successfully created a storage class in your cluster.

Kubernetes storage class best practices

Managing storage effectively within Kubernetes is paramount for application reliability and performance. Let’s look into best practices when configuring storage classes.

Implement retention and reclaim policies

The choice between retain and delete hinges on data sensitivity. With retain, if a PVC is deleted, the underlying PV and data remain, safeguarding against accidental data loss. This is valuable for critical data, such as databases or financial systems that handle sensitive transactions and banking operations.

The delete policy cleans up resources and is useful for temporary or less-critical data to ensure efficient resource utilization. For example, when used for user-uploaded files to be processed and moved elsewhere.

Persistent volume fate after POD deletion is determined by the storage class reclaimPolicy.

Understand and utilize volume binding modes

Two primary modes exist:

Immediate
WaitForFirstConsumer.

The immediate mode is suitable for general purposes, like in a backend service where immediate data availability is more crucial than specific data locality in processing user orders.

The WaitForFirstConsumer binding mode delays the PV provisioning until a pod using the PVC gets created. This is essential for locality by ensuring PVs are in the same zone as the consuming pod, especially in multi-zone clusters. An example of this use case is a distributed multi-zone video streaming platform, where data needs to be closest to the consumer to reduce latency to ensure viewers get the best streaming quality.

Optimize Kubernetes storage class performance

The storage type (e.g., SSD vs. HDD) impacts I/O performance. Based on workload, ensure you select the optimal type. e.g., databases benefit from high IOPS provided by SSDs, while archival data can reside on slower, cheaper HDDs.

Use consistent labeling

Labeling helps in managing, querying, and filtering resources. For example, grouping storage classes by environment or team allows for efficient resource tracking and access controls. In the above storage class definition example we added a label ‘environment: development’. We can filter resources based on this label:

> kubectl get storageclass -l environment=development

Version control your storage classes

Version control is vital for storage class change tracking. It eases rollbacks and enhances collaboration. The versions can be labeled or annotated. For example:

metadata:
  ...
  labels:
    version: "v1.0"
  annotations:
    storageclass.kubernetes.io/version: "v1.0"
    storageclass.kubernetes.io/notes: "Initial ver - storage class"

Use both labels and annotations for storage class versions. Labels help quickly select resources with kubectl, while annotations store the detailed metadata. Also, utilize Git tags to mark specific versions.

Regularly monitor and audit

Monitoring provides the continuous oversight essential for spotting potential issues before they become major problems. Monitor metrics like storage consumption rate, available capacity, and I/O operations.

Auditing tools can track changes, helping in troubleshooting and compliance. Prometheus is a great open-source monitoring tool. Combined with kube-state metrics, it allows users to gather detailed metrics on their storage resources in a Kubernetes cluster. Here is an example of recommended metrics to monitor:

Kube_persistentvolumeclaim_info
Kube_persistentvolumeclaim_resource_requests_storage_bytes
Kube_persistentvolumeclaim_status_condition

Grafana is another open-source platform for monitoring and observability, and it can integrate with Prometheus to visualize the data.

Remember, when setting up monitoring, it's crucial not just to collect data but to have a plan for alerting on and responding to specific events or anomalies. In addition, many cloud providers or storage vendors offer tools for monitoring storage backends' performance, capacity, and health.

Securing Kubernetes Storage Classes

Ensuring the security of storage classes in Kubernetes is a must. There's a lot to consider, from restricting access to sensitive data to ensuring that storage isn't misused. The sections below explain how to keep your Kubernetes storage solutions secure.

Be mindful of access modes

Access modes define how volumes can be accessed from a pod. Matching the mode to the application's requirements is crucial to ensure both functionality and data integrity. In general, there are three modes available:

ReadWriteOnce (RWO): The volume can be mounted as read-write by a single node. Suitable for databases.
ReadOnlyMany (ROX): The volume can be mounted read-only by many nodes.
ReadWriteMany (RWX): The volume can be mounted as read-write by many nodes. Suitable for shared storage use cases like CMS.

Access modes are not directly specified within the StorageClass itself but rather in the specifications of the PersistentVolume(PV) and PersistentVolumeClaim (PVC).

apiVersion: v1
kind: PersistentVolume
...
spec:
  accessModes:
    - ReadWriteOnce

apiVersion: v1
kind: PersistentVolumeClaim
...
spec:
  accessModes:
    - ReadWriteOnce

Implement storage class - namespace isolation via RBAC

Role-based access control (RBAC) restricts permissions on Kubernetes resources. Organizations can prevent unauthorized changes by setting specific roles for storage classes and ensure only trusted entities can allocate or modify storage, which is especially vital in multi-tenant clusters.

Suppose you have a storage class “gold,” and you want to allow only a certain group of users e.g., members of the namespace “team-a” to create PersistentVolumeClaims (PVCs) using the “gold” storage class.

First, define a role that restricts the creation of PVCs with the gold storage class:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: team-a
  name: gold-pvc-role
rules:
- apiGroups: [""]
  resources: ["persistentvolumeclaims"]
  verbs: ["get", "list", "watch", "create", "delete"]
  resourceNames: ["gold"]

Then, bind the role to a user, group, or ServiceAccount. For this example, let's say we have a ServiceAccount named team-a-user in the team-a namespace:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: gold-pvc-rolebinding
  namespace: team-a
subjects:
- kind: ServiceAccount
  name: team-a-user
  namespace: team-a
roleRef:
  kind: Role
  name: gold-pvc-role
  apiGroup: rbac.authorization.k8s.io

With this configuration, only pods that run as the team-a-user ServiceAccount in the “team-a” namespace can create PVCs with the “gold” storage class. If someone tries to create a PVC with the “gold” storage class without the correct permissions, they'll receive a forbidden error.

Restrict storage class creation with RBAC

You might want only certain administrators to create new storage classes but allow developers to create PVCs. You can achieve this using RBAC as well:

kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  namespace: default
  name: storage-class-manager
rules:
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses"]
  verbs: ["get", "list", "create", "delete"]

Securely manage secrets

Often, storage classes require credentials to interact with backend storage, especially if it's cloud-based. Kubernetes Secrets can be used to manage these credentials safely. Instead of hardcoding credentials, you should reference a secret.

For example, you'd store your storage provider credentials in a secret and then reference that secret in your storage class or provisioner:

apiVersion: v1
kind: Secret
metadata:
  name: storage-provider-credentials
type: Opaque
data:
  access-key: <base64-encoded-access-key>
  secret-key: <base64-encoded-secret-key>

Create secure network policies

While primarily for network traffic, network policies can impact storage, especially when considering solutions operating over the network (like NFS or certain cloud providers). Ensure that only authorized pods can communicate with these storage backends.

Real-world Kubernetes storage class use cases

Storage classes shine brightly in many real-world scenarios. The below scenarios highlight the versatility and practicality of Kubernetes storage classes:

Multi-tenant environments

Different teams or customers share the same cluster resources in a multi-tenant Kubernetes cluster but need logical separation for security and quota enforcement. Kubernetes storage classes are beneficial in these cases because they provide:

Dedicated storage classes per tenant: Each tenant or team can be assigned dedicated storage classes. This ensures that the provisioned storage meets the specific needs and SLAs of each tenant. For instance, 'Team A' might require high-speed SSDs while 'Team B' is okay with slower HDDs.‍
Quota management: By pairing storage classes with ResourceQuotas, it's possible to enforce how much storage each tenant can consume.

apiVersion: v1
kind: ResourceQuota
metadata:
  name: tenant-a-quota
spec:
  hard:
    gold.storageclass.storage.k8s.io/requests.storage: 100Gi

Database storage management and stateful applications

Databases and other stateful applications have unique storage requirements, like consistent I/O performance:

‍High throughput: Use storage classes to provision replicated or highly available volumes, depending on the storage backend. For example, a storage class might utilize provisioned IOPS for databases requiring consistent I/O.

apiVersion: storage.k8s.io/v1
kind: StorageClass
…
parameters:
  type: io1
  iopsPerGB: "10"

Backup and snapshot: Some storage backends support snapshots. Creating a storage class specific to snapshot-enabled volumes makes it easier to automate backup processes for stateful applications.

CI/CD pipelines, logging, and monitoring:

Continuous integration and continuous deployment (CI/CD) pipelines, and logging and monitoring solutions produce or require significant storage.

Ephemeral storage for CI/CD: When running build jobs, it might be necessary to provision temporary storage for the job's lifecycle. An ephemeral storage class can be defined for such use cases, often with faster provisioning times but lower durability guarantees.
Logging and monitoring data: Solutions like ELK (Elasticsearch, Logstash, Kibana) or Grafana with Prometheus require persistent storage to retain metrics and logs. A dedicated storage class tailored for write-intensive operations can be beneficial.

These real-world scenarios highlight the versatility and practicality of Kubernetes storage classes.

Common Kubernetes storage class issues

Even with the best configurations, issues can arise. Knowing how to troubleshoot these problems is essential for cluster administrators. Let's explore some common issues related to storage classes:

PVC and PV binding issues

Sometimes, a persistent volume claim (PVC) may not bind to a persistent volume (PV). This can be due to several reasons.

Mismatched storage class

Symptom: The PVC remains in the pending state.
Resolution: Ensure the storageClassName specified in the PVC matches an existing storage class's name.

Insufficient storage

Symptom: The PVC remains in the pending state if no PV of the required size is available.
Resolution: Either create a PV with the required storage size, check if a ResourceQuota is set to restrict the storage class size, or adjust the size requested in the PVC.

Access mode mismatch

Symptom: A PVC requests an access mode not supported by the available PVs.
Resolution: Ensure that the access mode in the PVC (e.g., ReadWriteOnce) matches the capabilities of the PV or the storage backend.

To diagnose binding issues, describe the PVC to get more details:

> kubectl describe pvc <pvc-name>

Failed provisioning attempts

There could be instances where a storage class fails to provision a PV dynamically.

Cloud provider quotas

Symptom: Dynamic provisioning fails because of cloud provider limitations like reaching the quota for disk resources.
Resolution: Check the cloud provider's console or logs for quota-related messages. Increase the quota or clean up unused resources.

Misconfiguration in storage class parameters

Symptom: The provisioning process fails due to parameters the provisioner doesn't recognize or support.
Resolution: Validate the parameters specified in the storage class. Consult the documentation of the specific provider to ensure compatibility. A very common parameter misconfiguration here is the mountOptions set to an invalid or unsupported value as discussed earlier in the best practices section.

Insufficient permissions

Symptom: The provisioner might not have the required permissions to create resources.
Resolution: Ensure that the provisioner (or the service account it runs under) has adequate RBAC permissions.

To diagnose provisioning failure issues, describe the PVC to get more details:

>> kubectl describe pvc <pvc-name>

Other common issues

In addition to the typical binding and provisioning concerns, several other problems can occur, often due to misconfigurations or unexpected environment behaviors. Addressing these requires a keen understanding of the system and sometimes manual intervention. Let's discuss some of these issues:

Deleting PVC doesn't release storage

Symptom: Storage resources aren't freed after deleting a PVC.
Resolution: Ensure the reclaimPolicy of the storage class isn't set to Retain. If it is, manual intervention is required to clean up the PV and associated storage.

Slow performance

Symptom: Applications experience slow I/O.
Resolution: Ensure the storage class parameters are optimized for the workload. For instance, using SSD-backed storage for I/O-intensive applications.

Volume expansion failures

Symptom: After trying to resize a PVC, the new storage isn't reflected, or the resizing process fails.
Resolution: Several resolutions are possible:

The storage class might have allowVolumeExpansion set to false. Ensure it's set to true to enable volume expansion.
Not all storage backends support resizing. Check if the provisioner and the storage backend support dynamic volume expansion.
The resizing request might have exceeded the quota or physical constraints of the backend storage system. Ensure that there's enough available storage and that the requested size doesn't exceed any limits or quotas.

Understanding the symptoms and resolutions of common issues can save a lot of time and prevent potential data loss. Continuously monitor storage events, set up alerts for unusual patterns, and regularly check the health of PVs and PVCs.

Learn about the various Kubernetes sandbox solutions and how they can be useful for app development, testing, and learning purposes.

Read the Guide