Kubernetes resource quotas are crucial to managing resource consumption and ensuring fair allocation among tenants in multi-tenant clusters. Administrators can control resource usage and prevent overconsumption by setting up resource quotas on a per-namespace level. This article discusses the benefits of resource quotas in multi-tenant cluster management and the steps to deploy and test them.
By following best practices, administrators can effectively utilize Kubernetes resource quotas to manage multi-tenant clusters and provide a quality experience for all tenants. This article will explore resource quota concepts and best practices Kubernetes administrators can use to optimize multi-tenant cluster environments.
The table below summarizes the Kubernetes resource quota concepts we will explore in this article.
Kubernetes resource quotas are native Kubernetes objects that impose limits on resource consumption. Administrators can set up resource quotas to control resource consumption on a per-namespace level, ensuring each namespace only uses a fair share of resources.
Kubernetes resource quotas are critical for allowing administrators to manage multi-tenant clusters by ensuring each tenant's namespace does not over consume the cluster's resources. A key challenge for multi-tenant cluster administrators is managing an equitable distribution of resources between each tenant and maintaining a reasonable quality of service for all workloads.
Examples of resources that can be limited using resource quotas include:
The below example of a Resource Quota object illustrates an example configuration. This object will restrict the total number of CPUs requests in the "team-1" namespace to 12, and the memory is limited to 8Gi.
The above example will allow any combination of CPU and memory requests for pods deployed to the "team-1" namespace as long as the total values do not exceed the specified Resource Quota restriction. For example, the above Resource Quota will allow the following pod configurations to be deployed to the "team-1" namespace:
Any combination of CPU/memory requests is allowed if the total number does not breach the limit defined in the Resource Quota. New pods will fail to deploy to the relevant namespace if the limit is reached. The API Server will decline to deploy new pods by using the Resource Quota Admission Controller to determine whether any restrictions are being breached.
Remember that resource quotas and namespace-level isolation are not considered high-security solutions for restricting resource access. Tools like virtual clusters will be more appropriate for administrators with sensitive workloads requiring stricter boundaries for security purposes.
There are three types of resource quotas cluster administrators can implement:
Administrators can limit the total persistent volume storage capacity requested by a namespace. Limiting storage access is useful when the cluster's storage capacity is limited and must be allocated fairly among tenants.
Resource quotas support limiting storage utilization based on either:
A common use case for resource quotas is to limit the amount of CPU and memory resources requested by a namespace. Pods can specify requests and limits for each compute resource, and resource quotas can restrict both.
A pod's compute requests specify the minimum resources required to run. The kube-scheduler will use this value to determine which worker node has available capacity to run the pod. A pod's compute limit specifies the maximum amount of resources the pod can consume before being throttled or evicted from the worker node to avoid disrupting neighboring pods. A pod can consume more resources than specified in the requests up to the specified limit value.
The below example shows how to define pod specifications for requests and limits:
Kubernetes resource quotas for compute resources will typically be configured to restrict both requests and limits to maintain a healthy cluster. Restricting the request values will ensure available worker node resources to schedule pods successfully, and restricting the limit values will ensure pods continue operating within reasonable boundaries. The restrictions may be based on the maximum compute resources available in the worker nodes configured for the cluster. Allowing requests and limits beyond the worker node's capabilities will lead to issues like unscheduled pods.
The below example shows a resource quota that restricts requests and limits CPU and memory:
Kubernetes resource quotas can restrict the number of objects deployed to ensure cluster infrastructure isn't being overconsumed. For example, administrators can restrict the number of Nodeport Services a tenant can create to ensure a particular tenant doesn't overconsume the cluster's worker node ports. Nodeport Services will reserve a port number across all nodes in the cluster, so these ports are finite resources to which administrators should limit access. Another object that resource quotas can restrict is a Load Balancer Service. Creating these Services may trigger the creation of additional cluster infrastructure (depending on the cluster's configuration), so limiting access to them can be important for cost control.
The example below shows a Kubernetes resource quota that restricts both Nodeport and Load Balancer Services:
Objects such as PVCs, Secrets, and ConfigMaps can also be restricted.
{{banner-1="/utility-pages/banners"}}
Let's try deploying a Kubernetes resource quota object and a deployment to consume some compute resources to see what happens. You’ll need a running Kubernetes cluster to deploy these objects. You can use tools like KIND and Minikube to set up a local cluster for testing purposes.
Create a text file called "cpu-quota.yaml" and paste the below contents:
This object will apply a CPU request limit in the "team-1" namespace. Let's create the namespace and apply the resource quota:
We can verify the resource quota was created by describing the object via Kubectl:
The output shows the resource quota created and how many CPU requests have been consumed in the "team-1" namespace. This is useful for checking whether a namespace is close to reaching its limit so the administrator may take action like warning the tenant or raising the quota. Enabling alerts with tools like Prometheus is useful in combination with resource quotas.
Now let's create a Deployment in an "nginx-deployment.yaml" file, which consumes 2 CPUs, which our quota allows:
Let's create the deployment:
The deployment’s pod has been created successfully. Now let's describe the resource quota object again:
We can see the "Used" value in the resource quota has changed to reflect the CPU requests consumed by the Deployment we created earlier. The resource quota will not allow any more CPU requests in the "team-1" namespace because the "Used" CPU requests value has reached the restriction we specified when creating the Resource Quota.
We can verify the resource quota will now block any new pods from deploying because the CPU requests will breach the quota. Create a "pod.yaml" file with the contents below:
Now let's try deploying this pod and observe what happens:
The output shows the new pod will exceed the CPU requests quota, so the pod cannot be created. This demonstrates how a Kubernetes resource quota can restrict resource consumption within a namespace.
Administrators can validate their ResourceQuota objects with tools like Kubeconform and “kubectl --dry-run” to ensure the object schema is matching the Kubernetes API schema. This will help detect configuration issues before objects are deployed to the cluster.
{{banner-1="/utility-pages/banners"}}
Kubernetes resource quota scopes are a helpful tool for applying multiple quotas in the same namespace to different workloads. Kubernetes resource quotas can be configured to apply to particular pods selectively. Administrators may want a quota to restrict certain types of pods while allowing other pods more freedom. The resource quota "scope" feature is an approach for selectively enforcing restrictions on resource consumption.
A resource quota scope defines what pods the resource quota will restrict. A scope can be pod attributes like Priority Class and Quality of Service Class.
Let's implement an example where two resource quotas are configured to target different pod Priority Classes. Create a file called "quotas-and-classes.yaml" with the following contents:
The above YAMl manifest will create:
Next, deploy the objects:
Now, when we create a pod, we expect the restriction on CPU requests to vary based on the pod's Priority Class name. Let's test this out by creating a "pod.yaml" defining a pod with a "low" Priority Class:
Let's apply the above pod and see what happens:
The error message shows the pod with the "low" Priority Class is rejected because a Resource Quota called "low-priority-resource-quota" only allows 2 CPU requests instead of 4 for this pod type. The scope of the Resource Quota is targeting pods with the "low" Priority Class.
To deploy a pod with 4 CPU requests, we need to change the Priority Class of the above pod to "high." This will trigger a different resource quota called "high-priority-resource-quota," which allows 4 CPU requests. The scope of this resource quota will target pods with the "high" Priority Class.
Try redeploying the pod:
Now, the pod is deploying successfully. The resource quota called "high-priority-resource-quota" can select pods with a "high" Priority Class, and this resource quota allows the 4 CPU requests.
Limit range objects are another feature built-in to all Kubernetes clusters. They are used to set default requests and limits for any pods that do not have these values set explicitly.
This object is relevant for users implementing resource quotas because the restrictions of a Resource Quota are only applied if the requests and limits are set in a pod. Kubernetes allows pods to omit the resource request and limit value, but this has the adverse effect of bypassing Resource Quota restrictions.
Limit Range objects help mitigate this problem by enforcing a default value for pod resources in a given namespace. Administrators can implement this object to guarantee that every deployed pod will have resource values set. For example, if we want to use a Resource Quota to restrict memory requests, we can use Limit Range objects to ensure every pod has a memory request value specified.
Limit Range objects can also specify the minimum and maximum resources requested by a pod. While a Resource Quota will enforce restrictions on the sum total of all resource requests in the whole namespace, Limit Range can be more granular by restricting how many resources individual pods can request. This can be helpful to ensure a single pod doesn't monopolize all resources within the namespace's quota and a reasonable minimum default value is applied.
Let's create a Limit Range object and a pod to see how they interact. Create a file called "limit-range-pod.yaml" with the following contents:
Now let's apply both objects:
Notice that we did not apply any resource requests in the pod's schema. Let's check the pod's attributes by running:
The limit range object injected the CPU and memory resource requests. The values match the defaults we specified in the Limit Range, and this will now ensure all pods have a default value applied when deployed to the "team-1" namespace. Implementing this object will allow resource quotas to enforce CPU and memory request limitations because the limit range fills in the missing values.
Implementing limit range objects is commonly done alongside resource quota objects because their functionalities complement each other.
There are drawbacks to Kubernetes resource quota objects that administrators should consider. Let’s take a closer look at three resource quota limitations.
Creating and maintaining separate resource quotas and limit range objects manually across every namespace increases operational overhead and risks human error. A misconfigured or missing resource quota will significantly adversely impact other workloads in the cluster because certain workloads will operate without limitations on resource consumption. Misconfiguring limit range objects will prevent sensible default resource allocations and will prevent resource quotas from functioning correctly. This is a critical risk for administrators relying on resource quotas as their only approach to resource isolation in a multi-tenant cluster.
Implementing virtual clusters will provide a better experience for administrators because resource quotas and limit range objects can be enabled by default, reducing the operational overhead and the risk of human error. This approach will guarantee every virtual cluster has a default resource quota to limit resource consumption and a Limit range resources to set sensible defaults to maintain a healthy multi-tenant environment.
Security of resource quotas and limit range objects is also a significant administrative challenge. The only approach natively available in Kubernetes to restrict access to these objects is role-based access control (RBAC). RBAC will allow administrators to prevent tenants from accessing sensitive objects like Kubernetes resource quotas. However, configuring RBAC to avoid unauthorized access in a multi-tenant cluster may be complicated and involve further operational overhead. It is essential to prevent tenants from having access to these object types due to the impact of an unwanted configuration change. Relying exclusively on RBAC to avoid this access is a thin layer of defense.
Resource quotas can also only restrict a limited type of Kubernetes objects. Custom resource definitions (CRDs) cannot be restricted, and this is an important object type to control in a multi-tenant cluster. CRDs are cluster-wide by default; therefore, unwanted modifications or excessive numbers of CRDs will potentially impact many tenants. Virtual clusters help restrict CRDs to a particular namespace, which Kubernetes cannot do natively. This enables administrators to manage the use of CRDs in a multi-tenant cluster more effectively.
Virtual clusters provide an additional layer of isolation and help prevent tenants from accessing objects outside of their own virtual cluster. Since each virtual cluster has its own control plane, a security boundary prevents cross-namespace API access to foreign Kubernetes objects. This improves the security posture of multi-tenant environments.
There are some key best practices administrators will benefit from following to ensure resources in their multi-tenant clusters are managed effectively:
{{banner-2="/utility-pages/banners"}}
Resource quotas are a key tool for administrators managing multi-tenant Kubernetes clusters. They enable administrators to allocate resources on a per-tenant basis logically and block pods that breach the configured limits. The tool ensures healthy cluster operations by mitigating resource overconsumption and allowing each tenant fair access to resources. A misconfigured quota will have a negative impact on tenants and cause increased operational overhead for administrators, so it's important to follow best practices when implementing resource quotas to ensure this feature is helping rather than hindering operations.
For administrators looking to manage resource allocation with a higher focus on security, namespace-based isolation with resource quotas may not be enough for sensitive workloads. Tools like virtual clusters will provide stricter resource isolation capabilities for a superior security setup.