Autoscaling during rolling update

Support for HorizontalPodAutoscaler in kubectl

A Deployment may handle its underlying ReplicaSets via performing a rolling update. A HorizontalPodAutoscaler (HPA) is attached to a deployment when autoscaling has been set up for it. With its replicas field, which it modifies based on resource use, the HPA controls the number of replicas utilized for the deployment.

During a rolling update:

The deployment controller makes sure that the total number of pods from all old and new ReplicaSets involved matches the amount that is planned.
The HPA keepsmakes an eye on events and adjusts the overall amount of replicas as needed.

The scenario differs substantially for StatefulSets. Without the use of a ReplicaSet or similar intermediate resource, StatefulSets directly keep their pods. When performing a rolling update on an autoscaled StatefulSet:

Each pod is handled directly through the StatefulSet controller.
The number of pods that a StatefulSet maintains is impacted directly by the HPA’s modification of the StatefulSet’s replica count.

Container resource metrics

Container resource metrics are used by Kubernetes to track and control how much resource every container in a cluster uses. These metrics aid to make sure resources are used effectively and that initiatives function effectively. CPU and memory use are significant metrics that are often used for autoscaling and performance monitoring. A summary of Kubernetes’ container resource metrics is given below:

Key Metrics

CPU Usage:
- Measured in millicores (m): 1000m = 1 CPU core.
- Usage: The amount of CPU time the the container consumes.
- Limit: The maximum amount of CPU resources that may be utilized by the container.
- Request: The smallest amount of CPU resources that the container gets is guaranteed.
Memory Usage:
- Measured in bytes (B), kilobytes (Ki), megabytes (Mi), etc..
- Usage: The amount of RAM which the container utilizes at this point in time.
- Limit: The greatest quantity of RAM that is permitted to used by the container.
- Request: The smallest amount of memory that the container will always possess.

Resource Requests and Limits: You may set resource requests and limitations when defining a container in a pod specification in order to make sure the container gets the resources that it needs and to prevent it from using more than it should. This improves the cluster’s capacity to distribute assets and provide high-quality services.
Collecting Metrics: The Kubernetes Metrics Server is a cluster-wide consumption of resources data aggregator which can be utilized for collecting metrics. It collects metrics from each node’s Kubelet and then makes them accessible through the Kubernetes API.
Using Metrics for Autoscaling: These metrics are employed by the HorizontalPodAutoscaler (HPA) to automatically adjust the number of pod replicas based to observed utilization of resources.

Viewing Metrics: The Kubernetes Dashboard or the kubectl top commands may be employed to view metrics.

View Node Metrics: kubectl top nodes
View Pod Metrics: kubectl top pods

How to Use Kubernetes Horizontal Pod Autoscaler?

The process of automatically scaling in and scaling out of resources is called Autoscaling. There are three different types of autoscalers in Kubernetes: cluster autoscalers, horizontal pod autoscalers, and vertical pod autoscalers. In this article, we’re going to see Horizontal Pod Autoscaler.

Application running workload can be scaled manually by changing the replicas field in the workload manifest file. Although manual scaling is okay for times when you can anticipate load spikes in advance or when the load changes gradually over long periods of time, requiring manual intervention to handle sudden, unpredictable traffic increases isn’t ideal.

To solve this problem, Kubernetes has a resource called Horizontal Pod Autoscaler that can monitor pods and scale them automatically as soon as it detects an increase in CPU or memory usage (Based on a defined metric). Horizontal Pod Autoscaling is the process of automatically scaling the number of pod replicas managed by a controller based on the usage of the defined metric, which is managed by the Horizontal Pod Autoscaler Kubernetes resource to match the demand.

Autoscaling during rolling update

Container resource metrics

Key Metrics

How to Use Kubernetes Horizontal Pod Autoscaler?

Similar Reads