Kubernetes Horizontal Pod Autoscaler

Horizontal Pod Autoscaler(HPA) is a controller that can scale most of the pod-based resources up and down based on your application workload. It does this by scaling the number of replicas of your pod once certain preconfigured thresholds are met and for the many applications we deploy scaling mostly depends on only a single metric which is CPU usage. To use HPA we need to define the number of maximum and minimum pods that we want to use for a particular application and also the memory percentage. If HPA is successfully enabled for a particular application Kubernetes will automatically monitor and controls the scaling up and down of pods based on the minimum and maximum limit we have defined.
 

For example, we will consider an application like Airbnb that runs in Kubernetes and it experiences high traffic of users if there is any offer on booking hotels and flights if the application is not optimized for handling this traffic, users may experience slow response times or even downtime. By using HPA, you may specify a target CPU usage percentage, a minimum and a maximum number of running pods, and other parameters. Kubernetes will automatically increase the number of pods to manage the increasing traffic when the CPU utilization reaches the specified level.

YAML code for HPA:

apiVersion: autoscaling/v2    
#this specifies Kubernetes API Version 
kind: HorizontalPodAutoscaler   
# this specifies Kubernetes object like HPA or VPA 
metadata:
 name: name_of_app   
spec:
 scaleTargetRef:
   apiVersion: apps/v2
   kind: Deployment
   name: name_of_app
 minReplicas: 1
 maxReplicas: 10
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 40
  - type: Resource
    resource:
     name: memory
     target:
      type: Utilization
      averageUtilization: 40

 

The last line ‘targetCPUUtilizationPercentage’ specifies the target CPU utilization percentage that the HPA will aim for when scaling the deployment. In this case, it is set to 50%, meaning that the HPA will attempt to keep the CPU utilization of the deployment at or below 50%. This YAML code will automatically scale the specified deployment based on CPU Utilization with a minimum of 1 and a maximum of 10 replicas. If the average CPU utilization of the container exceeds 50%, the HPA will automatically scale up the deployment to maintain optimal performance

 

Kubernetes – Autoscaling

Pre-requisite: Kubernetes 

Life before Kubernetes is like writing our code and pushing the code into physical servers in a data center and managing the resources needed by that server to run our application smoothly and another type is deploying our code in virtual machines(VM). With VMs also have problems with hardware and software components required by VMs costs are high and there are some security risks with VMs. Here comes the role of Kubernetes. It is an open-source platform that allows users to manage, deploy and maintain a group of containers and it is like a tool that manages multiple docker environments together. The problems we faced in VMs can be overcome by Kubernetes(K8s).

Similar Reads

Kubernetes Autoscaling

The main point of the cloud and Kubernetes is the ability to scale in the way that we can be able to add new nodes if the existing ones get full and at the same if the demand drops we should be able to delete those nodes. To solve this problem we can use Kubernetes auto scaler which is a component that allows us to scale the resources up and down according to the usage this method is called Kubernetes autoscaling. There are three different methods of Kubernetes autoscaling:...

Kubernetes Horizontal Pod Autoscaler

Horizontal Pod Autoscaler(HPA) is a controller that can scale most of the pod-based resources up and down based on your application workload. It does this by scaling the number of replicas of your pod once certain preconfigured thresholds are met and for the many applications we deploy scaling mostly depends on only a single metric which is CPU usage. To use HPA we need to define the number of maximum and minimum pods that we want to use for a particular application and also the memory percentage. If HPA is successfully enabled for a particular application Kubernetes will automatically monitor and controls the scaling up and down of pods based on the minimum and maximum limit we have defined....

Kubernetes Vertical Pod Autoscaler

The Vertical Pod Autoscaler (VPA) for Kubernetes is a tool that provides automated CPU and memory requests and limits modifications based on past resource utilization metrics. It may assist you in effectively and automatically allocating resources inside a Kubernetes cluster, down to the level of individual containers, when utilized appropriately. In addition to enhancing a pod’s performance and efficiency by managing its resource demands and limits, VPA may lower the cost of maintaining the application by reducing the wastage of resources. Pod resource use in a Kubernetes cluster may be improved using VPA, a useful feature....

Kubernetes Cluster Autoscaler

The cluster autoscaler is a tool that acts according to the requirements of your workloads, cluster autoscaler dynamically changes the number of nodes in a certain node pool. The cluster autoscaler scales back down to a minimum size that you choose when demand is low. This can increase the availability of your workload when you needs it. We don’t need to manually add or remove the nodes instead we can set a limit of maximum and minimum size for the node pool and the rest is taken care of by cluster autoscaler....