Scaling is the ability to handle the increase in usage by expanding the existing resources(nodes/pods).

Scaling types#

Scaling in Kubernetes can be done automatically and manually.

Automatic Scaling#

Kubernetes has the capability to provision resources automatically in order to match the needed demand.

Horizontal Pod Autoscaler (HPA)#

HPA automatically resizes the number of pods to match demand.

In order for hpa to work, you have to:

  1. Install metrics server

    kubectl apply -f
  2. Define requests for the metric used

It is configured and enabled by default in the deployed jans components.

The default configuration scales in and out based on the CPU utilization of the pods.

      enabled: true
      minReplicas: 1
      maxReplicas: 10
      targetCPUUtilizationPercentage: 50
      # -- metrics if targetCPUUtilizationPercentage is not set
      metrics: []
      # -- Scaling Policies
      behavior: {}

Cluster Autoscaler#

Cluster Autoscaler automatically resizes the number of nodes in a given node pool, based on the demands of your workloads.

Cluster Autoscaler is available in AWS, GCP, and Azure.

Manual Scaling#

Kubernetes also offers the option to manually scale your resources.

For example, you can increase manually the pod replicas of auth-server deployment using the following command:

kubectl scale --replicas=3 deployment/auth-server -n <namespace>

Last update: 2024-09-27
Created: 2022-07-21