Scaling
Overview#
Scaling is the ability to handle the increase in usage by expanding the existing resources(nodes/pods).
Scaling types#
Scaling in Kubernetes can be done automatically
and manually
.
Automatic Scaling#
Kubernetes has the capability to provision resources automatically
in order to match the needed demand.
Horizontal Pod Autoscaler (HPA)#
HPA automatically resizes the number of pods
to match demand.
In order for hpa
to work, you have to:
-
Install metrics server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
-
Define
requests
for the metric used
It is configured and enabled by default in the deployed jans
components.
The default configuration scales in and out based on the CPU utilization of the pods.
<component-name>:
hpa:
enabled: true
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 50
# -- metrics if targetCPUUtilizationPercentage is not set
metrics: []
# -- Scaling Policies
behavior: {}
Cluster Autoscaler#
Cluster Autoscaler automatically resizes the number of nodes
in a given node pool, based on the demands of your workloads.
Cluster Autoscaler is available in AWS, GCP, and Azure.
Manual Scaling#
Kubernetes also offers the option to manually scale your resources.
For example, you can increase manually
the pod replicas of auth-server deployment using the following command:
kubectl scale --replicas=3 deployment/auth-server -n <namespace>
Created: 2022-07-21