Di episode ini kita akan coba bahas Kubernetes Vertical Pod Autoscaler (VPA) untuk automatic resource sizing. Kita akan mempelajari bagaimana VPA work, cara install dan configure, update mode, dan best practice untuk right-sizing Pod resource automatically.

Catatan
Untuk kalian yang ingin membaca episode sebelumnya, bisa click thumbnail episode 30 di bawah ini
Di episode sebelumnya kita sudah belajar tentang Horizontal Pod Autoscaler (HPA) yang scale number Pod. Selanjutnya di episode 31 kali ini, kita akan coba bahas Vertical Pod Autoscaler (VPA), yang automatically adjust CPU dan memory request dan limit untuk container based on actual usage.
Catatan: Disini saya akan menggunakan Kubernetes Cluster yang di install melalui K3s.
Sementara HPA scale horizontally (more Pod), VPA scale vertically (bigger Pod). Setting right resource request challenging - terlalu rendah cause OOMKill dan throttling, terlalu tinggi waste resource. VPA solve ini dengan continuously analyzing usage dan recommending atau applying optimal resource value.
Vertical Pod Autoscaler (VPA) automatically adjust CPU dan memory request dan limit untuk container based on historical dan current resource usage.
Bayangkan VPA seperti tailor - dia measure actual size kalian (resource usage) dan adjust clothes kalian (resource request/limit) untuk fit perfectly. Instead of guessing size, VPA gunakan real data untuk right-size Pod kalian.
Karakteristik kunci VPA:
Memahami key difference:
| Aspek | VPA | HPA |
|---|---|---|
| Scaling Direction | Vertical (resource size) | Horizontal (replica count) |
| What Change | CPU/Memory request/limit | Number Pod |
| Require Restart | Yes (di Auto/Recreate mode) | No |
| Use Case | Right-size resource | Handle traffic spike |
| Metric | Historical usage | Current metric |
| Response Time | Slower (require restart) | Faster (add Pod) |
| Best For | Stateful workload | Stateless workload |
Bisa gunakan together:
Warning
Peringatan: Jangan gunakan VPA dan HPA pada same CPU/memory metric simultaneously - mereka bisa conflict. Gunakan HPA untuk CPU/memory, VPA untuk resource lain, atau HPA untuk scaling dan VPA di recommendation mode.
VPA solve critical resource management challenge:
Tanpa VPA, kalian either waste resource (over-provision) atau risk failure (under-provision), dan harus manually adjust as application change.
VPA consist of tiga component:
1. Recommender:
2. Updater:
3. Admission Controller:
VPA tidak installed by default. Mari kita install.
Clone VPA repository:
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscalerInstall VPA:
./hack/vpa-up.shIni install:
Verify installation:
kubectl get pods -n kube-system | grep vpaOutput:
vpa-admission-controller-xxx 1/1 Running 0 1m
vpa-recommender-xxx 1/1 Running 0 1m
vpa-updater-xxx 1/1 Running 0 1mCheck VPA CRD:
kubectl get crd | grep verticalpodautoscalerOutput:
verticalpodautoscalercheckpoints.autoscaling.k8s.io
verticalpodautoscalers.autoscaling.k8s.ioVPA support different update mode:
VPA calculate recommendation tapi tidak apply:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Off"Use case:
VPA set resource hanya ketika Pod created, never update running Pod:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Initial"Use case:
VPA evict dan recreate Pod dengan new resource value:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Recreate"Behavior:
Use case:
VPA automatically update Pod (currently same as Recreate):
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Auto"Catatan
Auto mode currently behave seperti Recreate. In-place update (tanpa Pod restart) planned untuk future Kubernetes version.
Create Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-app
spec:
replicas: 2
selector:
matchLabels:
app: my-app
template:
metadata:
labels:
app: my-app
spec:
containers:
- name: app
image: nginx:1.25
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "200m"
memory: "256Mi"Create VPA:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Auto"Apply:
kubectl apply -f app-deployment.yml
kubectl apply -f app-vpa.ymlControl resource mana yang VPA bisa modify:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: "app"
mode: "Auto"
minAllowed:
cpu: "50m"
memory: "64Mi"
maxAllowed:
cpu: "1000m"
memory: "1Gi"
controlledResources:
- cpu
- memoryResource Policy Option:
Target specific container di multi-container Pod:
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
# Main application container
- containerName: "app"
mode: "Auto"
minAllowed:
cpu: "100m"
memory: "128Mi"
maxAllowed:
cpu: "2000m"
memory: "2Gi"
# Sidecar container
- containerName: "sidecar"
mode: "Off" # Jangan modify sidecarkubectl get vpaOutput:
NAME MODE CPU MEM PROVIDED AGE
my-app-vpa Auto 150m 256Mi True 5mkubectl describe vpa my-app-vpaOutput show recommendation:
Name: my-app-vpa
Namespace: default
API Version: autoscaling.k8s.io/v1
Kind: VerticalPodAutoscaler
Recommendation:
Container Recommendations:
Container Name: app
Lower Bound:
Cpu: 100m
Memory: 128Mi
Target:
Cpu: 150m
Memory: 256Mi
Uncapped Target:
Cpu: 150m
Memory: 256Mi
Upper Bound:
Cpu: 300m
Memory: 512MiRecommendation Field:
kubectl get vpa my-app-vpa -o yamlapiVersion: apps/v1
kind: Deployment
metadata:
name: web-app
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
containers:
- name: nginx
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
cpu: "100m"
memory: "128Mi"
limits:
cpu: "500m"
memory: "512Mi"
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: web-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: web-app
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: "nginx"
minAllowed:
cpu: "50m"
memory: "64Mi"
maxAllowed:
cpu: "1000m"
memory: "1Gi"apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres
replicas: 1
selector:
matchLabels:
app: postgres
template:
metadata:
labels:
app: postgres
spec:
containers:
- name: postgres
image: postgres:15
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2000m"
memory: "2Gi"
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: postgres-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: StatefulSet
name: postgres
updatePolicy:
updateMode: "Off" # Recommendation only untuk databaseapiVersion: apps/v1
kind: Deployment
metadata:
name: api-service
spec:
replicas: 3
selector:
matchLabels:
app: api
template:
metadata:
labels:
app: api
spec:
containers:
- name: api
image: myapi:latest
resources:
requests:
cpu: "200m"
memory: "256Mi"
- name: log-agent
image: fluent/fluentd:v1.16
resources:
requests:
cpu: "50m"
memory: "64Mi"
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: api-service-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: api-service
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: "api"
mode: "Auto"
minAllowed:
cpu: "100m"
memory: "128Mi"
maxAllowed:
cpu: "2000m"
memory: "2Gi"
- containerName: "log-agent"
mode: "Auto"
minAllowed:
cpu: "25m"
memory: "32Mi"
maxAllowed:
cpu: "200m"
memory: "256Mi"apiVersion: apps/v1
kind: Deployment
metadata:
name: worker
spec:
replicas: 5
selector:
matchLabels:
app: worker
template:
metadata:
labels:
app: worker
spec:
containers:
- name: worker
image: myworker:latest
resources:
requests:
cpu: "100m"
memory: "128Mi"
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: worker-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: worker
updatePolicy:
updateMode: "Initial" # Hanya set di Pod creationkubectl apply -f app-deployment.yml
kubectl apply -f app-vpa.ymlCreate load untuk trigger resource usage:
kubectl run -it --rm load-generator --image=busybox:1.36 --restart=Never -- /bin/sh
# Generate CPU load
while true; do :; donekubectl get vpa my-app-vpa --watchSebelum VPA:
kubectl get pod <pod-name> -o yaml | grep -A 5 resourcesSetelah VPA update (di Auto mode):
# VPA akan evict dan recreate Pod
kubectl get pods -w
# Check new resource value
kubectl get pod <new-pod-name> -o yaml | grep -A 5 resources1. Require Pod Restart:
2. Not untuk Horizontal Scaling:
3. Conflict dengan HPA:
4. No Downscaling Protection:
5. Limited History:
6. Experimental Status:
Problem: VPA dan HPA conflict ketika both target CPU/memory.
# Bad: Both VPA dan HPA pada CPU
# VPA adjust CPU request
# HPA scale based on CPU utilization
# Mereka fight each otherSolusi: Gunakan different metric atau mode:
# Option 1: VPA di Off mode (recommendation only)
updatePolicy:
updateMode: "Off"
# Option 2: HPA pada CPU, VPA pada memory only
resourcePolicy:
containerPolicies:
- containerName: "app"
controlledResources:
- memory # VPA hanya manage memoryProblem: VPA bisa set extreme value.
Solusi: Selalu set boundary:
resourcePolicy:
containerPolicies:
- containerName: "app"
minAllowed:
cpu: "50m"
memory: "64Mi"
maxAllowed:
cpu: "2000m"
memory: "4Gi"Problem: Pod eviction cause data loss atau downtime.
Solusi: Gunakan Off atau Initial mode untuk stateful app:
# Untuk database, gunakan Off mode
updatePolicy:
updateMode: "Off"Problem: VPA need baseline untuk start from.
Solusi: Selalu set initial request:
resources:
requests:
cpu: "100m" # Set reasonable initial value
memory: "128Mi"Problem: Running VPA di Off mode tapi never checking recommendation.
Solusi: Regularly review dan apply recommendation:
kubectl describe vpa my-app-vpa
# Review Target recommendation
# Update Deployment manually jika neededTest VPA sebelum enable Auto mode:
# Phase 1: Observe recommendation
updatePolicy:
updateMode: "Off"
# Phase 2: Setelah validation, enable Auto
updatePolicy:
updateMode: "Auto"Define min/max based on workload:
resourcePolicy:
containerPolicies:
- containerName: "app"
minAllowed:
cpu: "100m" # Minimum untuk functionality
memory: "128Mi"
maxAllowed:
cpu: "2000m" # Maximum untuk cost control
memory: "2Gi"Avoid disrupting running Pod:
updatePolicy:
updateMode: "Initial" # Hanya affect new PodTrack VPA behavior:
# Watch VPA recommendation
kubectl get vpa --watch
# Check VPA event
kubectl describe vpa my-app-vpa
# Monitor Pod eviction
kubectl get events --sort-by='.lastTimestamp'Protect availability during update:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: my-app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: my-appVPA untuk resource optimization, bukan traffic handling:
Add annotation explaining choice:
metadata:
annotations:
vpa.note: "Auto mode dengan 100m-2000m CPU range based on load testing"kubectl describe vpa my-app-vpa
# Status show: No recommendation availableCause:
Solusi:
# Check VPA component
kubectl get pods -n kube-system | grep vpa
# Check target exist
kubectl get deployment my-app
# Wait untuk metric collection (5-10 menit)# Pod tidak being evicted despite recommendationCause:
Solusi:
# Check update mode
kubectl get vpa my-app-vpa -o yaml | grep updateMode
# Check PDB
kubectl get pdb
# Check jika recommendation differ dari current
kubectl describe vpa my-app-vpa# VPA keep evicting PodCause:
Solusi:
# Widen min/max range
maxAllowed:
cpu: "4000m" # Increase dari 2000m
memory: "4Gi" # Increase dari 2Gi
# Atau switch ke Off mode
updatePolicy:
updateMode: "Off"kubectl describe vpa my-app-vpa
# Target: 4000m CPU (seem terlalu tinggi)Solusi:
# Set maxAllowed untuk cap recommendation
resourcePolicy:
containerPolicies:
- containerName: "app"
maxAllowed:
cpu: "2000m"
memory: "2Gi"cd autoscaler/vertical-pod-autoscaler
./hack/vpa-down.shIni remove:
kubectl get vpa
kubectl get vpa -o wide
kubectl get vpa --all-namespaceskubectl describe vpa my-app-vpakubectl get vpa my-app-vpa -o yamlkubectl get pods -n kube-system | grep vpa
kubectl logs -n kube-system <vpa-recommender-pod>kubectl delete vpa my-app-vpaPod continue running dengan current resource value.
Pada episode 31 ini, kita telah membahas Vertical Pod Autoscaler (VPA) di Kubernetes secara mendalam. Kita sudah belajar bagaimana VPA automatically right-size Pod resource based on actual usage, different update mode, dan best practice untuk production use.
Key takeaway:
Vertical Pod Autoscaler essential untuk optimizing resource utilization di Kubernetes. Dengan memahami VPA configuration dan limitation, kalian bisa automatically right-size Pod, reduce waste, dan prevent resource-related failure tanpa manual tuning.
Bagaimana, makin jelas kan tentang Vertical Pod Autoscaler di Kubernetes? Jadi, pastikan tetap semangat belajar dan nantikan episode selanjutnya!
Catatan
Untuk kalian yang ingin melanjutkan ke episode selanjutnya, bisa click thumbnail episode 32 di bawah ini