Di episode ini kita akan coba bahas Kubernetes Job, controller yang dirancang untuk menjalankan task sampai selesai. Kita akan mempelajari bagaimana Job mengelola batch processing, one-time task, dan parallel execution di Kubernetes.

Catatan
Untuk kalian yang ingin membaca episode sebelumnya, bisa click thumbnail episode 13 di bawah ini
Di episode sebelumnya kita sudah belajar tentang DaemonSet, yang memastikan Pod berjalan di setiap node di cluster. Selanjutnya di episode 14 kali ini, kita akan coba bahas tipe controller yang berbeda yaitu Job.
Catatan: Disini saya akan menggunakan Kubernetes Cluster yang di install melalui K3s.
Tidak seperti controller yang sudah kita bahas (ReplicaSet, DaemonSet) yang keep Pod running continuously, Job dirancang untuk task yang run to completion. Bayangkan seperti menjalankan script atau batch process yang perlu finish successfully, kemudian stop.
Job membuat satu atau lebih Pod dan memastikan bahwa jumlah yang di-specify berhasil terminate. Job track successful completion dari Pod dan saat jumlah successful completion yang di-specify tercapai, Job itu sendiri complete.
Bayangkan Job seperti menjalankan cron task atau batch script - dia start, melakukan pekerjaannya, dan finish. Di Kubernetes, Job mengelola proses ini, handling failure dan retry secara otomatis.
Karakteristik kunci Job:
Job dirancang untuk workload yang perlu run once atau periodically kemudian complete:
Tanpa Job, kalian perlu:
Mari kita pahami perbedaan kunci nya:
| Aspek | Job | ReplicaSet | DaemonSet |
|---|---|---|---|
| Purpose | Run to completion | Keep running | Keep running on node |
| Pod lifecycle | Terminate on success | Run continuously | Run continuously |
| Restart policy | OnFailure atau Never | Always | Always |
| Completion tracking | Yes | No | No |
| Use case | Batch task | Application | Node-level service |
| Cleanup | Can auto-delete | Manual | Manual |
Contoh scenario:
Mari kita buat basic Job:
Buat file bernama job-basic.yml:
apiVersion: batch/v1
kind: Job
metadata:
name: hello-job
spec:
template:
spec:
containers:
- name: hello
image: busybox:1.36
command:
- /bin/sh
- -c
- echo "Hello from Kubernetes Job!"; sleep 5; echo "Job completed!"
restartPolicy: NeverImportant
Penting: Job Pod harus menggunakan restartPolicy: Never atau restartPolicy: OnFailure. Default Always tidak diperbolehkan untuk Job.
Apply konfigurasi:
sudo kubectl apply -f job-basic.ymlVerifikasi Job dibuat:
sudo kubectl get jobsOutput:
NAME COMPLETIONS DURATION AGE
hello-job 1/1 8s 10sCek Pod:
sudo kubectl get podsOutput:
NAME READY STATUS RESTARTS AGE
hello-job-abc12 0/1 Completed 0 15sPerhatikan Pod status adalah Completed, bukan Running.
Lihat Pod log:
sudo kubectl logs hello-job-abc12Output:
Hello from Kubernetes Job!
Job completed!Job support mode completion yang berbeda:
Menjalankan single Pod to completion:
apiVersion: batch/v1
kind: Job
metadata:
name: single-job
spec:
template:
spec:
containers:
- name: task
image: busybox:1.36
command: ["echo", "Single task completed"]
restartPolicy: NeverIni membuat satu Pod. Jika fail, Job membuat Pod baru sampai satu succeed.
Menjalankan multiple Pod secara parallel sampai jumlah yang di-specify complete successfully:
apiVersion: batch/v1
kind: Job
metadata:
name: parallel-job
spec:
completions: 5
parallelism: 2
template:
spec:
containers:
- name: task
image: busybox:1.36
command:
- /bin/sh
- -c
- echo "Processing task"; sleep 10; echo "Task completed"
restartPolicy: NeverJob ini:
completions: 5)parallelism: 2)Menjalankan multiple Pod secara parallel tanpa fixed completion count:
apiVersion: batch/v1
kind: Job
metadata:
name: work-queue-job
spec:
parallelism: 3
template:
spec:
containers:
- name: worker
image: busybox:1.36
command:
- /bin/sh
- -c
- echo "Processing work item"; sleep 5; echo "Done"
restartPolicy: NeverJob ini:
Job Pod support dua restart policy:
Pod tidak pernah di-restart. Jika fail, Job membuat Pod baru:
spec:
template:
spec:
restartPolicy: NeverBehavior:
Error statePod di-restart di node yang sama jika fail:
spec:
template:
spec:
restartPolicy: OnFailureBehavior:
Kontrol berapa kali Job retry failed Pod:
apiVersion: batch/v1
kind: Job
metadata:
name: retry-job
spec:
backoffLimit: 3
template:
spec:
containers:
- name: task
image: busybox:1.36
command:
- /bin/sh
- -c
- exit 1
restartPolicy: NeverJob ini:
backoffLimit: 3)backoffLimit adalah 6Set time limit untuk Job execution:
apiVersion: batch/v1
kind: Job
metadata:
name: deadline-job
spec:
activeDeadlineSeconds: 60
template:
spec:
containers:
- name: task
image: busybox:1.36
command:
- /bin/sh
- -c
- sleep 120
restartPolicy: NeverJob ini:
Untuk melihat informasi detail tentang Job:
sudo kubectl describe job hello-jobOutput:
Name: hello-job
Namespace: default
Selector: controller-uid=abc123
Labels: <none>
Annotations: <none>
Parallelism: 1
Completions: 1
Completion Mode: NonIndexed
Start Time: Sun, 01 Mar 2026 10:00:00 +0000
Completed At: Sun, 01 Mar 2026 10:00:08 +0000
Duration: 8s
Pods Statuses: 0 Active / 1 Succeeded / 0 Failed
Pod Template:
Labels: controller-uid=abc123
Containers:
hello:
Image: busybox:1.36
Command:
/bin/sh
-c
echo "Hello from Kubernetes Job!"; sleep 5; echo "Job completed!"
Environment: <none>
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 2m job-controller Created pod: hello-job-abc12
Normal Completed 2m job-controller Job completedapiVersion: batch/v1
kind: Job
metadata:
name: db-migration
labels:
app: database
task: migration
spec:
backoffLimit: 2
activeDeadlineSeconds: 300
template:
metadata:
labels:
app: database
task: migration
spec:
containers:
- name: migrate
image: migrate/migrate:v4.16.2
command:
- migrate
- -path=/migrations
- -database=postgres://user:pass@db:5432/mydb?sslmode=disable
- up
volumeMounts:
- name: migrations
mountPath: /migrations
volumes:
- name: migrations
configMap:
name: db-migrations
restartPolicy: NeverJob ini:
apiVersion: batch/v1
kind: Job
metadata:
name: data-processor
spec:
completions: 10
parallelism: 3
template:
spec:
containers:
- name: processor
image: python:3.11-slim
command:
- python
- -c
- |
import time
import random
print("Processing data batch...")
time.sleep(random.randint(5, 15))
print("Batch processing completed!")
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
restartPolicy: OnFailureJob ini:
apiVersion: batch/v1
kind: Job
metadata:
name: database-backup
labels:
app: backup
type: database
spec:
backoffLimit: 1
activeDeadlineSeconds: 600
template:
metadata:
labels:
app: backup
type: database
spec:
containers:
- name: backup
image: postgres:15-alpine
command:
- /bin/sh
- -c
- |
pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME > /backup/backup-$(date +%Y%m%d-%H%M%S).sql
echo "Backup completed successfully"
env:
- name: DB_HOST
value: "postgres-service"
- name: DB_USER
valueFrom:
secretKeyRef:
name: db-credentials
key: username
- name: DB_NAME
value: "production"
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
volumeMounts:
- name: backup-storage
mountPath: /backup
volumes:
- name: backup-storage
persistentVolumeClaim:
claimName: backup-pvc
restartPolicy: NeverJob ini:
apiVersion: batch/v1
kind: Job
metadata:
name: image-processor
spec:
completions: 100
parallelism: 10
template:
spec:
containers:
- name: processor
image: imagemagick:latest
command:
- /bin/sh
- -c
- |
echo "Processing image..."
convert input.jpg -resize 800x600 output.jpg
echo "Image processed successfully"
resources:
requests:
memory: "512Mi"
cpu: "500m"
limits:
memory: "1Gi"
cpu: "1000m"
restartPolicy: OnFailureJob ini:
Untuk task yang mungkin fail tapi harus retry:
spec:
backoffLimit: 5
template:
spec:
restartPolicy: NeverUntuk processing jumlah item yang diketahui:
spec:
completions: 100
parallelism: 10Untuk processing item dari queue:
spec:
parallelism: 5
# Tidak ada completion yang di-specifyPod coordinate melalui external queue (Redis, RabbitMQ, dll)
Untuk task yang harus complete dalam time limit:
spec:
activeDeadlineSeconds: 300
backoffLimit: 3Hapus completed Job:
sudo kubectl delete job hello-jobIni menghapus Job dan Pod nya.
Gunakan TTL (Time To Live) untuk otomatis cleanup completed Job:
apiVersion: batch/v1
kind: Job
metadata:
name: cleanup-job
spec:
ttlSecondsAfterFinished: 100
template:
spec:
containers:
- name: task
image: busybox:1.36
command: ["echo", "Task completed"]
restartPolicy: NeverJob ini:
Kontrol kapan Job di-cleanup:
spec:
ttlSecondsAfterFinished: 0 # Delete immediately after completionAtau keep failed Job untuk debugging:
spec:
ttlSecondsAfterFinished: 86400 # Keep untuk 24 jamsudo kubectl get jobssudo kubectl get jobs -wsudo kubectl get pods --selector=job-name=hello-job# Get Pod name dari Job
POD_NAME=$(kubectl get pods --selector=job-name=hello-job -o jsonpath='{.items[0].metadata.name}')
# View log
sudo kubectl logs $POD_NAMEsudo kubectl get events --sort-by='.lastTimestamp' | grep JobProblem: Menggunakan restartPolicy: Always untuk Job.
Solusi: Gunakan Never atau OnFailure:
spec:
template:
spec:
restartPolicy: Never # atau OnFailureProblem: Job retry indefinitely on failure.
Solusi: Set appropriate backoffLimit:
spec:
backoffLimit: 3Problem: Job run forever jika task hang.
Solusi: Set activeDeadlineSeconds:
spec:
activeDeadlineSeconds: 300Problem: Accumulation dari completed Job dan Pod.
Solusi: Gunakan TTL untuk automatic cleanup:
spec:
ttlSecondsAfterFinished: 100Problem: Set parallelism lebih tinggi dari completions.
Solusi: Ensure parallelism <= completions:
spec:
completions: 10
parallelism: 5 # Tidak lebih dari completionProblem: Job Pod consume excessive resource.
Solusi: Selalu set resource limit:
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"Prevent infinite retry:
spec:
backoffLimit: 3Prevent Job dari running terlalu lama:
spec:
activeDeadlineSeconds: 600Gunakan TTL untuk cleanup completed Job:
spec:
ttlSecondsAfterFinished: 100Prevent resource exhaustion:
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"Tambahkan meaningful label:
metadata:
labels:
app: data-processor
task: batch-import
environment: productionNever untuk debugging (keep failed Pod)OnFailure untuk efficiency (restart in place)Balance speed dan resource usage:
spec:
completions: 100
parallelism: 10 # Process 10 at a timeJangan hardcode credential:
env:
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: passwordJob run once, tapi bagaimana jika kalian perlu run them on a schedule?
Job - Run once:
apiVersion: batch/v1
kind: JobCronJob - Run on a schedule:
apiVersion: batch/v1
kind: CronJob
metadata:
name: scheduled-backup
spec:
schedule: "0 2 * * *" # Setiap hari jam 2 pagi
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: backup-tool:latest
restartPolicy: NeverCronJob membuat Job on a schedule. Kita akan cover CronJob secara detail di episode berikutnya.
Pada episode 14 ini, kita telah membahas Job di Kubernetes secara mendalam. Kita sudah belajar apa itu Job, bagaimana dia berbeda dari controller lain, dan cara menggunakannya untuk batch processing dan one-time task.
Key takeaway:
backoffLimitparallelism dan completionsactiveDeadlineSeconds untuk time-limit JobttlSecondsAfterFinished untuk automatic cleanupJob essential untuk menjalankan batch workload, data processing, dan one-time task di Kubernetes. Dengan memahami Job, kalian bisa effectively manage task yang perlu run to completion, handle failure gracefully, dan cleanup resource secara otomatis.
Bagaimana, makin jelas kan tentang Job di Kubernetes? Di episode 15 berikutnya, kita akan membahas CronJob, yang build on Job untuk menyediakan scheduled, recurring task execution. Jadi, pastikan tetap semangat belajar dan nantikan episode selanjutnya!