Belajar Kubernetes - Episode 14 - Pengenalan dan Penjelasan Job

Belajar Kubernetes - Episode 14 - Pengenalan dan Penjelasan Job

Di episode ini kita akan coba bahas Kubernetes Job, controller yang dirancang untuk menjalankan task sampai selesai. Kita akan mempelajari bagaimana Job mengelola batch processing, one-time task, dan parallel execution di Kubernetes.

Arman Dwi Pangestu
Arman Dwi PangestuMarch 17, 2026
0 views
8 min read

Pendahuluan

Catatan

Untuk kalian yang ingin membaca episode sebelumnya, bisa click thumbnail episode 13 di bawah ini

Episode 13Episode 13

Di episode sebelumnya kita sudah belajar tentang DaemonSet, yang memastikan Pod berjalan di setiap node di cluster. Selanjutnya di episode 14 kali ini, kita akan coba bahas tipe controller yang berbeda yaitu Job.

Catatan: Disini saya akan menggunakan Kubernetes Cluster yang di install melalui K3s.

Tidak seperti controller yang sudah kita bahas (ReplicaSet, DaemonSet) yang keep Pod running continuously, Job dirancang untuk task yang run to completion. Bayangkan seperti menjalankan script atau batch process yang perlu finish successfully, kemudian stop.

Apa Itu Job?

Job membuat satu atau lebih Pod dan memastikan bahwa jumlah yang di-specify berhasil terminate. Job track successful completion dari Pod dan saat jumlah successful completion yang di-specify tercapai, Job itu sendiri complete.

Bayangkan Job seperti menjalankan cron task atau batch script - dia start, melakukan pekerjaannya, dan finish. Di Kubernetes, Job mengelola proses ini, handling failure dan retry secara otomatis.

Karakteristik kunci Job:

  • Run to completion - Pod diharapkan finish dan exit successfully
  • Automatic retry - Failed Pod otomatis di-restart
  • Completion tracking - Track berapa banyak Pod yang completed successfully
  • Parallel execution - Bisa menjalankan multiple Pod secara parallel
  • Cleanup - Completed Job bisa otomatis di-cleanup
  • One-time atau batch task - Perfect untuk migration, backup, data processing

Kenapa Kita Butuh Job?

Job dirancang untuk workload yang perlu run once atau periodically kemudian complete:

  • Database migration - Menjalankan schema update atau data migration
  • Batch processing - Process large dataset atau generate report
  • Backup task - Membuat backup dari database atau file
  • Data import/export - Load data ke system atau export untuk analysis
  • Image processing - Resize image, generate thumbnail
  • ETL operation - Extract, transform, dan load data
  • One-time setup task - Initialize system atau seed data
  • Cleanup operation - Remove old data atau temporary file

Tanpa Job, kalian perlu:

  • Manually create Pod untuk one-time task
  • Monitor Pod completion status
  • Handle failure dan retry manually
  • Clean up completed Pod sendiri

Job vs Controller Lain

Mari kita pahami perbedaan kunci nya:

AspekJobReplicaSetDaemonSet
PurposeRun to completionKeep runningKeep running on node
Pod lifecycleTerminate on successRun continuouslyRun continuously
Restart policyOnFailure atau NeverAlwaysAlways
Completion trackingYesNoNo
Use caseBatch taskApplicationNode-level service
CleanupCan auto-deleteManualManual

Contoh scenario:

  • Job: Menjalankan database migration script once
  • ReplicaSet: Menjalankan 3 replica dari web application continuously
  • DaemonSet: Menjalankan log collector di setiap node continuously

Membuat Job

Mari kita buat basic Job:

Contoh 1: Basic Job

Buat file bernama job-basic.yml:

Kubernetesjob-basic.yml
apiVersion: batch/v1
kind: Job
metadata:
    name: hello-job
spec:
    template:
        spec:
            containers:
                - name: hello
                  image: busybox:1.36
                  command:
                      - /bin/sh
                      - -c
                      - echo "Hello from Kubernetes Job!"; sleep 5; echo "Job completed!"
            restartPolicy: Never

Important

Penting: Job Pod harus menggunakan restartPolicy: Never atau restartPolicy: OnFailure. Default Always tidak diperbolehkan untuk Job.

Apply konfigurasi:

Kubernetesbash
sudo kubectl apply -f job-basic.yml

Verifikasi Job dibuat:

Kubernetesbash
sudo kubectl get jobs

Output:

Kubernetesbash
NAME        COMPLETIONS   DURATION   AGE
hello-job   1/1           8s         10s

Cek Pod:

Kubernetesbash
sudo kubectl get pods

Output:

Kubernetesbash
NAME              READY   STATUS      RESTARTS   AGE
hello-job-abc12   0/1     Completed   0          15s

Perhatikan Pod status adalah Completed, bukan Running.

Lihat Pod log:

Kubernetesbash
sudo kubectl logs hello-job-abc12

Output:

Kubernetesbash
Hello from Kubernetes Job!
Job completed!

Mode Completion Job

Job support mode completion yang berbeda:

Non-Parallel Job (Default)

Menjalankan single Pod to completion:

Kubernetesjob-single.yml
apiVersion: batch/v1
kind: Job
metadata:
    name: single-job
spec:
    template:
        spec:
            containers:
                - name: task
                  image: busybox:1.36
                  command: ["echo", "Single task completed"]
            restartPolicy: Never

Ini membuat satu Pod. Jika fail, Job membuat Pod baru sampai satu succeed.

Parallel Job dengan Fixed Completion Count

Menjalankan multiple Pod secara parallel sampai jumlah yang di-specify complete successfully:

Kubernetesjob-parallel-fixed.yml
apiVersion: batch/v1
kind: Job
metadata:
    name: parallel-job
spec:
    completions: 5
    parallelism: 2
    template:
        spec:
            containers:
                - name: task
                  image: busybox:1.36
                  command:
                      - /bin/sh
                      - -c
                      - echo "Processing task"; sleep 10; echo "Task completed"
            restartPolicy: Never

Job ini:

  • Butuh 5 successful completion (completions: 5)
  • Menjalankan 2 Pod at a time (parallelism: 2)
  • Membuat Pod baru sampai 5 complete successfully

Parallel Job dengan Work Queue

Menjalankan multiple Pod secara parallel tanpa fixed completion count:

Kubernetesjob-work-queue.yml
apiVersion: batch/v1
kind: Job
metadata:
    name: work-queue-job
spec:
    parallelism: 3
    template:
        spec:
            containers:
                - name: worker
                  image: busybox:1.36
                  command:
                      - /bin/sh
                      - -c
                      - echo "Processing work item"; sleep 5; echo "Done"
            restartPolicy: Never

Job ini:

  • Menjalankan 3 Pod secara parallel
  • Tidak ada fixed completion count
  • Pod coordinate melalui external work queue
  • Job complete saat semua Pod finish

Restart Policy

Job Pod support dua restart policy:

Never

Pod tidak pernah di-restart. Jika fail, Job membuat Pod baru:

Kubernetesyml
spec:
    template:
        spec:
            restartPolicy: Never

Behavior:

  • Failed Pod stay di Error state
  • Pod baru dibuat untuk retry
  • Bagus untuk debugging (bisa inspect failed Pod)

OnFailure

Pod di-restart di node yang sama jika fail:

Kubernetesyml
spec:
    template:
        spec:
            restartPolicy: OnFailure

Behavior:

  • Failed Pod di-restart in place
  • Tidak ada Pod baru yang dibuat
  • Bagus untuk resource efficiency

Backoff Limit

Kontrol berapa kali Job retry failed Pod:

Kubernetesjob-backoff.yml
apiVersion: batch/v1
kind: Job
metadata:
    name: retry-job
spec:
    backoffLimit: 3
    template:
        spec:
            containers:
                - name: task
                  image: busybox:1.36
                  command:
                      - /bin/sh
                      - -c
                      - exit 1
            restartPolicy: Never

Job ini:

  • Retry sampai 3 kali (backoffLimit: 3)
  • Setelah 3 failure, Job di-mark sebagai failed
  • Default backoffLimit adalah 6

Active Deadline Seconds

Set time limit untuk Job execution:

Kubernetesjob-deadline.yml
apiVersion: batch/v1
kind: Job
metadata:
    name: deadline-job
spec:
    activeDeadlineSeconds: 60
    template:
        spec:
            containers:
                - name: task
                  image: busybox:1.36
                  command:
                      - /bin/sh
                      - -c
                      - sleep 120
            restartPolicy: Never

Job ini:

  • Harus complete dalam 60 detik
  • Setelah 60 detik, Job di-terminate
  • Semua running Pod di-kill

Melihat Detail Job

Untuk melihat informasi detail tentang Job:

Kubernetesbash
sudo kubectl describe job hello-job

Output:

Kubernetesbash
Name:             hello-job
Namespace:        default
Selector:         controller-uid=abc123
Labels:           <none>
Annotations:      <none>
Parallelism:      1
Completions:      1
Completion Mode:  NonIndexed
Start Time:       Sun, 01 Mar 2026 10:00:00 +0000
Completed At:     Sun, 01 Mar 2026 10:00:08 +0000
Duration:         8s
Pods Statuses:    0 Active / 1 Succeeded / 0 Failed
Pod Template:
  Labels:  controller-uid=abc123
  Containers:
   hello:
    Image:      busybox:1.36
    Command:
      /bin/sh
      -c
      echo "Hello from Kubernetes Job!"; sleep 5; echo "Job completed!"
    Environment:  <none>
    Mounts:       <none>
  Volumes:        <none>
Events:
  Type    Reason            Age   From            Message
  ----    ------            ----  ----            -------
  Normal  SuccessfulCreate  2m    job-controller  Created pod: hello-job-abc12
  Normal  Completed         2m    job-controller  Job completed

Contoh Praktis

Contoh 1: Database Migration Job

Kubernetesmigration-job.yml
apiVersion: batch/v1
kind: Job
metadata:
    name: db-migration
    labels:
        app: database
        task: migration
spec:
    backoffLimit: 2
    activeDeadlineSeconds: 300
    template:
        metadata:
            labels:
                app: database
                task: migration
        spec:
            containers:
                - name: migrate
                  image: migrate/migrate:v4.16.2
                  command:
                      - migrate
                      - -path=/migrations
                      - -database=postgres://user:pass@db:5432/mydb?sslmode=disable
                      - up
                  volumeMounts:
                      - name: migrations
                        mountPath: /migrations
            volumes:
                - name: migrations
                  configMap:
                      name: db-migrations
            restartPolicy: Never

Job ini:

  • Menjalankan database migration
  • Retry sampai 2 kali on failure
  • Harus complete dalam 5 menit
  • Load migration file dari ConfigMap

Contoh 2: Batch Data Processing Job

Kubernetesbatch-processing-job.yml
apiVersion: batch/v1
kind: Job
metadata:
    name: data-processor
spec:
    completions: 10
    parallelism: 3
    template:
        spec:
            containers:
                - name: processor
                  image: python:3.11-slim
                  command:
                      - python
                      - -c
                      - |
                          import time
                          import random
                          print("Processing data batch...")
                          time.sleep(random.randint(5, 15))
                          print("Batch processing completed!")
                  resources:
                      requests:
                          memory: "256Mi"
                          cpu: "250m"
                      limits:
                          memory: "512Mi"
                          cpu: "500m"
            restartPolicy: OnFailure

Job ini:

  • Process 10 batch data
  • Menjalankan 3 batch secara parallel
  • Set resource limit
  • Restart failed Pod di node yang sama

Contoh 3: Backup Job

Kubernetesbackup-job.yml
apiVersion: batch/v1
kind: Job
metadata:
    name: database-backup
    labels:
        app: backup
        type: database
spec:
    backoffLimit: 1
    activeDeadlineSeconds: 600
    template:
        metadata:
            labels:
                app: backup
                type: database
        spec:
            containers:
                - name: backup
                  image: postgres:15-alpine
                  command:
                      - /bin/sh
                      - -c
                      - |
                          pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME > /backup/backup-$(date +%Y%m%d-%H%M%S).sql
                          echo "Backup completed successfully"
                  env:
                      - name: DB_HOST
                        value: "postgres-service"
                      - name: DB_USER
                        valueFrom:
                            secretKeyRef:
                                name: db-credentials
                                key: username
                      - name: DB_NAME
                        value: "production"
                      - name: PGPASSWORD
                        valueFrom:
                            secretKeyRef:
                                name: db-credentials
                                key: password
                  volumeMounts:
                      - name: backup-storage
                        mountPath: /backup
            volumes:
                - name: backup-storage
                  persistentVolumeClaim:
                      claimName: backup-pvc
            restartPolicy: Never

Job ini:

  • Membuat database backup
  • Store backup di persistent volume
  • Gunakan secret untuk credential
  • Harus complete dalam 10 menit

Contoh 4: Image Processing Job

Kubernetesimage-processing-job.yml
apiVersion: batch/v1
kind: Job
metadata:
    name: image-processor
spec:
    completions: 100
    parallelism: 10
    template:
        spec:
            containers:
                - name: processor
                  image: imagemagick:latest
                  command:
                      - /bin/sh
                      - -c
                      - |
                          echo "Processing image..."
                          convert input.jpg -resize 800x600 output.jpg
                          echo "Image processed successfully"
                  resources:
                      requests:
                          memory: "512Mi"
                          cpu: "500m"
                      limits:
                          memory: "1Gi"
                          cpu: "1000m"
            restartPolicy: OnFailure

Job ini:

  • Process 100 image
  • Menjalankan 10 processing task secara parallel
  • Set appropriate resource limit untuk image processing

Pattern Job

Pattern 1: Single Job dengan Multiple Attempt

Untuk task yang mungkin fail tapi harus retry:

Kubernetesyml
spec:
    backoffLimit: 5
    template:
        spec:
            restartPolicy: Never

Pattern 2: Parallel Processing dengan Fixed Count

Untuk processing jumlah item yang diketahui:

Kubernetesyml
spec:
    completions: 100
    parallelism: 10

Pattern 3: Work Queue Pattern

Untuk processing item dari queue:

Kubernetesyml
spec:
    parallelism: 5
    # Tidak ada completion yang di-specify

Pod coordinate melalui external queue (Redis, RabbitMQ, dll)

Pattern 4: Time-Limited Job

Untuk task yang harus complete dalam time limit:

Kubernetesyml
spec:
    activeDeadlineSeconds: 300
    backoffLimit: 3

Cleanup Job

Manual Cleanup

Hapus completed Job:

Kubernetesbash
sudo kubectl delete job hello-job

Ini menghapus Job dan Pod nya.

Automatic Cleanup

Gunakan TTL (Time To Live) untuk otomatis cleanup completed Job:

Kubernetesjob-ttl.yml
apiVersion: batch/v1
kind: Job
metadata:
    name: cleanup-job
spec:
    ttlSecondsAfterFinished: 100
    template:
        spec:
            containers:
                - name: task
                  image: busybox:1.36
                  command: ["echo", "Task completed"]
            restartPolicy: Never

Job ini:

  • Otomatis deleted 100 detik setelah completion
  • Apply untuk successful dan failed Job
  • Membantu prevent Job accumulation

Cleanup Policy

Kontrol kapan Job di-cleanup:

Kubernetesyml
spec:
    ttlSecondsAfterFinished: 0  # Delete immediately after completion

Atau keep failed Job untuk debugging:

Kubernetesyml
spec:
    ttlSecondsAfterFinished: 86400  # Keep untuk 24 jam

Monitoring Job

Cek Status Job

Kubernetesbash
sudo kubectl get jobs

Watch Job Progress

Kubernetesbash
sudo kubectl get jobs -w

Lihat Job Pod

Kubernetesbash
sudo kubectl get pods --selector=job-name=hello-job

Cek Job Log

Kubernetesbash
# Get Pod name dari Job
POD_NAME=$(kubectl get pods --selector=job-name=hello-job -o jsonpath='{.items[0].metadata.name}')
 
# View log
sudo kubectl logs $POD_NAME

Monitor Job Event

Kubernetesbash
sudo kubectl get events --sort-by='.lastTimestamp' | grep Job

Kesalahan Umum dan Pitfall

Kesalahan 1: Menggunakan Wrong Restart Policy

Problem: Menggunakan restartPolicy: Always untuk Job.

Solusi: Gunakan Never atau OnFailure:

Kubernetesyml
spec:
    template:
        spec:
            restartPolicy: Never  # atau OnFailure

Kesalahan 2: Tidak Set Backoff Limit

Problem: Job retry indefinitely on failure.

Solusi: Set appropriate backoffLimit:

Kubernetesyml
spec:
    backoffLimit: 3

Kesalahan 3: Tidak Ada Time Limit

Problem: Job run forever jika task hang.

Solusi: Set activeDeadlineSeconds:

Kubernetesyml
spec:
    activeDeadlineSeconds: 300

Kesalahan 4: Tidak Cleanup Completed Job

Problem: Accumulation dari completed Job dan Pod.

Solusi: Gunakan TTL untuk automatic cleanup:

Kubernetesyml
spec:
    ttlSecondsAfterFinished: 100

Kesalahan 5: Incorrect Parallelism Configuration

Problem: Set parallelism lebih tinggi dari completions.

Solusi: Ensure parallelism <= completions:

Kubernetesyml
spec:
    completions: 10
    parallelism: 5  # Tidak lebih dari completion

Kesalahan 6: Tidak Set Resource Limit

Problem: Job Pod consume excessive resource.

Solusi: Selalu set resource limit:

Kubernetesyml
resources:
    requests:
        memory: "256Mi"
        cpu: "250m"
    limits:
        memory: "512Mi"
        cpu: "500m"

Best Practice

Set Appropriate Backoff Limit

Prevent infinite retry:

Kubernetesyml
spec:
    backoffLimit: 3

Gunakan Active Deadline

Prevent Job dari running terlalu lama:

Kubernetesyml
spec:
    activeDeadlineSeconds: 600

Enable Automatic Cleanup

Gunakan TTL untuk cleanup completed Job:

Kubernetesyml
spec:
    ttlSecondsAfterFinished: 100

Set Resource Limit

Prevent resource exhaustion:

Kubernetesyml
resources:
    requests:
        memory: "256Mi"
        cpu: "250m"
    limits:
        memory: "512Mi"
        cpu: "500m"

Gunakan Label untuk Organisasi

Tambahkan meaningful label:

Kubernetesyml
metadata:
    labels:
        app: data-processor
        task: batch-import
        environment: production

Pilih Right Restart Policy

  • Gunakan Never untuk debugging (keep failed Pod)
  • Gunakan OnFailure untuk efficiency (restart in place)

Configure Parallelism Wisely

Balance speed dan resource usage:

Kubernetesyml
spec:
    completions: 100
    parallelism: 10  # Process 10 at a time

Gunakan Secret untuk Sensitive Data

Jangan hardcode credential:

Kubernetesyml
env:
    - name: DB_PASSWORD
      valueFrom:
          secretKeyRef:
              name: db-credentials
              key: password

Job vs CronJob

Job run once, tapi bagaimana jika kalian perlu run them on a schedule?

Job - Run once:

Kubernetesyml
apiVersion: batch/v1
kind: Job

CronJob - Run on a schedule:

Kubernetesyml
apiVersion: batch/v1
kind: CronJob
metadata:
    name: scheduled-backup
spec:
    schedule: "0 2 * * *"  # Setiap hari jam 2 pagi
    jobTemplate:
        spec:
            template:
                spec:
                    containers:
                        - name: backup
                          image: backup-tool:latest
                    restartPolicy: Never

CronJob membuat Job on a schedule. Kita akan cover CronJob secara detail di episode berikutnya.

Penutup

Pada episode 14 ini, kita telah membahas Job di Kubernetes secara mendalam. Kita sudah belajar apa itu Job, bagaimana dia berbeda dari controller lain, dan cara menggunakannya untuk batch processing dan one-time task.

Key takeaway:

  • Job menjalankan Pod to completion, bukan continuously
  • Otomatis handle retry dengan backoffLimit
  • Support parallel execution dengan parallelism dan completions
  • Dua restart policy: Never (create new Pod) atau OnFailure (restart in place)
  • Gunakan activeDeadlineSeconds untuk time-limit Job
  • Gunakan ttlSecondsAfterFinished untuk automatic cleanup
  • Perfect untuk batch processing, migration, backup, dan one-time task
  • Berbeda dari ReplicaSet/DaemonSet yang keep Pod running
  • Bisa menjalankan single atau multiple Pod secara parallel
  • Selalu set resource limit dan backoff limit

Job essential untuk menjalankan batch workload, data processing, dan one-time task di Kubernetes. Dengan memahami Job, kalian bisa effectively manage task yang perlu run to completion, handle failure gracefully, dan cleanup resource secara otomatis.

Bagaimana, makin jelas kan tentang Job di Kubernetes? Di episode 15 berikutnya, kita akan membahas CronJob, yang build on Job untuk menyediakan scheduled, recurring task execution. Jadi, pastikan tetap semangat belajar dan nantikan episode selanjutnya!


Related Posts