Learning Kubernetes - Episode 27 - Introduction and Explanation of StatefulSet

Learning Kubernetes - Episode 27 - Introduction and Explanation of StatefulSet

In this episode, we'll discuss Kubernetes StatefulSet for managing stateful applications. We'll learn about stable network identities, persistent storage, ordered deployment, and best practices for databases and stateful workloads.

Arman Dwi Pangestu
Arman Dwi PangestuApril 2, 2026
0 views
8 min read

Introduction

Note

If you want to read the previous episode, you can click the Episode 26 thumbnail below

Episode 26Episode 26

In the previous episode, we learned about Deployment for managing stateless applications with rolling updates and easy scaling. In episode 27, we'll discuss StatefulSet, designed specifically for stateful applications that require stable network identities and persistent storage.

Note: Here I'll be using a Kubernetes Cluster installed through K3s.

While Deployments work great for stateless applications, stateful applications like databases, message queues, and distributed systems need guarantees about Pod identity, ordering, and storage persistence. StatefulSet provides these guarantees.

What Is a StatefulSet?

A StatefulSet is a Kubernetes workload resource that manages stateful applications, providing stable network identities, persistent storage, and ordered deployment and scaling.

Think of StatefulSet like a numbered team where each member has a specific role and identity - member-0 is always the leader, member-1 is always the backup, and so on. Unlike a Deployment where all Pods are interchangeable, StatefulSet Pods have unique, persistent identities.

Key characteristics of StatefulSet:

  • Stable network identity - Each Pod gets a predictable hostname
  • Persistent storage - Each Pod can have its own PersistentVolume
  • Ordered deployment - Pods created sequentially (0, 1, 2...)
  • Ordered scaling - Pods scaled up/down in order
  • Ordered updates - Pods updated one at a time in order
  • Stable DNS names - Pods accessible via predictable DNS
  • Sticky identity - Pod identity persists across rescheduling

StatefulSet vs Deployment

Understanding the key differences:

AspectStatefulSetDeployment
Pod IdentityStable, unique (web-0, web-1)Random (web-abc123)
Network IdentityStable hostnameRandom hostname
StorageIndividual PVC per PodShared or no storage
Deployment OrderSequential (0→1→2)Parallel
Scaling OrderSequentialParallel
Use CaseDatabases, stateful appsWeb servers, APIs
Pod ReplacementSame identity preservedNew random identity

Why Use StatefulSet?

StatefulSet solves critical challenges for stateful applications:

  • Database clusters - Master-slave replication with stable identities
  • Distributed systems - Nodes need to know each other's addresses
  • Message queues - Persistent storage for message durability
  • Caching systems - Stable identities for cache distribution
  • Consensus systems - Ordered deployment for leader election
  • Data persistence - Each Pod maintains its own data
  • Predictable scaling - Controlled order for adding/removing nodes

Without StatefulSet, managing stateful applications would require complex custom logic for identity management, storage allocation, and ordered operations.

Creating a StatefulSet

Let's create a basic StatefulSet.

Basic StatefulSet

Kubernetesweb-statefulset.yml
apiVersion: v1
kind: Service
metadata:
    name: nginx
    labels:
        app: nginx
spec:
    ports:
        - port: 80
          name: web
    clusterIP: None
    selector:
        app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
    name: web
spec:
    serviceName: "nginx"
    replicas: 3
    selector:
        matchLabels:
            app: nginx
    template:
        metadata:
            labels:
                app: nginx
        spec:
            containers:
                - name: nginx
                  image: nginx:1.25
                  ports:
                      - containerPort: 80
                        name: web

Apply the StatefulSet:

Kubernetesbash
sudo kubectl apply -f web-statefulset.yml

Watch Pods being created:

Kubernetesbash
sudo kubectl get pods -w -l app=nginx

Output shows sequential creation:

Kubernetesbash
NAME    READY   STATUS              RESTARTS   AGE
web-0   0/1     Pending             0          0s
web-0   0/1     ContainerCreating   0          0s
web-0   1/1     Running             0          10s
web-1   0/1     Pending             0          0s
web-1   0/1     ContainerCreating   0          0s
web-1   1/1     Running             0          10s
web-2   0/1     Pending             0          0s
web-2   0/1     ContainerCreating   0          0s
web-2   1/1     Running             0          10s

Notice:

  • Pods created sequentially: web-0, then web-1, then web-2
  • Each Pod has a stable, predictable name
  • Next Pod only starts after previous is Running and Ready

StatefulSet with Persistent Storage

Kubernetesstatefulset-storage.yml
apiVersion: v1
kind: Service
metadata:
    name: nginx
spec:
    ports:
        - port: 80
          name: web
    clusterIP: None
    selector:
        app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
    name: web
spec:
    serviceName: "nginx"
    replicas: 3
    selector:
        matchLabels:
            app: nginx
    template:
        metadata:
            labels:
                app: nginx
        spec:
            containers:
                - name: nginx
                  image: nginx:1.25
                  ports:
                      - containerPort: 80
                        name: web
                  volumeMounts:
                      - name: www
                        mountPath: /usr/share/nginx/html
    volumeClaimTemplates:
        - metadata:
              name: www
          spec:
              accessModes: ["ReadWriteOnce"]
              resources:
                  requests:
                      storage: 1Gi

This creates:

  • 3 Pods: web-0, web-1, web-2
  • 3 PVCs: www-web-0, www-web-1, www-web-2
  • Each Pod gets its own persistent storage

Headless Service

StatefulSets require a Headless Service for network identity.

What Is a Headless Service?

A Headless Service (clusterIP: None) doesn't load balance. Instead, it returns the IP addresses of individual Pods, enabling direct Pod-to-Pod communication.

Kubernetesheadless-service.yml
apiVersion: v1
kind: Service
metadata:
    name: nginx
spec:
    ports:
        - port: 80
          name: web
    clusterIP: None  # Makes it headless
    selector:
        app: nginx

DNS for StatefulSet Pods

Each Pod gets a predictable DNS name:

plaintext
<pod-name>.<service-name>.<namespace>.svc.cluster.local

Examples:

  • web-0.nginx.default.svc.cluster.local
  • web-1.nginx.default.svc.cluster.local
  • web-2.nginx.default.svc.cluster.local

Test DNS resolution:

Kubernetesbash
sudo kubectl run -it --rm debug --image=busybox:1.36 --restart=Never -- nslookup web-0.nginx

Scaling StatefulSets

Scale up and down in order.

Scale Up

Kubernetesbash
sudo kubectl scale statefulset web --replicas=5

Pods created sequentially:

  • web-3 created and becomes Ready
  • Then web-4 created

Scale Down

Kubernetesbash
sudo kubectl scale statefulset web --replicas=2

Pods deleted in reverse order:

  • web-4 deleted first
  • Then web-3 deleted
  • web-0, web-1, web-2 remain

Important

Important: Scaling down doesn't delete PersistentVolumeClaims. They remain for data safety and can be reused if you scale back up.

Update Strategies

StatefulSet supports two update strategies.

RollingUpdate (Default)

Updates Pods one at a time in reverse order:

Kubernetesrolling-update.yml
spec:
    updateStrategy:
        type: RollingUpdate
        rollingUpdate:
            partition: 0

Partition: Only Pods with ordinal >= partition are updated.

Example with partition=2:

  • web-2, web-3, web-4 updated
  • web-0, web-1 remain on old version

OnDelete

Pods only updated when manually deleted:

Kubernetesondelete-update.yml
spec:
    updateStrategy:
        type: OnDelete

Useful for manual control over updates.

Practical Examples

Example 1: MySQL Master-Slave Replication

Kubernetesmysql-statefulset.yml
apiVersion: v1
kind: Service
metadata:
    name: mysql
spec:
    ports:
        - port: 3306
          name: mysql
    clusterIP: None
    selector:
        app: mysql
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
    name: mysql
spec:
    serviceName: mysql
    replicas: 3
    selector:
        matchLabels:
            app: mysql
    template:
        metadata:
            labels:
                app: mysql
        spec:
            containers:
                - name: mysql
                  image: mysql:8.0
                  ports:
                      - containerPort: 3306
                        name: mysql
                  env:
                      - name: MYSQL_ROOT_PASSWORD
                        valueFrom:
                            secretKeyRef:
                                name: mysql-secret
                                key: password
                  volumeMounts:
                      - name: data
                        mountPath: /var/lib/mysql
                  resources:
                      requests:
                          memory: "512Mi"
                          cpu: "500m"
                      limits:
                          memory: "1Gi"
                          cpu: "1000m"
    volumeClaimTemplates:
        - metadata:
              name: data
          spec:
              accessModes: ["ReadWriteOnce"]
              resources:
                  requests:
                      storage: 10Gi

Example 2: PostgreSQL Cluster

Kubernetespostgres-statefulset.yml
apiVersion: v1
kind: ConfigMap
metadata:
    name: postgres-config
data:
    POSTGRES_DB: "myapp"
    POSTGRES_USER: "appuser"
---
apiVersion: v1
kind: Secret
metadata:
    name: postgres-secret
type: Opaque
stringData:
    POSTGRES_PASSWORD: "secretpassword"
---
apiVersion: v1
kind: Service
metadata:
    name: postgres
spec:
    ports:
        - port: 5432
          name: postgres
    clusterIP: None
    selector:
        app: postgres
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
    name: postgres
spec:
    serviceName: postgres
    replicas: 3
    selector:
        matchLabels:
            app: postgres
    template:
        metadata:
            labels:
                app: postgres
        spec:
            containers:
                - name: postgres
                  image: postgres:15
                  ports:
                      - containerPort: 5432
                        name: postgres
                  envFrom:
                      - configMapRef:
                            name: postgres-config
                      - secretRef:
                            name: postgres-secret
                  volumeMounts:
                      - name: data
                        mountPath: /var/lib/postgresql/data
                  livenessProbe:
                      exec:
                          command:
                              - pg_isready
                              - -U
                              - appuser
                      initialDelaySeconds: 30
                      periodSeconds: 10
                  readinessProbe:
                      exec:
                          command:
                              - pg_isready
                              - -U
                              - appuser
                      initialDelaySeconds: 5
                      periodSeconds: 5
    volumeClaimTemplates:
        - metadata:
              name: data
          spec:
              accessModes: ["ReadWriteOnce"]
              resources:
                  requests:
                      storage: 20Gi

Example 3: Redis Cluster

Kubernetesredis-statefulset.yml
apiVersion: v1
kind: ConfigMap
metadata:
    name: redis-config
data:
    redis.conf: |
        appendonly yes
        appendfilename "appendonly.aof"
---
apiVersion: v1
kind: Service
metadata:
    name: redis
spec:
    ports:
        - port: 6379
          name: redis
    clusterIP: None
    selector:
        app: redis
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
    name: redis
spec:
    serviceName: redis
    replicas: 3
    selector:
        matchLabels:
            app: redis
    template:
        metadata:
            labels:
                app: redis
        spec:
            containers:
                - name: redis
                  image: redis:7.2
                  ports:
                      - containerPort: 6379
                        name: redis
                  command:
                      - redis-server
                      - /etc/redis/redis.conf
                  volumeMounts:
                      - name: data
                        mountPath: /data
                      - name: config
                        mountPath: /etc/redis
                  resources:
                      requests:
                          memory: "256Mi"
                          cpu: "250m"
                      limits:
                          memory: "512Mi"
                          cpu: "500m"
            volumes:
                - name: config
                  configMap:
                      name: redis-config
    volumeClaimTemplates:
        - metadata:
              name: data
          spec:
              accessModes: ["ReadWriteOnce"]
              resources:
                  requests:
                      storage: 5Gi

Example 4: Kafka Cluster

Kuberneteskafka-statefulset.yml
apiVersion: v1
kind: Service
metadata:
    name: kafka
spec:
    ports:
        - port: 9092
          name: kafka
    clusterIP: None
    selector:
        app: kafka
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
    name: kafka
spec:
    serviceName: kafka
    replicas: 3
    selector:
        matchLabels:
            app: kafka
    template:
        metadata:
            labels:
                app: kafka
        spec:
            containers:
                - name: kafka
                  image: confluentinc/cp-kafka:latest
                  ports:
                      - containerPort: 9092
                        name: kafka
                  env:
                      - name: KAFKA_BROKER_ID
                        valueFrom:
                            fieldRef:
                                fieldPath: metadata.name
                      - name: KAFKA_ZOOKEEPER_CONNECT
                        value: "zookeeper:2181"
                      - name: KAFKA_ADVERTISED_LISTENERS
                        value: "PLAINTEXT://$(POD_NAME).kafka:9092"
                      - name: POD_NAME
                        valueFrom:
                            fieldRef:
                                fieldPath: metadata.name
                  volumeMounts:
                      - name: data
                        mountPath: /var/lib/kafka/data
    volumeClaimTemplates:
        - metadata:
              name: data
          spec:
              accessModes: ["ReadWriteOnce"]
              resources:
                  requests:
                      storage: 10Gi

Pod Management Policy

Control how Pods are managed during operations.

OrderedReady (Default)

Pods created/deleted sequentially, waiting for each to be Ready:

Kubernetesyml
spec:
    podManagementPolicy: OrderedReady

Parallel

Pods created/deleted in parallel (like Deployment):

Kubernetesyml
spec:
    podManagementPolicy: Parallel

Useful when Pod order doesn't matter but you still need stable identities.

Deleting StatefulSets

Delete StatefulSet (Keep Pods)

Kubernetesbash
sudo kubectl delete statefulset web --cascade=orphan

Pods remain running but are no longer managed.

Delete StatefulSet (Delete Pods)

Kubernetesbash
sudo kubectl delete statefulset web

Pods deleted in reverse order.

Delete PVCs

Kubernetesbash
sudo kubectl delete pvc -l app=nginx

Warning

Warning: Deleting PVCs permanently deletes data. Always backup before deleting.

Common Mistakes and Pitfalls

Mistake 1: No Headless Service

Problem: StatefulSet requires a headless Service.

Solution: Always create headless Service first:

Kubernetesyml
apiVersion: v1
kind: Service
metadata:
    name: nginx
spec:
    clusterIP: None  # Required for StatefulSet
    selector:
        app: nginx

Mistake 2: Wrong Service Name

Problem: serviceName doesn't match Service metadata.name.

Solution: Ensure names match:

Kubernetesyml
# Service
metadata:
    name: nginx
 
# StatefulSet
spec:
    serviceName: "nginx"  # Must match

Mistake 3: No Storage Class

Problem: PVCs can't be provisioned.

Solution: Ensure StorageClass exists or specify one:

Kubernetesyml
volumeClaimTemplates:
    - metadata:
          name: data
      spec:
          storageClassName: fast-ssd
          accessModes: ["ReadWriteOnce"]
          resources:
              requests:
                  storage: 10Gi

Mistake 4: Deleting PVCs Accidentally

Problem: Data lost when scaling down.

Solution: PVCs are preserved by design. Delete manually only when certain:

Kubernetesbash
# Scaling down doesn't delete PVCs
sudo kubectl scale statefulset web --replicas=1
 
# PVCs remain for web-1, web-2
sudo kubectl get pvc

Mistake 5: Not Setting Resource Limits

Problem: Pods can consume unlimited resources.

Solution: Always set limits for stateful workloads:

Kubernetesyml
resources:
    requests:
        memory: "512Mi"
        cpu: "500m"
    limits:
        memory: "1Gi"
        cpu: "1000m"

Best Practices

Use Appropriate Storage

Choose storage based on requirements:

Kubernetesyml
# Fast SSD for databases
storageClassName: fast-ssd
 
# Standard HDD for logs
storageClassName: standard

Set Pod Disruption Budget

Protect against voluntary disruptions:

Kubernetespdb.yml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
    name: web-pdb
spec:
    minAvailable: 2
    selector:
        matchLabels:
            app: nginx

Use Init Containers

Prepare environment before main container starts:

Kubernetesyml
initContainers:
    - name: init-config
      image: busybox:1.36
      command:
          - sh
          - -c
          - |
              echo "Initializing..."
              # Setup configuration

Add Health Checks

Monitor Pod health:

Kubernetesyml
livenessProbe:
    tcpSocket:
        port: 3306
    initialDelaySeconds: 30
    periodSeconds: 10
readinessProbe:
    exec:
        command:
            - mysqladmin
            - ping
    initialDelaySeconds: 5
    periodSeconds: 5

Use Anti-Affinity

Spread Pods across nodes:

Kubernetesyml
affinity:
    podAntiAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                  matchExpressions:
                      - key: app
                        operator: In
                        values:
                            - mysql
              topologyKey: kubernetes.io/hostname

Backup Data Regularly

Implement backup strategy:

Kubernetesbash
# Example: Backup MySQL
sudo kubectl exec mysql-0 -- mysqldump -u root -p$PASSWORD --all-databases > backup.sql

Viewing StatefulSet Details

Get StatefulSets

Kubernetesbash
sudo kubectl get statefulsets
sudo kubectl get statefulsets -o wide

Describe StatefulSet

Kubernetesbash
sudo kubectl describe statefulset web

Get Pods

Kubernetesbash
sudo kubectl get pods -l app=nginx

Get PVCs

Kubernetesbash
sudo kubectl get pvc

Check Pod DNS

Kubernetesbash
sudo kubectl run -it --rm debug --image=busybox:1.36 --restart=Never -- nslookup web-0.nginx

Troubleshooting StatefulSets

Check StatefulSet Status

Kubernetesbash
sudo kubectl get statefulset web
sudo kubectl describe statefulset web

Check Pods

Kubernetesbash
sudo kubectl get pods -l app=nginx
sudo kubectl describe pod web-0
sudo kubectl logs web-0

Check PVCs

Kubernetesbash
sudo kubectl get pvc
sudo kubectl describe pvc www-web-0

Check Service

Kubernetesbash
sudo kubectl get service nginx
sudo kubectl describe service nginx

Check Events

Kubernetesbash
sudo kubectl get events --sort-by='.lastTimestamp'

Conclusion

In episode 27, we've explored StatefulSet in Kubernetes in depth. We've learned how to manage stateful applications with stable identities, persistent storage, and ordered operations.

Key takeaways:

  • StatefulSet manages stateful applications with unique identities
  • Each Pod gets stable network identity (web-0, web-1, web-2)
  • Headless Service required for DNS-based Pod discovery
  • volumeClaimTemplates create individual PVC per Pod
  • Pods created/deleted sequentially by default
  • Predictable DNS names enable direct Pod communication
  • Scaling preserves PVCs for data safety
  • Two update strategies: RollingUpdate and OnDelete
  • Pod Management Policy controls parallel vs ordered operations
  • Use for databases, message queues, distributed systems
  • Always set resource limits for stateful workloads
  • Implement Pod Disruption Budget for availability
  • Use anti-affinity to spread Pods across nodes
  • Backup data regularly for disaster recovery
  • Different from Deployment: stable identity vs random identity

StatefulSet is essential for running stateful applications in Kubernetes. By understanding StatefulSets, you can confidently deploy and manage databases, distributed systems, and other stateful workloads with guaranteed identity and storage persistence.

Are you getting a clearer understanding of StatefulSet in Kubernetes? Keep your learning momentum going and look forward to the next episode!

Note

If you want to continue to the next episode, you can click the Episode 28 thumbnail below

Episode 28Episode 28

Related Posts