StatefulSets

Deploy stateful applications like databases with stable identities and ordered deployment.

8 min read

StatefulSets

In the previous tutorial, we learned all about DNS and service discovery — how services find each other inside the cluster. Now let's tackle a big challenge: running stateful applications.

Deployments work great for stateless apps where any pod can handle any request. But databases? They need to remember things. They need stable identities, persistent storage, and ordered startup. You can't just kill a database pod and hope a random replacement picks up where it left off. That's chaos.

That's what StatefulSets are for. Think of the difference like this: Deployments treat pods like identical workers in a call center — any one can take any call. StatefulSets treat pods like named employees with assigned desks and personal filing cabinets.

StatefulSet vs Deployment

Let's make this crystal clear:

FeatureDeploymentStatefulSet
Pod namesRandom (nginx-abc123)Ordered (nginx-0, nginx-1)
Pod identityInterchangeableStable, unique
StorageShared or ephemeralPer-pod persistent
ScalingAll at onceOne at a time, ordered
Startup orderParallelSequential (0, then 1, then 2)
Deletion orderRandomReverse sequential (2, then 1, then 0)

Use StatefulSets for:

  • Databases (PostgreSQL, MySQL, MongoDB)
  • Message queues (Kafka, RabbitMQ)
  • Distributed systems (Elasticsearch, Zookeeper)

Basically, anything that says "I need to remember who I am."

StatefulSet Requirements

"Can I just create a StatefulSet like a Deployment?"

Almost — but StatefulSets need a headless service for network identity. Remember headless services from the DNS tutorial?

apiVersion: v1
kind: Service
metadata:
  name: postgres-headless
spec:
  clusterIP: None  # Headless
  selector:
    app: postgres
  ports:
  - port: 5432

Create a StatefulSet

Here's a PostgreSQL StatefulSet — notice the differences from a Deployment:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres-headless  # Links to headless service
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:15
        ports:
        - containerPort: 5432
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        - name: PGDATA
          value: /var/lib/postgresql/data/pgdata
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

Key differences from Deployment:

  • serviceName links to the headless service (required!)
  • volumeClaimTemplates creates a unique PVC per pod (not shared)

This is the magic — each pod gets its own dedicated storage. No sharing, no conflicts.

Stable Pod Identity

Apply and watch the pods come up:

kubectl apply -f postgres-statefulset.yaml
kubectl get pods -w
NAME         READY   STATUS    RESTARTS   AGE
postgres-0   1/1     Running   0          30s
postgres-1   1/1     Running   0          25s
postgres-2   1/1     Running   0          20s

Pods are created sequentially: 0, then 1, then 2. Names are predictable. Not postgres-abc123 but postgres-0. Always. Even if you delete it and it comes back — still postgres-0. How cool is that?

Stable Network Identity

Each pod gets its own DNS name that persists across restarts:

<pod-name>.<service-name>.<namespace>.svc.cluster.local

For our example:

postgres-0.postgres-headless.default.svc.cluster.local
postgres-1.postgres-headless.default.svc.cluster.local
postgres-2.postgres-headless.default.svc.cluster.local

Test DNS resolution:

kubectl run debug --rm -it --image=busybox -- nslookup postgres-0.postgres-headless

Stable Storage

"What happens to the data when a pod dies?"

Each pod gets its own PVC:

kubectl get pvc
NAME             STATUS   VOLUME     CAPACITY   ACCESS MODES   AGE
data-postgres-0  Bound    pvc-xxx    10Gi       RWO            5m
data-postgres-1  Bound    pvc-yyy    10Gi       RWO            4m
data-postgres-2  Bound    pvc-zzz    10Gi       RWO            3m

If a pod is deleted and recreated, it reattaches to the same PVC. Data persists. The pod might die, but its filing cabinet stays right where it was.

Pod Management Policy

OrderedReady (Default)

Pods start/stop one at a time, each waiting for the previous to be Ready. Like a roll call:

spec:
  podManagementPolicy: OrderedReady
  • Scale up: 0 → 1 → 2 (waits for Ready between each)
  • Scale down: 2 → 1 → 0 (reverse order, like a stack)

Parallel

Start/stop all pods simultaneously (like a Deployment). Use when pods don't depend on each other's startup order:

spec:
  podManagementPolicy: Parallel

Use when pods don't depend on each other's startup order.

Update Strategies

RollingUpdate (Default)

Update pods one at a time, in reverse order:

spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      partition: 0  # Update all pods
  • Updates: 2 → 1 → 0
  • Each pod must be Ready before updating the next

Partition Updates (Canary)

"Can I update just one pod to test before rolling out to all?"

Yes! Only update pods with index >= partition:

spec:
  updateStrategy:
    rollingUpdate:
      partition: 2  # Only update pod 2

Set partition: 2 → only postgres-2 updates. Set partition: 1postgres-2 and postgres-1 update. Set partition: 0 → all pods update.

Useful for canary deployments of database updates. Test on one replica before risking your primary. Smart.

OnDelete

Only updates when a pod is manually deleted — maximum control:

spec:
  updateStrategy:
    type: OnDelete

Scaling StatefulSets

Scale up:

kubectl scale statefulset postgres --replicas=5

Pods are added sequentially: postgres-3, then postgres-4.

Scale down:

kubectl scale statefulset postgres --replicas=2

Pods are removed in reverse: postgres-4, then postgres-3, then postgres-2.

Important: Scaling down does NOT delete PVCs. Data is preserved. Kubernetes is cautious about deleting your data. Good.

Real-World Example: PostgreSQL with Replication

Okay, let's build something real. A primary-replica PostgreSQL setup where postgres-0 is the primary and the rest are read replicas:

apiVersion: v1
kind: ConfigMap
metadata:
  name: postgres-config
data:
  primary.conf: |
    wal_level = replica
    max_wal_senders = 3
    synchronous_commit = on
  replica.conf: |
    primary_conninfo = 'host=postgres-0.postgres-headless port=5432 user=replicator password=replpass'
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: postgres-headless
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:15
        ports:
        - containerPort: 5432
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        command:
        - bash
        - -c
        - |
          if [[ $POD_NAME == "postgres-0" ]]; then
            # Primary
            docker-entrypoint.sh postgres -c config_file=/etc/postgres/primary.conf
          else
            # Replica
            pg_basebackup -h postgres-0.postgres-headless -U replicator -D /var/lib/postgresql/data -Fp -Xs -R
            docker-entrypoint.sh postgres -c config_file=/etc/postgres/replica.conf
          fi
        volumeMounts:
        - name: data
          mountPath: /var/lib/postgresql/data
        - name: config
          mountPath: /etc/postgres
      volumes:
      - name: config
        configMap:
          name: postgres-config
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi
---
apiVersion: v1
kind: Service
metadata:
  name: postgres-headless
spec:
  clusterIP: None
  selector:
    app: postgres
  ports:
  - port: 5432
---
# Service for primary only (writes)
apiVersion: v1
kind: Service
metadata:
  name: postgres-primary
spec:
  selector:
    app: postgres
    statefulset.kubernetes.io/pod-name: postgres-0
  ports:
  - port: 5432
---
# Service for replicas (reads)
apiVersion: v1
kind: Service
metadata:
  name: postgres-replicas
spec:
  selector:
    app: postgres
  ports:
  - port: 5432

Applications connect to:

  • postgres-primary for writes (always goes to postgres-0)
  • postgres-replicas for reads (load balanced across all replicas)

That's a real database cluster running in Kubernetes with read/write splitting. Pretty slick, right?

Deleting StatefulSets

Here's something important to know:

Delete the StatefulSet:

kubectl delete statefulset postgres

PVCs are NOT deleted — data is preserved. Kubernetes doesn't throw away your data just because you deleted the StatefulSet. You have to explicitly delete PVCs:

kubectl delete pvc -l app=postgres

This is a safety feature. Appreciate it.

Delete with Cascade

Delete StatefulSet and pods, but not PVCs:

kubectl delete statefulset postgres --cascade=foreground

Troubleshooting

StatefulSets can be finicky. Here's your debugging playbook.

Pod Stuck in Pending

Usually a storage issue. Check PVC:

kubectl get pvc
kubectl describe pvc data-postgres-0

Common issues:

  • No available PersistentVolume (did you set up a StorageClass?)
  • StorageClass doesn't exist (typo, maybe?)
  • Insufficient storage on the node

Pod Won't Start

kubectl describe pod postgres-0
kubectl logs postgres-0

Check:

  • Init containers completed
  • Volume mounted correctly
  • Application configuration

Pods Not Scaling

kubectl describe statefulset postgres

With OrderedReady, each pod must be Ready before the next starts. If postgres-1 isn't healthy, postgres-2 will never start. Check the stuck pod first.

When NOT to Use StatefulSets

"Should I use a StatefulSet for everything?"

Nope! StatefulSets add complexity. Use them only when you actually need stable identity and per-pod storage.

  • Stateless applications? Use Deployment
  • All pods can share the same storage? Use Deployment + PVC
  • Ordering doesn't matter? Use Deployment

Don't reach for StatefulSets just because it sounds fancier. Use the simplest tool that solves your problem.

Clean Up

kubectl delete statefulset postgres
kubectl delete svc postgres-headless postgres-primary postgres-replicas
kubectl delete pvc -l app=postgres
kubectl delete configmap postgres-config
kubectl delete secret postgres-secret

What's Next?

Awesome work! You now know how to run stateful applications in Kubernetes with stable identities, predictable DNS names, and per-pod persistent storage. Databases, message queues, distributed systems — you've got them covered.

But what about running a pod on every node in your cluster? Like log collectors, monitoring agents, or security tools? That's where DaemonSets come in. Let's go!